Skip to content(if available)orjump to list(if available)

Qwen3-VL can scan two-hour videos and pinpoint nearly every detail

moralestapia

To me, this qualifies as some sort ASI already.

thot_experiment

anyone have a tl;dr for me on what the best way to get the video comprehension stuff going is? i use qwen-30b-vl all the time locally as my goto model because it's just so insanely fast, curious to mess with the video stuff, the vision comprehension works great and i use it for OCR and classification all the time