AI Is Slowing Down Tracker
21 comments
·August 25, 2025DrNosferatu
rpdillon
Can vouch. Had an .mkv that browsers wouldn't play, and asked AI to give me a command line that maximized compatibility so I could stream it from CopyParty without folks on my network having to mount it and stream to VLC, rather than just play in the browser.
This is one of those cases where I couldn't really verify that what it suggested was correct:
ffmpeg -i file.mkv -c:v libx264 -profile:v baseline -level 3.0 -pix_fmt yuv420p -vf "scale=trunc(iw/2)*2:trunc(ih/2)*2" -c:a aac -b:a 128k -movflags +faststart output.mp4
I could look up all those flags, but I haven't. But the command did seem to work! I'm sure the HN crowd can critique. =)And so I'm curious about the concept you're getting at: command line tools that offer natural language interfaces. Maybe the command line can become more broadly accessible, especially if designed with that that use case in mind.
skydhash
It seems that my computer usage is mild. I use ffmpeg to convert albums (which I don't usually do because I download everything in flac), and I've got a couple of convert.sh lying around. My most advanced usage of ffmpeg was enabling streaming in aac for gonic (you can't do gapless with mp3 and opus wouldn't play).
This just to say anything that is more than a couple flag usually find themselves as alias, function, or shell script.
> I could look up all those flags, but I haven't. But the command did seem to work! I'm sure the HN crowd can critique. =)
Not critique. But just highlighting that ffmpeg is a power tool. Most of the intricacies are codecs and audio/video knowledge. Not ffmpeg itself. You have stuff like containers, video codecs, audio codecs, audio channels, media tracks, resolution, bitrate, quality (lossy), compression rate (lossless),... and a bunc h of manipulation depending on the media type.
Just for you to know, H.264 (the standard) is one of the most supported video format, playable on anything not from the stone age, it's successor is H.265 (which restarted the license controversy) which is mostly used for 4k media on the sea. Then you need a container (MP4) that can contains both the video and audio track. MKV is another type of container. yuv420 is how color is represented (Chroma subsampling), much better than RGB when you want free compression. faststart is to be able to start playing the media as soon as possible, instead of having to download a good part of the file. I think PDF have something like that too.
DrNosferatu
I don’t know about the HN crowd, but my AI sure has things to say about your FFmpeg command:
<< On the ffmpeg command,
• It’s conservative but works. The key bits for web playback are: H.264 video, yuv420p, AAC audio, MP4 container with +faststart. That’s exactly what it ensures.
• Where it’s sub‑optimal in 2025:
• profile/level: baseline, level 3.0 maximizes legacy compatibility but hurts efficiency/quality (no B‑frames, CABAC, etc.). High, level 4.0 (or auto) is widely supported on modern browsers/devices.
• quality control: better to use CRF + preset than implicit defaults. Example: -crf 20 -preset veryfast (or slow if you can wait).
• scaling: forcing even dimensions is fine; you can also just let libx264 pad/scale as needed or do scale=ceil(iw/2)2:ceil(ih/2)2 to avoid rounding down.
• redundancy: -pix_fmt yuv420p is good; adding format=yuv420p in -vf is redundant if -pix_fmt is set.
• Practical “ladder” that minimizes work and preserves quality:
1. If codecs already web‑friendly, just remux: ffmpeg -i in.mkv -c copy -movflags +faststart out.mp4 (Works when video is H.264 yuv420p and audio is AAC.)
2. If video is OK but audio isn’t (e.g., AC3/Opus), transcode audio only: ffmpeg -i in.mkv -c:v copy -c:a aac -b:a 160k -movflags +faststart out.mp4
3. If video needs re-encode, use modern defaults: ffmpeg -i in.mkv -c:v libx264 -profile:v high -level 4.0 -pix_fmt yuv420p -crf 20 -preset veryfast -vf “scale=ceil(iw/2)2:ceil(ih/2)2” -c:a aac -b:a 160k -movflags +faststart out.mp4
4. If you have GPU/QSV and just need “good enough” fast: ffmpeg -hwaccel auto -i in.mkv -c:v h264_nvenc -preset p5 -rc vbr -cq 23 -b:v 5M -maxrate 8M -bufsize 10M -profile:v high -pix_fmt yuv420p -c:a aac -b:a 160k -movflags +faststart out.mp4
• Quick verification after transcoding: ffprobe -v error -select_streams v:0 -show_entries stream=codec_name,profile,level,pix_fmt,width,height -of default=nw=1 out.mp4 >>
teraflop
> forcing even dimensions is fine; you can also just let libx264 pad/scale as needed
This part is wrong, because libx264 will reject an input with odd width or height rather than padding or scaling it automatically.
> redundancy: -pix_fmt yuv420p is good; adding format=yuv420p in -vf is redundant if -pix_fmt is set.
This seems to have hallucinated a redundancy that isn't there.
reactordev
For sure. You downgraded the video to half the size, then blew it back up again, converted the audio, set the apple mov headers, and spit that sucker out as an mp4 with probably half the resolution in pixel density but hey - it played.
I would try it again without the pix_fmt flag, the vf flag (and string). No idea what -level 3.0 is as it’s not in the docs anywhere (hallucination?). The video filter scaling definitely needs to go if you want it to be as close to the original resolution.
Cool part is, it worked. Even with a bad flag or two, ffmpeg said “Hold my beer”
teraflop
> You downgraded the video to half the size, then blew it back up again
No, that's not what that command does. It performs a single rescaling that resizes the video dimensions to the next lower multiple of 2. e.g. it will resize an 801x601 pixel video to 800x600.
If the video size is already an even number of pixels, it's a no-op and doesn't lose any detail.
If the video size isn't already even, then the rescaling is necessary because H.264 doesn't support odd dimensions.
> No idea what -level 3.0 is as it’s not in the docs anywhere (hallucination?).
It's documented here: https://ffmpeg.org/ffmpeg-codecs.html#Options-40
kylehotchkiss
Total AI expenditure justified
DrNosferatu
But synthesizing FFmpeg commands is not the total gain from “AI expenditure”, is it?
There are infinite similar use cases.
I guess the killer app (for AI coding) will be a framework to successfully structure projects in appropriately sized complexity morsels LLMs can efficiently chew through.
- Has Amazon’s Kiro truly managed this?
- What other efforts are there in this direction?
skydhash
> I guess the killer app (for AI coding) will be a framework to successfully structure projects in appropriately sized complexity morsels LLMs can efficiently chew through.
AI is a step too late. We already have a solution. They are called SDKs and Framewoks. Where the few things left is the business logic which you'll gather in meeting. The coding is mostly tabbing (for completion), copy pasting (you already have something 80% similar), and refactoring (a new integration is needed, and the static configuration isn't going to cut it).
A lot of the coding work is hunting down bugs (because you assumed instead of knowing) and moving things around in a way that won't make the whole thing crashes.
eddiewithzato
Yea so basically just a better search engine. Not what the VC were promised though
nopinsight
More than half of the 2024 links, about 15, appeared between o1-preview’s September launch and a few days after o3’s late-December announcement. That span was arguably the most rapid period of advancement for these models in recent years.
boombapoom
ironic this was built with replit
techpineapple
Is it increasing or decreasing? Need some graphs.
guluarte
who have guesses that glorified Markov chains are not the path to AGI
WanderPanda
you know whats a glorified Markov chain? The Universe
AI has made FFmpeg easily usable to the mere mortal - that alone is a technological revolution.