FFmpeg by Example
253 comments
·January 14, 2025hbn
simonw
I use ffmpeg multiple times a week thanks to LLMs. It's my top use-case for my "llm cmd" tool:
uv tool install llm
llm install llm-cmd
llm cmd use ffmpeg to extract audio from myfile.mov and save that as mp3
https://github.com/simonw/llm-cmdresonious
I tried this (though with a different tool called aichat) for extremely simple stuff like just "convert this mov to mp4" and it generated overly complex commands that failed due to missing libraries. When I removed the "crap" from the commands, they worked.
So much like code assistance, they still need a fair amount of baby sitting. A good boost for experienced operators but might suck for beginners.
cm2187
Plus you need to know the format of your source file to design the command correctly. How many audio tracks, is the first video track a thumbnail or the video, are the subtitles tracks forced, etc.
And in some situations ffmpeg has some warts you have to go around. Like they introduced recently a moronic change of behaviour where the first sub tracks becomes forced/default irrespective of the original forced/default flag of the source. You need to add "-default_mode infer_no_subs" to counter that.
Over2Chars
My feelings exactly, but I think that's OK!
It's another tool and one that might actually improve with time. I don't see GNU's man pages getting any better spontaneously.
Whoa, what if they started to use AI to auto-generate man pages...
BiteCode_dev
Reading this feels like seing a guy getting his first car in 1920 and complaining he still has to drive it himself.
assimpleaspossi
My experience exactly.
I no longer check with these AI tools after a number of attempts. Unrelated, a friend thought there was a NFL football game last Saturday at noon. Checking with Google's Gemini, it said "no", but there was one between two teams whose season had ended two weeks before at 1:00 Eastern Time and 2:00 Central. (The times are backwards.)
sdesol
> "convert this mov to mp4"
Did any of the commands look like the ones in the left window:
https://beta.gitsense.com/?chats=12850fe4-ffb1-4618-9215-c13...
The left window contains a summary of all the LLMs asked, including all commands. The right window contains the individual LLM responses.
I asked about gotchas with missing libraries as well, and Sonnet 3.5 said there were. Were these the same libraries that were missing for you?
keeganpoppen
what exactly do you want the llm to do here? if the ask was so unambiguous and simple that it could be reliably generated, then the interface wouldn't be so complicated to use in the first place! LLMs are not in any way best suited for one-shot prompt => perfect output, and expectations to that effect are extremely unreasonable. the reason why LLMs are still hard for beginners to use is because the software is hard to use correctly. as with LLM output goes life itself: the results you get from using a tool can only ever be as good as the (mental) model used to choose that tool & the inputs to begin with. if all the information required to generate the output were contained by the initial prompt, then there would be absolutely no need to use the LLM at all in the first place.
Philpax
Hate to be that guy, but which LLM was doing the generation? GPT-4 Turbo / Claude 3.x have not really let me down in generating ffmpeg commands - especially for basic requests - with most of their failures resulting from domain-specific vagaries that an expert would need to weigh in on m
Melomomololo
[dead]
pmarreck
A while back I simply wrote my own bash function for this called `please`
as in
bash> please "use ffmpeg to extract audio from myfile.mov and save it as mp3"
It will then courteously show you the command it wants to run before you agree to do it.Here is the whole thing, with its two dependent functions, so that people stop writing their own versions of this lol. All it needs is an OPENAI_API_KEY, feel free to modify for other LLMs
EDIT: Moved to a gist: https://gist.github.com/pmarreck/9ce17f7996347dd532f3e20a2a3...
Suggestions welcome- for example I want to add a feature that either just copies it (for further modification) or prepopulates the command line with it somehow (possibly for further modification, or even for skipping the approval step)
smusamashah
please is such an appropriate name. Will rename my ChatGPT alias to please.
atoav
Did you just invent the LLM-equivalent of curl-piping unread shell scripts into sh?
I am sure that will never cause any problems.
bspammer
It displays the generated command to you, there's an additional step to confirm.
ykonstant
> Did you just invent the LLM-equivalent of curl-piping unread shell scripts into sh?
Many such cases.
dekhn
"The future is already here. It's just not very well distributed"
(honestly, the work you share is very inspiring)
zahlman
>This will then be displayed in your terminal ready for you to edit it, or hit <enter> to execute the prompt. If the command doesnt't look right, hit Ctrl+C to cancel.
I appreciate the UI choice here. I have yet to do anything with AI (consciously and deliberately, anyway) but this sort of thing is exactly what I imagine as a proper use case.
hnuser123456
Just like all other code. There will be user-respecting open source code and tools, and there's user-disrespecting profitable closed code that makes too many decisions for you.
mvonballmo
Hypertalk <https://en.wikipedia.org/wiki/HyperTalk> lives.
th0ma5
You should figure out what went wrong for the other commenter and fix your tool.
levocardia
For the longest time I had ffmpeg in the same bucket as regex: "God I really need to learn this but I'm going to hate it so much." Then ChatGPT came along and solved both problems!
zxvkhkxvdvbdxz
Interesting. Being able to use regexps for text processing through my career has probably saved me a few thousand hours of programming one-off solutions so far. It is one of those skills that really pays off to learn proper.
And speaking of ffmpeg, or tooling in general, I tend to make notes. After a while you end up with a pretty decent curated reference.
codetrotter
I use regexes a lot. The main thing that always trips me up is dealing with escaping, because different tools I use – vim, sed, rg, and so on – sometimes have different meanings for when to escape or not.
In one tool you’ll use + to match one or more times, and \+ to mean literal plus sign.
In another tool you’ll use \+ to match one or more time, and + to mean literal plus sign.
In one tool you’ll use ( and ) to create a match group, and \( and \) to mean literal open and close parentheses.
In another tool you’ll use \( and \) to create a match group, and ( and ) to mean literal open and close parentheses.
This is basically the only problem I have when writing regexes, for the kinds of regexes I write.
Also, one thing that’s not a problem per se but something that leads me to write my regexes with more characters than strictly necessary is that I rarely use shorthand for groups of characters. For example the tool might have a shorthand for digit but I always write [0-9] when I need to match a digit. Also probably because the shorthand might or might not be different for different tools.
Regexes are also known to be “write once read never”, in that writing a regex is relatively easy, but revisiting a semi-complicated regex you or someone else wrote in the past takes a little bit of extra effort to figure out what it’s matching and what edits one should make to it. In this case, tools like https://regex101.com/ or https://www.debuggex.com/ help a lot.
mystified5016
No one doubts the power or utility of regexes or ffmpeg, but they are both complicated beasts that really take a lot of skill.
They're both tools where if they're part of your daily workflow you'll get immense value out of learning them thoroughly. If instead you need a regex once or twice a week, the benefit is not greater than the cost of learning to do it myself. I have a hundred other equally complicated things to learn and remember, half the job of the computer is to know things I can't put in my brain. If it can do the regex for me, I suddenly get 70% of the value at no cost.
Regex is not a tool I need often enough to justify the hours and brain space. But it is still an indespensible tool. So when I need a regex, I either ask a human wizard I know, or now I ask my computer directly.
earnestinger
Not sure about ffmpeg, but you should definitely try memorising regexp. Casual Search&replace that becomes possible is worth it.
sergiotapia
in 15 years it never sticks and by the time i need it again i've forgotten it! :D
teaearlgraycold
Gotta be honest, years of configuring automod on Reddit have honed me into a regex God.
jmb99
For me, it wasn’t so much learning ffmpeg, as it was understanding containers/codecs/encoders/streams/etc. Learning all of the intricacies there made ffmpeg make a lot more sense.
skydhash
Almost no one cares to understand the domain of the tool anymore, they only want result and expect a simplified interface that already does the unique thing they want to do, but can’t accept that a power tool can only be used with training.
shlomo_z
... Then ChatGPT came along and I had 3 problems! https://regex.info/blog/2006-09-15/247
hackingonempty
CSS has entered the ChatGPT.
kccqzy
My rule for using LLMs is that anything that's one off is okay. Anything that's more permanent and committed to a repo needs a human review. I strongly suggest you have an understanding of the basics (at least the box model) so that you are competent at reviewing CSS code before using LLM for that.
permo-w
I've been looking for a good guide on prompting LLMs for CSS.
does anyone know of any?
juancroldan
Same here, it's one of these things where AI has taken over completely and I'm just a broker that copy-pastes error traces.
jjcm
In addition to the many others mentioned, here's a script I just threw together that simplifies a lot of these chained commands - llmpeg: https://github.com/jjcm/llmpeg
If you have ffmpeg installed and an OpenAI env api key set, it should work out of the box.
Demo: https://image.non.io/1c7a92ef-0917-49ef-9460-6298c7a9116c.we...
magarnicle
My experience got even better once I learned how complex filters worked.
dylan604
learning how to use splits to do multiple things all in one command is a god send. the savings of only needed to read the source and convert to baseband video once is a great savings.
i started with avisynth, and it took time for my brain to switch to ffmpeg. i don't know how i could function without ffmpeg at this point
NetOpWibby
Truly, a net positive to my life. Just a few days ago I asked my AI buddy (Claude) to create a zsh script to organize my downloads folder according to the Johnny Decimal system. I’ve since modified it to move the files to a JD setup on my desktop.
The sense of elation I get when I wonder aloud to my digital friend and they generate what I thought was too much to expect. Well worth the subscription.
bambax
Basic syntax for re-encoding a video file did take me some time to memorize, but isn't in fact too hard:
ffmpeg <Input file(s)> <Codec(s)> <MAPping of streams> <Video Filters> output_file
- input file: -i, can be repeated for multiple input files, like so: ffmpeg -i file1.mp4 -i file2.mkv
If there is more than one input file then some mapping is needed to decide what goes out in the output file.- codec: -c:x where x is the type of codec (v: video, a: audio or s:subtitles), followed by its name, like so:
-c:v libx265
I usually never set the audio codec as the guesses made by ffmpeg, based on output file type, are always right (in my experience), but deciding the video codec is useful, and so is the subtitles codec, as not all containers (file formats) support all codecs; mkv is the most flexible for subtitles codecs.- mapping of streams: -map <input_file>:<stream_type>:<order>, like so:
-map 0:v:0 -map 1:a:1 -map 1:a:0 -map 1:s:4
Map tells ffmpeg what stream from the input files to put in the output file. The first number is the position of the input file in the command, so if we're following the same example as above, '0' would be 'file1.mp4' and '1' would be 'file2.mkv'. The parameter in the middle is the stream type (v for video, a for audio, s for subtitles). The last number is the position of the stream IN THE INPUT FILE (NOT in the output file).The position of the stream in the output file is determined by the position of the map command in the command line, so for example in the command above we are inverting the position of the audio streams (taken from 'file2.mkv'), as audio stream 1 will be in first position in the output file, and audio stream 0 (the first in the second input file) will be in second position in the output file.
This map thing is for me the most counter-intuitive because it's unusual for a CLI to be order-dependent. But, well, it is.
- video filters: -vf
Video filters can be extremely complex and I don't pretend to know how to use them by heart. But one simple video filter that I use often is 'scale', for resizing a video:
-vf scale=<width>:<height>
width and height can be exact values in pixels, or one of them can be '-1' and then ffmpeg computes it based on the current aspect ratio and the other provided value, like this for example: -vf scale=320:-1
This doesn't always work because the computed value should be an even integer; if it's not, ffmpeg will raise an error and tell you why; then you can replace the -1 with the nearest even integer (I wonder why it can't do that by itself, but apparently, it can't).And that's about it! ffmpeg options are immense, but this gets me through 90% of my video encoding needs, without looking at a manual or ask an LLM. (The only other options I use often are -ss and -t for start time and duration, to time-crop a video.)
izacus
> This doesn't always work because the computed value should be an even integer; if it's not, ffmpeg will raise an error and tell you why; then you can replace the -1 with the nearest even integer (I wonder why it can't do that by itself, but apparently, it can't).
It's not about integer, but some of the sizes need to be even. You can use `-vf scale=320:-2` to ensure that.
bambax
It's hard for a number to be even without first being an integer, no? ;-)
But thanks for '-2', didn't know about that! It's the exact default option I needed! Will be using that always from now on.
https://stackoverflow.com/questions/71092347/ffmeg-option-sc...
0x38B
A practical example of mapping streams:
ffmpeg -i <movie-with-many-tracks.mkv> -map 0:0 -map 0:5 -map 0:12 -vcodec copy -acodec copy -scodec copy "output-movie.mkv"
Use: sometimes I have a file with a lot of audio and or subtitle streams but only want one or two of each – here, 0:0 is the video, 0:5 is English audio, and 0:12 was the subtitle track I wanted. Setting the codecs to “copy” means nothing gets reencoded.jmb99
> then you can replace the -1 with the nearest even integer (I wonder why it can't do that by itself, but apparently, it can't).
Likely because the aspect ratio will no longer be the same. There will either be lost information (cropping), compression/stretching, or black bars, none of which should be default behaviour. Hence, the warning.
pdyc
I ended up creating my own tool to generate ffmpeg commands https://newbeelearn.com/tools/videoeditor/
jazzyjackson
This reminds me I need to publish my write up on how I've been converting digitized home video tapes into clips using scene detection, but in case anyone is googling for it, here's a gist I landed on that does a good job of it [0] but sometimes it's fooled by e.g. camera flashes or camera shake so I need to give it a start and end file and have ffmpeg concatenate them back together [1]
Weird thing is I got better performance without "-c:v h264_videotoolbox" on latest Mac update, maybe some performance regression in Sequoia? I don't know. The equivalent flag for my windows machine with Nvidia GPU is "-c:v h264_nvenc" . I wonder why ffmpeg doesn't just auto detect this? I get about 8x performance boost from this. Probably the one time I actually earned my salary at work was when we were about to pay out the nose for more cloud servers with GPU to process video when I noticed the version of ffmpeg that came installed on the machines was compiled without GPU acceleration !
[0] https://gist.githubusercontent.com/nielsbom/c86c504fa5fd61ae...
[1] https://gist.githubusercontent.com/jazzyjackson/bf9282df0a40...
jack_pp
> Probably the one time I actually earned my salary at work was when we were about to pay out the nose for more cloud servers with GPU to process video when I noticed the version of ffmpeg that came installed on the machines was compiled without GPU acceleration !
Issue with cloud CPU's is that they don't come with any of the consumer grade CPU built-in hardware video encoders so you'll have to go with the GPU machines that cost so much more. To be honest I haven't tried using HW accel in the cloud to have a proper price comparison, are you saying you did it and it was worth it?
radicality
Are the hardware encoders even good? I thought that unless you need something realtime, it's always better to spend the cpu cycles on a better encode with th software encoder. Or have things changed ?
jmb99
They still suck compared to software encoders. This is true for both H.264 and H.265 on AMD, Nvidia, and Intel GPUs. They’re “good enough” for live streaming, or for things like Plex transcoding, or where you care only about encoding speed and have a large bandwidth budget. They’re better than they used to be, but not worth using for anything you really care about.
zos_kia
That's my experience too. I transcode a lot of video for a personal project and hardware acceleration isn't much faster. I figure that's because on CPU I can max out my 12 cores.
The file size is also problematic I've had hardware encodes twice as large as the same video encoded with CPU.
jack_pp
I know they used to be worse, haven't tested the newest ones
jazzyjackson
We were a quick and dirty R&D team that had to do a lot of video processing quickly, we were not very cost sensitive and didn’t have anything other than AWS to work with, so I can’t speak to whether it was worth it :)
dekhn
I used ffmpeg for empty scene detection- I have a camera pointed at the flight path for SFO, and stripped out all the frames that didn't have motion in them. You end up with a continuous movie of planes passing through, with none of the boring bits.
hnuser123456
Then can you merge all the clips starting when motion starts and see hundreds of planes fly across at once?
dekhn
Interesting. Yes, I assume that's possible although I'm not sure how you handle the background- I guess you find an empty frame, and subtract that from every image with a plane.
One of the advantages of working with image data is that movies are really just 3d data and as long as all the movies you work with are the same size, if you have enough ram, or use dask, you could basically do this in a couple lines of numpy.
rahimnathwani
-c:v h264_nvenc
This is useful for batch encoding, when you're encoding a lot of different videos at once, because you can get better encoding throughput.But in my limited experiments a while back, I found the output quality to be slightly worse than with libx264. I don't know if there's a way around it, but I'm not the only one who had that experience.
ziml77
IIRC they have improved the hardware encoder over the generations of cards, but yes NVENC has worse quality than libx264. NVENC is really meant for running the compression in real-time with minimal performance impact to the system. Basically for recording/streaming games.
icelancer
So counterintuitive that nvenc confers worse quality than QSV/x264 variants, but it is both in theory and in my testing as well.
But for multiple streams or speed requirements, nvenc is the only way to fly.
Gormo
Why's that counterintuitive? It makes intuitive sense to me that an approach optimized for throughput would make trade-offs that are less optimized for quality.
xnx
Co-signing. Encode time was faster with nvenc, but quality was noticeably worse even to my untrained eye.
jazzyjackson
Fascinating, it didn't occur to me quality could take a hit, I thought the flag merely meant "perform h264 encoding over here"
Edit: relevant docs from ffmpeg, they back up your perception, and now I'm left to wonder how much I want to learn about profiles in order to cut up these videos. I suppose I'll run an overnight job to reencode them from Avi to h264 at high quality, and make sure the scene detect script is only doing copys, not reencoding, since that's the part I'm doing interactively, there's no real reason I should be sitting at the computer while its transcoding.
Hardware encoders typically generate output of significantly lower quality than good software encoders like x264, but are generally faster and do not use much CPU resource. (That is, they require a higher bitrate to make output with the same perceptual quality, or they make output with a lower perceptual quality at the same bitrate.)
Gormo
> I wonder why ffmpeg doesn't just auto detect this?
Hardware encoding is often less configurable and involves greater trade-offs than using sophisticated software codecs, and don't produce exactly equivalent results even with equivalent parameters. On top of that, systems often have multiple hardware APIs to choose from that often different features.
FFMpeg is a complex command-line tool intended for users who are willing to learn its intricacies, so I'm not sure it makes sense for it to set defaults based on assumptions.
Trixter
In your snippets, you don't appear to be deinterlacing. If your pre-digitized clips are already deinterlaced, that's fine, but if they're not, you're encoding interlaced material as progressive, and mangling the quality. Try adding a bwdif filter so that your 30i content gets encoded as 60p (which will look more like the original videotapes).
dekhn
I've gotten pretty good at various bits of ffmpeg over time. Its CLI has a certain logic to it... it's order dependent (not all unix CLIs are).
Lately, I've been playing around with more esoteric functionality. For example, storing raw video straight off a video camera on a fairly slow machine. I built a microscope and it reads frames off the camera at 120FPS in raw video format (YUYV 1280x720) which is voluminous if you save it directly to disk (gigs per minute). Disks are cheap but that seemed wasteful, so I was curious about various close-to-lossless techniques to store the exact images, but compressed quickly. I've noticed that RGB24 conversion in ffmpeg is extremely slow, so instead after playing around with the command line I ended up with:
ffmpeg -f rawvideo -pix_fmt yuyv422 -s 1280x720 -i test.raw -vcodec libx264 -pix_fmt yuv420p movie.mp4 -crf 13 -y
This reads in raw video- because raw video doesn't have a container, it lacks metadata like "pixel format" and "image size", so I have to provide those. It's order dependent- everything before "-i test.raw" is for decoding the input, and everythign after is for writing the output. I do one tiny pixel format conversion (that ffmpeg can do really fast) and then write the data out in a very, very close to lossless format with a container (I've found .mkv to be the best container in most cases).Because I hate command lines, I ended up using ffmpeg-python which composes the command line from this:
self.process = (
ffmpeg.
input(
"pipe:",
format="rawvideo",
pix_fmt="yuyv422",
s="{}x{}".format(1280, 720),
threads=8
)
.output(
fname, pix_fmt="yuv422p", vcodec="libx264", crf=13
)
.overwrite_output()
.global_args("-threads", "8")
.run_async(pipe_stdin=True)
)
and then I literally write() my frames into the stdin of that process. I had to limit the number of threads because the machine has 12 cores and uses at least 2 at all times to run the microscope.I'm still looking for better/faster lossless YUV encoding.
zahlman
>Its CLI has a certain logic to it... it's order dependent (not all unix CLIs are).
Which is appropriate. A Unix pipeline is dependent on the order of the components, and complex FFMpeg invocations entail doing something analogous.
>I ended up using ffmpeg-python which composes the command line from this
A lot of people like this aesthetic, but doing "fluent" interfaces like this is often considered un-Pythonic. (My understanding is that ffmpeg-python is designed to mirror the command-line order closely.) The preference (reinforced by the design of the standard library and built-in types) is to have strong https://en.wikipedia.org/wiki/Command%E2%80%93query_separati... . By this principle, it would look something more like
ffmpeg(global_args=..., overwrite_output=True).process_async(piped_input(...), output(...))
where using a separate construction process for the input produces a different runtime type, which also cues the processing code that it needs to read from stdin.dekhn
To be honest what I really wanted is more like a programming API or config file than attempting to express complex pipelines and filters in a single command line.
As for what's unpythonic: don't care. My applications has code horrors that even Senior Fellows cannot unsee.
zahlman
I get that. My critique is for the library authors, not you.
Ch00k
Similar, arguably simpler, Python library that provides an interface to FFmpeg command line is ffmpy [0], of which I am the author.
jcalvinowens
> I'm still looking for better/faster lossless YUV encoding.
Look no further: https://trac.ffmpeg.org/wiki/Encode/FFV1
dekhn
I spent some time with this on my data set, and in my hands I wasn't able to produce results that were convincingly better than libx264, but with slower encodes and larger output files. It's really hard to beat libx264.
jcalvinowens
>> I'm still looking for better/faster lossless YUV encoding.
> I wasn't able to produce results that were convincingly better than libx264
With "-qp 0"? Otherwise, it's not a valid comparison... "-crf 13" is nowhere near lossless (though it might appear so visually).
FFV1 is much better than H264 at lossless compression in my experience. Here's a random sample of a ten second 4K input I had handy (5.5G uncompressed):
h264-ultrafast 1.951s 850M
h264-veryslow 46.528s 715M
ffv1 8.883s 637M
But yeah, if you don't actually require truly lossless data, it's a huge waste.at_a_remove
I am here to sell you on one word: ramdisks.
If you are doing processing with intermediate steps you do not want to keep? Ramdisks. Oh yeah. Oh yeah.
Moru
This seems to be very forgotten tech. First time I used that was to load NetHack to ram instead of the slow diskette on my Atari. Now I still use it as webcache for work to not bother the database with so many requests.
When I set up the server, the ramdisk didn't have a way of shrinking when space wasn't needed so had to make sure it doesn't eat up all memory when growing unlimited. I bet it's smarter nowadays.
Gormo
Not forgotten at all -- in fact, if your system has has a /tmp directory, it is almost certainly a RAMdisk.
null
mixmastamyk
Slow ffmpeg pipelines are typically cpu-bound rather than io-bound.
e.g. When doing a simple copy, progress status messages upgrade to scientific notation.
latexr
I thought this was going to be a website managed by an experienced user of FFmpeg sharing from their collection of accumulated knowledge, but then was immediately disappointed on the first example I clicked on.
https://www.ffmpegbyexample.com/examples/l1bilxyl/get_the_du...
Don’t call two extra tools to do string processing, that is insane. FFprobe is perfectly capable of giving you just the duration (or whatever) on its own:
ffprobe -loglevel quiet -output_format csv=p=0 -show_entries format=duration video.mp4
Don’t simply stop at the first thing that works; once it does think to yourself if maybe there is a way to improve it.gariany
Hi, original poster here. I think calling it "insane" is a bit of an exaggeration lol. Don't you think?
I like your solution better!
latexr
> I think calling it "insane" is a bit of an exaggeration
Yes, I agree. It was decidedly the wrong word to use and the post would undoubtedly have been better without that part. Unfortunately, the edit window had already passed by the time I reread it.
fastily
Nice! This reminds me of my own ffmpeg cheatsheet; I would imagine that everyone who uses ffmpeg frequently has a similar set of notes
nickdothutton
FFmpeg is one of those tools I need to use so infrequently that he exact syntax never seems to stick. I've resorted to using an LLM to give me the command line I need. The only other tool that I ever had trouble with was 1990s-era MegaCLI from LSI Logic, also something I barely used from one year to the next (but one where you really need to get it right under pressure).
pseudosavant
I've been using FFMPEG for 15+ years, and still can't remember almost any commands. LLMs have been amazing for using FFMPEG though. ChatGPT and Claude do wonders with "give me an ffmpeg command that will remux a video into mkv, include subtitle.srt in the file, and I only want it between 0:00:05 and 0:01:00." It produced this in case you were wondering: `ffmpeg -i input.mp4 -i subtitle.srt -ss 00:00:05 -to 00:01:00 -map 0 -map 1 -c copy -c:s mov_text output.mkv`
I wonder how small of an LLM you could develop if you only wanted to target creating ffmpeg commands. Perhaps it could be small enough to be hosted on a static webpage where it is run locally?
porterde
Perhaps small enough to include in ffmpeg itself so you can just write commands `ffmpeg do this thing I want`.
Now I say this, it seems like there should already be a shell that is also an LLM where you can mix bits of commands you vaguely remember and natural language a bit like Del Boy speaking French...
Alex-Programs
Warp terminal does that. It's cool.
pseudosavant
That would be an amazingly useful feature of ffmpeg, and considering how large its dependencies are (390MB of packages for `apt install ffmpeg` on a fresh Raspberry Pi OS install), it would be reasonable to have an optional model package.
7jjjjjjj
-c:s mov_text is unnecessary and in fact might be fucking things up
pseudosavant
I was kind of curious about that one too. It is encoding the SRT into the MP4/MOV subtitle format. I use it all the time when muxing subs into MP4s, but I haven't seen what happens with an MKV like that. It is very well supported in MP4s.
escapecharacter
I've just maintained my own note doc, going on 15 years now, of my most commonly used syntax. When that fails, I grep my bash history.
daveslash
Same. The only thing that sticks is converting from format X to .mp4. Everything else I need to look up every single time.
Relevant XKCD https://xkcd.com/1168/
dmd
Yeah I commented the other day, tongue firmly in cheek, that it's probably worth burning down all the rainforests just so LLMs can tell me the right ffmpeg flags to do what I want.
greenavocado
Don't forget that Gstreamer exists and its command line and documentation make a little bit more sense than ffmpeg because GStreamer is pipeline based and the composition is a little bit more sane. I stopped using ffmpeg entirely and only use GStreamer for intense video work.
jack_pp
Gstreamer can give you more control and has friendlier API's if you're gonna make a pipeline programatically but for one off stuff ffmpeg seems much friendlier to me. For example it has sane x264 defaults while with gst-launch you have to really know what you're doing to get quality x264 encoding
legends2k
I thought FFmpeg is pipeline based too; graph of filters. Am I missing something? You can set up a complex graph of source, sink and transform filters.
lehi
FFmpeg's filter DSL is so good that I get annoyed if I ever have to fall back to command line switches.
radicality
I wish there was some sort of local gui / tool to drag and drop nodes, connect them together, type-check the graph if possible, and it would only show the ffmpeg command to run which you could paste. Anyone know or anything?
pbmahol
see lavfi-preview on github. Its GUI app for libavfilter/FFmpeg filters
greenavocado
You're right, but gstreamer is a little bit more sane for many use cases. Maybe ffmpeg is more advanced; I am not sure. I find the pieces fit together better with gstreamer.
remram
GStreamer feels like abandonware though, they also got big vulnerabilities recently, and their docs are very defunct.
greenavocado
Not sure about the vulns but I am actively discussing things with the devs on their Matrix
AdieuToLogic
Here is the GitHub repo for a ffmpeg book which may be a nice supplement to this site:
Narciss
Oh this is nice, thank you!
merksoftworks
ffmpeg has always felt like a gui application crammed into tui format. I've had the displeasure of using the C api a few times, while it's straight forward in many respects, it makes invalid states extremely easy to represent. I would love a realtime AV1 encoding framework that "just works".
jmb99
> ffmpeg has always felt like a gui application crammed into tui format.
It’s one of the only tools where I reach for a GUI equivalent (Handbrake) by default, unless I’m doing batch processing. There are a few pure ffmpeg GUIs out there as well. There’s just something about working with video that CLI doesn’t work right with my brain for.
mastax
I can vouch for GStreamer as an API. I was using the Rust bindings so not super familiar with the C API but it looks good. GObject makes some things verbose but once you understand it you can interact with every object in the API consistently. There is a ton of necessary complexity (video is hard) but it’s really well designed and pretty well implemented and documented.
If you have a pretty normal use case the Bins (decodebin, transcodebin, playbin) make that pretty easy. If you have a more complex use case the flexibility of the design makes it possible.
garaetjjte
ffmpeg API is somewhat clunky but it works fine. I dread working with gstreamer, sea of leaky abstractions, inexplicable workarounds and mysterious bugs.
tiborsaas
I like this insight, but TUI is something graphical while ffmpeg is just CLI.
It would be cool to see if a TUI tool existed. Something like https://github.com/Twinklebear/fbed but more feature complete.
joshbaptiste
One thing on Linux systems I like to do is build ffmpeg statically.. as distro versions are sometimes too old or don't include modules I prefer.. this containerized version has done wonders for me https://github.com/wader/static-ffmpeg
franze
I love using FFMpeg via Wasm for ... senseless ... mini projects i.e.: https://video-2-sprites.franzai.com/ Video 2 Sprites Converter - totally over-engineered
dsp_person
Love the bouncing progress bar. Also nice ffmpeg wasm only 11MB
alpb
I love "X by Example" sites! But if you don't work with a tool like ffmpeg imagemagick day in and out, there's no way you'll remember their unintuitive syntax or will want to spend the time to get your one-time job done. I'd still probably not use this site to scan a dozen of examples and try to put together the pieces of the puzzle; instead, I'd probably just use an LLM who already scanned the entire web corpus and can probably get me to a solution faster, right? At that point, I wonder what folks get out of this site?
gariany
Its for when people google how do I do X. Also, I've built this site before chatgpt was a thing...
I've enjoyed using ffmpeg 1000% more since I was able to stop doing manually the tedious task of Googling for Stack Overflow answers and cobbling them into a command and got Chat GPT to write me commands instead.