Google is using YouTube videos to train its AI video generator
38 comments
·June 19, 2025pier25
emodendroket
I think that phrase ought to be retired simply because even if you are paying money you often still are “the product.”
kube-system
If you're paying money, you still might not be a company's real customer: https://www.statista.com/statistics/1093781/distribution-of-...
emodendroket
I don’t think it’s even right to think that there’s one “real” customer and one “fake” one really. It seems like an oversimplified model that doesn’t accurately describe how anybody operates besides a mom-and-pop.
nuodag
Yes, and a way out of that is open source, where you aren't a customer…
pier25
Absolutely. See TVs for example. Price has gone down because they sell the data of what you're watching.
techjamie
Drop tens of thousands on a new vehicle at a stealership and you'll get sketchy companies offering warranties on your exact vehicle within a week and then forever-more afterwards.
The way I understand it, usually either the dealership, the software they take your information in, or both typically sell off your data after the sale.
Also I get calls and letters from places asking to buy a vehicle I haven't owned since 2018 regularly.
josefritzishere
This is the most important thread here. I don't think even Andrew Lewis saw this coming. Now we are always the product because we lack digital rights. It's all been legislated away.
echelon
This should have resulted in an antitrust dismantlement by now. Google has every structural advantage in the world.
Years ago, Google would have been worth more if sold for parts. They were giving away far too much (and pissing on entire industries while doing so). Now they're activating all of those assets for strong, explosive incremental growth. It's hard to even call it incremental. More like checkmate world.
They're going to off so many businesses this decade and collect all the money.
They own the web, they own most of mobile, they control the other half of mobile, they own search, they own media, they own advertising. There's not a dollar that gets made that doesn't flow though Google somehow.
You can't even build a brand anymore without getting extorted by Google. You'll have your competitors paying to trademark squat you, and the browser itself defaults to Google search.
Google really needs to be split into about a half dozen companies. This is way bigger and way worse than Ma Bell.
hagbard_c
This comes as no surprise as I suspect many if not most other 'AI video generator' projects are being fed 'content' from Youtube, Vimeo, Rumble and any and all other accessible video sites - where else would they get a wide spectrum of video material to train on?
bgwalter
Here is a free business idea: Create an agentic "AI" video watcher. "AI" YouTube creators can register with the service, which will then watch their videos, will generate click-throughs to the advertisers and interact with the advertiser's web pages. The service is financed by profit sharing.
This streamlines video watching, which humans are notoriously slow at. It could lead to efficiency gains in video and ad watching that are practically unlimited.
kube-system
I'm guessing that is a facetious response, but in case it isn't: this is just plain old fraud.
JohnFen
I don't think that's fraud unless its done by the channel operator. Me as an end user auto-clicking ads is not even in the same ballpark as actual fraud.
kube-system
> I don't think that's fraud unless its done by the channel operator.
That's exactly what the parent comment suggested.
lovich
Sounds like you just need to execute on it fast enough that the government cant respond before youre too big to fight. standard strategy
kube-system
Not really. Click fraud isn't anything new, it has existed for decades, and there are many ways that it can be (and currently is) mitigated privately. The most common way is to ban, shadowban, or demonetize the offender. And if that doesn't work you can always be held civilly liable.
Contracting with others to commit fraud and violate contracts is not a good business idea even if you stay off the government's radar.
gauku
Almost sure that's a tongue-in-cheek response. Right?
bgwalter
Yes! I'm pretty sure though that given the current hype someone can come up with an elaborate legal and moral justification for increasing video watching efficiency.
add-sub-mul-div
Be Facebook and call it pivot-to-video.
kunzhi
Reading this article I couldn’t help but remember the Key & Peele skit about joke theft - “high on potenuse.” All this AI training feels similar to me on some level. Yeah, it’s “just making a copy” on the other hand the person who originated the idea doesn’t get to participate in the success.
Life is hard, but at least on the other hand, it’s also unfair.
JohnFen
genAI videos are already making YouTube worse than it was, and that trend is only starting. Maybe that, plus Google using user videos in this way, will finally allow one of their competitors to gain more traction.
paxys
Well, no shit.
Remember when OpenAI's CTO was asked to confirm that they don't use YouTube to train Sora and she evaded the question...?
Everyone is training on everything they can get their hands on, period.
cavisne
Hilariously there was a story how Google could not train on Youtube data due to their TOS, so they changed it for new videos. Meanwhile everyone else was scraping Youtube as much as they liked and training on it.
null
superkuh
Additionally, no one blocks googlebot even though it's being used just as much for LLM/etc AI training as any web spider out there. Too big to block. Too big to not use.
krunck
... because Youtube videos meet some minimum level of content quality?
adzm
There is just so much of it, on so many different topics. Especially esoteric things that aren't popular "influencer" things that everyone is going to think of initially.
kube-system
Most of it is better than this: https://www.youtube.com/watch?v=XQr4Xklqzw8
leumon
Just like for llms for the base model you need quantity not quality. It just needs to learn how to correctly predict the next frame.
add-sub-mul-div
If low quality influencer garbage is what people are watching, they'll be happy to generate more of it and I don't think they'll lose sleep about the quality.
null
greatgib
It kind of does make sense, like a Library would use the books at its disposal.
But what is not normal is that they will easily block, ban and sue you if you try to do the same, like if the catalog of content was belonging to them.
kube-system
> It kind of does make sense, like a Library would use the books at its disposal.
Libraries don't really "use" books to produce anything, except to support accessibility like translations or indexing. Their lending of books is under the first-sale doctrine, which wouldn't be applicable to YouTube videos streamed under license.
> But what is not normal is that they will easily block, ban and sue you if you try to do the same, like if the catalog of content was belonging to them.
Because they do have rights to the content. All of the content on YouTube has been licensed to YouTube, and the licensor has assigned some rights to them.
echoangle
Does it not? Do you not give those rights to YouTube once you upload a video?
throwaway29843
Not all YouTube videos are uploaded by their rightholders though. There's plenty of stuff reuploaded from other platforms, which Google is feeding into their AI indiscriminately.
To no one's surprise. If you're not the customer you're the product.