Stargate Project: SoftBank, OpenAI, Oracle, MGX to build data centers
560 comments
·January 21, 2025serjester
You have to keep in mind Microsoft is planning on spending almost 100B in datacenter capex this year and they're not alone. This is basically OpenAI matching the major cloud provider's spending.
This could also be (at least partly) a reaction to Microsoft threatening to pull OpenAI's cloud credits last year. OpenAI wants to maintain independence and with compute accounting for 25–50% of their expenses (currently) [2], this strategy may actually be prudent.
[1] https://www.cnbc.com/2025/01/03/microsoft-expects-to-spend-8...
throitallaway
Microsoft has lots of revenue streams tied to that capex outlay. Does OpenAI have similar revenue numbers to Microsoft?
tuvang
OpenAI has a very healthy revenue stream in the form of other companies throwing money at them.
But to answer your question, no they aren’t even profitable by themselves.
manquer
> they aren’t even profitable
Depends on your definition of profitability, They are not recovering R&D and training costs, but they (and MS) are recouping inference costs from user subscription and API revenue with a healthy operating margin.
Today they will not survive if they stop investing in R&D, but they do have to slow down at some point. It looks like they and other big players are betting on a moat they hope to build with the $100B DCs and ASICs that open weight models or others cannot compete with.
This will be either because training will be too expensive (few entities have the budget for $10B+ on training and no need to monetize it) and even those kind of models where available may be impossible to run inference with off the shelf GPUs, i.e. these models can only run on ASICS, which only large players will have access to[1].
In this scenario corporations will have to pay them the money for the best models, when that happens OpenAI can slow down R&D and become profitable with capex considered.
[1] This is natural progression in a compute bottle-necked sector, we saw a similar evolution from CPU to ASICS and GPU in the crypto few years ago. It is slightly distorted comparison due to the switch from PoW to PoS and intentional design for GPU for some coins, even then you needed DC scale operations in a cheap power location to be profitable.
MR4D
Given the release of the new DeepSeek R1 model [0], OpenAI’s future revenue stream is probably more at risk than it was a week ago.
[0] - https://arstechnica.com/ai/2025/01/china-is-catching-up-with...
null
tantalor
That's like saying I have a healthy revenue stream from my credit card.
SecretDreams
Serious question - why Texas???
tempusalaria
Texas is a world leader in renewable energy. Easy permitting, lots of space, lots of existing grid infrastructure from the o&g industry.
LarsDu88
My kneejerk response was to point to the incoming administration, but the fact Stargate has been in the works for more than a year now says to me it's because of tax credits.
wilson090
It's where the energy is for this project.
This is unfortunately paywalled but a good writeup on how the datacenter came to be: https://www.theinformation.com/articles/why-openai-and-oracl...
PittleyDunkin
.
oldpersonintx
Who is "we"?
This isn't your money
kdmtctl
It is not. But this kind of money does have impact for society in any field. So, this a proper concern.
heydenberk
~$125B per year would be 2-3% of all domestic investment. It's similar in scale to the GDP of a small middle income country.
If the electric grid — particularly the interconnection queue — is already the bottleneck to data center deployment, is something on this scale even close to possible? If it's a rationalized policy framework (big if!), I would guess there's some major permitting reform announcement coming soon.
constantcrying
They say this will include hundreds of thousands of jobs. I have little doubt that dedicated power generation and storage is included in their plans.
Also I have no doubt that the timing is deliberate and that this is not happening without government endorsement. If I had to guess the US military also is involved in this and sees this initiative as important for national security.
cmdli
Is there really any government involvement here? I only see Softbank, Oracle, and OpenAI pledging to invest $500B (over some timescale), but no real support on the government end outside of moral support. This isn't some infrastructure investment package like the IRA, it's just a unilateral promise by a few companies to invest in data centers (which I'm sure they are doing anyway).
beezle
hundreds of thousands of jobs? I'll wait for the postmortem on that prediction. Sounds a lot like Foxconn in Wisconsin but with more players.
shrubble
Just as there is an AWS for the public, with something similar but only for Federal use, so it could be possible that there is AI cloud services available to the public and then a separate cloud service for Federal use. I am sure that military intelligence agencies etc. would like to buy such a service.
szvsw
AWS GovCloud already exists FYI (as you hinted) and it is absolutely used by the DoD extensively already.
n2d4
Yes, Trump announced this as a massive foreign investment coming into the US: https://x.com/WatcherGuru/status/1881832899852542082
null
JumpCrisscross
> It's similar in scale to the GDP of a small middle income country
I’ve been advocating for a data centre analogue to the Heavy Press Programme for some years [1].
This isn’t quite it. But when I mapped out costs, $1tn over 10 years was very doable. (A lot of it would go to power generation and data transmission infrastructure.)
markus_zhang
Maybe they will invest in nuclear reactors.
Data center, AI and nuclear power stations. Three advanced technologies, that's pretty good.
UltraSane
They are trying. Microsoft wants to star the 3 Mile Island reactor. And other companies have been signing contracts for small modular reactors. SMRs are a perfect fit for modern data centers IF they can be made cheaply enough.
jonisgold
I think this is right- data centers powered by fission reactors. Something like Oklo (https://oklo.com) makes sense.
null
ericcumbee
watching the press conference and Onsite power production were mentioned. I assume this means SMRs and solar.
jazzyjackson
just as likely to be natural gas or a combination of gas and solar. I don't know what supply chain looks like for solar panels, but I know gas can be done quickly [1], which is how this money has to be spent if they want to reach their target of 125 billion a year.
The companies said they will develop land controlled by Wise Asset to provide on-site natural gas power plant solutions that can be quickly deployed to meet demand in the ERCOT.
The two firms are currently working to develop more than 3,000 acres in the Dallas-Fort Worth region of Texas, with availability as soon as 2027
[0] https://www.datacenterdynamics.com/en/news/rpower-and-wise-a...
[1.a] https://enchantedrock.com/data-centers/
[1.b] https://www.powermag.com/vistra-in-talks-to-expand-power-for...
toomuchtodo
US domestic PV module manufacturing capacity is ~40GW/year.
gunian
could something of this magnitude be powered by renewables only?
apsec112
I don't think any assembly line exists that can manufacture and deploy SMRs en masse on that kind of timeframe, even with a cooperative NRC
mikeyouse
There have been literally 0 production SMR deployments to date so there’s no possibility they’re basing any of their plans on the availability of them.
dhx
Hasn't the US decided to prefer nuclear and fossil fuels (most expensive generation methods) over renewables (least expensive generation methods)?[1][2]
I doubt the US choice of energy generation is ideological as much a practicality. China absolutely dominates renewables with 80% of solar PV modules manufactured in China and 95% of wafers manufactured in China.[3] China installed a world record 277GW of new solar PV generation in 2024 which was a 45% year-on-year increase.[4] By contract, the US only installed ~1/10th this capacity in 2024 with only 14GW of solar PV generation installed in the first half of 2024.[5]
[1] https://en.wikipedia.org/wiki/Cost_of_electricity_by_source
[2] https://www.iea.org/data-and-statistics/charts/lcoe-and-valu...
[3] https://www.iea.org/reports/advancing-clean-technology-manuf...
[4] https://www.pv-magazine.com/2025/01/21/china-hits-277-17-gw-...
[5] https://www.energy.gov/eere/solar/quarterly-solar-industry-u...
cavisne
Much more likely is what xAI did, portable gas turbines until the grid catches up.
cameldrv
One possibility would be just to build their own power plants colocated with the datacenters and not interconnect at all.
jiggawatts
Notably it is significantly more than the revenue of either of AWS or Azure. It is very comparable to the sum of both, but consolidated into the continental US instead distributed globally.
deelowe
Dcs will start generating power on site soon. I know micro nuclear is one area actively being explored.
jscottbee
Small or modular reactors in the US are more than 10 years away, probably more like 15-20. These are facts and not made-up political or pipe-dreaming techno-snobes.
JumpCrisscross
> Small or modular reactors in the US are more than 10 years away, probably more like 15-20
Could be 5 to 10 with $20+ bn/year in scale and research spend.
Trump is screwing over his China hawks. The anti-China and pro-nuclear lobbies have significant overlap; this could be how Trump keeps e.g. Peter Thiel from going thermonuclear on him.
dwnw
Don't worry, they said they are doing it in Texas where the power grid is super reliable and able to handle the massive additional load.
dang
"Don't be snarky."
"Eschew flamebait."
Let's not have regional flamewar on HN please.
dwnw
Not guilty. No sarcasm intended, of course. If your guidelines are so broad to include this, you should work on them, and in turn, yourself.
Governor says our power grid is the best in the universe. Why don't you believe us?
Stop breaking your own rules.
"Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith."
"Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something."
Let's not ruin HN with overmoderation. This kind of thing is no longer in fashion, right?
lvl155
Probably because they don’t have to deal with energy-related regulations…
llamaimperative
That was sarcasm, the Texas grid falls over pretty much annually at this point.
heydenberk
Say what you will about Texas, but they are adding energy capacity, renewables especially, at a much faster rate than any comparable state.
segasaturn
How much capacity does solar and wind add compared to nuclear, per square foot of land used? Also I thought the new administration was placing a ban on new renewable installations.
CapcomGo
Ok but their grid sure seems to fail a lot.
dwnw
Probably the first state to power all those renewables down at the whim of the president too.
cpursley
[flagged]
null
383toast
Where are they getting the $500B? Softbank's market cap is 84b and their entire vision fund is only $100b, Oracle only has $11b cash on hand, OpenAI's only raised $17b total...
philipwhiuk
MGX has at least $100bn: https://www.theinformation.com/articles/a-100-billion-middle...
This is Abu Dhabi money.
petesergeant
This would be a large outlay even for UAE, who would be giving it to a direct competitor in the space: UAE is one of the few countries outside of the US who are in any way serious about AI.
notatoad
there doesn't appear to be any timeline announced here. the article says the "initial investment" is expected to be $100bn, but even that doesn't mean $100bn this year.
if this is part of softbank's existing plan to invest $100bn in ai over the next four years, then all that's being announced here is that Sama and Larry Ellison wanted to stand on a stage beside trump and remind people about it.
HotHotLava
The literal first sentence of the announcement is:
> The Stargate Project is a new company which intends to invest $500 billion over the next four years
themagician
Softbank is being granted a block of TRUMP MEMES, the price of which will skyrocket when they are included in the bucket of crypto assets purchased as part of the crypto reserve.
1oooqooq
how I wish that was a joke...
TuringNYC
>> Where are they getting the $500B? Softbank's market cap is 84b and their entire vision fund is only $100b, Oracle only has $11b cash on hand, OpenAI's only raised $17b total...
1. The outlays can be over many years.
2. They can raise debt. People will happily invest at modest yields.
3. They can raise an equity fund.
jameshart
Soooo this isn’t so much ‘announcing an investment’ as ‘announcing an investment opportunity’?
Why not continue:
4. They can start a kickstarter or go fund me
5. They can go on Dragons’ Den
…
TuringNYC
>> 4. They can start a kickstarter or go fund me
Debt/Equity Fundraising is basically a kickstarter! Remarkably similar.
griomnib
6. ??? 7. Profit.
sangnoir
4. The US government can chip in via grants, tax breaks or contracts.
It's all very Dr. Strangelove. "Mr. President, we must not allow an AI gap! Now give us billions"
dkrich
Psst: it’s probably going to end up being a fraction of that but doesn’t make for as good a headline
LarsDu88
Quite possibly pulled out of their asses...
If Son can actually build a 500B Vision Fund it can only come from one of two places...
somehow the dollar depreciates radically OR Saudis
Vision Fund was heavily invested in by the Saudis so...
jhallenworld
Oracle's cash on hand is presumably irrelevant- I think they are on the receiving end of the money, in return for servers. No wonder Larry Ellison was so fawning.
Is this is a good investment by Softbank? Who knows.. they did invest in Uber, but also have many bad investments.
handfuloflight
Sleight of hand with the phrasing "up to" $500B.
JSTrading
Wasn’t this announced months ago? I feel like it was. https://www.techradar.com/pro/could-amd-be-the-key-to-micros...
gilgoomesh
Interesting that 6 months ago, Microsoft was attached but now they're missing from today's announcement.
Maxious
Scroll down:
> Other partners in the project include Microsoft, investor MGX and the chipmakers Arm and NVIDIA, according to separate statements by Oracle and OpenAI.
daveguy
Well, I've never known Trump to take credit for something someone else did.
lantry
yeah, it sounds like they're just relabeling an existing plan
> Ellison noted that the data centers are already under construction with 10 being built so far.
MichaelMoser123
The moon program was $318 billion in 2023 dollars, this one is $500 billion. So that's why the tech barons who were present at the inauguration were high as a kite yesterday, they just got the financing for a real moon shot!
aurareturn
To be fair, it’s not easy to monetize the moon program into profitability. This has a far better shot of sustaining profitability.
TheAceOfHearts
I'm confused and a bit disturbed; honestly having a very difficult time internalizing and processing this information. This announcement is making me wonder if I'm poorly calibrated on the current progress of AI development and the potential path forward. Is the key idea here that current AI development has figured out enough to brute force a path towards AGI? Or I guess the alternative is that they expect to figure it out in the next 4 years...
I don't know how to make sense of this level of investment. I feel that I lack the proper conceptual framework to make sense of the purchasing power of half a trillion USD in this context.
Davidzheng
Let me avoid the use of the word AGI here because the term is a little too loaded for me these days.
1) reasoning capabilities in latest models are rapidly approaching superhuman levels and continue to scale with compute.
2) intelligence at a certain level is easier to achieve algorithmically when the hardware improves. There's also a larger path to intelligence and often simpler mechanisms
3) most current generation reasoning AI models leverage test time compute and RL in training--both of which can make use of more compute readily. For example RL on coding against compilers proofs against verifiers.
All of this points to compute now being basically the only bottleneck to massively superhuman AIs in domains like math and coding--rest no comment (idk what superhuman is in a domain with no objective evals)
philipwhiuk
You can't block AGI on a whim and then deploy 'superhuman' without justification.
A calculator is superhuman if you're prepared to put up with it's foibles.
Davidzheng
It is superhuman in a very specific domain. I didn't use AGI because its definitions are one of two flavors.
One, capable of replacing some large proportion of global gdp (this definition has a lot of obstructions: organizational, bureaucratic, robotic)...
two, difficult to find problems in which average human can solve but model cannot. The problem with this definition is that the distinct nature of intelligence of AI and the broadness of tasks is such that this metric is probably only achievable after AI is already in reality massively superhuman intelligence in aggregate. Compare this with Go AIs which were massively superhuman and often still failing to count ladders correctly--which was also fixed by more scaling.
All in all I avoid the term AGI because for me AGI is comparing average intelligence on broad tasks rel humans and I'm already not sure if it's achieved by current models whereas superhuman research math is clearly not achieved because humans are still making all of progress of new results.
lossolo
> All of this points to compute now being basically the only bottleneck to massively superhuman AIs
This is true for brute force algorithms as well and has been known for decades. With infinite compute, you can achieve wonders. But the problem lies in diminishing returns[1][2], and it seems things do not scale linearly, at least for transformers.
1. https://www.bloomberg.com/news/articles/2024-12-19/anthropic...
2. https://www.bloomberg.com/news/articles/2024-11-13/openai-go...
HarHarVeryFunny
Largest GPU cluster at the moment is X.ai's 100K H100's which is ~$2.5B worth of GPUs. So, something 10x bigger (1M GPUs) is $25B, and add $10B for 1GW nuclear reactor.
This sort of $100-500B budget doesn't sound like training cluster money, more like anticipating massive industry uptake and multiple datacenters running inference (with all of corporate America's data sitting in the cloud).
dauhak
> Is the key idea here that current AI development has figured out enough to brute force a path towards AGI?
My sense anecdotally from within the space is yes people are feeling like we most likely have a "straight shot" to AGI now. Progress has been insane over the last few years but there's been this lurking worry around signs that the pre-training scaling paradigm has diminishing returns.
What recent outputs like o1, o3, DeepSeek-R1 are showing is that that's fine, we now have a new paradigm around test-time compute. For various reasons people think this is going to be more scalable and not run into the kind of data issues you'd get with a pre-training paradigm.
You can definitely debate on whether that's true or not but this is the first time I've been really seeing people think we've cracked "it", and the rest is scaling, better training etc.
catmanjan
This has nothing to do with technology it is a purely financial and political exercise...
ilaksh
I think the only way you get to that kind of budget is by assuming that the models are like 5 or 10 times larger than most LLMs, and that you want to be able to do a lot of training runs simultaneously and quickly, AND build the power stations into the facilities at the same time. Maybe they are video or multimodal models that have text and image generation grounded in a ton of video data which eats a lot of VRAM.
null
layer8
> Is the key idea here that current AI development has figured out enough to brute force a path towards AGI?
It rather means that they see their only chance for substantial progress in Moar Power!
petesergeant
> Is the key idea here that current AI development has figured out enough to brute force a path towards AGI? Or I guess the alternative is that they expect to figure it out in the next 4 years...
Can't answer that question, but, if the only thing to change in the next four years was that generation got cheaper and cheaper, we haven't even begun to understand the transformative power of what we have available today. I think we've felt like 5-10% of the effects that integrating today's technology can bring, especially if generation costs come down to maybe 1% of what they currently are, and latency of the big models becomes close to instantaneous.
lvl155
It appears this basically locks out Google, Amazon and Meta. Why are we declaring OpenAI as the winner? This is like declaring Netscape the winner before the dust settled. Having the govt involved in this manner can’t be a good thing.
VectorLock
Since the CEOs of Google, Amazon and Meta were seated at the front row of the inauguration, IN FRONT OF the incoming cabinet, I'm pretty confident their techno -power-barrel will come via other channels.
jvm___
Broligarchs
skepticATX
Interestingly, there seems to be no actual government involvement aside from the announcement taking place at the White House. It all seems to be private money.
trhway
Government enforcing or laxing/fast tracking regulations and permits can kill or propel even a 100B project, and thus can be thought as having its own value on the scale of the given project’s monetary investment, especially in the case of a will/favor/whim-based government instead of a hard rules based deep state one.
cmdli
Isn't that a state and local-level thing, though? I can't imagine that there is much federal permitting in building a data center, unless it is powered by a nuclear reactor.
null
rcpt
Yeah but the linked article makes it seem like the current, one-day-old, administration is responsible for the whole thing.
janalsncm
The article also mentions that this all started last year.
HarHarVeryFunny
Trump just tore up Biden's AI safety bill, so this is OpenAI's thank-you - let Trump take some credit
modeless
I generally agree that government sponsorship of this could be bad for competition. But Google in particular doesn't necessarily need outside investment to compete with this. They're vertically integrated in AI datacenters and they don't have to pay Nvidia.
shuckles
Google definitely needs outside investment to spend $500b on capex.
modeless
They don't have to spend $500B to compete. Their costs should be much lower.
That said, I don't think they have the courage to invest even the lower amount that it would take to compete with this. But it's not clear if it's truly necessary either, as DeepSeek is proving that you don't need a billion to get to the frontier. For all we know we might all be running AGI locally on our gaming PCs in a few years' time. I'm glad I'm not the one writing the checks here.
jonas21
Over what time frame? They could easily spend that much over the next 5 to 10 years without outside investment (and they probably will).
chairmansteve
TFA says $100 billion. The $500 is maybe, eventually.
misiti3780
Probably not popular opinion - but I actually think Google is winning this now. Deep research is the most useful AI product I have used (Claud is significantly more useful than openAI)
impulser_
Because this is Oracle's and OpenAI's project with SoftBank and MGX as investors.
jazzyjackson
It's who you know. Sam is buddies with Masa, simple as.
OutOfHere
I am not sure if OpenAI will be the winner despite this investment. Currently, I see various DeepSeek AI models as offering much more bang for the buck at a vastly cheaper cost for small tasks, but not yet for large context tasks.
layer8
Amazon MGM will do the media tie-ins. ;)
qgin
How involved is the government at all? I’m still having a hard time seeing how Trump or anyone in the government is involved except to do the announcement. These are private companies coming together to do a deal.
jparishy
I hear this joked about sometimes or used as a metaphor, but in the literal sense of the phrase, are we in a cold war right now? These types of dollars feel "defense-y", if that makes sense. Especially with the big focus on energy, whatever that ends up meaning. Defense as a motivation can get a lot done very fast so it will be interesting to watch, though it raises the hair on my arms
kube-system
Absolutely
for instance: https://en.wikipedia.org/wiki/2024_United_States_telecommuni...
jparishy
Right, but they've been doing that for a while, to everyone. The US is much quieter about it, right? But you can twist this move and see how the gov would not want to display that level of investment within itself as it could be interpreted as a sign of aggression. but it makes sense to me that they'd have no issue working through corporations to achieve the same ends but now able to deny direct involvement
kube-system
I don't think this administration is worried too much about showing aggression. If anything they are embracing it. Today was the first full day, and they have already threatened the sovereignty of at least four nations.
distortionfield
We certainly are, if you ask me. Especially when you realize that we haven’t had official comms with Russia since the war in Ukraine broke out.
etblg
The US government and its media partners sure seem to think so.
non-
Any clues to how they plan to invest $500 billion dollars? What infrastructure are they planning that will cost that much?
burnte
That was literally my question. Is this basically just for more datacenters, NVidia chips, and electricity with a sprinkling of engineers to run it all? If so, then that $500bn should NOT be invested in today's tech, but instead in making more powerful and power efficient chips, IMO.
kristianp
Nvidia and TSMC are already working on more powerful and efficient chips, but the physical limits to scaling mean lots more power is going to be used in each new generation of chips. They might improve by offering specific features such as FP4, but Moore's law is still dead.
bitmasher9
I don’t know if $500bn could put anyone ahead of nvidia/tmc.
amluto
$500bn of usefully deployed engineering, mostly software, seems like it would put AMD far ahead of Nvidia. Actually usefully deploying large amounts of money is not so easy, though, and this would still go through TSMC.
entropicdrifter
Nvidia's in on it, so presumably this is a doubling-down on Nvidia as the chip developers
Havoc
Add some nuclear power and you’ve suddenly got a big bill
bdangubic
if only $500bn was enough to make more powerful and power efficient chips…
patall
He wanted to do that, but would have needed 5T for that. Only got 100 bn so far, so this is what you get (only slightly /s)
TrainedMonkey
I'll make a wild guess that they will be building data centers and maybe robotic labs. They are starting with 100B of committed by mostly Softbank, but probably not transacted yet, money.
> building new AI infrastructure for OpenAI in the United States
The carrot is probably something like - we will build enough compute to make a supper intelligence that will solve all the problems, ???, profit.
null
K0balt
If we look at the processing requirements in nature, I think that the main trend in AI going forward is going to be doing more with less, not doing less with more, as the current scaling is going.
Thermodynamic neural networks may also basically turn everything on its ear, especially if we figure out how to scale them like NAND flash.
If anything, I would estimate that this is a space-race type effort to “win” the AI “wars”. In the short term, it might work. In the long term, it’s probably going to result in a massive glut in accelerated data center capacity.
The trend of technology is towards doing better than natural processes, not doing it 100000x less efficiently. I don’t think AI will be an exception.
If we look at what is -theoretically- possible using thermodynamic wells, with current model architectures, for instance, we could (theoretically) make a network that applies 1t parameters in something like 1cm2. It would use about 20watts, back of the napkin, and be able to generate a few thousand T/S.
Operational thermodynamic wells have already been demonstrated en silica. There are scaling challenges, cooling requirements, etc but AFAIK no theoretical roadblocks to scaling.
Obviously, the theoretical doesn’t translate to results, but it does correlate strongly with the trend.
So the real question is, what can we build that can only be done if there are hundreds of millions of NVIDIA GPUs sitting around idle in ten years? Or alternatively, if those systems are depreciated and available on secondary markets?
What does that look like?
jppope
Reasonably speaking, there is no way they can know how they plan to invest $500 billion dollars. The current generation of large language models basically use all human text thats ever been created for the parameters... not really sure where you go after than using the same tech.
Philpax
That's not really true - the current generation, as in "of the last three months", uses reinforcement learning to synthesize new training data for themselves: https://huggingface.co/deepseek-ai/DeepSeek-R1-Zero
bandrami
It worked well for the Habsburg family; what could go wrong?
XorNot
Right but that's kind of the point: there's no way forward which could benefit from "moar data". In fact it's weird we need so much data now - i.e. my son in learning to talk hardly needs to have read the complete works of Shakespeare.
If it's possible to produce intelligence from just ingesting text, then current tech companies have all the data they need from their initial scrapes of the internet. They don't need more. That's different to keeping models up to date on current affairs.
jazzyjackson
It seems to me you could generate a lot of fresh information from running every youtube video, every hour of TV on archive.org, every movie on the pirate bay -- do scene by scene image captioning + high quality whisper transcriptions (not whatever junk auto-transcription YouTube has applied), and use that to produce screenplays of everything anyone has ever seen.
I'm not sure why I've never heard of this being done, it would be a good use of GPUs in between training runs.
jensvdh
The fact that OpenAI can just scrape all of Youtube and Google isn't even taking legal action or attempting to stop it is wild to me. Is Google just asleep?
ilaksh
I think that this is the obvious path to more robust models -- grounding language on video.
airstrike
Don't forget every hour of news broadcasting, of which we likely won't run out any time soon. Plus high quality radio
miltonlost
> a lot of fresh information from running every youtube video
EVERY youtube video?? Even the 9/11 truther videos? Sandy Hook conspiracy videos? Flat earth? Even the blatantly racist? This would be some bad training data without some pruning.
cavisne
The new scaling vector is “test time compute” ie spending more compute in inference.
riku_iki
I think there is huge amount of corporate knowledge.
croddin
This could be a clue
layer8
I’m more interested in how they plan to draw the rest of the damn owl.
lukeplato
hopefully nuclear power plants
HarHarVeryFunny
They are going to buy 50 $10B nuclear aircraft carriers and use them as a power source.
thecrumb
"create hundreds of thousands of American jobs"... Given the current educational system in the US, this should be fun to watch. Oh yeah, Musk and his H-1B Visa thing. Now it's making sense.
jedberg
If they're creating that many jobs, it means most of them are construction work.
Skilled labor for sure, but not necessarily college educated.
raphman
How does this work out in the long term? Operating a data center does not require that many blue-collar workers.
I'm imagining a future where the US builds a Tower of Babel from thousands of data centers just to keep people employed and occupied. Maybe also add in some paperclip factories¹?
jedberg
I doubt these are permanent jobs. This project will create a ton of temporary work though!
bdangubic
you put Trump (implicitly) and “long-term” in the same sentence… :)
dwnw
How many jobs will it net if "successful" and the AI eliminates jobs?
stevenwoo
This is what the 2024 Nobel prize winners in economics call "creative destruction" to repeat from their book Why Nations Fail. They really did not have a lot of sympathy for those they lumped in with Luddites who were collateral damage to progress.
kortilla
Data centers are nearly all blue collar work.
FergusArgyll
If you're familiar with this kind of work, please elaborate!
Do you mean building the centers or maintenance or both?
insane_dreamer
maybe this is to employ the hundreds of thousands of federal employees that are about to lose their jobs?
patall
Last year, sama goal was 5 to 7T. Now he is going with 100B, with option for another 400B. Huge numbers, but it still feels like a bit of a down turn.
Havoc
Let’s be real the 5T was a wild ass guess
OutOfHere
I think that coming down from 5T to 0.5T means that TSMC cannot be reproduced locally, but everything else is on the table. At least TSMC has a serious roadmap for its Arizona fab facility, so that too is domestically captured, although not its latest gen fab.
We changed the URL from https://openai.com/index/announcing-the-stargate-project/ to a third-party report. Readers may want to read both. If there's a better URL, we can change it again.