The Gentle Singularity
514 comments
·June 10, 2025grafmax
yencabulator
> The utopia Altman describes won’t appear.
Sure it will, as far as Altman is concerned. To make the whole post make sense, add "... for the rich" where appropriate.
sjducb
The problem is that housing and health insurance are too expensive. Tech isn’t responsible for either of those problems.
BriggyDwiggs42
Parent didn’t claim tech was responsible for every problem? Housing prices are likely an inequality issue; as a greater portion of money in the economy is held by rich people, more money is invested and less is spent on goods/services, hence a scarce asset like land sees an increase in value.
kjkjadksj
In a way it is. Why are housing costs so high in Redmond, WA? The result of an influx of high income tech workers to the local housing market and the resulting shift of prices such as to eventually dilute the utility of that high salary to begin with. People in the area without a hook on that whale are of course left high and dry.
uses
No... the reason housing prices are high and rising is that not enough housing is getting build in the places people want to live. The main reason for that is that the people who already live in those places can block construction of new housing. That, and zoning.
skeaker
Citation needed, my understanding was that housing prices are being driven up by real estate owners/agencies buying tons of property to either rent at extortionary prices or sit on until they sell for a higher price to the next sucker. Also stuff like the RealPage scandal where they simply illegally collude: https://news.ycombinator.com/item?id=41330007
I think the idea of a law that only allows a limited number of owned properties per person and requires them to actually be using those properties would be interesting to alleviate this.
_DeadFred_
I mean home prices went up insane in California due to tech. Many people cashed out and bought homes in cheaper locations...driving up the housing prices there beyond what locals could afford.
How did Hacker News already forget these things?
nradov
Real wages have risen a lot since 1980 when you include employer contributions to employee health insurance.
Rooster61
It's difficult for me to call those wages "real" when medical costs have been so absurdly gouged to eat up those contributions. Those increases have had no real impact on the average consumer, and is profoundly awful for those without access to employment that provides that insurance
nradov
That's not entirely accurate. Wages are real enough whether paid out as cash, or paid to a third party for the employee's benefit. Some medical costs are unreasonably inflated, but on the other hand much of the cost increase reflects greater capabilities. We can effectively treat many conditions today that would have been a death sentence in 1980, but some of those cutting edge treatments cost >$1M per patient. That doesn't directly benefit the average healthy consumer, but it can help a lot if you get sick.
I do agree that it makes no logical sense to couple medical insurance to employment. This system was created sort of accidentally as a side effect of wartime tax law and has persisted mainly due to inertia.
kjkjadksj
All costs. New roof on the house? $30k. Hail damage on the car? It’s actually totaled. Whatever is in the store? Inflation certainly hit there too.
boole1854
Even without including employer health insurance costs, real wages are up 67% since 1980.
Source: https://fred.stlouisfed.org/graph/?g=1JxBn
Details: uses the "Wage and salary accruals per full-time-equivalent employee" time series, which is the broadest wage measure for FTE employees, and adjusts for inflation using the PCE price index, which is the most economically meaningful measure of "how much did prices change for consumers" (and is the inflation index that the Fed targets)
fusionadvocate
How has inflation behaved since 1980?
xboxnolifes
No, that's not a real wage increase, thats nominal wage. If i make 20k more, but health insurance costs also went up 20k, my real wage did not change. I am no richer.
nradov
No, that's a real wage increase. The healthcare system can effectively treat a lot more conditions than it could in 1980. That makes you richer as measured in terms of QALYs.
insane_dreamer
Not when you account for the insane rise in cost of health care.
ImHereToVote
Does it correct for housing costs?
azan_
> Real wages haven’t risen since 1980.
Do people really believe that? I think either people have too rosy view of 80s or consider that real wages should also adjust for lifestyle inflation.
naryJane
Yes. It’s even a part of Ray Dalio’s speeches on the topic. Here is one example where he mentions it: https://www.linkedin.com/pulse/why-how-capitalism-needs-refo...
tim333
>Real wages haven’t risen since 1980
is a very US thing. In China they've probably 10xd over that time.
hollerith
Even after that 10x growth, median Chinese household income is only 13-16% of US median household income:
China: ~$10,000 – $12,000
US: ~$74,580 (U.S. Census Bureau, 2022)
Y_Y
I think GDPPPP/capita is a better measure here, but the story is similar. From the IMF's 2025 numbers: USA $90k, China $29k, world average $26k.
Lovely tabulation of the data here: https://en.wikipedia.org/wiki/List_of_countries_by_GDP_(PPP)...
ethbr1
Purchasing power parity matters a lot too.
grafmax
That is because of China and US positions in the global system over this time. The wage/labor/inequality story is broadly true across the global north; China can credit forward thinking central planning, social programs, and industrialization for its economic progress (yet it continues to live under authoritarian rule).
tim333
It varies a lot - check this out https://www.businessinsider.com/france-vs-us-real-wages-2015...
The US is kind of an outlier.
I guess if you have the opportunity to have your stuff made by cheap or free labour whether low paid Chinese or AI robots, societies have a choice how to distribute the benefits which has varied from everything to the rich in the US to fairly even in France say. Such choices will be important with AI.
Herring
Lots of people don't care about "progress" in an absolute sense, eg longer healthier lifespans for all. They only care about it in a relative sense, eg if cop violence against minorities goes down, they feel anxiety and resentment. They really really want to be the biggest fish in the little pond. That's how a caste system works, it "makes a captive of everyone within it". Equality feels like a demotion [1].
That's why we have a whole thing about immigration going on. It's the one issue that the president is not underwater on right now [2]. You can't get much of a labor movement like this.
[1] https://www.texasobserver.org/white-people-rural-arkansas-re...
[2] https://www.natesilver.net/p/trump-approval-ratings-nate-sil...
flessner
> Already we live with incredible digital intelligence, and after some initial shock, most of us are pretty used to it. Very quickly we go from being amazed that AI can generate a beautifully-written paragraph to wondering when it can generate a beautifully-written novel;
It was probably around 7 years ago when I first got interested in machine learning. Back then I followed a crude YouTube tutorial which consisted of downloading a Reddit comment dump and training an ML model on it to predict the next character for a given input. It was magical.
I always see LLMs as an evolution of that. Instead of the next character, it's now the next token. Instead of GBs of Reddit comments, it's now TBs of "everything". Instead of millions of parameters, it's now billions of parameters.
Over the years, the magic was never lost on me. However, I can never see LLMs as more than a "token prediction machine". Maybe throwing more compute and data at it will at some point make it so great that it's worthy to be called "AGI" anyway? I don't know.
Well anyway, thanks for the nostalgia trip on my birthday! I don't entirely share the same optimism - but I guess optimism is a necessary trait for a CEO, isn't it?
helloplanets
What's your take on Anthropic's 'Tracing the thoughts of a large language model'? [0]
> To write the second line, the model had to satisfy two constraints at the same time: the need to rhyme (with "grab it"), and the need to make sense (why did he grab the carrot?). Our guess was that Claude was writing word-by-word without much forethought until the end of the line, where it would make sure to pick a word that rhymes. We therefore expected to see a circuit with parallel paths, one for ensuring the final word made sense, and one for ensuring it rhymes.
> Instead, we found that Claude plans ahead. Before starting the second line, it began "thinking" of potential on-topic words that would rhyme with "grab it". Then, with these plans in mind, it writes a line to end with the planned word.
This is an older model (Claude 3.5 Haiku) with no test time compute.
[0]: https://www.anthropic.com/news/tracing-thoughts-language-mod...
Sammi
What is called "planning" or "thinking" here doesn't seem conceptually much different to me than going from naive breath first search based Dijkstra shortest path search, to adding a heuristics that makes it search in a particular direction first and calling it A*. In both cases you're adding another layer to an existing algorithm in order to make it more effective. Doesn't make either AGI.
I'm really no expert in neural nets or LLMs, so my thinking here is not an expert opinion, but as a CS major reading that blog from Anthropic, I just cannot see how they provided any evidence for "thinking". To me it's pretty aggressive marketing to call this "thinking".
helloplanets
They definitely do strain the neurology and thinking metaphors in that article. But the Dijkstra's algorithm and A* comparisons are the flipside of that same coin. They aren't trying to make it more effective. And definitely not trying to argue for anything AGI related.
Either way: They're tampering with the inference process, by turning circuits in the LLM on and off, in an attempt to prove that those circuits are related with a specific function. [0]
They noticed that circuits related to a token that is only relevant ~8 tokens forward were already activated on the newline token. Instead of only looking at the sequence of tokens that has been generated so far (aka backwards), and generating the next token based off of that information, the model is activating circuits related to tokens that are not relevant to the next token only, but to specific tokens a handful of tokens after.
So, information related to more than just the next upcoming token (including a reference to just one specific token) is being cached during a newline token. Wouldn't call that thinking, but I don't think calling it planning is misguided. Caching this sort of information in the hidden state would be an emergent feature, rather than a feature that was knowingly aimed at by following a specific training method, unlike with models that do test time compute. (DeepSeek-R1 paper being an example, with a very direct aim at turbocharging test time compute, aka 'reasoning'. [1])
The way they went at defining the function of a circuit, was by using their circuit tracing method, which is open source so you can try it out for yourself. [2] Here's the method in short: [3]
> Our feature visualizations show snippets of samples from public datasets that most strongly activate the feature, as well as examples that activate the feature to varying degrees interpolating between the maximum activation and zero.
> Highlights indicate the strength of the feature’s activation at a given token position. We also show the output tokens that the feature most strongly promotes / inhibits via its direct connections through the unembedding layer (note that this information is typically more meaningful for features in later model layers).
[0]: https://transformer-circuits.pub/2025/attribution-graphs/bio... [1]: https://arxiv.org/pdf/2501.12948 [2]: https://github.com/safety-research/circuit-tracer [3]: https://transformer-circuits.pub/2025/attribution-graphs/met...
exe34
> In both cases you're adding another layer to an existing algorithm in order to make it more effective. Doesn't make either AGI.
Yet. The human mind is a big bag of tricks. If the creators of AI can enumerate a large enough list of capabilities and implement those, then the product can be as good as 90% of humans, but at a fraction of the cost and a billion times the speed - then it doesn't matter if it's AGI or not. It will have economic consequences.
yencabulator
Generalize the concept from next token prediction to coming tokens prediction and the rest still applies. LLMs are still incredibly poor at symbolic thought and following multi-step algorithms, and I as a non-ML person don't really see what in the LLM mechanism would provide such power. Or maybe we're still just another 1000x scale off and symbolic thought will emerge at some point.
Me personally, I expect to see LLMs to be a mere part of whatever will be invented later.
iNic
The mere token prediction comment is wrong, but I don't think any of the other comments really explained why. Next token prediction is not what the AI does, but its goal. It's like saying soccer is a boring sport having only ever seen the final scores. The important thing about LLMs is that they can internally represent many different complex ideas efficiently and coherently! This makes them an incredible starting point for further training. Nowadays no LLM you interact with will be a pure next token predictor anymore, they will have all gone through various stages of RL, so that they actually do what we want them to do. I think I really feel the magic looking at the "circuit" work by Anthropic. It really shows that these models have some internal processing / thinking that is complex and clever.
quonn
> that they can internally represent many different complex ideas efficiently and coherently
The Transformer circuits[0] suggest that this representation is not coherent at all.
iNic
I guess that depends on what you think is coherent. A key finding is that the larger the network the more coherent the representation becomes. One example is larger networks merge the same concept across different languages into a single concept (as humans do). The addition circuits are also fairly easy to interpret.
trashtester
The "next token prediction" is a distraction. That's not where the interesting part of an AI model happens.
If you think of the tokenization near the end as a serializer, something like turning an object model into json, you get a better understanding. The interesting part of a an OOP program is not in the json, but what happens in memory before the json is created.
Likewise, the interesting parts of a neural net model, whether it's LLM's, AlphaProteo or some diffusion based video model, happen in the steps that operate in their latent space, which is in many ways similar to our subconscious thinking.
In those layers, the AI models detect deeper and deeper patterns of reality. Much deeper than the surface pattern of the text, images, video etc used to train them. Also, many of these patterns generalize when different modalities are combined.
From this latent space, you can "serialize" outputs in several different ways. Text is one, image/video another. For now, the latent spaces are not general enough to do all equally well, instead models are created that specialize on one modality.
I think the step to AGI does not require throwing a lot more compute into the models, but rather to have them straddle multiple modalities better, in particular, these:
- Physical world modelling at the level of Veo3 (possibly with some lessons from self driving or robotics model for elements like object permananence and perception) - Symbolic processing of the best LLM's. - Ability to be goal oriented and iterate towards a goal, similar to the Alpha* family of systems - Optionally: Optimized for the use of a few specific tools, including a humanoid robot.
Once all of these are integrated into the same latent space, I think we basically have what it takes to replace most human thought.
sgt101
>which is in many ways similar to our subconscious thinking
this is just made up.
- we don't have any useful insight on human subconscious thinking. - we don't have any useful insight on the structures that support human subconscious thinking. - the mechanisms that support human cognition that we do know about are radically different from the mechanisms that current models use. For example we know that biological neurons & synapses are structurally diverse, we know that suppression and control signals are used to change the behaviour of the networks , we know that chemical control layers (hormones) transform the state of the system.
We also know that biological neural systems continuously learn and adapt, for example in the face of injury. Large models just don't do these things.
Also this thing about deeper and deeper realities? C'mon, it's surface level association all the way down!
ixtli
Yea whenever we get into this sort of “what’s happening in the network is like what’s going on in your brain” discussion people never have concrete evidence of what they’re talking about.
scarmig
The diversity is itself indicative, though, that intelligence isn't bound to the particularities of the human nervous system. Across different animal species, nervous systems show a radical diversity. Different architectures; different or reversed neurotransmitters; entirely different neural cell biologies. It's quite possible that "neurons" evolved twice, independently. There's nothing magic about the human brain.
Most of your critique is surface level: you can add all kinds of different structural diversity to an ML model and still find learning. Transformers themselves are formally equivalent to "fast weights" (suppression and control signals). Continuous learning is an entire field of study in ML. Or, for injury, you can randomly mask out half the weights of a model, still get reasonable performance, and retrain the unmasked weights to recover much of your loss.
Obviously there are still gaps in ML architectures compared to biological brains, but there's no particular reason to believe they're fundamental to existence in silico, as opposed to myelinated bags of neurotransmitters.
ben_w
The bullet list is a good point, but:
> We also know that biological neural systems continuously learn and adapt, for example in the face of injury. Large models just don't do these things.
This is a deliberate choice on the part of the model makers, because a fixed checkpoint is useful for a product. They could just keep the training mechanism going, but that's like writing code without version control.
> Also this thing about deeper and deeper realities? C'mon, it's surface level association all the way down!
To the extent I agree with this, I think it conflicts with your own point about us not knowing how human minds work. Do I, myself, have deeper truths? Or am myself I making surface level association after surface level association, but have enough levels to make it seem deep? I do not know how many grains make the heap.
phorkyas82
As far as I understood any AI model is just a linear combination of its training data. Even if that were such a large corpus as the entire web... it's still just like a sophisticated compression of other's people's expressions.
It has not made its own experiences, not interacted with the outer world. Dunno, I won't to rule out something operating solely on language artifacts cannot develop intelligence or consciousness, whatever that is,.. but so far there are also enough humans we could care about and invest into.
tlb
LLMs are not a linear combination of training data.
Some LLMs have interacted with the outside world, such as through reinforcement learning while trying to complete tasks in simulated physics environments.
olmo23
Just because humans can describe it, doesn't mean they can understand (predict) it.
And the web contains a lot more than people's expressions: think of all the scientific papers with tables and tables of interesting measurements.
andsoitis
> the AI models detect deeper and deeper patterns of reality. Much deeper than the surface pattern of the text
What are you talking about?
klipt
If you wish to make an apple pie from scratch
You must first invent the universe
If you wish to predict the next token really well
You must first model the universe
Aeolun
> wondering when it can generate a beautifully-written novel
Not quite yet, but I’m working on it. It’s ~~hard~~ impossible to get original ideas out of an LLM, so it’ll probably always be a human assisted effort.
agumonkey
The TB of everything with transformers makes a difference, maybe i'm just too uneducated, but the amount of semantic context that can be taken into account when generating the next token is really disrupting.
marsten
> Over the years, the magic was never lost on me. However, I can never see LLMs as more than a "token prediction machine".
The "mere token prediction machine" criticism, like Pearl's "deep learning amounts to just curve fitting", is true but it also misses the point. AI in the end turns a mirror on humanity and will force us to accept that intelligence and consciousness can emerge from some pretty simple building blocks. That in some deep sense, all we are is curve fitting.
It reminds me of the lines from T.S. Eliot, “...And the end of all our exploring, Will be to arrive where we started, And know the place for the first time."
daxfohl
> although we’ll make plenty of mistakes and some things will go really wrong, we will learn and adapt quickly
If the "mistake" is that of concentrating too much power in too few hands, there's no recovery. Those with the willingness to adapt will not have the power to do so, and those with the power to adapt will not have the willingness. And it feels like we're halfway there. How do we establish a system of checks and balances to avoid this?
rcarmo
This read like a Philip K. Dick, Ubik-style advertisement for a dystopian future, and I’m pretty amazed it is an actual blog post by a corporate leader in 2025. Maybe Sam and Dario should be nominated for Hugos or something…
crossroadsguy
I have read his Scanner Darkly and partially another book. Not sure whether you are overrating this post, or insulting his writing style.
udev4096
[flagged]
FrustratedMonky
A crooked CEO doesn't seem to conflict with the concept of "advertisement for a dystopian future".
A CEO can be crooked, and selling a dystopian future.
So, don't think the parent was a delusional post.
wolecki
Some reasoning tokens on this post:
>Intelligence too cheap to meter is well within grasp
And also:
>cost of intelligence should eventually converge to near the cost of electricity.
Which is a meter-worthy resource. So intelligence effect on people's lives is in the order of magnitude of one second of a toaster use each day, in present value. This begs the question: what could you do with a toaster-second say 5 years from today?
GarnetFloride
That's what they said about electricity when nuclear power plants were proposed. What's your electricity bill like today?
AnthonyMouse
The primary operating cost of traditional power plants is fuel, i.e. coal and natural gas. The fuel cost for nuclear power plants is negligible because the energy content is higher by more than a factor of a million. So if you build enough nuclear plants to power the grid, charging per kWh for the electricity is pointless because the marginal cost of the fuel is so low. Meanwhile the construction cost should be on par with a coal plant, since the operating mechanism (heat -> steam -> electricity) is basically the same.
Unsurprisingly, this scared the crap out of the fossil fuel industry in the US and countries like Russia that are net exporters of fossil fuels, so they've spent decades lobbying to bind nuclear plant construction up in red tape to prevent them being built and funding anti-nuclear propaganda.
You can see a lot of the same attempts being made with AI, proposals to ban it or regulate it etc., or make sure only large organizations have it.
But the difference is that power plants are inherently local. If the US makes it uneconomical to build new nuclear plants, US utility customers can't easily get electricity from power plants in China. That isn't really how it works with AI.
bruce511
Your point is not wrong, but I'd clarify a couple things;
The primary cost of a traditional plant is fuel. However a nuclear plant needs a tad more (qualified) oversight than a coal plant.
In the same way that the navy has specialists for running nuclear propulsion systems versus crew needed for diesel engines. Not surprisingly nuclear engines cost "more than fuel".
That cost may end up being insignificant in the long run, but cost is not zero. And shortages of staff would matter (like it does with ATC at the moment.)
Construction cost should be much lower than it is, but I don't think it'll be as cheap as say coal or diesel. The nature of the fuel would always require some extras, if only because the penalty-for-failure is so high. There's a difference between a coal-plant catastrophe and Chernobyl.
So there are costs gor running nuclear, I don't think it necessarily gets "too cheap to measure".
roenxi
It is worth noting that the effect is jaw-droppingly stark. The regulators managed to invert the learning curve [0] so the more power plants get built the more expensive it gets! It is one of the most stunning failures of an industrial society in the modern era; the damage this did to us all is huge. It is disheartening that our leadership/society chose to turn their backs on the future and we're all lucky that the Chinese chose a different tack to the West's policy of energy failure. At least there are still people who believe in industry.
[0] https://www.mdpi.com/1996-1073/10/12/2169 - Fig 3
Joeri
I don't buy into the "red tape" argument. For me it's all due to a lack of a business case. I think building (LWR) nuclear plants is just freaking expensive, has always been freaking expensive, was bankrolled for a while by governments because they thought it would get cheaper, and then largely abandoned in most countries when it proved to be an engineering dead end.
Here's an MIT study that dug into the reasons behind high nuclear construction cost. They found that regulatory burdens were only responsible for 30% of cost increases. Most of the cost overruns were because of needing to adapt the design of the nuclear plant to the site.
https://news.mit.edu/2020/reasons-nuclear-overruns-1118
Now, you can criticize the methodology of that study, but then you have to bring your own study that shows precisely which regulatory burdens are causing these cost overruns, and which of those are in excess. Is it in excess that we have strict workplace safety regulation now? Is it in excess that we demand reactor containment vessels to prevent meltdowns from contaminating ground water supplies? In order to make a good red tape argument I expect detail in what is excess regulation, and I've never seen that.
Besides, if "red tape" and fossil industry marketing was really the cause of unpopularity, and the business case for nuclear was real when installing plants at scale, you would see Russia and China have the majority of their electricity production from nuclear power.
- Russia is the most pro-nuclear country in the world, and even they didn't get past 20% of electricity share. They claim levelized cost for nuclear on the order of that of solar and wind, but I am very skeptical of that number, and anyone who knows anything about the Russian government's relation to the truth will understand why. When they build nuclear plants in other countries (e.g. the bangladesh project) they are not that cheap.
- China sits at low single digit percentages of nuclear share, with a levelized cost that is significantly higher than Russia's and well above that of solar and wind. While they're planning to grow the nuclear share they assume it will be based on new nuclear technology that changes the business case.
Both Russia and China can rely on cheap skilled labor to bring down costs, a luxury western countries do not have.
And this is ultimately the issue: the nuclear industry has been promising designs that bring down costs for over half a century, and they have never delivered on that promise. It's all smoke and mirrors distracting from the fact that building nuclear plants is inherently freaking expensive. Maybe AI can help us design better and cheaper nuclear power plants, but as of today there is no proven nuclear plant design that is economical to build, and that is ultimately why you see so little new nuclear plant construction in the west.
squidbeak
Nuclear plants have other long tails costs, decommissioning and waste containment.
derefr
> So if you build enough nuclear plants to power the grid, charging per kWh for the electricity is pointless because the marginal cost of the fuel is so low.
Wouldn't there still be the OpEx of maintaining the power plants + power infrastructure (long-distance HVDC lines, transformer substations, etc)? That isn't negligible.
...I say that, but I do note that I live in a coastal city with a temperate climate near a mountain watershed, and I literally do have unmetered water. I suppose I pay for the OpEx of the infrastructure with my taxes.
hn_throwaway_99
> Unsurprisingly, this scared the crap out of the fossil fuel industry in the US and countries like Russia that are net exporters of fossil fuels, so they've spent decades lobbying to bind nuclear plant construction up in red tape to prevent them being built and funding anti-nuclear propaganda.
This is frankly nonsense, and my hope is that this nonsense is coming from a person too young to remember the real, valid fears from disasters like 3 Mile Island, Chernobyl, and Fukushima.
Yes, I fully understand that over the long term coal plants cause many more deaths, and of course climate change is an order of magnitude worse, eventually. The issue is that human fears aren't based so much on "the long term" or "eventualities". When nuclear power plants failed, they had the unfortunate side effect of making hundreds or thousands of square miles uninhabitable for generations. The fact that societies demand heavy regulation for something that can fail so spectacularly isn't some underhanded conspiracy.
Just look at France, the country with probably the most successful wide-scale deployment of nuclear. They are rightfully proud of their nuclear industry there, but they are not a huge country (significantly smaller than Texas), and thus understand the importance of regulation to prevent any disasters. Electricity there is relatively cheap compared to the rest of Western Europe but still considerably higher than the US average.
UltraSane
I always thought with nuclear electricity it would actually make more sense to charge by the max watts a customer can draw.
danw1979
The thing is, nuclear was never on such a steep learning curve as solar and batteries are today.
It’ll never be too cheap to meter, but electricity will get much cheaper over the coming decades, and so will synthetic hydrocarbons on the back of it.
tim333
My and or my family's electricity bills have never been near zero. On the other hand my AI bill is zero. I think different economics apply.
(that excludes a brief period when I camped with a solar panel)
TheOtherHobbes
Your electricity bill is set by the grift of archaic fossil energy industries. And nuclear qualifies as a fossil industry because it's still essentially digging ancient stuff out of the ground, moving it around the world, and burning it in huge dirty machines constructed at vast expense.
There are better options, and at scale they're literally capable of producing electricity that literally is too cheap to meter.
The reasons they haven't been built at scale are purely political.
Today's AI is computing's equivalent of nuclear energy - clumsy, centralised, crude, industrial, extractive, and massively overhyped and overpriced.
Real AI would be the step after that - distributed, decentralised, reliable, collaborative, free in all senses of the word.
greenie_beans
watch the cost of electricity go up because the demand created by data centers. i'm building an off grid solar system right now and it ain't cheap! thinking about a future where consumers are competing with data centers for electricity makes me think it might feel cheap in the future, though.
antihero
Datacentre operators want to keep energy costs down and also have capital to make it happen.
null
greenie_beans
weird i didn't know they could build nuclear power plants (surely you are being sarcastic?)
also this is weird i thought electricity prices only get cheaper? https://fred.stlouisfed.org/series/APU000072610
unstablediffusi
bro, how much did your electricity cost go up because of millions of people playing games on their 500w+ gpus? by a billion of people watching youtube? by hundreds of millions of women and children scrolling instagram and tiktok 8 hours a day?
greenie_beans
this is not the same thing and all the AI CEOs would agree with me. very absurd statement to try to use for your argument.
but here is some data bro! https://fred.stlouisfed.org/series/APU000072610
null
nhdjd
Fusion will show up soon dont worry. AI will accelerate its arrival.
NoGravitas
Instead of always being 30 years in the future, it's now always 15 years in the future!
greenie_beans
good luck w that
jes5199
I’m not sure we’ll be metering electricity if Wright’s Law continues
TheAceOfHearts
Do you think we will get AI models capable of learning in real time, using a small number of examples similar to humans, in the next few years? This seems like a key barrier to AGI.
More broadly, I wonder how many key insights he thinks are actually missing for AGI or ASI. This article suggests that we've already cleared the major hurdles, but I think there are still some major keys missing. Overall his predictions seem like fairly safe bets, but they don't necessarily suggest superintelligence as I expect most people would define the term.
paradox242
This is a condensed version of Altman's greatest hits when it comes to his pitch for the promise of AI as he (allegedly) conceives it, and in that sense it is nothing new. What is conspicuous is that there is a not-so-subtle reframing. No longer is AGI just around the corner, instead one gets the sense that OpenAI they have already looked around that corner and seen nothing there. No, this is one of what I expect will be many more public statements intended to cool things down a bit, and to reframe (investor) expectations that the timelines are going to be longer than were previously implied.
throw310822
Cool things down a bit? That's what you call "we're already in the accelerating part of the singularity, past the event horizon, the future progress curve looks vertical, the past one looks flat"? :D
le-mark
If any of that were true, then the llm would be actively involved in advancing themselves, or assisting humans in a meaningful way in the endeavor, which they’re not so far as I can tell.
fyrn_
For Sam Altman, honestly yes. The man has said some truly wild things for the VC
woopsn
Artificial intelligence is a nourished and well educated population. Plus some Adderall maybe. Those are the key insights which represent the only scientific basis for that term.
The crazy thing is that a well crafted language model is great product. A man should be content to say "my company did something akin to compressing the whole internet behind a single API" and take his just rewards. Why sully that reputation boasting to have invented a singularity that solves every (economically registerable) problem on Earth?
ixtli
Because they promised the latter to the investors and due to the nature of capitalism they can never admit they’ve done all they can.
hoseja
Such is the nature of the scorpion.
thegeomaster
I hate to enter this discussion, but learning based on a small number of examples is called few-shot learning, and is something that GPT-3 could already do. It was considered a major breakthrough at the time. The fact that we call gradient descent "learning" doesn't mean that what happens with a well-placed prompt is not "learning" in the colloquial sense. Try it: you can teach today's frontier reasoner models to do fairly complex domain-specific tasks with light guidance and a few examples. It's what prompt engineering is about. I think you might be making a distinction on the complexity of the tasks, which is totally fine, but needs to be spelled out more precisely IMO.
hdjdbdirbrbtv
Are you talking about teaching in the context window or fine tuning?
If it is the context window, then you are limited to the size of said window and everything is lost on the next run.
Learning is memory, what you are describing is an llm being the main character in the movie Momento, I.e. no longterm memories past what was trained in the last training run.
thegeomaster
There's really no defensible way to call one "learning" and the other not. You can carry a half-full context window (aka prompt) with you at all times. Maybe you can't learn many things at once this way (though you might be surprised what knowledge can be densely stored in 1m tokens), but it definitely fits the GP's definition of (1) real-time and (2) based on a few examples.
tim333
AlphaZero learned various board games from scratch up to better than human levels. I guess in principle that sort of algorithm could be generalized to other things?
null
crazylogger
What you described can be (and is being) achieved by agentic systems like Claude Code. When you give it a task, it knows to learn best practices on the web, find out what other devs are doing in your codebase, and it adapts. And it condenses + persists its learnings in CLAUDE.md files.
Which underlying LLM powers your agent system doesn't matter. In fact you can swap them for any state-of-the-art model you like, or even points Cursor to your self-hosted LLM API.
So in a sense every advanced model today is AGI. We were already past the AGI "singularity" back in 2023 with GPT4. What we're going through now is a maybe-decades-long process of integrating AGI into each corner of society.
It's purely an interface problem. Coding agent products hook the LLM to the real world with [web_search, exec_command, read_file, write_file, delegate_subtask, ...] tools. Other professions may require vastly more complicated interfaces (such as "attend_meeting",) it takes more engineering effort, sure, but 100% those interfaces will be built at some point in the coming years.
physix
I started quickly reading the article without reading who actually wrote it. As I scanned over the things being said, I started to ask myself: Who wrote this? It's probably some AI proponent, someone who has a vested interest. I had to smile when I saw who it was.
thrwwy_jhdkqsdj
I did the same thing, I thought "This post looks like a posthumous letter".
I hope LLM use will drive efforts for testing and overall quality processes up. If such thing as an AGI ever exists, we'll still need output testing.
To me it does not matter if the person doing something for you is smarter than you, if it's not well specified and tested it is as good as a guess.
Can't wait for the AI that is almost unusable for someone without a defined problem.
daxfohl
> although we’ll make plenty of mistakes and some things will go really wrong, we will learn and adapt quickly
Famous last words.
mjd
“So far, so good,” the man said as he fell past the twentieth floor of the Empire State Building.
afavour
Yes, in a world where populations increasingly ask an AI any question they have and believe whatever it says the opportunity for “things going really wrong” feels huge. To brush that aside so easily… well I wish I could say it surprises me.
munificent
"Some of you may die, but it's a sacrifice I am willing to make."
bigyabai
"I see!" Said the blind man, with his hammer and saw.
BjoernKW
The title is a likely nod to The Gentle Seduction by Marc Stiegler: http://www.skyhunter.com/marcs/GentleSeduction.html
kristjank
This reminds me of Pat Gelsinger quoting the Bible on Twitter, lol.
Between this and Ed Zitron at the other end of the spectrum, Ed's a lot more believeable to be honest.
absurdo
Link for context (to what Ed Zitron said)?
benjaminclauss
ab5tract
Excellent link, thank you for sharing this!
minimaxir
> A lot more people will be able to create software, and art. But the world wants a lot more of both, and experts will probably still be much better than novices, as long as they embrace the new tools.
This isn't correct: people want good software and good art, and the current trajectory of how LLMs are used on average in the real world unfortunately run counter to that. This post doesn't offer any forecasts on the hallucination issues of LLMs.
> As datacenter production gets automated, the cost of intelligence should eventually converge to near the cost of electricity. (People are often curious about how much energy a ChatGPT query uses; the average query uses about 0.34 watt-hours
This is the first time a number has been given in terms of ChatGPT's cost-per-query and is obviously much lower than the 3 watts still cited by detractors, but there's a lot of asterisks in how that number might be calculated (is watt-hours the right unit of measurement here?).
Workaccount2
>and the current trajectory of how LLMs are used on average in the real world unfortunately run counter to that. This post doesn't offer any forecasts on the hallucination issues of LLMs.
You are getting confused here...
LLMs are terrible at doing software engineering jobs. They fall on their face trying to work in sprawling codebases.
However, they are excellent at writing small bespoke programs on behalf of regular people.
Becky doesn't need ChatGPT to one-shot Excel.exe to keep track of the things she sells at the craft fair that afternoon.
This is a huge blind-spot I keep seeing people miss. LLM "programmers" are way way more useful to regular people than they are to career programmers.
tsimionescu
Do you have any proof whatsoever that non-programmers are using LLMs to write small bespoke apps for them successfully? Programming ecosystems are generally very unfriendly to this type of use case, with plenty of setup required to get something usable out even if you have the code available. Especially if you'd like to run the results on your phone.
Sure, an LLM might be able to guide you through the steps, and even help when you stumble, but you still have to follow a hundred little steps exactly with no intuition whether things will work out at the end. I very much doubt that this is something many people will even think to ask for, and then follow-through with.
Especially since the code quality of LLMs is nowhere near what you make it out to be. You'll still need to bear with them and go through a lot of trial and error to get anything running, especially if you have no knowledge of the terms of art, nor any clue of what might be going wrong if it goes wrong. If you've ever seen consumer bug reports, you might have some idea of the quality of feedback the LLM may be getting back if something is not working perfectly the first time. It's very likely to be closer to "the button you added is not working" than to "when I click the button, the application crashes" or "when I click the button, the application freezes for a few seconds and then pops up an error saying that input validation failed" or "the button is not showing up on the screen" [...]
azan_
> Do you have any proof whatsoever that non-programmers are using LLMs to write small bespoke apps for them successfully?
I’m radiologist, I’ve been paying for software that sped up my reporting like 200 usd per month. I’ve remade all the functionality I need in one evening with cursor and added some things that I’ve found missing from the original software.
Workaccount2
My company is a non-tech company that is in manufacturing. I'm not a programmer and niether is anyone else here.
So far I have created 7 programs that are now used daily in production. One of them replaces a $1k/yr/usr CAD package, and another we used to bring in a contractor to write. The others a miscellaneous apps for automating/simplifying our in houses processes. None of the programs are more than 6k LOC.
Avicebron
That's great, but Becky at the farmer's market doesn't want to be re-inventing notepad on her computer to sell sourdough.
This smells like "Self-checkouts", "perfect bespoke solution so that everyone can check their own groceries out, no need to wait for a slow/expensive _human_ to get the job done, except you end up basically just doing the cashiers job and waiting around inconveniently whenever you want to get some beer
bgwalter
And often you are still supervised by sometimes rude security guards who demand the receipts. So you have the worst of all worlds: You pay the supermarket, do the work and are treated like dirt.
Agentlien
On a tangent: what about self scanning? Here in Sweden that is the norm. For the last ten years I've brought my own bags, packed everything like I want it as I grab it - with a quick blip from the scanner - and then checkout is no line, no cashier, no unpacking and packing things again. Just scanning my card and going through a brief screen to check that the summary looks right.
Instead of a guard there's a scanner by the exit gate where you scan your ticket. In my case just a small stub since the actual ticket is sent digitally.
I think it works so smoothly, much better than before this system was introduced. My recent experience buying groceries in Belgium was very similar.
Workaccount2
>That's great, but Becky at the farmer's market doesn't want to be re-inventing notepad on her computer to sell sourdough.
That's exactly why she asks an LLM to do it for her. A program like this would be <1k LOC, well within even the current LLM one-shot-working-app domain.
satvikpendem
And yet I'd much rather self checkout than have a cashier do it for me.
theptip
Right. Jevons Paradox.
The lower barrier to entry might mean the average quality of software goes WAY down. But this could be great if it means that there are 10x as many little apps that make one person happy by doing only as much as they need.
The above is also consistent with the quantity of excellent software going way up as well (just slower than the below-average stuff).
const_cast
IMO the problem with the current software landscape isn't the number of applications, it's the quality and availability of integrations. Everything is in it's own little kingdom, so the communication overhead is extreme and grows exponentially. And by overhead I mean human cognition - I need to remember what's in my notes, and then put that in my calendar, and then email it to myself or something, and on and on.
We have these really amazing computers with very high-quality software but still, so many processes are manual because we have too many hops from tool to tool. And each hop is a marginal increase in complexity, and an increase in risk of things going wrong.
I don't think AI will solve this necessarily, and it could make it worse. But, if we did want to improve it, AI can act as a sort of spongy API. It can make decisions in the face of uncertainty.
I imagine a world where I want to make a doctor's appointment and I don't open any apps at all, or pick up the phone. I tell my AI assistant, and it works it out, and I have all my records in one place. Maybe the AI assistant works over a phone line, like some sort of prehistoric amalgamation of old technology and new. Maybe the AI assistant doesn't even talk to a receptionist, but another AI assistant. Maybe it gives me spongy human information, but speaks to other's of it's kind in more defined interfaces and dialects. Maybe, in the future, there are no applications at all. Applications are, after all, a human construction. A device intended to replace labor, but a gross approximation of it.
LtWorf
What the android/apple appstores have taught me compared to the old symbian appstore is that we don't need more apps, we need decent ones. The current appstores have more.
ben_w
> This is a huge blind-spot I keep seeing people miss. LLM "programmers" are way way more useful to regular people than they are to career programmers.
Kinda. They're good, and I like them, but I think of them like a power tool: just because you can buy an angle grinder or a plasma cutter from a discount supermarket, doesn't mean the tool is safe in the hands of an untrained amateur.
Someone using it to help fix up a spreadsheet? Probably fine. But you should at least be able to read the code to the level you don't get this deliberate bad example to illustrate the point:
#!/usr/bin/python3
total_sales = 0.0
def add_sale(total_sales, amount):
total_sales = total_sales + amount
print("total_sales: " + str(total_sales))
Still useful, still sufficiently advanced technology that for most people it is (Clarketech) magic, but also still has some rough edges.disambiguation
>. They fall on their face trying to work in sprawling codebases.
>However, they are excellent at writing small bespoke programs
For programmers I predict a future of a million micro services.
Sprawling has always been an undesirable and unnecessary byproduct of growing code bases, but there's been no easy solution to that. I wonder if LLMs would perform better on a mono repo of many small services than one sprawling monolith.
Workaccount2
The issue as I have observed it is that software is written to cover as many use cases for as many users as possible. Obviously executives want to sell their product to as many hands as possible.
But end users often only need a tiny fraction of what the software is fully capable of, leaving a situation where you need you need to purchase 100% to use 5%.
LLMs can break down this wall by offering the user the ability to just make a bespoke 5% utility. I personally have had enormous success doing this.
antihero
Micro-services are part of a system, so to create anything meaningful with them your context window has to include knowledge about the rest of the system anyway. If your system is large and complex, that means a huge context window even if your code itself is small.
api
Micro service systems are just huge sprawling code bases with more glue code. Calling something over a network instead of via a local function call is still calling something.
bgwalter
> experts will probably still be much better than novices, as long as they embrace the new tools.
Sure, we'll all get a subscription and subject ourselves to biometric verification in order to continue our profession.
OpenAI's website looks pretty bad. Shouldn't it be the best website in existence if the new Übermensch programmers rely on "AI"?
What about the horrible code quality of "AI" applications in the "scientific" Python sphere? Shouldn't the code have been improved by "AI"?
So many questions and no one shows us the code.
actuallyalys
Yep, watt-hours are correct. Think of it this way: Power supplies and laptop chargers are rated in watts and that represents how much energy they can pull from the wall at a point in time. If you wanted to know how much energy a task takes on your computer and you know it maxes out your 300 watt desktop [0] for an entire hour, that would take 300 watt-hours.
[0] To my knowledge, no desktops are built to max out their power supply for an extended time; this is a simplification for illustration purposes.
theptip
Watt-hours seem good right now, and rebut the current arguments that LLMs are wasteful. They might soon stop being the right lens, if your agents can go off and (in parallel) expend some arbitrary and borderline-unknowable budget of energy on a single request.
At some point you’ll need to measure in terms of $ earned per W-H, or just $out/$in.
tsimionescu
(average) Watt-hours would still be the right unit to measure that, if the question is about energy efficiency / environmental impact. You still care how much energy those agents consume to achieve a given task, and the time it takes for tasks is still variable, so you need to multiply by time.
Money is irrelevant to energy efficiency.
actuallyalys
To be clear, I'm only saying the unit is correct, not whether the number of watt-hours is accurate or (assuming it is correct) that it's a good amount to spend on each query.
chrismartin
Hey Sam, does 0.34 watt-hours per query include an amortization of the energy consumed to train the model, or only the marginal energy consumed by inference?
I used to believe this wasn't worth considering (because while training is energy-intensive, once the model is trained we can use it forever.) But in practice it seems like we switch to newer/better models every few months, so a model has a limited period of usefulness.
> (is watt-hours the right unit of measurement here?)
Yes, because we want to measure energy consumed, while watts is only a measure of instantaneous power.
onlyrealcuzzo
> is watt-hours the right unit of measurement here?
That's how electricity is most commonly priced. What else would you want?
minimaxir
It's more that every time the topic has come up, it's reported in watts and I'm not sure if there's a nuance I'm missing. It's entirely possible those reports have been in watt-hours though and using watts for shorthand.
I still find the Technology Connections video counterintuitive. https://www.youtube.com/watch?v=OOK5xkFijPc
onlyrealcuzzo
A watt wouldn't make any sense.
A watt needs a time unit to be converted to joules because a watt is a measure of power, while a joule is a measure of energy.
Think of it like speed and distance:
A watt is a RATE of energy use, just like speed (e.g., miles per hour) is a RATE of travel.
A joule is a total amount of energy, just like distance (e.g., miles) is a total amount of travel.
If you're traveling 60 mph but you don't specify for how long... You won't know how far you went, how much gas you used, etc.
moab
Watt-hours is the only unit that makes sense here as we're describing the energy cost of some calculation. Watt, a rate unit, is not appropriate.
roywiggins
Watt-hours are what matter. If you want to boil a kettle, you need to use a certain minimum amount of watt-hours to do it, which corresponds to, say, burning a certain amount of propane. If you care about how many times you can boil a kettle on a bottle of propane, you care about the watt-hours expended. If you had perfect insulation you could boil a kettle by applying barely any watts for a long enough time and it would take the same amount of propane.
There are situations where Watts per se matter, eg if you build a datacenters in an area that doesn't have enough spare electricity capacity to generate enough watts to cover you, and you'd end up browning out the grid. Or you have a faulty burner and can't pump enough watts into your water to boil it before the heat leaks out, no matter how long you run it.
jumploops
> people want good software
Absolutely! We’re in an interesting time, and LLMs are certainly over-hyped.
With that said, I’d argue that most software today is _average_. Not necessarily bad by design, but average because of the (historic) economies of software development.
We’ve all seen it: some beloved software tool that fills a niche. They raise funding/start to scale and all of a sudden they need to make this software work for more people.
The developers remove poweruser features, add functionality to make it easier to use, and all of a sudden the product is a shell of its former self.
This is what excites me about LLMs. Now we can build software for 10s or 100s of users, instead of only building software with the goal (and financial stability) of a billion users.
My hope is that, even with tons of terrible vibe coded atrocities littering the App Store/web, we’ll at least start to see some software that was better than before, because the developers can focus on the forest rather than each individual tree (similar to Assembly -> C -> Python).
jcelerier
> people want good software and good art,
minimaxir
There's a difference between a 2000's character brute-forced through millions of advertising dollars to sell ringtones to children, and whatever the hell is happening in the https://www.meta.ai/ Discover feed.
One aspect about modern generative AI that has surprised me is that it should have resulted in a new age of creatively surreal shitposts, but instead we have people Ghiblifying themselves and pictures of Trump eating tacos.
xwiz
Surreal shitposting you say?
bongodongobob
Why would you expect the average person to have exceptional taste to produce things you like? Go listen to a random Spotify track, you probably won't like that either. Art still needs to be curated and given some direction. Metas random AI feed created by the average FB user is indicative of nothing.
nradov
Whether software is good or not doesn't matter in most cases. Sure, quality matters for software used by millions. But for little bespoke applications good enough is fine. People will tolerate a lot of crap as long as it's cheap and gets the job done.
Caelus9
My attitude towards AI is one of balance not overly dependent, but not completely rejecting either. After all, AI is already playing an important role in many fields, and I believe the future will be a world where humans and AI coexist and collaborate. There are many things humans cannot do, but AI can help us achieve them. We provide the ideas, and AI can turn those ideas into actions, helping us accomplish tasks.
biophysboy
This is the correct take. It is technology, not a god or a demon.
> Scientific progress is the biggest driver of overall progress
> There will be very hard parts like whole classes of jobs going away, but on the other hand the world will be getting so much richer so quickly that we’ll be able to seriously entertain new policy ideas we never could before
Real wages haven’t risen since 1980. Wealth inequality has. Most people have much less political power than they used to as wealth - and thus power - have become concentrated. Today we have smartphones, but also algorithm-driven polarization and a worldwide rise in authoritarian leaders. Depression and anxiety affect roughly 30% of our population.
The rise of wealth inequality and the stagnation of wages corresponds to the collapse of the labor movement under globalization. Without a counterbalancing force from workers, wealth accrues to the business class. Technological advances have improved our lives in some ways but not on balance.
So if we look at people’s well-being, society as whole hasn’t progressed since the 1980s; in many ways it’s gotten worse. Thus the trajectory of progress described in the blog post is make believe. The utopia Altman describes won’t appear. Mass layoffs, if they happen, will further concentrate wealth. AI technology will be used more and more for mass surveillance, algorithmic decision making (that would make Kafka blush), and cost cutting.
What we can realistically expect is lowering of quality of life, an increased shift to precarious work, further concentration of wealth and power, and increasing rates of suffering.
What we need instead of science fiction is to rebuild the labor movement. Otherwise “value creation” and technology’s benefits will continue to accrue to a dwindling fraction of society. And more and more it will be at everyone else’s expense.