S1: The $6 R1 Competitor?

57 comments

·February 5, 2025

swiftcoder

> having 10,000 H100s just means that you can do 625 times more experiments than s1 did

I think the ball is very much in their court to demonstrate they actually are using their massive compute in such a productive fashion. My BigTech experience would tend to suggest that frugality went out the window the day the valuation took off, and they are in fact just burning compute for little gain, because why not...

gessha

This is pure speculation on my part but I think at some point a company's valuation became tied to how big their compute is so everybody jumped on the bandwagon.

whizzter

Mainly it points to a non-scientific "bigger is better" mentality, and the researchers probably didn't mind playing around with the power because "scale" is "cool".

Remember that the Lisp AI-labs people were working on non-solved problems on absolute potatoes of computers back in the day, we have a semblance of progress solution but so much of it has been brute-force (even if there has been improvements in the field).

The big question is if these insane spendings has pulled the rug on real progress if we head into another AI winter of disillusionment or if there is enough real progress just around the corner to show that there is hope for investors in a post-deepseek valuation hangover.

wongarsu

We are in a phase where costs are really coming down. We had this phase from GPT2 to about GPT4 where the key to building better models was just building bigger models and training them for longer. But since then a lot of work has gone into distillation and other techniques to make smaller models more capable.

If there is another AI winter, it will be more like the dotcom bubble: lots of important work got done in the dotcom bubble, but many of the big tech companies started from the fruits of that labor in the decade after the bubble burst

svantana

Besides that, AI training (aka gradient descent) is not really an "embarrassingly parallel" problem. At some point, there are diminishing returns on adding more GPUs, even though a lot of effort is going into making it as parallel as possible.

mark_l_watson

Off topic, but I just bookmarked Tim’s blog, great stuff.

I dismissed the X references to S1 without reading them, big mistake. I have been working generally in AI for 40 hears and neural networks for 35 years and the exponential progress since the hacks that make deep learning possible has been breathtaking.

Reduction in processing and memory requirements for running models is incredible. I have been personally struggling with creating my own LLM-based agents with weaker on-device models (my same experiments usually work with 4o-mini and above models) but either my skills will get better or I can wait for better on device models.

I was experimenting with the iOS/iPadOS/macOS app On-Device AI last night and the person who wrote this app was successful in combining web search tool calling working with a very small model - something that I have been trying to perfect.

pona-a

If chain of thought acts as a scratch buffer by providing the model more temporary "layers" to process the text, I wonder if making this buffer a separate context with its own separate FNN and attention would make sense; in essence, there's a macroprocess of "reasoning" that can unbounded time to complete, and then there's another microprocess of describing this incomprehensible stream of embedding vectors as natural language explanation, in a way returning to encoder/decoder architecture but where both are autoregressive. Maybe it will give us a denser representation of said "thought" not constrained by accurately reproducing the output.

cowsaymoo

The part about taking control of a reasoning model's output length using <think></think> tags is interesting.

> In s1, when the LLM tries to stop thinking with "</think>", they force it to keep going by replacing it with "Wait".

I had found a few days ago that this let you 'inject' your own CoT and jailbreak it easier. Maybe these are related?

https://pastebin.com/G8Zzn0Lw

https://news.ycombinator.com/item?id=42891042#42896498

causal

This even points to a reason why OpenAI hides the "thinking" step: it would be too obvious that the context is being manipulated to induce more thinking.

sambull

That sovereign wealth fund with tik tok might set a good precedent; when we have to 'pour money' into these companies we can do so with stake in them held in our sovereign wealth fund.

bberenberg

In case you’re not sure what S1 is, here is the original paper: https://arxiv.org/html/2501.19393v1

mi_lk

it's also the first link in the article's first sentence

bberenberg

Good call, I must have missed it. I read the whole blog then went searching for what S1 was.

addandsubtract

It's linked in the blog post, too. In the first sentence, actually, but for some reason the author never bothered to attach the name to it. As if keeping track of o1, 4o, r1, r2d2, wasn't exhausting enough already.

kgwgk

> for some reason the author never bothered to attach the name to it

Respect for his readers’ intelligence, maybe.

null

[deleted]

theturtletalks

Deepseek R1 uses <think/> and wait and you can see it in the thinking tokens second guessing itself. How does the model know when to wait?

These reasoning models are feeding more to OP's last point about NVidia and OpenAI data centers not being wasted since reason models require more tokens and faster tps.

qwertox

Probably when it would expect a human to second guess himself, as shown in literature and maybe other sources.

ttyprintk

https://huggingface.co/simplescaling

anentropic

and: https://github.com/simplescaling/s1

mettamage

When you're only used to ollama, how do I go about using this model?

cyp0633

Qwen's QvQ-72B does much more "wait"s than other LLMs with CoT I tried, maybe they've somewhat used that trick already?

Havoc

The point about agents to conceal access to the model is a good one.

Hopefully we won’t lose all access to models in future

yapyap

> If you believe that AI development is a prime national security advantage, then you absolutely should want even more money poured into AI development, to make it go even faster.

This, this is the problem for me with people deep in AI. They think it’s the end all be all for everything. They have the vision of the ‘AI’ they’ve seen in movies in mind, see the current ‘AI’ being used and to them it’s basically almost the same, their brain is mental bridging the concepts and saying it’s only a matter of time.

To me, that’s stupid. I observe the more populist and socially appealing CEOs of these VC startups (Sam Altman being the biggest, of course.) just straight up lying to the masses, for financial gain, of course.

Real AI, artificial intelligence, is a fever dream. This is machine learning except the machines are bigger than ever before. There is no intellect.

and the enthusiasm of these people that are into it feeds into those who aren’t aware of it in the slightest, they see you can chat with a ‘robot’, they hear all this hype from their peers and they buy into it. We are social creatures after all.

I think using any of this in a national security setting is stupid, wasteful and very, very insecure.

Hell, if you really care about being ahead, pour 500 billion dollars into quantum computing so u can try to break current encryption. That’ll get you so much further than this nonsensical bs.

menaerus

You can choose to be somewhat ignorant of the current state in AI, about which I could also agree that at certain moments it appears totally overhyped, but the reality is that there hasn't been a bigger technology breakthrough probably in the last ~30 years.

This is not "just" machine learning because we have never been able to do things which we are today and this is not only the result of better hardware. Better hardware is actually a byproduct. Why build a PFLOPS GPU when there is nothing that can utilize it?

If you spare yourself some time and read through the actual (scientific) papers of multiple generations of LLM models, the first one being from DeepMind in 2017, you might get to understand that this is no fluff.

And I'm speaking this from a position of a software engineer, without bias.

The reason why all this really took off with so much hi-speed is because of the not quite expected results - early LLM experiments have shown that "knowledge" with current transformers architecture can linearly scale with regards to the amount of compute and training time etc. That was very unexpected and to this day scientists do not have an answer why this even works.

So, after reading bunch of material I am inclined to think that this is something different. The future of loading the codebase into the model and asking the model to fix bugs has never been so close and realistic. For the better or worse.

dotancohen

  > Real AI, artificial intelligence, is a fever dream. This is machine learning except the machines are bigger than ever before. There is no intellect.

That sounds to me like dismissing the idea that a Russian SSBN might cross the Pacific and nuke Los Angeles because "submarines can't swim".

Even if the machine learning isn't really intelligent, it is still capable of performing IF..THEN..ELSE operations, which could have detrimental effects for [some subset of] humans.

And even if you argue that such a machine _shouldn't_ be used for whatever doomsday scenario would harm us, rest assured that someone, somewhere, who either does not understand what the machines are designed to do or just pretends that they work like magic, will put the machines in a position to make such a decision.

amarcheschi

I couldn't agree more.

If we're not talking about cyber war exclusively, such as finding and exploiting vulnerabilities, for the time being national security will still be based on traditional army.

Just a few weeks ago, italy announced a 16bln€ plan to buy >1000 rheinmetall ifv vehicles. That alone would make italy's army one of the most equipped in Europe. I can't imagine what would happen with a 500$bln investment in defense,lol. I don't agree with what Meloni's government is doing, but one of the ministers I agree more with is the defense minister Crosetto

Furthermore, what is being shown, at least for the time being, is that open source can be and is crucial in aiding developing better models. This collides with the idea of big, single "one winner takes it all" VC mentality (because let's be honest, these defense pitches are still made by startup/VC bros)

piltdownman

>italy announced a 16bln€ plan to buy >1000 rheinmetall ifv vehicles. That alone would make italy's army one of the most equipped in Europe.

So target practice for a beyond-the-horizon missile system launched ground-to-ground or air-to-ground? As an attacking force, conventional ground forces and tactics are a non-runner in a modern theatre of operations when faced against air and drone support. This is why no single EU country is incentivised into dumping money into any single area - as the only probable defense would be against USA/Russia/China to begin with.

The US proved it beyond doubt in Afghanistan - partisans simply haven't a chance against a gunship with IR or NV optics; the last time they levelled the playing field against air interdictors was in Charlie Wilson's Afghanistan when the Mujahideen took on that era of Soviet gunships with hand-held AA systems.

smcl

Been saying this for years, it's been fucking baffling. Generating images, video and text that sort-of resembles what a human would come up with is genuinely quite impressive. It is not "let's claim it'll fix our country" (looking at you, Keir) impressive though, and I cannot believe so much money has been pumped into it.

amarcheschi

But you have to over promise and under deliver, otherwise you won't receive those sweet sweet money

baq

I can only say that exponential curves grow nominally sublinearly before they take off. AI is not quite at the obvious take off point, but owners of the biggest clusters have seen the extrapolations and it isn't pretty - once your competitor achieves take off and you aren't anywhere close, you're done for. The risk of not participating in that are too great.

amelius

Yes, I'd like to see some examples where our current AI can actually extrapolate rather than interpolate. Let it invent new things, new drawing styles, new story plots, etc. Maybe _then_ it will impress me.

mrshadowgoose

Here you go: https://www.biorxiv.org/content/10.1101/2024.11.11.623004v1

amelius

I'm not convinced. This is using the tooling and paradigms invented by humans.

moffkalast

Can you?

mrshadowgoose

> They think it’s the end all be all for everything.

Is (human-based) general intelligence not one of the fundamental enabling elements of literally every human activity throughout history, regardless of how many layers of automation and technology one has to peel back to get to it?

Can you maybe imagine how the ability to create arbitrary amounts of general intelligence, completely divorced from the normal lengthy biological process, could upend that foundation of human activity?

> They have the vision of the ‘AI’ they’ve seen in movies in mind, see the current ‘AI’ being used and to them it’s basically almost the same, their brain is mental bridging the concepts and saying it’s only a matter of time.

I've found that most AI-related movies exclusively focus on "quality ASI" scenarios, which are mostly irrelevant to our current state of the world, as an immense amount of danger/value/disruption will arrive with AGI. People who are seriously reasoning about the impacts of AGI are not using movies as references. "Those stupid movie watching idiots" is just a crutch you are using to avoid thinking about something that you disagree with.

> Real AI, artificial intelligence, is a fever dream. This is machine learning except the machines are bigger than ever before. There is no intellect.

Do you have any evidence to support this conclusion? And does it even matter? If "fake intellect" can replace a human, that human still has to deal with the very real issue or not having a job anymore. If "fake intellect" is used to conduct mass surveillance, and direct suppression activities towards divergent individuals, those individuals are still going to have a bad time.

mnky9800n

Also the narrative that we are currently on the brink of Ai explosion and this random paper shows it has been the same tired old story handed out by ai hawks for years now. Like yes, I agree with the general idea that more compute means more progress for humans and perhaps having a more responsive user interface through some kind of ai type technology would be good. But I don’t see why that will turn into Data from Star Trek. But I also think all these ai hawks kind of narcissistically over value their own being. Like blink and their lives are over in the grand scheme of things. Maybe our “awareness” of the world around us is an illusion provided by evolution because we needed it to value self preservation whereas other animals don’t. There is an inherent belief in the specialness of humans that I suppose I mistrust.

ben_w

> But I don’t see why that will turn into Data from Star Trek.

"Is Data genuinely sentient or is he just a machine with this impression" was a repeated plot point in TNG.

https://en.wikipedia.org/wiki/The_Measure_of_a_Man_(Star_Tre...

https://en.wikipedia.org/wiki/The_Offspring_(Star_Trek:_The_...

https://en.wikipedia.org/wiki/The_Ensigns_of_Command

https://en.wikipedia.org/wiki/The_Schizoid_Man_(Star_Trek:_T...

Similar with The Doctor on VOY.

Even then, what we have with LLMs is basically already at the level of the ship's main computer as it was written in TNG/DS9/VOY.

encipriano

I find the last part of the paragraph offputting and I agree

HenryBemis

> Going forward, it’ll be nearly impossible to prevent distealing (unauthorized distilling). One thousand examples is definitely within the range of what a single person might do in normal usage, no less ten or a hundred people. I doubt that OpenAI has a realistic path to preventing or even detecting distealing outside of simply not releasing models.

(sorry for the long quote)

I will say (naively perhaps) "oh but that is fairly simple". For any API request, add a counter of 5 seconds to the next for 'unverified' users. Make the "blue check" (a-la X/Twitter). For the 'big sales' have a third-party vetting process so that if US Corporation XYZ wants access, they prove themselves worthy/not Chinese competition and then you do give them the 1000/min deal.

For everyone else, add the 5 second (or whatever other duration makes sense) timer/overhead and then see them drop from 1000 requests per minutes to 500 per day. Or just cap them at 500 per day and close that back-door. And if you get 'many cheap accounts' doing hand-overs (AccountA does 1-500, AccountB does 501-1000, AccountC does 1001-1500, and so on) then you mass block them.

HN

S1: The $6 R1 Competitor?

S1: The $6 R1 Competitor?