DeepSeek R2 launch stalled as CEO balks at progress

teruakohatu

The title of the article is "DeepSeek R2 launch stalled as CEO balks at progress" but the body of the article says launch stalled because there is a lack of GPU capacity due to export restrictions, not because a lack of progress. The body does not even mention the word "progress".

I can't imagine demand would be greater for R2 than for R1 unless it was a major leap ahead. Maybe R2 is going to be a larger/less performant/more expensive model?

Deepseek could deploy in a US or EU datacenter ... but that would be admitting defeat.

Thorrez

The article says this:

>June 26 (Reuters) - Chinese AI startup DeepSeek has not yet determined the timing of the release of its R2 model as CEO Liang Wenfeng is not satisfied with its performance,

>Over the past several months, DeepSeek's engineers have been working to refine R2 until Liang gives the green light for release, according to The Information.

But yes, it is strange how the majority of the article is about lack of GPUs.

chvid

I am pretty sure that the information has no access to / sources at Deepseek. At most they are basing their article on selective random internet chatter amongst those who Chinese ai.

Davidzheng

but deepseek doesn't actually need to host inference right if they opensource it? I don't see why these companies even bother to host inference. deepseek doesn't need outreach (everyone knows about them) and the huge demand for sota will force western companies to host them anyway.

teruakohatu

Releasing the model has paid off handsomely with name recognition and making a significant geopolitical and cultural statement.

But will they keep releasing the weights or do an OpenAI and come up with a reason they can't release them anymore?

At the end of the day, even if they release the weights, they probably want to make money and leverage the brand by hosting the model API and the consumer mobile app.

ngruhn

If they continue to release the weights + detailed reports what they did, I seriously don't understand why. I mean it's cool. I just don't understand why. It's such a cut throat environment where every little bit of moat counts. I don't think they're naive. I think I'm naive.

impossiblefork

The lack of GPU capacity sounds like bullshit though, and it's unsourced. It's not like you can't offer it as a secondary thing, sort of like O-3 or even just turning on the reasoning.

wizee

They just recently released the r1-0528 model which was a massive upgrade over the original R1 and is roughly on par with the current best proprietary western models. Let them take their time on R2.

A_D_E_P_T

At this point the only models I use are o3/o3-pro and R1-0528. The OpenAI model is better at handling data and drawing inferences, whereas the DeepSeek model is better at handling text as a thing in itself -- i.e. for all writing and editing tasks.

With this combo, I have no reason to use Claude/Gemini for anything.

People don't realize how good the new Deepseek model is.

energy123

My experience with R1-0528 for python code generation was awful. But I was using a context length of 100k tokens, so that might be why. It scores decently in the lmarena code leaderboard, where context length is short.

diggan

Would love to see the system/user prompts involved, if possible.

Personally I get it to write the same code I'd produce, which obviously I think is OK code, but seems other's experience differs a lot from my own so curious to understand why. I've iterated a lot on my system prompt so could be as easy as that.

sigmoid10

https://archive.is/byKrB

spaceman_2020

Honestly, AI progress suffers because of these export restrictions. An open source model that can compete with Gemini Pro 2.5 and o3 is good for the world, and good for AI

energy123

Your views on this question are going to differ a lot depending on the probability you assign to a conflict with China in the next five years. I feel like that number should be offered up for scrutiny before a discussion on the cost vs benefits of export controls even starts.

diggan

> probability you assign to a conflict with China in the next five years. I feel like that number should be offered up for scrutiny before a discussion

Might as well talk about the probability of a conflict with South Africa, China might not be the best country to live in nor be country that takes care of its own citizens the best, but they seem non-violent towards other sovereign nations (so far), although of course there is a lot of posturing. But from the current "world powers", they seem to be the least violent.

spaceman_2020

I'm not American. Ever since I've been old enough to understand the world, the only country constantly at war everywhere is America. An all-powerful American AI is scarier to me than an open source Chinese one

layer8

Are you saying that large-model capabilities would make a substantial difference in a military conflict within the next five years? Because we aren’t seeing any signs of that in, say, the Ukraine war.

tazjin

> depending on the probability you assign to a conflict with China in the next five years

And on who you would support in such a conflict! ;)

qwertox

"We had difficulties accessing OpenAI, our data provider." /s

jamesblonde

Rumour was that DeepSeek used the outputs of the thinking steps in OpenAI's reasoning model (o1 at the time) to traing DeepSeek's Large Reasoning Model R1.

astar1

This, my guess is OpenAI wised up after r1 and put safeguards in place for o3 that it didn't have for o1, hence the delay.

ozgune

I think that's unlikely.

DeepSeek-R1 0528 performs almost as well as o3 in AI quality benchmarks. So, either OpenAI didn't restrict access, DeepSeek wasn't using OpenAI's output, or using OpenAI's output doesn't have a material impact in DeepSeek's performance.

https://artificialanalysis.ai/?models=gpt-4-1%2Co4-mini%2Co3...

nsoonhui

Not too sure why you are downvoted but OpenAI did announce that they are investigating on the Deepseek (mis)use of their outputs, and that they were tightening up the validation of those who use the API access, presumably to prevent the misuse.

To me that does seem like a reasonable speculation, though unproven.

xdennis

I still find it amusing to call it "misuse". No AI company has ever asked for permission to train.

viraptor

Exactly because it's phrased like the poster knows this is the reason. I wouldn't downvote it if it was a clear speculation with the link to the OAI announcement you mentioned for bonus points.

HN

DeepSeek R2 launch stalled as CEO balks at progress

DeepSeek R2 launch stalled as CEO balks at progress