GPT 4.5 level for 1% of the price

281 comments

·March 16, 2025

GavCo

Surprised nobody has pointed this out yet — this is not a GPT 4.5 level model.

The source for this claim is apparently a chart in the second tweet in the thread, which compares ERNIE-4.5 to GPT-4.5 across 15 benchmarks and shows that ERNIE-4.5 scores an average of 79.6 vs 79.14 for GPT-4.5.

The problem is that the benchmarks they included in the average are cherry-picked.

They included benchmarks on 6 Chinese language datasets (C-Eval, CMMLU, Chinese SimpleQA, CNMO2024, CMath, and CLUEWSC) along with many of the standard datasets that all of the labs report results for. On 4 of these Chinese benchmarks, ERNIE-4.5 outperforms GPT-4.5 by a big margin, which skews the whole average.

This is not how results are normally reported and (together with the name) seems like a deliberate attempt to misrepresent how strong the model is.

Bottom line, ERNIE-4.5 is substantially worse than GPT-4.5 on most of the difficult benchmarks, matches GPT-4.5 and other top models on saturated benchmarks, and is better only on (some) Chinese datasets.

InkCanon

To try to avoid the inevitable long arguments about which benchmarks or sets of them are universally better: there is no such thing anymore. And even within benchmarks, we're increasingly squinting to see the difference.

threatripper

Do the benchmarks reflect real-world usability? My feeling is that the benchmark result numbers stop working above 75%.

In a real problem you may need to get 100 things right in a chain which means a 99% chance of getting each single one correct results in only 37% change of getting the correct end result. But creating a diverse test that can correctly identify 99% correct results in complex domains sounds very hard since the answers are often nuanced in details where correctness is hard to define and determine. From working in complex domains as a human, it often is not very clear if something is right or wrong or in a somewhat undefined and underexplored grey area. Yet we have to operate in those areas and then over many iterations converge on a result that works.

Not sure how such complex domains should be benchmarked and how we objectively would compare the results.

fau

GPT-4.5's advantages are supposed to be in aspects that aren't being captured well in current benchmarks, so the claim would be shaky even if ERNIE's benchmarks actually showed better performance.

bdelmas

You know what's sad? Every Western company has been using this technique for a long time...

iandanforth

So, fairly accurate if you're Chinese?

GavCo

It doesn't really matter what nationality or ethnicity you are, but if you communicate with the model in Chinese you might get better results from this model.

Then again, if they've misrepresented the strength of the model overall, there might be some other shenanigans with their results. The fact that their results show their model is worse than GPT-4.5 on 2 Chinese language benchmarks, while it's so much stronger on some of the others, is a bit weird.

ksec

I guess this is the end of OpenAI? No more dreaming of Universal Basic Compute for AI, Multi Trillion for Fabs and Semi?

This is just like everything in China. They will find ways to drive down cost to below anyone previously imagined, subsidised or not. And even just competing among themselves with DeepSeek vs ERNIE and Open sourcing them meant there is very little to no space for most.

Both DRAM and NAND industry for Samsung / Micron may soon be gone, I thought this was going to happen sooner but it seems finally happening. GPU and CPU Designs are already in the pipelines with RISC-V, IMG and ARM-China. OLED is catching up, LCD is already taken over. Batteries we know. The only thing left is foundries.

Huawei may release its own Open Source PC OS soon. We are slowly but surely witnessing the collapse of Western Tech scene.

SirensOfTitan

> We are slowly but surely witnessing the collapse of Western Tech scene

Generally, I’ve found that almost no founders or friends I speak with have any vision for the future anymore. They care only about making money and do not care how. It’s a spectacular collapse of vision and purpose—these people have always existed but it feels incredibly pervasive now.

With that, I realize your comment is much broader than AI so below is too domain specific but…

VC has been investing in AI as-if it were a winner takes all market, but it has been obvious that isn’t the case.

Not only that, but the massive amount of cash thrown to anyone with even marginal credentials has undermined the constraints that often lead to innovation.

There is 0 reason that Safe Superintelligence should be raising for the second time at a 30 B valuation with no product.

nyarlathotep_

> VC has been investing in AI as-if it were a winner takes all market, but it has been obvious that isn’t the case.

I really don't see how this was supposed to go, and I've never heard an explanation.

I don't see any kind of coherent vision from any of these types.

Most normal folks (i.e not SV/HN types that seem to desire to replace their marketable programming skills with LLM output) really don't "want" LLMs in any real sense.

Sure, people use them like a search engine, kids cheat on homework with them etc, but there's not this overwhelming universal desire for them like there was for, say iPhones.

I never once have heard any sort of proposed roadmap for how LLMs were supposed to work as a product.

They were just going to get, uh "really good" and take everyone's office job or something?*

Normal F500 organizations that are obviously a target for LLM use (via hyperscaler sales) are still yet to see a clear path to "revolutionizing" their workforce or whatever via LLMs-it's just not there. Costs are too high, there's no obvious use case, "hallucinations" are a real impediment etc.

I'll add many of the public usecases for this (i.e those a hyperscaler would blog about as a sales promo) are seriously weak ("we reduced onboarding time by 20% with $MODEL")

I would really like to hear a proposal for how this is all supposed to come together. Does anyone have a concrete plan for the future for all this stuff?

*I'll note, this is NOT the way to sell a product to the masses, either.

Addendum: I'm not an "LLM hater" by any means. I pay for GH Copilot, and have been running local LLMs since it's been a thing (granted with limited hardware, and limited quality)--I intend to wait a bit and buy better hardware with one motivating factor being running local LLMs in a year or so when the open-source offerings stabilize"

SpicyLemonZest

You're looking for a proposal for how it's supposed to come together in the future, but VCs are living in the world where it already has come together. Normal F500 organizations have widely adopted LLMs, with OpenAI reporting that 92% of them are customers. (https://www.axios.com/2024/08/29/openai-chatgpt-200-million-...) Determining individual business usage is always a bit sketchy, but at least one survey run by a major staffing firm indicates that a majority of workers and almost all executives use generative AI for their jobs. (https://www.weforum.org/stories/2024/01/ai-training-workforc...)

nasmorn

I have changed from Copilot to Cline with Claude 3.7 and it is a total gamechanger. For some one like me being able to describe a multi file edit is really empowering. I hate that kind of busy work

Imustaskforhelp

can I say something.

This is one of the most finest and most accurate things that I have read in a long time.

This really could be a blog post which I encourage you to make! (I would prefer github pages but if you really want , I have a domain name on cloudflare and I am more than willing to host the static page of such blog on my own domain name for absolutely free (lets go , cloudflare!)

Its just facts. Pure facts. "" Generally, I’ve found that almost no founders or friends I speak with have any vision for the future anymore. They care only about making money and do not care how. It’s a spectacular collapse of vision and purpose—these people have always existed but it feels incredibly pervasive now. ""

Why did I read it in a monotonous way as if a student from the future understands the current scenario. I felt as if it was the same level of sadness in my heart as that when you listen to some video which has raining background and he reads the dark comedy (something like burialgoods oats shitposting but this time more serious and real!)

Currently saving this on wayback machine just for this comment. Internet needs to preserve this comment , no matter what.

Imustaskforhelp

https://web.archive.org/web/20250316121222/https://news.ycom...

May archive never go down!

spamizbad

There used to be significant alignment between engineers, founders and certain VCs: lots of excitement around building software that genuinely made things better/easier/cheaper. Each group naturally wanted a different thing out of this arrangement but each camp was on the same page.

Now I feel like everything is more top-down. The tech sector feels less like market capitalism and more like something being centrally planned: we all must chase trends that come from various industry thought-leaders. And it all must be done a very specific way (this in particular is why Chinese companies are likely going to disrupt the AI market: they’re free from this burden)

amelius

VCs are happy to throw money at something if they believe they can corner a market. It just doesn't have to do much with reality. Until of course, it becomes self fulfilling. But in this case it seems like it's not going to happen because nobody has a moat.

jarsin

> I’ve found that almost no founders or friends I speak with have any vision for the future anymore.

I think in general there is a feeling that the time to get your bag is rapidly shrinking.

Once everything is built by these things there will be no reason to create anything as the platform owners (big tech) will be able to take everything for themselves and no longer have to share 70% with those pesky creators/small business/startups etc.

tmpz22

Founder risk has been nil for a long time either because they pay themselves six figures out the gate or because the job market has been hot enough that they can market utter failure to get another job.

There’s a lot of opportunity to make low cost software that out competes big tech just because it doesn’t demand 10000x returns on every if statement.

I’d encourage Europeans to start replacing American software vendors with small teams today. You won’t become the next American oligarch but you’ll be able to clean up millions from the incompetent Americans.

dukeyukey

I think there's huge opportunity for software that replicates the 80% most used features from big tech companies/packages, but at half the price and a tiny fraction of the overall complexity. Think of Python Anywhere, offering a very simple Python-focused VPS/PaaS platform, but so so much easier to use than AWS.

jarsin

[flagged]

null

[deleted]

adam_arthur

It's similar to how China dominated manufacturing in prior decades.

They have massive amounts of low cost labor and, unfortunately, the US has pretty large walls up preventing mass in-migration of white collar workers.

H1B is capped and also more of a lottery than a points based system.

If the US allowed mass white collar immigration, wages would decline materially which would make our industries more competitive for the next generation of software.

Right now the system is geared around protectionism (intended or not) and wage inflation for US local workers.

The current market wages in software are far far above what a global equilibrium would be. Though myself and I'm sure most others here have benefited from it in the short term.

To be clear, established companies with an existing market are fine for now and can do well with high wages.

But the next generation of companies that are chasing smaller markets and margins, ones that require more elbow grease to out-compete are underserved.

e.g. the entire DeepSeek team was paid less than a few Meta engineers (with 7 figure comp each)

"The firm offers 14-month pay for various positions and the highest offer is for deep learning researchers for artificial general intelligence (AGI), with a monthly salary between 80,000 yuan ($10,983) and 110,000 yuan, which could mean an annual income of up to 1.54 million yuan, the report said."

ninetyninenine

> Huawei may release its own Open Source PC OS soon. We are slowly but surely witnessing the collapse of Western Tech scene.

The US has been so used to being number one that not being number one equates to “collapse”.

No the US Will NOT collapse. They just won’t be number one in economic/military/technological might. Similar to how many countries like the UK, Japan, and more have not existed as the number one economic super power.

jampekka

It will be (arguably already is) societally rough though. The west has been riding the asian cheap labor for decades (and the cheap colonial labor before that). People are not gonna be happy falling down the "value chain".

nebula8804

Probably explains Trump's moves: Force out all the illegal immigrants doing low value work, kill off as much fluff in the government as you can to cut the debt load, those unemployed people are forced to fill in the low value labor. You've solved all the problems and everyone except the capital owners are worse off..but at least the capital owners live to see another day.

pera

I guess the majority in here would agree that without strong market intervention OpenAI will soon implode. They urgently need:

- WIPO copyright exemption

- Anti-China protectionist measures

- Hard-line hardware export control

- Multi-billion dollar government contracts

bakuninsbart

This is the old way of doing it, and probably the way the US is going to go with, at the detriment of its own population. - I would posit that since we are talking about digital goods, there is a better way:

Require open source / open weights of any company that used data to it doesn't own to train its models. If chinese companies do not comply, their copyright becomes void in the US, and these models are very easy to copy. Treat advances in architecture as a utility, and let the utilization of those architectures be the market for companies to compete in.

maxglute

That one weird trick. x4.

gdiamos

Gold

ForTheKidz

Of course, none of this will prevent china from producing technology that's clearly as impressive, if not more so.

megous

This is all protectionistic measures. Just let it implode, and let companies with better technology take over.

Why should people involved in some hyped company deserve all this "socialism for the rich" from the state?

pera

Yeah sorry I worded my comment in a weird way: I'm definitively not advocating for any of those points

viraptor

A copyright exemption would just put them at the level of deepseek officially, but they've been working around that anyway in practice. I'm not sure that change would make any difference.

rustc

> but they've been working around that anyway in practice

Working around how?

prng2021

When it comes to hardware who pioneered all those technologies? Definitely not China. They’ve stolen unimaginable amounts of IP and will continue to do so. But yes you’re right, they surprise everyone with how well they can scale the stolen innovation.

ak_111

Possible, but if you look at the graduate students and lecturers behind many of these IPs you will find they are Chinese (or Russians or Iranians).

This is the paradox in those who are championing barring Chinese students from the US to prevent them from stealing IP, they don't see that at least 50% of this IP is generated by students from China, in a way they will be handing the CCP a gift by incentivising those students to remain in China.

calf

I don't see how this is a proper reply to the prior comment; which Chinese lecturer has a Nobel prize in AI?

blackeyeblitzar

> they don't see that at least 50% of this IP is generated by students from China

Source? I noticed your post leaves out various other countries that have high representation in US universities.

occz

>They’ve stolen unimaginable amounts of IP and will continue to do so.

All AI models are built on the back of massive amounts of "IP stealing". Either we consider IP to be valid and then all western companies in this space are just as bad, or we go with the direction the western companies are claiming and then China is not doing anything wrong.

SalmoShalazar

These sour grapes comments are so goofy, and honestly a little racist. The millions of Chinese engineers working out in China are extremely talented, and to downplay their achievements like this and to chalk them all up as thieves is ridiculous. They have the skills, the man power, and the vision, and they’re eating the West’s lunch regardless of your feelings on how fair it is.

Tostino

I don't think it's that, so much as the average western citizen isn't able to go and create a knock off of a new invention / product and have it sold without legal consequences. That has been available to many Chinese citizens though.

j_maffe

All developing markets "steal" until they've caught up with the competition. Just look at the US and how they "stole" innovations and tech from Europe.

tonyhart7

"Just look at the US and how they "stole" innovations and tech from Europe."

except that they are not

rapsey

It is entirely irrelevant who pioneered the tech. This is why no one gives a crap about xerox anymore.

Dismissing Chinese tech is foolish. They are tech leaders in many areas and moving to new ones every day. Solar, Nuclear, Batteries, EVs, Drones, Robotics etc. They have no one to copy in those fields because they have left the rest of the world behind.

j_maffe

Expertise in HVDC is entirely contained in China it's incredible. They're definitely capable of innovations when needed.

mateus1

Would you say the west “stole” the IP for paper , ice cream, tea and noodle? Weird notion.

paganel

By the 1890s both the US and Germany had surpassed Britain when it came to industrial output, I don’t think it was any consolation for the Brits that they had invented it (almost) all.

null

[deleted]

wruza

We are slowly but surely witnessing the collapse of Western Tech scene.

I think you're witnessing it rather getting back in touch with reality than collapsing. Multi-trillion out of jsx generator was too much from the beginning. You folks just don't know what to do with too much money you have.

rstuart4133

> I think you're witnessing it rather getting back in touch with reality than collapsing.

You're witnessing the USA tech scene getting in back to reality. Software engineers in other western countries looked at the salaries the Tech scene was paying in the USA, and scratched their heads.

Imustaskforhelp

Its a collapse from fictional reality to real reality , but a collapse nonetheless.

Sometimes reality acts more weird than fiction itself. I have just now decided to call this "fictional reality"

Like yesterday when I realized that nuclear bombs weren't that far away from the creation of chemical resonance & they happened after world war I and I think , just really 5-6 years before nuclear bombs but still!

It actually gave me a lot of hope because I felt that a lot of people were focusing on AI , so I can use AI (sometimes , if I want) to focus on a passion project that I want , to maybe earn some money.

I have also thought of creating AI projects but that too for fun. I don't know two shits but I just want to know what the hype is about from a theoretical standpoint.

syntex

cheaper hardware usually means more adoption of the software and then even more demand for hardware

benjaminva

Correct answer, never think about the future in terms of linear extrapolations. It's a non-linear differential equation with lots of variables and expect complex feedback loops. Systems react to change.

nwellnhof

Jevons paradox: https://en.wikipedia.org/wiki/Jevons_paradox

apwell23

you are assuming that cost is stopping from ppl using these technologies.

These things are not actually useful. They hyper optimzed it for coding usecase but it still sucks balls at it.

david-gpu

When the cost of training a model goes down, it doesn't simply become cheaper to the end user. In addition to that, the provider will train even larger and more capable models.

rixed

> We are slowly but surely witnessing the collapse of Western Tech scene

Is economy a zero sum game now? Isn't economic development supposed to be a good thing? Can the West only exist in a world of poverty and underdevelopment?

jampekka

The current western lifestyle of dirt cheap foreign goods exists only in a world of relative poverty and underdevelopment.

jononor

With high degrees of automation that need not be the case. Consumer electronics is highly automated right now, where-as clothing production is currently not.

milesrout

That doesn't make sense. Consumer goods have become cheaper and more accessible because the third world has become more productive and richer. If they become richer still it will be because they become more productive still, which will mean their exports will be more competitive (cheaper).

patrickhogan1

What's interesting about Baidu's AI model Ernie is that Baidu and its founder, Robin Li, have been working on AI for a long time. Robin Li has a strong background in AI research going back many years. Also notable is that some of the key early research on scaling laws—important for understanding how AI models improve as they get bigger—was done by Baidu's AI lab. This shows Baidu's significant role in the ongoing development of AI.

https://research.baidu.com/Blog/index-view?id=89

I am excited to see Baidu catchup. It feels like they have earned it. Being very early.

gdiamos

Here’s a true story I find funny about scaling laws at Baidu.

From 2016-17 I did a projection using our scaling law equation with my coauthors about how many GPUs it would take to train an LLM with a step function in capability. Joel Hestness in particular did excellent experimental work to enable this.

I came out with a projection of about a $1 Billion GPU budget.

Baidu was in the middle of downsizing the US research center (SVAIL) in favor of AI in China and I was participating in the layoff of many of my colleagues while trying to keep the lights on long enough to finish our scaling law experiments, which I personally thought would change the world.

I actually wrote a report to Robin explaining the implications of scaling laws and asking for a $1 billion budget to train a Baidu LLM in 2016 and sat on it through 2017.

But I never sent it because I thought it would never have been supported in that environment. I sometimes wonder what Robin would have thought about it and how the world may have been different if Baidu had released ChatGPT.

We may be about to find out because the AI moat filled with simple algorithms and scale seems to be much more shallow than the processor and systems moat.

I have a huge amount of respect for Dario and Ilya for carrying on scaling laws at OpenAI or it may have never seen the light of day.

If there is one problem for the AI community to solve by 2030 I think it is the moat problem.

KaoruAoiShiho

Dario, founder of Anthropic is an ex-Baidu AI employee, it was at Baidu that he learned the bitter lesson.

ninetyninenine

Do most people feel the way you do? This is one factor out of multitudes of factors representing Chinas rise as a super power that will eclipse the US in technological, economical and military might.

I’m excited but most people are patriotic and I feel things like this or even the whole situation with BYD producing better cars then Tesla is something people take as an attack to their identity. If not an attack it’s definitely represents an eroding of their patriotic identity.

Unfortunately Trump can’t slap a tariff on this. Maybe he can ban it like he was going to do with TikTok? The US really needs to get off its high horse and not associate its identity with being the sole economic super power in the world.

entropyneur

It's not about patriotism. Many people outside the US, myself included, see a problem with authoritarian superpowers per se. Although now that the US is rapidly drifting towards authoritarianism, that just seems like an inevitable future to prepare for.

ninetyninenine

Agreed. Within the US though a lot of it is definitely patriotism. But even for Europe a new super power on the block is not necessarily a good thing.

Would you prepare for such a future by banning TikTok and placing tariffs on all goods like BYD cars? I would say no. Those acts are done out of patriotism.

vitorgrs

Depends "where". Don't think most people in Latin America see a problem with the U.S being less powerful, even it makes China more powerful.

This might sound weird, but in Latin America a lot of people see the U.S in a similar way that Europeans see Russia.

ben_w

Like 95% of the planet, I'm not American. Like 82% of the planet, I'm not Chinese.

BYD being better than Tesla isn't a matter of patriotism in most of the world. DeepSeek and Baidu can spend as long as they want playing musical chairs/rap battles with Anthropic and OpenAI, it makes no odds to me which wins.

America and China both have politics that have no reason to care for people like me, nor people like my friends, that they are for different reasons and differ in penalties for being an out-group doesn't matter when I'm a foreigner to both, when my antecedent are who the 13 Colonies rebelled against and more recent antecedent forced unwanted opium sales on China.

ninetyninenine

I’m speaking to the composition of people on HN.

gaoryrt

So you from UK.

tw04

I think (hope) most folks care less about the “attack on patriotic identity” and are more concerned that what is essentially a dictatorship is rising in power significantly. History has shown dictatorships rarely end well for the general populace and the rest of the world.

Democracy has its flaws, but one of the features that most people prefer is that it can significantly change how it looks and operates to reflect the will of its people without violence.

somenameforme

I don't think this is really true. History mostly just shows that hegemonic powers rarely end well for other countries, and ultimately even for the people under said hegemony. The same will obviously be written of the US in the history books. We've invaded, overthrown, or tried to overthrow so many countries that you'd have a far easier time counting the countries we haven't tried to dominate in one way or the other.

And historically many of the greatest eras under Ancient Greece and Rome were under autocratic systems that advanced humanity by essentially every single metric. For that matter China has been among the most powerful countries in the world countless times - yet I think relatively few would ever know this because it's always been a quite insular nation, and never pursued hegemony in the same way as Western empires. Of course that could change but it seems extremely unlikely. Pursuing the perpetuation of global hegemony has been anything but fruitful for the US, and it should be a great lesson for the rest of the world. Those times, not just of the US - but of any global hegemon, are probably behind us.

czottmann

> more concerned that what is essentially a dictatorship is rising in power significantly

Which one?

krapp

>Democracy has its flaws, but one of the features that most people prefer is that it can significantly change how it looks and operates to reflect the will of its people without violence.

Internally, maybe. But China becoming a de facto supowerpower doesn't mean everyone else becomes Chinese any more than America being a superpower means everyone else becomes American. The salient point for most people is how that superpower balances the carrot of trade and the stick of violence to maintain its hegemony. To that end the US has far worse of a track record than does China.

Unless the implication is that China intends to directly colonize Western countries, which is something only the US is currently threatening to do.

xbmcuser

That has been falsely taught to you but the real fight has never been about the type of rule. But rather on the type of economy US and the west hate China not because of how its dictatorship but rather because its economy is not private capital economy that is showing it can succeed without private citizens completely taking over the country.

As in the last 40-50 years is has been the US and western countries that have been involved in bringing down democracies that had slightest socialist tendencies and propped up dictatorships that allowed the companies to exploit the countries resources. So it is not about the type of government rather the type of economy.

tm-infringement

Honestly I'm more worried about the US backsliding to full authoritarianism with the usually "spicier" foreign policy. The more politically insular China from the current regime seems stable enough. Xi could have even 15 years left in the tank before succession shenanigans start. Obviously this from a LATAM perspective, I'm not in Taiwan or South Korea, I would be considerably more spooked then.

znpy

I’m more concerned about the silence from congress and other similar government entities., to be honest. Are they complicit?

sunaookami

As a European I can say that I like this development because prices go down and models get better and OpenAI has no monopol anymore.

flir

Given the framing of "most people" and "patriotic": China's got 1bn+ people.

somenameforme

It has nothing to do with just giving up and going 'Wellp, I guess China wins.'

China and the US are obviously very different culturally in just about every way possible. This difference makes for great competition. Someone in another topic mentioned something that seemed pretty insightful to me - in that where LLM companies failed in the US was in basically becoming clones of each other, whereas DeepSeek (and now perhaps Baidu) were going in a different way, and that way turned out to be better.

US companies will inevitably copy these strategies, one way or the other, as will Chinese companies copy what ends up working well from the US (see their latest rockets looking more than a little inspired by Starship). And the true competitiveness ensures in the end that the main people who will win will not be whichever guy ended up founding an AI company first, but you and I. It's how capitalism is supposed to work - companies beat themselves down into a race to the bottom, and society reaps the rewards. It only gets really messed up when there's no "real" competition, which is an increasingly frequent state of affairs. But that definitely will not be the case here.

Expect the same thing from India in the future as well. Their economy is advancing rapidly, and soon enough we're going to have another 1.4 billion people able to fully utilize the outliers such a population entails to similarly drive things forward in their own unique way. It's a great future for the world as a whole.

greenie_beans

don't know why you're getting down voted because it's true. we should work together with the new world superpower instead of fight it.

and don't start on some dictator BS. the US does/has done as many, if not more, bad things as china.

jampekka

And open weights promised for June. China is really taking over in the ML game.

https://x.com/Baidu_Inc/status/1890292032318652719

vdfs

What's AI 3 months in real world time? could be old or obsolete by then

pacifika

Is the title claim correct? It is not mentioned as such in the tweet.

throaway55623

I feel like Deepseek had such good media reception, and SOTA models are so close that "GPT4x performance at y% the price" is an easy tagline that companies will be using in the coming 6 months. It's an easy goal to achieve because of diminishing returns in compute and game-able benchmarking, cherry-picking, distilling etc.

Not to say there can't be actual interesting improvements in performance/cost, but in many cases it will be more of a marketing angle.

Alifatisk

Yeah I was wondering about that too, the benchmarks look good but this seems to be more like a competitor to GPT 4o, not GPT-4.5

aubanel

No it's not: their model is only on GPT-4.5 level on a few, saturated/cherry-picked benchmarks.

decide1000

ERNIE 4.5: Input and output prices start as low as $0.55 per 1M tokens and $2.2 per 1M tokens, respectively.

Comparison models: https://x.com/Baidu_Inc/status/1901094083508220035/photo/1

simonw

Anyone managed to try this yet? https://yiyan.baidu.com/ appears to require a Chinese phone number.

taosx

Just tried it. Not sure exactly what model is behind the scenes but it was cringe. I provided specs for a coding task, it told me that the specs are possible but too complex so it just gave me an alternative naive way of doing it. I use LLMs as a tool so I'm trying to be very exact with my requirements and wording, this felt like it was basically negotiating the requirements with me...kinda annoyed me, lol. My suspicion is that it was trained too much on chinese forums and the data was not refined enough.

dhx

You get one free question answered without a login. You can dismiss the login prompt which appears after submitting your question and use copy/paste with keyboard shortcuts or browser debug tools to retrieve the full answer (including the part hidden with CSS rules). Either use XPath of '//div[@id="answer_text_id"]//text()' or copy the text/eventstream response for the API call to https://yiyan.baidu.com/eb/chat/conversation/v2 once the SSE session has closed.[1] Clear cookies and site data and you'll get a new session and can keep going.

It can take about 20 seconds to return all tokens so it appears likely the login prompt is there to minimise resource consumption.

[1] https://developer.mozilla.org/en-US/docs/Web/API/Server-sent...

pogue

I'm trying to figure out the same thing. They make claums about it being totally free, but everything is in Chinese and you appear to need a Chinese mobile number to register.

lopkeny12ko

"Free" does not mean "available to everyone."

andsoitis

The tweet is in English, which strongly suggests that the product is accessible in English, but then doesn’t appear to be.

That begs the question what the point is of an announcement in English?

siva7

America, is this the future you want?

borgdefenser

Surely, this is as inevitable as not being able to use Wechat as an American.

The models aren't what worry me anyway. China is going to kick our ass when it comes to AI integration into society and the economy.

Imagine the difficulties faced by America vs China in integrating AI into healthcare.

We are just too worried about winning this AI model sporting event even though the entire concept is flawed and doomed to failure. We actually have to figure out how to use these models for more than how many Rs are in strawberry. That appears to be the actual hard part.

Of course, none of this is helped by having wasted an entire generation of some of America's best minds on javascript programming for obscene profit.

infecto

An entire generation is not wasted. The bigger issue is that China has no concern wiping out whole classes of jobs to be replaced by the next iteration and America struggles with keeping those voters happy. Think about things like our lack of dock work automation in favor of keeping some labor unions happy.

Logge

GTP 4.5 is not a reasoning model. Reasoning models outperform it clearly. Even OpenAIs o3-mini is smarter while being magnitudes cheaper. Those 2 should be compared in my opinion. GPT 4.5 feels like a failed experiment to see how far you can push non-thinking models.

logicchains

>GPT 4.5 feels like a failed experiment to see how far you can push non-thinking models

It's not a failed experiment, it's a very good experiment, because it produced a very useful piece of information for the world (that there's limited return to further size scaling).

Logge

Good point. But pushing it as a product with that knowledge still puts it in a weird spot for me.

azinman2

Outperform in what way? Reasoning models may be able to solve problems correctly a bigger percentage of time, but they burn many tokens to get there. So they’re much less efficient, both in latency and ultimately environmental cost.

colesantiago

Good.

OpenAI, Anthropic, et al, are getting sucked into a vortex of competition with China that is ultimately going to zero.

AI is the ultimate race to zero.

There is no moat. AI and intelligence is becoming a commodity with nobody (except Nvidia) is making money. This is known for a while now.

The acceleration and adoption would only make those in the middle who aren't aware of the change happening without a job and unable to get a job.

The US-China competition in addition to Jevons Paradox will be so viciously fierce that jobs will be removed as soon as they are created.

naveen99

Intelligence can not be a commodity, because complexity is infinite. By definition the top 1% in understanding complexity are the top 1% in intelligence.

jamesblonde

Baidu have a long history in the scalable distributed deep learning space. PaddlePaddle (so good they named it twice) predates Ray and supports both data parallel and model-parallel training. It is still being developed.

https://github.com/PaddlePaddle/Paddle

They have pedigry.

kleiba

US: Could I interest you in my lunch?

China: Thanks, already on it.

curl-up

Cheap means small, small means low Q&A scores. I know that this isn't that important for the majority of applications, but I feel that over-reliance on RAG whenever Q&A performance is discussed is quite misleading.

Being able to clearly and correctly discuss science topics, to write about art, to understand nuances in (previously unseen) literature, etc. is impossible simply through powerful-reasoning + RAG, and so many advanced use cases would be enabled by this. Sonnet 3.5+ and GPT 4.5 are still unparalleled here, and it's not even close.