Skip to content(if available)orjump to list(if available)

Alibaba Cloud says it cut Nvidia AI GPU use by 82% with new pooling system

kilotaras

Alibaba Cloud claims to reduce Nvidia GPU used for serving unpopular models by 82% (emphasis mine)

> 17.7 per cent of GPUs allocated to serve only 1.35 per cent of requests in Alibaba Cloud’s marketplace, the researchers found

Instead of 1192 GPUs they now use 213 for serving those requests.

yorwba

Not really, Figure 1(a) of the paper says that the 17.7% are relative to a total of 30k GPUs (i.e. 5310 GPUs for handling those 1.35% of requests) and the reduction is measured in a smaller beta deployment with only 47 different models (vs. the 733 "cold" models overall.) Naïve extrapolation by model count suggests they would need 3321 GPUs to serve all cold models, a 37.5% reduction to before. (Or 6.6% reduction of the full 30k-GPU cluster.)

djoldman

Key paragraph:

> However, a small handful of models such as Alibaba’s Qwen and DeepSeek are most popular for inference, with most other models only sporadically called upon. This leads to resource inefficiency, with 17.7 per cent of GPUs allocated to serve only 1.35 per cent of requests in Alibaba Cloud’s marketplace, the researchers found.

checker659

They are working with tiny models. Not sure how well it'd scale to bigger models (if at all).

CaptainOfCoit

They're all LLMs, so no, not tiny, but not exactly huge either:

> Our current deployment runs in a cross-region cluster comprising 213 H20 GPUs, serving twenty-eight 1.8–7B models (TP=1) and nineteen 32–72B models (TP=4).

hunglee2

The US attempt to slow down China's technological development succeeds on the basis of preventing China from directly following the same path, but may backfire in the sense it forces innovation by China in a different direction. The overall outcome for us all may be increase efficiency as a result of this forced innovation, especially if Chinese companies continue to open source their advances, so we may in the end have reason to thank the US for their civilisational gate keeping

dlisboa

History has shown that withholding technology from China does not significantly stop them and they'll achieve it (or better) in a small number of years.

In many senses there's hubris in the western* view of China accomplishments: most of what western companies have created has had significant contribution by Chinese scientists or manufacturing, without which those companies would have nothing. If you look at the names of AI researchers there's a strong pattern even if some are currently plying their trade in the west.

---

* I hate the term "western" because some "westeners" use it to separated what they think are "civilized" from "uncivilized", hence for them LATAM is not "western" even though everything about LATAM countries is western.

zawaideh

Re: Western. A similar thing plays out when the term "international community" is used in news. It refers to the US and its major allies which means US, Canada, Western Europe, Japan, Australia and New Zealand more or less.

nicoburns

> A similar thing plays out when the term "international community" is used in news. It refers to the US and its major allies which means US, Canada, Western Europe, Japan, Australia and New Zealand more or less.

Wait, really? I thought "international community" meant all countries.

newyankee

Essentially countries that were developed prior to 1990 or so , although South Korea is a tricky case today going by this definition, as are Taiwan, Hongkong and Singapore

tsunamifury

Yes community refers to whose who participate in community.

How is this hard to understand?

Broadly speaking coast de ivory and the like is not a participant in the international community.

achierius

> most of what western companies have created has had significant contribution by Chinese scientists or manufacturing, without which those companies would have nothing. If you look at the names of AI researchers there's a strong pattern even if some are currently plying their trade in the west.

While I don't disagree with your overall point, it's important to recognize that this is only a phenomenon of the last ~30 years, and to avoid falling into the trapn of Han racial chauvinism. E.g. there were ~no Chinese scientists in Germany in the 70s but they were heavily innovating nevertheless.

dlisboa

Absolutely. China obviously has a longer history with innovation but they like to make it seem everything was invented by them at some point in the past. I'd say newer technology is where China has had a bigger impact.

Consequently newer tech is precisely where global cooperation is most required so no country can really do it by themselves. We could even say no country, western or otherwise, has been doing it on their own for the past 500 years or so but alas...

onlyrealcuzzo

> History has shown that withholding technology from China does not significantly stop them and they'll achieve it (or better) in a small number of years.

It's worked for a very long time for aircraft.

China has been pushing to build its own aircraft for >23 years. It took 14 years for COMAC to get its first regional jet flying commercial flights on a Chinese airline, and 21 years to get a narrow-body plane flying a commercial flight on a Chinese airline.

If for no technical reasons and purely political, COMAC may still be decades away from being able to fly to most of the world.

Likewise, in ~5 years, China may be able to build Chips that are as good as Nvidia after Nvidia's 90% profit margin - i.e. they are 1/10th as good for the price - but since they can buy them for cost - they're they same price for performance and good enough.

If for purely political reasons, China may never be able to export these chips to most of the world - which limits their scale - which makes it harder to make them cost effective compared to Western chips.

Yoric

> If for purely political reasons, China may never be able to export these chips to most of the world - which limits their scale - which makes it harder to make them cost effective compared to Western chips.

Note that this happens at the same time the US is breaking up its own alliances, so as of this writing, there's no such thing as certainty about politics.

wood_spirit

But at the same time they are fielding multiple new stealth aircraft and their jets and missiles outperformed western aircraft in the recent Pakistan India flare-up.

sofixa

> China has been pushing to build its own aircraft for >23 years. It took 14 years for COMAC to get its first regional jet flying commercial flights on a Chinese airline, and 21 years to get a narrow-body plane flying a commercial flight on a Chinese airline

And both those planes have a strong dependency on "western" components that won't be overcome before the 2030s, and even then, they're around a generation behind.

lawlessone

In a way withholding a tech becomes a signal saying "Hey this is important" so the result is China dedicates more resources to researching it lol.

raincole

> look at the names

Why would I do that tho? If we look at the names of scientists/researchers/engineers/businessmen, the conclusion would be that the US has contributed nothing to the world. Europeans did all the hard work!

thesmtsolver

Another equivalent way to look at that:

Historically, top scientists/researchers/engineers/businessmen migrate from rest of the world to the US rather than to Europe or China.

Imagine if Europe or China were a bit more open with immigration and equally attractive, we would see the same pattern there too.

caycep

this is true for anyone - create challenges, and you optimize efficiency elsewhere.

Also, isn't this the usual path to better computer science? Reducing computation needs by making better/more efficient algorithms? The whole "trillions of dollars of brute force GPU strength" proposed by Altman, Nadella, Musk et al just seems to reinforce that these are business people at heart, not engineers/computer scientists...

rayiner

Nobody thinks the Japanese aren’t “civilized.” “Western” is just a euphemism for “rich and orderly.”

switchbak

It is an odd category, and Japan is often considered to be "Western" - these days at least. That certainly wasn't the case even a few generations ago.

I think it's ostensibly supposed to be more about shared cultural values, but even that is a pretty weak way to divide countries. Perhaps "an ally of the United States" is a little more accurate?

Any societal dividing line like this is bound to hit on problems once subjected to the real world.

bad_haircut72

Its more about democracy and adhering to the global (set up by America post WW2) system of laws and trade.

null

[deleted]

null

[deleted]

null

[deleted]

notepad0x90

I think anti-immigrant rhetoric will have the most impact against the US. A lot of the people innovating on this stuff are being maligned and leaving in droves.

Aside from geography, attracting talent from all over the world is the one edge the US has a nation over countries like China. But now the US is trying to be xenophobic like China, restrict tech import/export like China but compete against 10x population and lack of similar levels of internal strife and fissures.

The world, even Europe is looking for a new country to take on a leader/superpower role. China isn't there yet, but it might get there in a few years after their next-gen fighter jets and catching up to ASML.

But, China's greatest weakness is their lack of ambition and focus on regional matters like Taiwan and south china sea, instead of winning over western europe and india.

dlisboa

> But, China's greatest weakness is their lack of ambition and focus on regional matters like Taiwan and south china sea, instead of winning over western europe and india.

That's a strength. Them not having interest in global domination and regime change other than their backyard is what allows them to easily make partners in Africa and LATAM, the most important regions for raw materials.

notepad0x90

You would think so, but historically that's why they never became more than a regional power. Empires for millennia craved trade with China but only the mongols from that region made it all the way to western europe in their invasions.

It is a strength, if their goal is to have a stable and prosperous country long term, and that seems to be what they want. good for them. But nature abhors a vacuum, so there will always be an empire at the top of the food chain. Such empires want to maximize wealth for their people and secure them against threats, that's why invasions and exploitation of weaker countries happens. That game hasn't changed. Friendly relations work, until you need a lot of resources from a country that doesn't want to give it up. Or, like with the US, when they're opening up military bases next to your borders and you need a buffer state. Or, when naval blockades and sanctions are being enforced against your country for not complying with extra-sovereign demands.

History shows that countries content with what they have collapse or weaken very quickly.

China will have a population crisis in a few decades for example, and it won't have the large manufacturing base and its people will be too used to luxuries to go back to slaving for western countries for pennies. Keep in mind that the current china itself is so great and prosperous because of all the invasions it did against western china and satellite states like Vietnam and north Korea (the US isn't special in this regard).

ikidd

>Them not having interest in global domination and regime change

I don't even know where to begin with that one.

OrvalWintermute

if you've been tracking the shark deals they give countries for loans, I think you'd recant what you just said.

"while the CCP accuses the West of predatory interest rates, the average Chinese rescue loan carries an interest rate of about 5 percent, more than double the IMF’s standard 2 percent. As of Oct. 1, 2025, despite higher U.S. interest rates, the IMF’s Special Drawing Rights lending rate stands at only 3.41 percent, still significantly lower than what China charges struggling nations for so-called relief."

These countries paying these loans are the ones least able to pay them back, and at more than double IMF loans, they are really putting them in a vise.

bkandel

China's greatest weakness is that their working-age population has already peaked and is in the process of plummeting, which will continue over the coming decades.

notepad0x90

Yes, and being content and lacking ambition isn't good. Expansionism and immigration can solve that, but they're culturally stagnant in that regard.

Without immigration, the US would have faced the same problems.

rayiner

> But now the US is trying to … compete against 10x population and lack of similar levels of internal strife and fissures.

I can’t tell whether you think the anti-immigration stance is a good thing or bad thing.

csomar

> But, China's greatest weakness is their lack of ambition and focus on regional matters like Taiwan and south china sea, instead of winning over western europe and india.

How can they have international hegemony before they clear their regional order? China is more interested in aligning Taiwan than invading; though it’ll probably invade if it can’t align it diplomatically.

China is probably not interested in continuing the current Western-style order but to implement their own sino-stuff. At least with the CCP at the helm.

drittel

[flagged]

lesuorac

The US isn't slowing China anymore.

China has an import ban on chips [1] so its irrelevant what the US does.

[1]: https://www.cnbc.com/2025/09/17/nvidia-ceo-disappointed-afte...

xadhominemx

The US is certainly slowing down China considerably. China would certainly not have an import ban on Blackwell GPUs if they were made available. And upstream, the ban on EUV and other high end semiconductor production equipment has severely limited china’s capacity to produce logic and DRAM (including HBM).

overfeed

> China has an import ban on chips

Only in response to the US bamming the export of cards China wamted. The import ban is the Chinese government burning the the landing ships, it clearly communicates to everyone involved that there is no going back.

unethical_ban

Would they have done that if the US had been more "reliable" in providing the chips and didn't cut them off in the first place?

The point still stands that the US instigated the split.

sspiff

Their are signs that China is not open sourcing their SOTA models anymore. Both Huawei and Qwen (Qwen-Max, WAN 2.5) and have launched flagship models which are yet to be opensourced.

natrys

Qwen's max series had always been closed weight, it's not a policy change like you are alluding.

What exactly is Huawei's flagship series anyway? Because their PanGu line is open-weight, but Huawei is as of yet not in the LLM making business, their models are only meant to signal that's it's possible to do training and inference on their hardware, that's all. No one actually uses those models.

camel_Snake

Small counterpoint but there are also 2 new players putting out SOTA open source models (Moonshots Kimi and zhipus GLM) so we're still seeing the same number of models overall, just via newer entrants.

segmondy

may backfire? it's a bit too late for that.

go to 2024, western labs were crushing it.

it's now 2025, and from china, we have deepseek, qwen, kimi, glm, ernie and many more capable models keeping up with western labs. there are actually now more chinese labs releasing sota models than western labs.

Workaccount2

But they aren't keeping up

They are lauded for the ability to cost ratio, or their ability to parameter ratio, but virtually everyone using LLMs for productive work are using ChatGPT/Gemini/Claude.

They are kind of like Huffy bicycles. Good value, work well, but if you go to any serious event, no one will be riding one.

hunglee2

too early to call a winner, though it is disappointing to see US withdrawal from open source. Still the main outcome of open source is distribution / diffusion of the idea, so it will inevitably mean US open source will come back, hopefully via some grass roots maniac, there will be a Linus-like character emerge at some point

mixologist

user growth has slowed. the technology that should help users is only being pushed from the top, while users refuse to use it. openai pivoted to porn.

does it really feel like they have a chance to recover all the expenses in the future?

crypto grifters pivoted to ai and, same as last time, normal people don’t want to have anything to do with them.

considering the amount of money burned on this garbage, i think we can at least declare a looser.

segmondy

i'm not calling a winner, i'm just saying that the chinese have caught up despite the embargo. google, openai & anthrophic have phenomenal models. i stopped using openai & anthropic after they called for open weight/source regulation. i use google because they offer gemma and i got a year gemini-pro subscription for free, use openai gpt-oss-120b since i can run it at home, and the only model i currently pay for is a chinese model.

rzerowan

Fingers crossed for convergence rather than divergence in the technical standards.Although the way hings are going it looks like the 2 stacks will diverge sooner rather than later , with the US+ banning the use of CHN models while simultaneosly banning the export of it quasi-open models. We may very well end up in a situation like the old PAL vs NTSC video standard where the PAL(EU/Asia/AFrica) and NTSC(America's/Japan) gradually converged with the adoption of digital formats. Instead here would be a divergence based on geopolitical considerations.

hunglee2

positive take: a bifurcated tech tree might give us (humanity) a better chance of faster advancement, as it would be a persistent A/B test in live environment. Where I would join you in the crossing of fingers is to ensure such A/B testing is competitive but not destructive. We may even evolve to a situation of complementarity, an American Ying vs the Chinese Yang. Lets hope so!

reliabilityguy

Tbh this whole situation reminds of how Japan excelled in making a lot more with a lot less after WW2, e.g., fuel-efficient engines, light cars, etc. these constraints were not present in the US (and to some extent in Europe), and resulted in US cars being completely not competitive in non-US markets.

dataviz1000

I've been in Chile, Peru, Colombia, Panama, and Costa Rica.

The streets are flooded with cheap Chinese cars and I see more BYD than American cars. If the car wasn't made in Japan or Korea which probably account for most of the cars, it was likely made in China. Moreover, I haven't been in countries with the closest ties to China.

sofixa

> The streets are flooded with cheap Chinese cars and I see more BYD than American cars

This isn't surprising in any way, American "cars" (quotes because the vast majority of what American manufacturers pump out isn't cars, it's trucks) haven't been competitive in decades. The only globally competitive vehicles were developed in Europe by GM Europe (Opel, since sold to PSA now Stellantis) or Ford Europe (which axed all models bar the Puma). The rest is too big, expensive and inefficient from the vast majority of uses. Tariffs and good marketing keep American car manufacturers in business in the US, but those don't work in most other markets.

The more appropriate comparison is with European automakers such as VW Group, Stellantis (Peugeot, Citroën, DS, Fiat, Chrysler, Dodge, Ram), Renault. And there too BYD is winning as well in mosy countries, but at least there's a comparison possible.

tsunamifury

The premature optimizer is never the innovator.

Japan eventually stopped that role and their products improved greatly.

knowitnone3

It's much easier to copy what others are doing instead of spending the time and money for research and engineering. It's also much easier if you steal the tech. I could never have invented a bicycle but I can sure make a copy of one.

braza

Does someone know if there's some equivalent of those engineering/research blogs for Chinese companies?

I used to follow the ones from Western companies, but honestly, after some point in time, I would like to see some cases from what I consider is a good benchmark for everyone that does not work in FAANG in terms of engineering.

supriyo-biswas

The company blogs of Chinese companies will often do articles like this[1] talking about a new innovation or optimization that they did, but this will be often just mixed in with marketing articles too.

I would also assume there's a lot of content in the native Chinese forums, which unfortunately, as an English-speaking person, I wouldn't be able to easily refer to :(

[1] https://www.alibabacloud.com/blog/how-does-alibaba-ensure-th...

throwaway48476

Its easy enough for a a well resourced entity to take a pre trained model and deploy it on new hardware to save on the NVDA tax. It's far less likely for research and model training to happen outside the mature NVDA ecosystem.

muddi900

[flagged]

Yoric

I haven't read the paper yet, but it's here: https://ennanzhai.github.io/pub/sosp25-aegaeon.pdf aka https://dl.acm.org/doi/10.1145/3731569.3764815 .

So, definitely not state media, probably not lying on the fundamentals. Of course, still presumably viewed favorably by the CCP, I'd imagine.

muddi900

Well if it is real, we will surely see out Claude Code limits go back up.

dotnet00

This is such a popular coping tactic from Americans when it comes to facing actual competition, especially from China. Everything they do must either be a lie or just stolen American technology, as if there's something inherently special about Americans that no one else has.

serf

It's easy to guess that an opponent that is focusing on information control and theatre above all else is doing so for reasons.

> Everything they do must either be a lie or just stolen American technology

as an aviation enthusiast for 30+ years this claim , while deliberately blunt, is not far from the truth -- the truth being that half of their hardware was stolen Russian design, too.

Let's consider : The KJ-600, the J-31, J-10, H-6, Z-20, J-7, J-15, J-11.

If it isn't a direct shape-to-shape knockoff like the J-31 it's either a licensed reproduction from Russia or something derived from a reverse engineering effort like the Su-33 prototype they got from Ukraine. Similar story with their Ghost Bat knockoffs.

There are very few novel designs. I'm not faulting the methodology -- the shape of the thing w.r.t. aircraft is half (if not more) of the struggle.

It's a tremendous advantage to start from a known good shape and go from there. If I were the boss I would do exactly the same thing when trying to bootstrap an aerospace industry.

>as if there's something inherently special about Americans that no one else has.

the US has proven numerous times that this is exactly the case.

muddi900

That is a great respomse to something I did not say.

throwacct

Interesting. So, we're going to deny that most of the IP theft from China up to this moment? Do you even think China is this advanced just because of chinese innovation? C'mon man.

lossolo

It comes from what people are taught in schools and from their own self perception. When those beliefs about American exceptionalism are challenged, cognitive dissonance kicks in.

https://en.wikipedia.org/wiki/American_exceptionalism

muddi900

Did y'all not read the last sentence?

larus_canus

Ah, yes, the American media environment, which is internationally famous for not lying.

muddi900

So we are in agreement, we should not take every thing at face value.

larus_canus

I just hope you extend this skepticism consistently.

throawayonthe

scmp is kinda the opposite of state media lol

muddi900

It is owned buy AliBaba.

Can anybody independently verify any of this?