Skip to content(if available)orjump to list(if available)

Huawei cloned Qwen and DeepSeek models, claimed as own

egypturnash

LLMs are apparently completely incompatible with copyright anyway, so if you can train them without paying a single dime to anyone whose work you ingest, then you should be able to clone them for free. What goes around comes around.

mensetmanusman

They are naïvely incompatible, but lawyers will find a way to make it not so.

bigmattystyles

Old maps (and perhaps new ones) used to add fake little alleys so a publisher could quickly spot publishers infringing on their IP rather than going out and actually mapping. I wonder if something similar is possible with LLMs.

tedivm

When I was at Malwarebytes we had concerns that IOBit was stealing our database and passing it off on their own. While we had a lot of obvious proof, we felt it wasn't enough for the average person to understand.

To get real proof we created a new program that only existed on a single machine, and then added a signature for that application. This way there could be no claim that they independently added something to their database, as the program was not malware and literally impossible to actually find in the wild. Once they added it to their database we made a blog post and the issue got a lot of attention.

https://forums.malwarebytes.com/topic/29681-iobit-steals-mal...

landl0rd

The classic example here is subtle, harmless defects/anomalies built into computer chips. Half the stuff china's made is full of these because they're straight ripped from reverse engineering of TI or whomever's stuff.

Very funny that the chinese even do this to each other; equal-opportunity cheats.

throwaway74354

It's important part of the culture and is not considered cheating. IP protection laws legal precedents are not the universal truth.

This article on the topic is a good explainer, https://aeon.co/essays/why-in-china-and-japan-a-copy-is-just... , but it's a thoroughly studied phenomenon.

ateng

Youtuber Jay Foreman made a video about fake alleys in maps https://www.youtube.com/watch?v=DeiATy-FfjI

Tokumei-no-hito

i have come across this one for example https://github.com/sentient-agi/OML-1.0-Fingerprinting

> Welcome to OML 1.0: Fingerprinting. This repository houses the tooling for generating and embedding secret fingerprints into LLMs through fine-tuning to enable identification of LLM ownership and protection against unauthorized use.

NitpickLawyer

Would be interesting to see if this kind of watermarking survives the frankenstein types of editing they are presumably doing. Per the linked account, they took a model, changed tokenizers, and added layers on top. They then presumably did some form of continued pre-training, and then post-training. It would have to be some very resistant watermarking to survive that. It's not as simple as making the model reply with "my tokens are my passport, verify me" when you ask them the weather in NonExistingCity... Interesting nonetheless.

yorwba

The original whisteblower article in Chinese at the bottom (but not the English version at the top) has this part:

实际上,对于后续训了很久很久的这个模型,Honestagi能够分析出这个量级的相似性我已经很诧异了,因为这个模型为了续训洗参数,所付出的算力甚至早就足够从头训一个同档位的模型了。听同事说他们为了洗掉千问的水印,采取了不少办法,甚至包括故意训了脏数据。这也为学术界研究模型血缘提供了一个前所未有的特殊模范吧。以后新的血缘方法提出可以拿出来溜溜。

In fact, I'm surprised that HonestAGI's analysis could show this level of similarity for this model that had been post-trained for a long time, because the computing power used to train-wash the parameters of this model was enough to train a model of the same size from scratch. I heard from my colleagues that they took many measures to wash off Qwen's watermark, even deliberately training on dirty data. This also provides an unprecedented case study for the academic community studying model lineage. If a new lineage method is put forward in the future, you can take it for a spin.

varispeed

I often say an odd thing on public forum or make up a story and then see if LLM can bring it up.

I started doing that once LLM provided me with a solution to a problem that was quite elegant, but was not implemented in the particular project. Turns out it learned it from GitHub issues post that described how particular problem could be tackled, but PR never actually got in.

richardw

I’ve wondered whether humans who wanted to protect some areas of knowledge just start writing BS here and there. Organised and large scale, with hidden orchestration channels, it could potentially really screw with models. Put the signal to humans in related but slightly removed places.

throwaway48476

Chinese efficiency. The west is held back by archaic IP laws.

JPLeRouzic

That's a very human and very honest report. It presents the confusion there is in some big companies and how the pressure by the management favors dishonest teams. The writer left the company. I hope he is well; he is a fine person.

dworks

Yes. In fact, this report should be written in the context of other farewell letters to employers that have been published recently in China. There has recently been one, by a 15-year Alibaba veteran, who decried the decline of the company culture as a cause of its now lacking competitiveness and inability to launch new products.

The issues in this report are really about: 1. Lies about Huawei's capabilities to the country (important national issue) 2. Lies to customers who paid to use Huawei models 3. A rigid, KPI-focused and unthinking organization where dishonest gaming of the performance review system not only works but seems to be the whole point and is tacitly approved (this and the reporters idealism and loss of faith is the main point of the report as I see it)

yorwba

I think the reporter's motivations would've come across more clearly if you had posted a paragraph-by-paragraph translation instead of the current abridged version. (I assume Dilemma Works is your Substack.) Lots of details that add color to the story got lost.

option

"Organization: We belong to the “Fourth Field Army” initiative. Under its structure, core language large models fall under the 4th brigade; Wang Yunhe’s small-model group is the 16th brigade."

- Lol, what? So is this literally a part of CCP military?

option

Doesn't feel like a healthy culture, IF true. Also, apparently current DeepSeek lab members aren't allowed to travel to conferences. This is all maybe good for execution but absolutely not for innovation

gausswho

"Saturday was a working day by default, though occasionally we had afternoon tea or even crayfish."

Unexpected poetry. Is there a reason why crayfish would be served in this context?

tecleandor

I understood like "even as they made us work on Saturday, we sometimes had the luck of having some afternoon snack", and I guess crayfish might be popular there. Or maybe it's a mistranslation.

alwa

Immensely popular, delicious, and very beautiful on a plate or in a bowl, both whole/boiled/stir-fried and as snack packs of pre-peeled tails! See, e.g.,

https://mychinesehomekitchen.com/2022/06/24/chinese-style-sp...

So yes, I read it the same way you do: “They made us work weekends, but at least they’d order us in some pizzas.”

(…and if you’re in the US, you can have them air-freighted live to you, and a crawfish boil is an easy and darn festive thing to do in the summer. If you’re put off by the crustacean staring back at you, and you have access to a kitchen that operates in a Louisianan style, you might be able to find a “Cajun Popcorn” of the tails seasoned, battered, and fried. Or maybe one of the enormous number of “seafood boil” restaurants that have opened in the US in recent years.)

(I feel like those establishments came on quickly, that I notice them mainly in spaces formerly occupied by American-Chinese restaurants, and that it’s felt like a nationwide phenomenon… I suspect there’s a story there for an enterprising young investigative nonfiction writer sort.)

tecleandor

Oh! That sounds tasty. I'm in EU, but I'm gonna take note of both. Thanks.

sui762o

[dead]

kkzz99

Remember that there was a Huawei Lab member that got fired for literally sabotaging training runs. Would not be surprised if that was him.

yorwba

I think the case you're talking about is this one: https://arstechnica.com/tech-policy/2024/10/bytedance-intern... where it was a ByteDance intern.

matt3210

The question is who really made the original models?

tengbretson

In the LLM intellectual property paradigm, I think this registers as a solid "Who cares?" level offence.

brookst

The point isn’t some moral outrage over IP, the point is a company may be falsely claiming to have expertise it does not have, which is meaningful to people who care about the market in general.

null

[deleted]

tonyedgecombe

Nobody who pays attention to Huawei will be surprised. They have a track record of this sort of behaviour going right back to their early days.

npteljes

While true, these sorts of reports are the track records which we can base our assessments on.

didibus

Ya, the models have stolen everyone's copyrighted intellectual property already. Not sure I have a lot of sympathy, in fact, the more the merrier, if we're going to brush off that they're all trained on copyrighted material, might as well make sure they end up a really cheap, competitive, low margin, accessible commodity.

lambdasquirrel

Eh... you should read the article. It sounds like a pretty big deal.

some_random

Claiming to care deeply about IP theft in the more nebulous case of model training datasets then dismissing the extremely concrete case of outright theft seems pretty indefensible to me.

Arainach

Everyone has a finite amount of empathy, and I'm not going to waste any of mine on IP thieves complaining that someone stole their stolen IP from them.

pton_xd

> dismissing the extremely concrete case of outright theft seems pretty indefensible to me.

Outright theft is a meaningless term here. The new rules are different.

The AI space is built on "traditionally" bad faith actions. Misappropriation of IP by using pirated content and ignoring source code licenses. Borderline malicious website scraping. Recitation of data without attribution. Copying model code / artifacts / weights is just the next most convenient course of action. And really, who cares? The ethical operating standards of the industry have been established.

perching_aix

Par for the course for emotional thinking, I'm not even surprised anymore.

null

[deleted]

esskay

It is very hard to have any sympathy, they stole stolen material from people known to not care they are stealing.

null

[deleted]

mathverse

[flagged]

oblio

Didn't know Sam Altman was Chinese :-)

typon

LLMs are all built on stolen data. There is no such thing as intellectual property in LLMs.

mattnewton

That’s not the point IMO; the point was this was being used to display capabilities to train models with Huawei software and hardware.

hereme888

[flagged]

jambutters

I don't think anyone cares about that. OpenAI ripped off of the internet and books. Deepseek distilled some of openAI and pushed the field forward