Open washing – why companies pretend to be open source
95 comments
·October 26, 2024martin-t
cbsks
I was looking for a list of free AI models and I searched for “open ai models”, which is when I first understood the terrible genius of the “OpenAI” name.
Spivak
I'm not sure why training on stolen data would disqualify them if said data was available or at minimum accurately specified what it was.
youoy
If (stolen) data is available to download ok, that would be the accurate definition of open AI model. But "accurately specified" is not because you would need to trust that the person specifying it is actually honestly doing it. And I think we all know what happens to all that honesty when economic interests are in place.
martin-t
The data is bound by licenses which affect how the resulting model can be used. I release most of my public code under AGPL so that, for most intents and purposes, anybody using it has to also make their code public and benefit society at large.
Now, with LLMs, anybody can launder my code and use it to build proprietary software for his own benefit without giving anything back. That is a violation of the spirit of AGPL and hopefully the law too.
Brian_K_White
Available doesn't excuse anything. I don't know why people say it like it matters.
When CBS lets you watch a show on their web site, even for free and anonymously, they still own the show and did not grant you any right to re-distribute or re-use it.
What AIs do is also not fair use, because that isn't just about the size of a quote but about usage. A discussion is fair use, excerpting simply to pluck a cherry and present it as your own is not.
ErikBjare
Not a lawyer, but my (possibly poor) understanding was that courts were leaning towards it indeed being fair use?
myworkinisgood
Great point!
neilv
Open source was always a corporate-friendly compromise, but seemed like some of the people involved had a lot of integrity.
What we need is those open source people with integrity to put the smack down on those willfully abusing and destroying the terms.
If you can't do it with trademarks/certifications/licensing/memberships/etc., do it with mainstream journalism. Like might be being done here, except The Register has long had rare insider knowledge, and is relatively niche. You need to get the message out to everyone who's not already in the know, including lawmakers.
(Incidentally, the FSF also has integrity, but, besides prompting open source by being zero-compromise -- which is fine in their case -- they have an additional challenge of seeming to be clinically incapable of advocacy in situations that are aligned.)
pyeri
That compromise thing was like eons ago when folks like Bruce Perens and ESR tried to tow that fine line between commercial open source and free libre paradigms and were successful to a great degree.
But today, such nuance doesn't exist. The commercial ones have gone full commercial and making no qualms about it (thus the title of this post).
If this attitude continues, all commercial interests in FOSS will be seen with high scepticism unless they have a proven track record of being a good actor.
bubblesnort
Open source never had any of the ethics or philosophy that free software has.
Free software > open source.
trehalose
Do you think, if open source never existed, if there were only free software and non-free software, we wouldn't be arguing about whether AI corporations can truly call their free models free?
mrweasel
Companies always seemed much more weary of "free software" as compared to open source. Probably because of the ambiguous meaning of free in English, honestly that is one of the reason we have open source as a concept.
Companies like the flexibility in "open source", even companies who release code as GPL rarely talk about "free software", they are open source companies.
pessimizer
How could we? Free Software makes it clear that when you modify the Free thing and productize it, you have to share the modifications with the public under the same licensing. What's there to argue about? You're either doing that or you're not. If you find a loophole in the text, then the license gets updated, the loophole explicitly closed, and everybody who agrees moves to the new version.
jraph
You are confusing Free Software with copyleft.
Free Software licenses and Open source licenses are essentially the same (apart a few odd examples).
The difference between the free software movement and the open source software movement is essentially philosophical.
ensignavenger
Non-copyleft licenses can also qualify as Free Software under the FSF definition.
Ekaros
Free is ambiguous term. It might be free in code and price. Or it might be free in price, but closed source. It could be free for me as private person, but not for business.
Is freeware free software? It is rather murky term for me.
arccy
based on current license choice of projects, turns out most people don't agree...
mistrial9
in English, the word "free" has not served well.. suggested alternative "libre" ... oh, except LOSS does not sound great! seems challenging right now.. "free" has failed IMHO .. it is literally mocked by finance people no? every adult in the US and elsewhere must pay bills.. "free" is failing as a label
homebrewer
Probably should have called it "freedom software" like "freedom fighter" or "freedom units" (as opposed to metric units).
bubblesnort
It's not too late for that.
Ringz
Don’t forget „Freedom Fries“.
anthk
Fair software.
pessimizer
Free Software has been wildly and unimaginably successful, and undergirds the world economy.
mistrial9
certainly agree (to clarify)
an_d_rew
I have worked at multiple companies that vilified open source anything, while building their entire businesses on Linux, Java, Debian, and thousands of other "OSI Approved" software.
It's because, in my experience, the majority of businesses want to take but do not want to feel any obligation to give back or support.
Aeolun
Most businesses are started to earn money. Using free stuff while not giving anything away seems perfectly in line with those goals.
LtWorf
> Most businesses are started to earn money.
I thought tech startups were started to con people into thinking they might earn money.
pessimizer
Which was the entire purpose of Open Source, from conception, and the only way it is distinct from other licenses. Open Source is like Free Software, except you can use it without giving anything away.
dragonwriter
> Open Source is like Free Software, except you can use it without giving anything away.
No, Open Source and Free Software are two names for essentially the same thing. The Free Software Foundation has a preference for licenses which go beyond its own Free Software Definition [0] and which are also "Copyleft" [1], but does not define Free Software in a way which requires that it also be Copyleft.
[0] https://www.gnu.org/philosophy/free-sw.en.html [1] https://www.gnu.org/licenses/copyleft.en.html
goku12
To be clear, Open Source and Free Software aren't licenses. They are philosophies. FOSS licenses come in two major varieties - copyleft (like GPL) and permissive (like MIT). It's possible for either type of license to conform to both open source and free software philosophies. In fact, the vast majority of FOSS licenses - both copyleft and permissive - are endorsed by both camps (OSI and FSF). Also, both camps reject licenses for similar reasons - like for having proprietary terms (as in case of BSL).
The property of being able to keep changes to oneself is the property of permissive licenses, not opensource. Open source software under copyleft licenses cannot be modified and distributed while withholding changes. The inverse is applicable to FS under permissive license too.
The real difference between free software and open source is in how they treat the software. FS camp considers software as something that should give the users total freedom over the computing devices they own. The software shouldn't constrain or exploit the end user in any manner. This of course needs the source to be open.
OSS camp established open source because they realized the advantages of 'open' source, but didn't like the emphasis on freedom. That's more in line with corporate philosophy - take advantage of unaffiliated talent to increase code volume and quality, without making any commitment to user freedom. This is why many companies completely avoid the term free software. It's also easy to find 'open source' code that's very exploitative towards users, despite being open and using FSF-endorsed licenses.
mirekrusin
True, this needs clarification that currently doesn't exist for large models where training costs heavy millions and binary artifact is both precious and malleable – unlike ordinary compilation.
Regardless if – once OSI establishes their definition(s) – Meta will choose path of adherence or not, they still deserve a paragraph of praise for what they're doing.
As a side note OSI should also recognize that in the era of giant cloud providers protection from predatory market participants is also a thing and should exist as clear licensing option. Mongo, Elastic and Redis drama could be avoided in the future if there was a clear option to protect author side sustainability without affecting open source spirit for end users.
ps. I also believe that "Open <something>" should be protected phrase similar to how "Police", "Federal", "Government" or "Organic" is protected to not mislead the public so we don't have things like "OpenAI" nonsense.
mdaniel
I can more readily(?) accept ones which mis-label their announcements of "Open Source!!1 under My Awesome License 1.0beta" than I can rug-pulls. Look, if you wanna use some rights-harming license and just shit on the term "Open Source," that's bad, but from a certain perspective understandable if the marketing folks don't grok the nuances of Open Source. The world is filled with misguided people, and I can just command-w the window and never use your product
But if you accept contributions from the community for years, and ingrain your product in hundreds of thousands of workflows around the world, and only then decide "holy shit, salaries cost money, best yank our license" that should be a case of fraud and you should be civilly liable, in my opinion
teddyh
Companies whose products’ licenses permit rugpulls exist because the company wants to have the option to rugpull. If you don’t want to have the rug pulled on you, don’t use products with such licenses.
kvemkon
Related:
OSI readies controversial open-source AI definition (26.10.2024)
scirob
an agregious example is thirdweb who technically has the product open sourced but is written to not work without an API key and phone home to SAAS to check your API call limit..
https://github.com/thirdweb-dev/engine?tab=readme-ov-file https://portal.thirdweb.com/engine/self-host
It makes me sad becuase I was working on a getting a team together to build a real opensource and free alternative but once they found thirdweb they all got discouraged thinking that no one will understand why our real open product is diffierent
josephcsible
If it's open source, can't you just fork it and remove that antifeature?
BlueTemplar
Another example why the 4 freedoms aren't good enough any more, and we need the 11 freedoms :
LtWorf
What even is this thing?
Sytten
Direct consequence IMO of our failure to popularize good licenses in another concept like fair source that sits in-between open source and closed source. My small non-saas bootstrap company could not survive if it was OSS, but maybe fair source.
lordofgibbons
> The pair found that while a handful of lesser-known LLMs, such as AllenAI's OLMo and BigScience Workshop + HuggingFace with BloomZ could be considered open, most are not.
It's absolutely wild to think the deranged BigScience RAIL license, under which the Bloom LLM was released, is open in any way shape or form. It has more user-harming restrictions than basically any other LLM license out there.
meehai
I think Open Weights is a better name for AI models that don't share the reproducible training scripts and data.
goku12
By that logic, I can call any proprietary program as 'open machine code' or 'open assembly'. If the artifact can't be built or modified easily, then it can't be considered open.
ahaucnx
I believe often companies or rather decision makers are afraid of going fully open-source because they invested a lot of money into the product and are afraid some other company uses it, offers it cheaper and ultimately harms the originator.
So even they might believe in open-source they put protections in place that ultimately lock it down and thus make it closed source but trying to keep the impression of being open.
In our journey at AirGradient towards becoming fully open-source hardware (all code and hardware licensed under CC-BY-SA), we had the same concerns but ultimately decided to go full-in and open up everything with an officially approved open-source license.
I believe there are a few important aspects and "protections" that are open-source compatible that help companies protect their investments.
Firstly, requiring Attribution is compatible with open-source and can help companies get a lot of visibility and competitors probably don't want to attribute another company and thus are often not likely to clone.
Secondly, using a share-alike license also makes it unattractive for many other companies using the code.
Lastly, I believe the code itself is often not the valuable part compared to the brand value, employees, reputation, business model, network and implicit knowledge that a company builds up.
It really worked for us to go that way with a true open-source license and I hope many others will do it too.
There are already some easy to understand licenses like CC in place and I do hope that they also create awareness around "open washing".
The second goal is muddying the waters and making people not care.
Say you're deciding between two programs (or AI models)[0], you prefer an open source one, a colleague prefers one that just pretends to be open. You say your choice is preferable because it's open, he says the same about his choice. Then you say the dreaded "well, actually" and either you sound like a fundamentalist or an asshole.
[0]: None of those are truly open source because they're all trained on stolen data. And see? Now I sound like a fundamentalist.