OpenEuroLLM
170 comments
·February 20, 2025simplecto
dang
Thanks! Macroexpanded:
Open Euro LLM: Open LLMs for Transparent AI in Europe - https://news.ycombinator.com/item?id=42922989 - Feb 2025 (279 comments)
seydor
> OpenEuroLLM has a total budget of €37.4 million of which €20.6 million comes from the Digital Europe Programme.
So on average 1.87 million per participating institution which might amount to funding ~5 PhD students per institution. Not bad for a training program.
The project has been awarded the Sovereignity Seal, an EU mark of Excellence before it even started. This is truly in accordance with european values, where we reward participation and proclamation. I don't think we will ever hear again from this project.
Congratulations to the participants of the consortium for receiving this large EU grant. Thoughts and prayers to the students who will be writing the deliverable progress reports.
huijzer
> This is truly in accordance with european values, where we reward participation and proclamation.
Yes. In my experience the government is happy with "looks good doesn't work" as long as it truly looks good.
bee_rider
Thank goodness we have startup culture in the US, instead we can have “looks bad, doesn’t work, but Microsoft bought it so…”
HeatrayEnjoyer
and "actively erodes human community and democratic society"
aubanel
Microsoft is certainly much better at judging the value of a software than any european administration.
FranzFerdiNaN
Lol yeah unlike corporationd who are happy with “makes life intentionally worse but brings in money”.
Jesus the ideology in this place runs so thick with some people.
solid_fuel
It’s truly absurd the ways some people here twist logic. “Government did it” means it MUST be bad, of course. It takes a willful ignorance, pretending that all government efforts are bad and all corporate efforts are therefore good.
As if ARPAnet just sprang from a group of MBAs sitting around unemployed.
When I read 1984 in high school, I didn’t really get the scariest bit: a lot of people are PROUD to shout that 2 + 2 = 5 as long as it makes a poor person somewhere else sad.
null
gman83
Sovereignity Seal -- https://strategic-technologies.europa.eu/investors_en is just a fund for investing in strategic technologies, it's meant for projects that are getting started.
ta12653421
for 1.87m per project, you get in EU rather 15 - 20 people :) (salaries are low here)
dagenleg
At least in France, where they have PhDs which last only 3 years, a years of PhD would cost ~45K EUR in gross salary (granted the student gets around half of that after tax), then let's say ~10K travel and consumables costs, then add up the inevitable 20% overhead costs and now you're looking at around 200K for the shortest possible frugal 3 year PhD.
rhubarbtree
At least in the UK, overheads are usually over 100%.
j-krieger
I agree, in Germany companies PhD funding seems to be between 200 and 300k.
seydor
I assume the largest portion will be consumables, travel, meetings etc.
null
jpdus
My comment from the original submission [1]:
--- As someone who is in general skeptical of programs like this (and an European) there are 2 remarkable / timely things about this: - This project doesn't just allocate money to universities or one large company, but includes top research institutions as well as startups and GPU time on supercomputing clusters. The participants are very well connected (e.g. also supported by HF, Together and the likes with European roots) - Deepseek has just shown that you probably can't beat the big labs with these resources, but you can stay sufficient close to the frontier to make a dent.
Europe needs to try this. Will this close the Gap to the US/China? Probably not. But it could be a catalyst for competitive Open source models and partially revitalize AI in Europe. let's see..
PS: on Twitter there was a screenshot yesterday that in a new EU draft, "accelerate" was used six times. Maybe times are changing a little bit.
Disclaimer: Our company is part of this project, so I might be biased. --- I hope the next time this is on HN, it's with some cool release and not a PR :).
(@mods please delete if copy-quoting not allowed)
acka
What about the very similar sounding EuroLLM[1] project mentioned elsewhere[0] in the comments? If that is indeed a different project, why not pool resources? EuroLLM has already delivered some models, they are up on Hugging Face[2][3].
[0] https://news.ycombinator.com/item?id=43119913
[1] https://sites.google.com/view/eurollm
mrshu
It is worth noting there is _another_, completely unrelated project (also) called *EuroLLM* that is also EU funded which not only shares many of the same goals, but has already fulfilled many of them:
1. large multilingual dataset
2. open science approach
3. competitive performance
Here is the HF blogpost that introduced it in December 2024 (along with various benchmarks): https://huggingface.co/blog/eurollm-team/eurollm-9b
The project's lead has summarized the situation succinctly in their LinkedIn post [0]
I hope the different communities collaborate openly, share their expertise, and don't decide to reinvent the wheel every time a new project gets funded. Next what? "OpenEuroLLM with real cheese"?
[0] https://www.linkedin.com/posts/andre-martins-31476745_ai-art...olejorgenb
Homepage: https://sites.google.com/view/eurollm
Deliverables:
- A series of models of different sizes for optimal effectiveness and efficiency (1B, 9B and 22B) trained on 4T tokens
- A multimodal model which can process and understand speech or text input
- Full project codebase available to the public with detailed data and model descriptions
I can't find the codebase yet though
amarcheschi
Results don't seem that bad for 9b https://huggingface.co/blog/eurollm-team/eurollm-9b
KronisLV
I've been running it with Ollama, it's actually pretty good for working with text in Latvian (and other EU languages). I'd be hard pressed to find another model of a similar size that's good at it, for example: https://huggingface.co/spaces/openGPT-X/european-llm-leaderb...
This won't be relevant to most people here, but it's cool to see even the smaller languages getting some love, instead of getting garbage outputs from Qwen (some versions of which are otherwise pretty good for programming) and anything below Llama 70B, or maybe looking at Gemma as a middle ground.
belter
"...EuroLLM-9B was trained on approximately 4 trillion tokens, using 400 Nvidia H100 GPUs on the MareNostrum5 supercomputer..."
GTP
Thanks for the heads up, I missed this project! However, on their page they write "Project Timeline: 1 May 2024 - 30 April 2025". April isn't far away, anyone knows what's supposed to happen afterwards?
egorfine
That timeline is just for the preliminary hearing on potential committee members.
No sarcasm, sorry.
dmacedo
This should probably link to the actual press release since its more of an announcement of something forming rather than a release of any models, code, whitepapers, etc.
picafrost
This is classic EU. An announcement of an effort to collect collaborators to discuss doing something that they might do in the future.
null
simion314
>This is classic EU. An announcement of an effort to collect collaborators to discuss doing something that they might do in the future.
It should be done in secret? How did they manage to create CERN? maybe there was no reddit like people commenting back then?
ffsm8
No, but collaboration comes with a cost too.
As a European myself, I would prefer them to put less emphasis on collaboration and more on actually doing something's with the resources available to them and making that freely available. Collaboration will happen naturally and without having to coordinate.
But as they said, this is less about producing value then it's about signaling
mmaunder
It’s like telling someone you’re planning on starting a diet and getting congratulated.
picafrost
> It should be done in secret?
No?
> How did they manage to create CERN?
I have no clue. It appears that was 70 years ago.
> maybe there was no reddit like people commenting back then?
Huh?
The EU is often criticized for its lack of competitiveness due to its highly regulated environment, low investment numbers, risk aversion, and slow moving bureaucracy. This announcement hits all of these points. I am European as well, and it just makes me sad? It is more of the same. This doesn't look like a serious effort to propel Europe to the cutting-edge or even the conversation. It's just enough to say we're doing something, without a high risk of calling it a failure if nothing ends up being delivered.
Europe doesn't lack talent or initiative. If you look at the top AI research institutions out there, a great many of them are composed of researchers who originated from Europe. What is the US offering them that Europe is not? That is many things, none of which are are actively being addressed in the EU. There's a high likelihood that academic beneficiaries of these funds will end up in the US due to the absurd salaries and cutting edge positions.
I prefer the regulated EU environment. I value my privacy and think the EU is doing the right, long-term thing. I don't mind the reduced salaries here -- I worked in the US for years but returned back to Europe because I share its values. But there's no point in pretending the EU will be a serious contender in this environment.
01HNNWZ0MV43FF
It probably should not be number 2 on Hacker News, unless Hacker News has a lot of readers who might contribute to this effort
dkyc
But can I run it on Gaia-X?
This really reads like a parody. Press release, “a consortium of 20 research institutions”, “awarded the STEP (Strategic Technologies for Europe Platform) seal”. Lots of grandiose self-congratulations. All with nothing to run, download or try of course.
rafram
> Press release
https://openai.com/news/company-announcements/
> a consortium of 20 research institutions
https://aimagazine.com/machine-learning/google-invests-in-ai...
> awarded the STEP (Strategic Technologies for Europe Platform) seal
https://openai.com/index/strengthening-americas-ai-leadershi...
> Lots of grandiose self-congratulations
https://x.com/sama/status/1891533802779910471
> All with nothing to run, download or try of course.
Legend2440
OpenAI has a real, groundbreaking product.
This has... a statement of intent to try to copy that product. Not remotely the same.
cess11
This is about industrial research, not about some product.
Members of the project have previously produced both niche and general models, but without the arrogance and bluster of usian corporate subcultures.
coalbin
Maybe you can't download their weights, but you can literally try out their products right from their homepage. What's your point?
rafram
OK, if you prefer: https://web.archive.org/web/20151211215507/https://openai.co...
It's normal to announce things long before they're actually available to end users. This is not some unique evil of the EU bureaucracy - if anything, it's very corporate of them.
sunshine-o
What I am gonna say here is not a political point but I hope someone can point me the pattern (and some something to read about it) I have observed with for example the EU.
Yes it sounds like a parody or an onion piece. We know the European search engine, cloud, blockchain never got anywhere. I don't even believe that anybody ever really tried.
Now you have to put yourself in their head for 2 minutes and here is what I noticed by knowing a few of them (the "EU type").
In their perception of reality it seems they really believe that if they declare something it is real. This is why they get so deranged if you dare pointing to the facts or just asking questions. It seems they really believe they succeeded in all those projects. I they say it, it exists.
I am not really satisfied by the explanations we usually hear: they are incompetent, it is corruption or even insanity (some sort of mass hysteria that would take root in some institutions).
What I am wondering is, is there a concept in philosophy or some similar pattern in previous civilisation that could help us understand what is going on with the EU?
Because Gaia-X or OpenEuroLLM is one thing, but it is worrisome they now believe they can raise an army and declare war on everybody.
hanshansen43
As a European, the sad reality is that I see parallels with the late-stage Soviet Union and its satellite states.
NOT when it comes to the level of violence and repression or quality of living. Those two things are world-class.
But in the sense that there's a more or less unelected political establishment that's
a) Recursive: It does things only to show them off to itself.
b) Not exposed to real-world consequences.
c) Has a non-falsifiable pretense to validate whatever they do and caution against undoing whatever it is. For the soviets, it was anti-capitalism. For the EU it's some notion of safety or sustainability.
d) Inadvertently benefits itself and other elites and harms the people they pretend to protect.
My hope is that as a democratic institution, the EU is capable of reform.
sunshine-o
Yeah you are right there is probably no need to look very far ...
Now what worry me is from I understand of the collapse of the Soviet Union (but I might be very wrong) is they kind let things happen and was less aggressive by the end.
On the contrary the EC is now consolidating power rapidly and are getting very aggressive.
varjag
As someone who grew up in late-stage Soviet Union nope. Not even close.
Fnoord
There's various EU cloud providers. It seems to me it is difficult to compete with these energy prices.
amarcheschi
It is not different from any corporate speech, except that this time is for public benefit rather than private, and will proceed much slower. And yes, I don't know why but apparently consortium are named quite often, I'm in compsci in italy and on hpc courses they get named a lot
menaerus
It is. They will do nothing but distribute the EU taxpayers money into their pockets. Unfortunately.
Argonaut998
It’s par for the course for this union. It’s just comical given the very recent political events.
anonymousDan
In what way? What exactly has the US achieved?
jisnsm
I don’t know. Everything?
kandesbunzler
Uhhm.. they lead in pretty much everything especially tech related? You redditors are unbearable.
cyberax
Europe is moving at the speed of bureaucracy. It's slow, but inexorable.
And honestly, people don't _want_ the European bureaucracy to move fast. Case in point: the USA.
null
baggy_trough
The spending of money is inexorable, but little else is achieved (unless you count blocking productive people).
cyberax
European projects are often long and ponderous, but they do deliver. There's a long history of state-sponsored academic collaborations, like the venerable CERN.
kandesbunzler
> , people don't _want_ the European bureaucracy to move fast.
I'm a german and yes i would absolutely want it to move faster. And I guess you are an american?
cyberax
Ethnically Russian. And the Russian government is (and I'm not joking) quite effective and agile.
You can guess why I prefer a bit more ...gradual... style of governing.
Oras
> A series of foundation models for transparent AI in Europe
Am I the only one who doesn't see any link to any model? Too many words, no actual outcome.
davidcollantes
From https://openeurollm.eu/launch-press-release (3 Feb 2025):
> "The models will be developed within Europe's robust regulatory framework...
huijzer
Not a surprising stance if the project is funded by the people who are responsible for said regulatory framework
eej71
I think the intent of highlighting that sentence ('The models will be developed within Europe's robust regulatory framework') was to draw attention to the fact that the sponsors will not move fast nor achieve anything of note. To put it more sarcastically, with sponsors like that, who needs others throwing down roadblocks!
enbugger
To make the regulation even more effective of course
artninja1988
I think they've yet to train let alone release any models. This is just a press release about the effort
Oras
Starting by misleading? Nice start!
The title should be “effort to train model” or plans, not saying “series”! Series without having even one?
woah
The three goals featured prominently above the fold are:
> truly open > including data, documentation, training and testing code, and evaluation metrics; including community involvement
> compliant > under EU regulations, OpenEuroLLM will provide a series of transparent and performant LLMs
> diverse > for European languages and other socially and economically interesting ones, preserving linguistic and cultural diversity
The first one seems good, but the second two seem to be pretty beside the point of creating models that compete with the cutting edge of China and the USA.
rafram
People on HN complain constantly about "open-source" models not releasing their training data. That's what the second point ("transparent") seems to be alluding to. And that's a bad thing?
Others have responded to your "diversity" point, but making sure to train on adequate amounts of data in all EU languages is valuable, especially because LLMs are so prone to generating convincing BS when working close to the edges of their training set. If this exists, people in Malta are going to want to use it, so better for it to generate good Maltese than gibberish that sort of looks like Maltese, right?
ben_w
Why would diversity, especially linguistic diversity, be besides the point? Europe is a lot more culturally and linguistically diverse than either the USA or China.
Hier spricht man Deutsch.
A 600 km à l'ouest, on parle français.
50 km na wschód, Polska.
360 χλμ βόρεια, Δανέζικα, Σουηδικά; 250 χλμ νότια, Τσεχία; 750 χλμ νοτιοανατολικά, Ουγγρικά; και τα λοιπά.
Europe has a need, that the other models aren't bothered by — they can do it, but more by happenstance than on purpose.
woah
Depends on the goals. If they were fine-tuning leading foundation models, then I could see this being an entirely sensible undertaking. But since their goal seems to be to make foundation models, I don't think that they will end up being the leading models with so many other conflicting requirements.
pastage
Of the four languages I speak the different models do a pretty good job. I am sure there is something extra that can be added, but atm it is good enough for me.
layer8
Compliance and language diversity are important motivations to not just use the existing foreign models.
blackeyeblitzar
That note about EU regulations may also be dangerous. There is an increasing trend of European leaders supporting censorship of speech, on weak justifications like misinformation that are applied very aggressively. There are even videos of police showing up at people’s homes in some countries, over tweets they made. I don’t have faith that these European LLMs will be trustworthy as a result.
yorwba
Laws against defamation and fraud aren't exactly a new trend, nor are they limited to Europe.
I guess some people are surprised police might get involved in a defamation case because in the US it's not a crime but a civil wrong? Which means you can't get help from the police to identify the person who made a defamatory tweet? Or something?
papertokyo
Then you should also question what flavor of censorship and bias US-made LLMs have.
Also, if someone says something that could threaten my safety (either directly or through inciting others) I would very much like them to get a visit from the police. This situation is so easily avoided by not being a dick to people.
logicchains
Most of the people working on building American LLMs also support such censorship, they just don't have the political power now to achieve it in the US, especially given the first amendment.
Fnoord
> There are even videos of police showing up at people’s homes in some countries, over tweets they made.
Yeah, if you are from The Netherlands and want police showing up your door, mention on Twitter that you want to shoot mr. Wilders. Threatening someone to take away their life has repercussions. How peculiar!
(Please don't do it. Example is just illustrative. Actually, I know a website with a forum where this happened approx 20 years ago. Server got seized. They didn't log. FDE, but obviously got broken at some point.)
Freedom of speech isn't that you can spout whatever you want and not face repercussions.
Besides that, there's Popper.
Furthermore, there's this thing called chilling effect. You might wanna ask GOP Senators and Congressman about that.
I have faith in LLMs and AI, as long as it is reproducible and transparent. Right now, when I use Mistral, it refers to sources. A step in the right direction.
cherryteastain
Why bother when there's Mistral which is already open, pretty good and, crucially, exists
MLENG
Somebody needs to train those PhDs that Mistral will eventually hire.
anonymousDan
Exactly. US commentators here are so tedious.
speedgoose
Where is the training data and the recipee?
throw8383848
》under EU regulations, OpenEuroLLM will provide a series of transparent and performant LLMs
What EU relulations? It is a moving target, and nobody knows what exactly apply. It would be nice to provide list of regulations with references. And some testing suite or checklist, to verify AI use actually fits regulations.
Right now, if I integrate spellchecker into my app, I have no idea if I am breaking any AI EU regulations!!!
MLENG
Don't worry, I would be immensely impressed if they even finish a pretraining run for a competitive model. Let alone get to the stage of doing any kind of fine tuning for any kind of purpose.
throw8383848
Well, perhaps they could take Deepseek, feed it with all EU directives, ask on each if it's related to AI (or whatever) and spill out results.
It could be even nice idea for startup. All data are publicly available...
cess11
Here's a summary of what one of the members has done before.
https://www.ai.se/en/ai-labs/natural-language-understanding/...
They've cooperated with a research agency known in part for their Prolog implementation, i.e. they've been at it since the last massive "AI" hype cycle.
sameermanek
They also had some search engine announced with similar name like "openeurosearch" or something close to that.
That project too seems dormant lately.
They just announce things and then the train leaves the station.
rafram
That was just an Ecosia rebrand, not an official EU thing: https://betterweb.qwant.com/en/2024/11/08/ecosia-and-qwant-j...
sunshine-o
The official EU "Google alternative" was Quaero (not Qwant which is something else). Announced in 2005 and ended in 2013.
I don't believe anything was ever made public.
harvey9
I wish they'd held the press release until there was something to see. There isn't even a hiring link.
schnable
Rushed it out to respond to the JD Vance speech?
layer8
This was announced over two weeks ago: https://openeurollm.eu/launch-press-release
https://news.ycombinator.com/item?id=42922989