Mlx-community/OLMo-2-0325-32B-Instruct-4bit

maxloh

You should link to the original model too: https://huggingface.co/allenai/OLMo-2-0325-32B-Instruct

Kudos to Allen AI for their great work on a fully-open LLM!

teruakohatu

OLMo uses open datasets, such as CommonCrawl and StackOverflow, for training, about 5TB worth of text. I wonder how well it would perform if it was also trained on Annas Archive/LibGen (>600TB).

rajman187

Not a lawyer but would assume downloading material from libgen is, in the vast majority of cases, illegal because it's a breach of copyright or similar. That’s gotten Meta in quite a spectacle of late [1]

[1] https://www.loeb.com/en/insights/publications/2023/12/richar...

maxloh

CommonCrawl is composed of copyrighted contents too. You gain copyright on your work automatically the moment you created it, including this very comment.

fulafel

In many jurisdictions it's just sharing that is illegal, not obtaining.

anon373839

I think it’s a big deal to see a fully open LLM now achieving this level of quality. While the partially open releases we’ve seen from the big labs are are quite valuable, models like OLMo-2 are the only way that researchers can truly study this technology to answer questions about how the models’ capabilities are shaped by their training data and training process.

The closed and partly-closed models rely on a lot of secret sauce, so it’s also just really impressive to see their results being replicated in the open.

teruakohatu

I have struggled with SVG generation with just about all models, the SVG demo for this model is more or less that I get from much larger models.

Am I doing something wrong? Everyone seems to say how well models work in producing SVGs but I get shapes in all sorts of the wrong places. SVG documents are quite low level (verses editing them in Inkscape or Illustrator) so its tricky to modify, beyond very simple shapes.

simonw

The models are mostly terrible at SVG output, at least if you ask for something that's hard (or impossible?) to draw like a pelican riding a bicycle. That's why I use it as a benchmark, I think it's amusing: https://simonwillison.net/tags/pelican-riding-a-bicycle/

Some of them can do good SVGs for things that make sense, like simple diagrams.

showmexyz

These works well for some svg that are simple and already in training data but doesn't work for harder svgs, even simple one if they are out of distribution of training data.

In simon's example whole purpose is to make it draw something that it has not seen before but can easily infer from geometry, spatial arrangement. I think it makes a fun problem.

pylotlight

"refreshingly abstract" is just another term for wrong.. not particuarly helpful..

simonw

My joke there makes more sense in the context of the series: https://simonwillison.net/tags/pelican-riding-a-bicycle/

HN

Mlx-community/OLMo-2-0325-32B-Instruct-4bit

Mlx-community/OLMo-2-0325-32B-Instruct-4bit