Sharing new research, models, and datasets from Meta FAIR

67 comments

·December 13, 2024

cube2222

There’s honestly so much interesting stuff here, esp. the llm-related things - large concept models (operating on and predicting concepts, not tokens), dynamic byte latent transformers (byte-level alternative to standard tokenization), sparse memory layers (successfully scaling key-value memory layers without an increase in computational requirements).

Here they are presented as separate things, each of which apparently improves quality / efficiency. I wonder what the quality / efficiency increase is of all those methods put together? Maybe that’s what Llama 4 will be?

This looks like a lot of innovation is happening at Meta in those areas, really cool!

ms8

I hope that Llama 4 or 5 will have a different architecture. All released llamas are +/- same inference with a better training pipeline. The downside is that llamacpp will probably not be able to run new models and maybe it will be too much big rewrite, so we will need new c,cpp,go,rust programs.

janeway

Side track, but does anyone have suggestions about how to better present such content. I am struggling with similar docs/demos.

As a documentation page, each section is laid out uniformly with section heading, content, link to code and link to paper.

However the page itself is a blog post which will be difficult to find again next year.

Are there other examples of companies having well presented technical summaries which remain findable from the hime page?

airstrike

I'd put a table of contents-like page up front with some exciting short description of each section and use hyperlinks, allowing the user to navigate to the section and back

rmbyrro

it's a bit ironic that Meta ended up becoming the largest "open ai" org.

all right, yeah, it's not "open source", but hey, it is open to use and they're publishing their research openly as well.

airstrike

This is so cool! Playing around with the first demo is a lot of fun. First one to get the model to moonwalk wins. My best attempt was probably something like `(body_speed_forward < -0.3) * (head_height > 1.0) * (stay_still > 0.2) * (body_speed_vertical < 0.1) * (stay_upright > 0.9)`

https://i.imgur.com/O5hGMo5.gif

Then the "Meta Explore Theory of Mind" is even more interesting. There was a thread about a month ago in which some of us were discussing some of the concepts here like "beliefs" and updating a model of the world accordingly. https://news.ycombinator.com/item?id=42035985

modeless

I really hope Dynamic Byte Latent Transformers work out. Death to tokenizers!

Interesting that it's a a hierarchical structure but only two levels of hierarchy. Stacking more levels seems like an obvious direction for further research.

entilzha

Author here :), I do think it’s a good direction to look into! That said, aside from it being a bit too much to do at once, you’d also have to be careful about how you distributed your FLOP budget across the hierarchy. With two levels, you can make one level (bytes/local encoder) FLOP efficient and the other (patches/global encoder) FLOP intensive. You’d also need to find a way to group patches into larger units. But ya, there are many directions to go from here!

Permik

In a way I'm kinda sad that if tokenizers will go the way of the dinosaurs as asking someone to give me a Unicode character from the private use area was one of the last ways you could actually distinguish a co-operative human from an LLM online They simply don't have those characters tokenized, so they can't output them. (But this is technically moot if the LLM has a python interpreter handy)

djhn

How do you ask someone to give you a Unicode character from the private use area?

ks2048

When I wonder about the business behind Meta doing this, I see they have $70B in cash, so giving a bunch of AI experts hundreds of millions is pocket change.

wrsh07

Imagine that something fundamental shifts in the world of AI research. It could be anything: AI suddenly makes programmers much more productive, AI becomes very good at identifying vulnerabilities, AI chat becomes a new major source of entertainment, AI images become an item popularly shared on Instagram (etc)

Suppose any one of these things happened and suddenly Facebook wished that it had access to state of the art models so that it could customize them for its uses (internal developers or tools, embedding in their app).

Imagine how they would feel if the only way they could access these models were by signing 7-9 figure deals with a model dealer like OpenAI. Even worse, imagine if one of their main competitors in advertising started providing robust AI tools to help advertisers adapt their creatives to various form factors. Facebook is now way behind and possibly has to shell out millions to a company like OpenAI all while also losing ad market share worth billions per quarter (ads on Google start performing much better, so Google gets more ad spend)

If this worst case scenario came to pass, Facebook would look foolish. If even one of these things were likely their investments make sense. The rest (open source, make meta a cool place to work) are a strategy credit.

aoanevdus

“Commoditize you complement” may be a good way of framing it. Consider that if OpenAI succeeds dramatically and is the only game in town, they could extract huge rents for anyone using their service. So it’s in other companies interests (or anyone who wants to use AI) that the AI ecosystem have lots of competition to keep prices low.

cma

You can't have enough top researchers without letting them publish.

sangnoir

Those AI experts are a played a critical role in Meta getting that $70B in the first place

almostgotcaught

everyone that has responded so far has it wrong (naively so).

FB sells ad space on several apps. those apps needs people on them in order for the ad space to be worth anything. people, in turn, need content to attract them to the apps. so it's simple: enable people/companies/whomever to generate tons of content for cheap and consequently share it on the apps. that's it.

SideQuark

Except giving out the tools makes easier for competitors like TikTok to do the same, drawing revenue away from meta.

So that’s not it. Naively so.

tzs

Couldn't the same argument be made for all kinds of things companies have made open? Some examples:

• Tesla gave away its EV patents.

• Pixar and DreamWorks have both open-sourced some of their tools, including tools used to make some of their best works. For example DreamWorks' MoonRay renderer has been used on everything they have done since "How to Train Your Dragon: The Hidden World", including "Puss in Boots: The Last Wish" and "The Wild Robot", and will be used on their upcoming films.

• Facebook open-sourced React.

• Google open-sourced Chromium.

almostgotcaught

this is like saying that AMD making chips that intel/nvidia employees can buy and use to do their jobs is a bad strategy for AMD. lol. ok not every single strategic choice needs to both grow the top line and be anti-competitive. some can just grow the top line.

mttddd

the tools but not necessarily the data, presumably they have internally trained versions

null

[deleted]

mttddd

content but also better ad targetting by better understanding all of the content that users post

mtkd

I was fortunate to get to a talk by Ross Taylor ex-Meta recently at the AI Engineer London meetup

He's recorded the full talk here now: https://www.youtube.com/watch?v=S5l5OvJ01ws

I had missed how much Meta have been doing on reasoning, ToM etc.

sharih

This is a great video - places o1 in context. with openAI, google and meta releases going at it at this pace, anthropic is next up..

intalentive

Every time I have to clean text I wonder why I haven’t just trained a byte level denoising autoencoder to handle it for me.

anon373839

That’s a fun idea. I’ve always wondered about experimenting with u-nets and hourglass nets for text data since they’re so efficient at capturing global and local context (in vision, anyway). But I’ve never tried it.

puttycat

Can someone explain how watermarking AI videos voluntarily helps make AI safer?

benatkin

It lets those providing AI video generation services watermark all of their videos. So it isn't intended to by voluntary. You would be left with those services that don't comply with whatever the current Big Tech rules are, like people who used Grok/X.ai to generate images in support of Trump despite Grok/X.ai being inferior. https://arstechnica.com/information-technology/2024/08/musks...

refulgentis

Think this the wrong / older article - when I click the link, this is twitter's hosted Flux model making pictures of Kamala and Trump flying into the world trade center and Trump on a surfboard with busty cat girls. The X.ai one launched this week

sangnoir

X hosted a white-label Flux model for a while, and freely admitted so .

null

[deleted]

bee_rider

How much does it take to train a model at this point? I’d tend to expect that it’ll be in range of any major state or most oligarchs in the next couple years (if it isn’t already). So, making it is probably best of everybody understands the watermarking to be voluntary. Images and videos aren’t worth the bits they are printed in at this point, as evidence of anything in particular.

bbor

Crazy stuff. Everyone’s covering how exciting all these are (especially LCM and the non-tokenizing-tokenizer), but I have to ask in case anyone’s been paying attention: why are they using the term “advanced machine intelligence”?

My initial thought is that they want to please/distract the doomers, but I’m prolly just self-centered!

rajman187

It originates in Yann LeCunn’s paper from 2022 [1], the term AMI being district from AGI. However, the A has changed over the past few years from autonomous to advanced and even augmented, depending on context

[1] https://openreview.net/pdf?id=BZ5a1r-kVsf

esafak

I think Lecun doesn't like the term AGI.

stevenhuang

I'm waiting for when they're called Minds :)

devmor

I would guess it’s in response to the recent market studies showing that the general public views anything labeled “AI” as a likely scam and untrustworthy.

pkkkzip

meta has certainly redeemed itself and helping AI become moat-free

echelon

Even though Meta doesn't sell I/PaaS, Meta's fitness goes up when AI is in the hands of more players than just Google and OpenAI. Commoditize AI and you create a diverse set of businesses that will reach customers through Meta's platforms.

ponector

They still ruin society with the Facebook, no matter how much good they do with LLM.

bubaumba

Like it or not Meta is a major player in AI world with its free models and tools.

As for social impact of the rest it's debatable. I personally don't have active social accounts, and not sure this is good.

mupuff1234

Like it or not the social impact isn't really debatable, there's a decent amount of evidence, enough for the surgeon general Dr to issue a warning:

https://www.hhs.gov/about/news/2023/05/23/surgeon-general-is...

dailykoder

They are not free

croes

Free by accident.

mupuff1234

It's not redeeming if you still continue with the original sin.

SpaceManNabs

This is like learning 10 different new architectures lol

Flomolok

It's not a hype when it's delivers and I'm also not seeing a ceiling yet

Yet again interesting progress.

Also I like the idea of using the pose model to generate not a NPC but a avatar living in my phone or glas cube as a hologram. That would be quite scifi futuristic

zookerterg

[dead]

HN

Sharing new research, models, and datasets from Meta FAIR

Sharing new research, models, and datasets from Meta FAIR