AI Demos

118 comments

·February 9, 2025

Visit

meltyness

It's a tool box of demos with the following:

Segment Anything 2: Create video cutouts and other fun visual effects with a few clicks.

Seamless Translation: Hear what you sound like in another language.

Animated Drawings: Bring hand-drawn sketches to life with animations.

Audiobox: Create an audio story with A1-generated voices and sounds.

echelon

> This research demo is not open to residents of, or those accessing the demo from, the States of Illinois or Texas.

Not accessible if you're in Illinois or Texas.

They must have anti-AI laws, probably with voice conversion moreso than image segmentation and cartoon animation.

Hopefully the lawmakers see beneficial use cases and fix their laws to target abuse instead of a blanket coarse-grained GenAI restriction.

azinman2

Illinois has laws against biometrics, which basically can be interpreted as broadly as anything that even looks for a face as a binary classifier. The translation demo uses video, intended to be your face.

Knowing meta they save all of it.

bongodongobob

"knowing meta" - as if any company working on AI isn't saving all the training data they can.

blagie

Texas sounds reasonable in general. I've written license terms which exclude Texas. That's home of the patent trolls.

Heartland v. Kraft Foods is worth a read.

pridkett

Texas has the “Capture or Use of Biometric Identifiers” act. It’s very similar to the Illinois act that requires consent etc. Although it’s been on the books for a long time, Texas AG Paxton really only started enforcing it in 2022, 14 years after the law first appeared. The first target was Meta.

In this case it’s not the patent trolls, but the biometric collection acts shared by Illinois and Texas.

Aside - if you use Clear for airport security in those states, you get an additional consent screen. It seems like about 50% of the time the Clear employee clicks through the consent screen before you can read it. I imagine this does not fulfill the legal requirements when that happens.

JKCalhoun

I'm in Nebraska — but I think, due to my ISP, I appear to be in the Chicago area. Oh well.

hnuser123456

Sounds like your ISP needs to update their IRR and RIR records

DonHopkins

[flagged]

kylecazar

Seamless translation is... Pretty incredible.

I speak English and Spanish, so I recorded some English sentences and listened to the Spanish output it generated. It came damn close to my own Spanish (although I have more Castilianisms in mine, which of course I wouldn't expect it to know)

heyjamesknight

A real test here would be to give it to my friend from Mendoza, Argentina.

I'm bilingual and still can't understand him. I'm not even sure half the things he says are actual words.

mattlondon

I tried it and it sounded nothing like me at all - just some random "generic" male voice that translate what I said into german. My wife put it as "that's shit - sounds nothing like you". Nuff said.

0xFEE1DEAD

Same for me.

I also tried speaking German and translating it to English and when I said "Hallo ich wollte das nur mal ausprobieren" (Hello I just wanted to try this out) it translated it to "Hi, how are you? Do you know anyone who quit smoking?".

I feel gaslit.

suddenlybananas

I translated from French to English and vice versa and the voice sounded nothing like me in either case. The English to French translation also made me sound about 90 years old.

ludwik

Some for me. I'm a man with a relatively deep voice. The translation was read out by some generic female AI voice.

gardenhedge

I think you clicked the wrong recording. The generic female AI voice is the translation of what you said.

svilen_dobrev

which is good. do you really want a deep-fake? that noone can distinguish?

foundry27

If that’s how it’s being advertised, and that’s the reason people are giving it a shot based on that advertising, then I certainly do! And so, I imagine, did the people who have left feedback so far!

recursive

Being good would be bad, therefore being bad is actually good.

lttlrck

Did it _sound_ like you though? It doesn't sound remotely like me.

kylecazar

It didn't really the first time. I recorded a second one and annunciated really strong/well (and said more) -- that yielded the positive results.

xandrius

Unfortunate that the examples they provide were absolutely terrible and robotic.

It put me off from actually trying it, I might reconsider.

anal_reactor

Whether "we're there yet" on translation technology is still debated, but at some point we'll consider it "good enough" for most practical use cases, truly removing the linguistic barrier. This is actually both terrifying and exciting, because then it'll definitely start influencing spoken language to at least some degree.

suddenlybananas

It depends how much tolerance you have for mistakes. For a waiter or asking directions or things like that, 100% this works great. For a diplomatic discussion where nuance is very important however... It also doesn't work great for translating works of art where the translation itself is open-ended and can be done in a bunch of different ways and requires a lot of editorial/artistic decisions from the translator.

rob-olmos

Is this subject purposely spelled Aidemos somewhere like the HN title says instead of AI Demos?

sophiebits

HN automatically recapitalizes words in submission titles so I think it’s possible this could have been submitted as “AIDemos by Meta”.

rob-olmos

Ahh I see. Thanks for the info!

o-o-

Aidemos... the greek god... of intelligence...?

riffraff

At least it's not AI Demons

saikatsg

Fixed.

ghxst

I'm pretty impressed with the segment anything[0] demo, is this integrated into an actual product anywhere? I do some simple video editing for friends as a hobby and can see some of this be pretty useful.

[0]https://sam2.metademolab.com/

barrenko

It probably is, but you won't hear it advertised as such.

brap

What is Meta’s angle with AI? They seem to be doing a lot of research but what is the end goal? Google and MSFT I understand, Meta not so much.

lanthissa

Meta believes the dollars at the end of the AI race will be in walled gardens and prop data, not data centers and models.

They are going to do everything they can to make sure no one uses the time that models and data centers are limiting factors to disrupt them.

In the same way google demonetized the application layer of the web to prevent walled gardens from blocking search.

If models and hardware become commoditized at the end of the race meta will have a complete psychographic profile of people on an individual and group level to study, and serve incredibly targeted content to.

Their only real competition in that would be someone developing a 'her' like app that takes people out of social media and into their own individual silo'ed worlds. In a lot of ways discord is the alternative world to meta's ecosystem. hyper focused invite only small communities.

mattlondon

> Their only real competition in that would be someone developing a 'her' like app that takes people out of social media and into their own individual silo'ed worlds

I take it you have not tried the new Gemini models on ai studio? It does real time streaming video input and conversation you can genuinely ask it questions about what you are looking at in a conversational audio in-out way. This is basically "her"-level technology in an unpolished form, right here today.

azinman2

Her is about a lot more than just asking questions in pure audio. ChatGPT has also had this since for a little while.

theshackleford

ChatGPT has been doing this for ages. Is the Gemini version drastically different or something?

flir

> Meta believes the dollars at the end of the AI race will be in walled gardens

Will those walls keep AI-generated content out, or will they keep the people outside from accessing the AI-generated content in the garden?

If it's the first, somebody should tell them the slop's already up to their navels and they probably shouldn't be helping people generate more of it.

If it's the second, then the models that supply the content to the garden must have some kind of uniqueness/value, because otherwise you could get identical content from anywhere.

This is a genuine question, because I don't understand the logic here.

(I had assumed it was more like hardware companies funding open source way back when - Commoditize Your Complement).

sangnoir

> If it's the first, somebody should tell them the slop's already up to their navels and they probably shouldn't be helping people generate more of it.

One would imagine Meta can readily quantify how much AI-generated content is consumed across its properties.

Meta's play is simple: more engagement means more money for Meta, and this can be done by "slop" as you called it, or alternatively expanding the audience of high quality human-generated content, say via translation. A funny video in Albanian is probably still very funny after being translated to English.

null

[deleted]

twelve40

so in other words, "better targeting"? that's it?

HarHarVeryFunny

Better targetting

Better moderation (to the extent they still care)

Generation of AI slop for the sheep to feed on

Use of AI is really core to their business, so understandable they want to build it themselves, but not so clear why they want to "open source" (weights) it other than to harm companies like OpenAI

pfisherman

Is something like automated personalized content creation (for ads) better targeting? Or is it qualitatively different?

I personally think that the population scale surveillance and behavioral manipulation infrastructure built by meta is unethical and incredibly dangerous.

jiggawatts

In the same way that an atomic bomb is “just” a better bomb.

I keep telling parents that Meta et al are spending the inflation-adjusted equivalent of the Manhattan project — not to defeat Japan — but to addict their child.

SV_BubbleTime

Has Meta done anything else?

alexashka

No.

Meta is a spy on all citizens of the world arm of USA government, pretending to be a company.

They are not investing in AI to bring profit. They are investing in AI to improve their intelligence gathering capabilities.

Or did you think cheap on the fly translation is funded because the overlords just love languages so much? Sweet summer child...

They are open sourcing because everyone in tech knows Zuck is a sociopath. Naive tech nerds are willing to work for a sociopath if it's open source. They had their brains dipped in tech-utopia. They are bringing about worldwide 1984.

xyst

> walled gardens

Apple tried that and it’s crumbling. Meta/Zuckerfuck is always behind the curve.

- AR (failed)

- “metaverse” (failed)

The only thing that has kept them above water is social media and selling off user data, and that’s crumbling as well. Smaller players have been eating their lunch and the user base is aging out.

NBJack

Yeah, their stock is WAY over inflated. I know their data wells are drying up fast. The long bets aren't working out. The AI stuff is neat, and certainly disruptive, but it isn't a paying bet.

The writing is on the wall, and his "falling in line" with theb political climate speaks volumes on his effort to keep Meta afloat.

postexitus

Paraphrasing from someone who is involved in this - their angle in AI is better targeting of Ads - better classification, clustering, better "recommendations" for the advertiser, including visuals, wording, video etc.

These and others are just side benefits or some form of "greenwashing". Meta's main (and only) business is advertisement. They failed to capitalize on everything else.

rsynnott

After the 'metaverse' stuff flopped, desperate to spend their money on some other thing that might be The Future(TM)?

Arguably this would be kind of rational behaviour for them even if they thought that LLM stuff had a low chance of being the next thing; they have lots and lots of money, and lots of revenue, so one strategy would be just to latch on to every new fad, and then if one is a real thing they don't get left behind (and if it's not, well, they can afford it).

My suspicion is that this is where most Big Tech interest in LLMs comes from; it's essentially risk management.

twelve40

great question, i was wondering about that. I think it's mostly in discovery phase right now, similar to how they dabbled in crypto before, and the largely finished by now "metaverse" experiment. (yes, this dabbling involves a ton of money sometimes). These demos actually show what they might end up using AI for, but whether it's truly game-changing for their business and whether it will be good for the regular users, considering their shitty UI's both in FB and even Instagram by now are grossly obsolete, haven't changed in over a decade despite 70,000 people working there, and are nowadays mostly focused on violently shoving more ads over actual usefulness, is still an open question.

If their business remains a shitty declining buggy 20-year-old Facebook and a 10+year-old Instagram app, but they contribute to advancing open source models similar to how they did with React, I'll consider that a net win though.

JTyQZSnP3cQGa8B

Money and manipulation? Was that a real question?

twelve40

Yes, that's a real question, even for the money and manipulation use case, how does this help, especially the money part?

mistrial9

all math leads to cryptography; all media leads to ads (?)

isoprophlex

You forgot "fucking over the competition".

Not that I'm complaining about their open-weights model releases destroying openai's moat... but still.

CPLX

https://gwern.net/complement

Joel Spolsky in 2002 identified a major pattern in technology business & economics: the pattern of “commoditizing your complement”, an alternative to vertical integration, where companies seek to secure a chokepoint or quasi-monopoly in products composed of many necessary & sufficient layers by dominating one layer while fostering so much competition in another layer above or below its layer that no competing monopolist can emerge, prices are driven down to marginal costs elsewhere in the stack, total price drops & increases demand, and the majority of the consumer surplus of the final product can be diverted to the quasi-monopolist. No matter how valuable the original may be and how much one could charge for it, it can be more valuable to make it free if it increases profits elsewhere. A classic example is the commodification of PC hardware by the Microsoft OS monopoly, to the detriment of IBM & benefit of MS.

This pattern explains many otherwise odd or apparently self-sabotaging ventures by large tech companies into apparently irrelevant fields, such as the high rate of releasing open-source contributions by many Internet companies or the intrusion of advertising companies into smartphone manufacturing & web browser development & statistical software & fiber-optic networks & municipal WiFi & radio spectrum auctions & DNS (Google): they are pre-emptive attempts to commodify another company elsewhere in the stack, or defenses against it being done to them.

999900000999

AI make stock go up.

I think this is it. I'm kicking myself for not going harder, but I was very much into LLMs/ML back in 2019, had I not given up I might have a startup right now.

I'd need like 70k and a minimum of 6 months, but I still have a few ideas for AI driven startups.

barbazoo

Generated content is my assumption. Both, by users but also fully automated.

brap

I don’t think anyone wants generated content in their IG/FB feed, so not sure how this will play out in the long run

ketzo

Correction: Nobody wants content that they can tell is AI generated.

int_19h

People say that, yet how many likes and reshares does said generated content get?

twelve40

Sadly, i don't think they care much about what "everyone wants" because with userbase this size they will figure out a way to forcefully shove whatever they come up with into people's faces.

cebert

The seamless transition demo is fantastic. The translated voice is passable for my own native voice. It would be incredible when we can achieve this in real-time.

exgrv

We can! At Kyutai, we released a real-time, on-device speech translation demo last week. For now, it is working only for French to English translation, on an iPhone 16 Pro: https://x.com/neilzegh/status/1887498102455869775

We released inference code and weights, you can check our github here: https://github.com/kyutai-labs/hibiki

mastermedo

Good work. The delay seems to be around 5 secods. This is a step in the right direction. I'm wondering how much more real-time can we push it.

ketzo

Damn, this is pretty amazing. Feels like we’re not far off from the babel fish.

lelag

It's not exhaustive. For exemple, it's missing the Meta Motivo demo at https://metamotivo.metademolab.com/ (humanoid control model)

rocauc

Meta deeply comprehends the impact of GPT-3 vs ChatGPT. The model is a starting point, and the UX of what you do with the model showcases intelligence. This is especially pronounced in visual models. Telling me SAM2 can "see anything" is neat. Clicking the soccer ball and watching the model track it seamlessly across the video even when occluded is incredible.

lm28469

We can add these to the pile of completely useless AI shit the world built in the last two years. Are people under some kind of spell that forces them to be in owe ? Looking at a lawnmower magazine is more interesting than these in term of utility and interesting tech

npalli

“ Our site is not available in your region at this time.”

Aurornis

Companies have to be very careful with AI products in international markets and even some US states because there are a number of different AI legislations in that need to be checked.

This is why cutting edge models are delayed in certain regions.

The work to verify and document all of the compliance isn’t worth it for various small demos, so they probably marked it as only allowed in the US and certain regions.

xnx

Getting this from the US

1832

I get

"Allow the use of cookies from Meta on this browser? We use cookies and similar technologies to help provide and improve content on . We also use them to provide a safer experience by using information we receive from cookies on and off Meta Quest, and to provide and improve Meta Products for people who have an account.

    •
    Essential cookies: These cookies are required to use Meta Products and are necessary for our sites to work as intended.
    •
    Cookies from other companies: We use these cookies to show you ads off of Meta Products and to provide features like maps and videos on Meta Products. These cookies are optional.

You have control over the optional cookies we use. Learn more about cookies and how we use them, and review or change your choices at any time in our . "

should I click on accept?

techscruggs

Same. Texas.

chairmanwow1

I was getting this from inside the US, however setting my VPN to LA worked to get around it. I assume this is because that's where the Meta engineers are ¯\_(ツ)_/¯

EDIT: Once accessed there is this note:

> This research demo is not open to residents of, or those accessing the demo from, the States of Illinois or Texas.

and I'm in TX

malshe

Oh wow, thanks for finding this. I am also in TX. I was going crazy thinking it might be my iCloud Private Relay

meltyness

I think Texas has some recent law that could be interpreted as being against twinning tech / deep fakes like the voice cloning. ¯\_(ツ)_/¯ seems like a good time to "ask the lawyers" and "not make a not political statement"

Even a passing glance it would be immediately clear that it's not a real risk of any sort.

ewuhic

Where are all the links to models?

nprateem

Click the about link:

https://ai.meta.com/sam2/

GH: https://github.com/facebookresearch/sam2

tsumnia

Neat, but I wish Meta would just say what this really is - "please give us some In the Wild data to further train our models on".

I did the same technique years ago for estimating ages. Person uploads an image, helps align 10% of our facial landmark points, and run the estimator. If we were wrong, ask for correction and refine.

Its still cool and all, but meh based on my prior experience.