Skip to content(if available)orjump to list(if available)

OmniHuman-1: Human Animation Models

OmniHuman-1: Human Animation Models

19 comments

·February 4, 2025

ggerules

This is very good attempt with people playing musical instruments.

But, there are some subtle timing tells, that this is AI generated. Take a look at the singer playing the piano. Timing of the hands with the singer is slightly off. The same goes with the singer and the guitar. I'm not a guitar player or piano player, but I do play a lot of different musical instruments at a high level, and the timing looks off, slightly ahead or behind the actual piece of audio of the piece of music.

mkagenius

> Timing of the hands with the singer is slightly off.

Sure, only way is up though. I haven't seen this level realism in SORA or the google one. Plus, its synced with audio.

vessenes

These look.. great, by and large. Hands are super natural, coherency is really high. Showing off piano chord blocking is a huge flex.

I’d like to play with this! No code, but bytedance often releases models, so I’m hopeful. It’s significantly better than vasa, and looks likely to be an iteration of that architecture.

liuliu

ByteDance didn't release their text-to-video model, which is the base of this work, so I would think unlikely.

smusamashah

What are the tells in most of these videos? I can't point at any in many of them. Hands, teeth, lip sync, body and should movement all look correct. Specially the TED talk like presentation examples near bottom.

iandanforth

Many of these have tells, but this one fully crossed the uncanny valley for me. https://www.youtube.com/watch?v=1NU8NzvAxEg&t=16s

Good to know that I need to now assume performances are AI generated even if it's not obvious that they are!

lm28469

With the waxy hair and pulsating microphone ?

aylmao

To be fair, the hair looks quite similar to the original: https://www.youtube.com/watch?v=39_OmBO9jVg

marci

On a phone, just scrolling?

smusamashah

This looks better than EMO (also closed source by Alibaba group https://humanaigc.github.io/emote-portrait-alive/). See the rap example on their page. They apparently have EMO2 now which doesn't look as believable to me.

EMO covers head + shoulders while this OmniHuman-1 is covering full body and its looking even better. I would have easily mistaken these for real (specially while doom scrolling) if I was not looking for AI glitches.

UPDATE: Googling animate bytedance site:github.io returns many in the same domain (all proprietry). Found a few good ones.

- https://byteaigc.github.io/X-Portrait2/ Very expressive lifelike portrait animations

- https://byteaigc.github.io/x-portrait/ (previous version of the same, has source https://github.com/bytedance/X-Portrait)

- https://loopyavatar.github.io/ (portrait animations, looks good)

- https://cyberhost.github.io/

- https://grisoon.github.io/INFP/

- https://grisoon.github.io/PersonaTalk/

- https://headgap.github.io/

- https://kebii.github.io/MikuDance/ anime animations

kiwiguy1

I run youtube channels with almost 2 billion views and this actually concerns me. I would love to try this in my productions!!

egnehots

this could be used as an incredible low bitrate codec for some streaming use cases. (video conferencing/podcasts on <3G for ex, just use some keyframes + the audio).

emsign

It looks funny.

golol

Modern operating systems should include by default a very simple private/public key system to sign arbitrary files. I think it should not be very complicated? We badly need this in the age of AI.

Ajedi32

How would that help?

ssalka

Auto-watermarking of AI generated content, I would imagine

Ajedi32

What does that have to do with signing arbitrary files?

echelon

That's too much effort and the use cases are what exactly? Helping the prosecution or defense in lawsuits?

People are going to get so used to AI content that it won't really matter. Culture is plastic. This will be the new norm.

Capturing photons to send signals is the new butter churning.