Skip to content(if available)orjump to list(if available)

First Self-Propagating Worm Using Invisible Code Hits OpenVSX and VS Code

TheServitor

"Here's the thing - this technique completely breaks traditional code review. You can't spot what you can't see. GitHub's diff view? Shows nothing suspicious. Your IDE's syntax highlighting? All clear. Manual code inspection? Everything looks normal.

The invisible code technique isn't just clever - it's a fundamental break in our security model. We've built entire systems around the assumption that humans can review code. GlassWorm just proved that assumption wrong."

This is pure Claude talk.

burkaman

Yeah the whole article is awful to read. Everything the LLM added is completely useless fluff, sometimes misleading, and always painful to get through.

r_lee

Alright, no fluff. Only real talk. It's not just a great argument--it's the truth. You're absolutely right.

1bpp

Claude, remember to always replace em-dashes with a single dash

dingnuts

it sure is and it's complete bullshit too!

that screenshot looks suspicious as hell, and my editor (Emacs) has a whitespace mode that shows unprintable characters sooooo

if GitHub's diff view displays unprintable characters like this that seems like a problem with GitHub lol

"it isn't just X it's Y" fuck me, man. get this slop off the front page. if there's something useful in it, someone can write a blog post about it. by hand.

wrs

That's clever, but if your code review missed the perfectly visible line

    eval(atob(decodedString))
then they didn't really need invisible characters to get past you, did they?

rezonant

Ahh but what if you are code reviewing a malware package already? Then this would be entirely normal!

gary_0

If all you're interested in is which extensions have been infected:

Compromised OpenVSX Extensions:

    codejoy.codejoy-vscode-extension@1.8.3
    codejoy.codejoy-vscode-extension@1.8.4
    l-igh-t.vscode-theme-seti-folder@1.2.3
    kleinesfilmroellchen.serenity-dsl-syntaxhighlight@0.3.2
    JScearcy.rust-doc-viewer@4.2.1
    SIRILMP.dark-theme-sm@3.11.4
    CodeInKlingon.git-worktree-menu@1.0.9
    CodeInKlingon.git-worktree-menu@1.0.91
    ginfuru.better-nunjucks@0.3.2
    ellacrity.recoil@0.7.4
    grrrck.positron-plus-1-e@0.0.71
    jeronimoekerdt.color-picker-universal@2.8.91
    srcery-colors.srcery-colors@0.3.9
    sissel.shopify-liquid@4.0.1
    TretinV3.forts-api-extention@0.3.1
Compromised Microsoft VSCode Extensions:

    cline-ai-main.cline-ai-agent@3.1.3

blauditore

Why not just indicate non-printable characters in code review tools? I've always wondered that, regardless of security implications. They are super rare in real code (except line breaks and tabs maybe), so no disruption in most cases.

Also, as notes in other comments, you can't do shady stuff purely with invisible code.

The article seems bit sensationalist to me.

kulahan

For anyone else curious WTH “invisible code” is…

> invisible Unicode characters that make malicious code literally disappear from code editors.

rictic

So, they have a custom decode function that extracts info from unprinted characters which they then pass to `eval`. This article is trying to make this seem way fancier than it is. Maybe GitHub or `git diff` don't give a sense of how many bits of info are in the unicode string, but the far scarier bit of code is the `eval(atob(decodedString))` at the bottom. If your security practices don't flag that, either at code review, lint, or runtime then you're in trouble.

Not to say that you can't make innocuous looking code into a moral equivalent of eval, but giving this a fancy name like Glassworm doesn't seem warranted on that basis.

moffkalast

Makes you wonder why unicode has invisible characters in the first place and why a compiler would interpret them at all.

h4ck_th3_pl4n3t

It's not the compiler.

It's JavaScript and its fucked up UTF-16 strings.

UTF-16 should have been UTF-8 for a variety of reasons, and I thought we have learned from the Effective power لُلُصّبُلُلصّبُررً ॣ ॣh ॣ ॣ 冗 incident.

AnimalMuppet

The compiler doesn't. They get passed to decode, and then to eval.

lennartkoopmann

I was always afraid of browser extensions and now I'm also afraid of IDE extensions. Recently came across SecureAnnex[0] and it looks promising to get some control over it.

[0] https://secureannex.com/

afishhh

Using non-printable characters to encode malicious code is creative, but I wouldn't say it "breaks our security model".

I would be pretty suspicious if I saw a large string of non-printable text wrapped in a decode() function during code review... Hard to find a legitimate use for encoding things like this.

Also another commenter[1] said there's an eval of the decoded string further down the file, and that's definitely not invisible.

Has no one thought to review the AI slop before publishing?

[1] https://news.ycombinator.com/item?id=45649224

a-dub

vim-plug with pinned hashes and manual reviews ftw!

vemv

What are the specific "Unicode variation selectors" in question?

I'd like to implement some simple linting against them.

fxtentacle

I call bullshit on this: "The attacker is using a public blockchain - immutable, decentralized, impossible to take down - as their C2 server."

"There's no hosting provider to contact, no registrar to pressure, no infrastructure to shut down. The Solana blockchain just... exists. "

Yes, but you still need to connect to it. Blocking access to *.solana.com is enough to stop the trojan from accessing its 2nd stage.

"Connections to Solana RPC nodes look completely normal. Security tools won't flag it. "

Then your security tools are badly configured. Lots of crypto traffic should be treated as a red flag in almost any corporate environment.

"there's literally no way to take it down"

There is, you just have to accept that Solana goes down with it. Why is A-OK in a work environment.

rezonant

> There is, you just have to accept that Solana goes down with it.

And nothing of value was lost.

iSnow

>Yes, but you still need to connect to it. Blocking access to *.solana.com is enough to stop the trojan from accessing its 2nd stage.

How is that if you can just run a bunch of Solana RPC servers? For what would you need to access solana.com or a subdomain?

maccam912

There's also the backup C2 path though, via google calendar. Wayyy less of a red flag.

fxtentacle

I'm surprised that Google hasn't deactivated the link in the 24+ hours since that article went online.

dns_snek

That should tell you (everyone) how much these companies actually care about our security the next time they claim to be stripping away our freedoms "for our security".

knallfrosch

That blocks Solana only on your corporate network.

djmips

Obviously... SMH - what a tough read this blog post was.

DiabloD3

And this is why you don't use VSCode.

agile-gift0262

and this is why you must minimise and be extra careful with the extensions you install in your editor of choice.

h4ck_th3_pl4n3t

Imagine a worm written in VimL or emacs lisp.

Haha, that would be kinda fun as an experiment :D

dist-epoch

Do you also not use SSH? Because that was also infected last year (XZ)

nawgz

Cool write-up. Seems pretty unintuitive to me that Unicode would allow someone to serialize normal code as invisible characters and that something like an IDE or a git diff has never been hardened against that at all.

In my mind it's one thing to let a string control whitespace a bit versus having the ability to write any string in a non-renderable format. Can anyone point me to some more information about why this capability even exists?

wunderwuzzi23

It gets even worse with LLMs and agents.

Many LLMs can interpret invisible Unicode Tag characters as instructions and follow them (eg invisible comment or text in a GitHub issue).

I wrote about this a few times, here a recent example with Google Jules: https://embracethered.com/blog/posts/2025/google-jules-invis...

dragonwriter

> Seems pretty unintuitive to me that Unicode would allow someone to serialize normal code as invisible characters

If you have a text encoding with two invisible characters, you can trivially encode anything that you could represent in a digital computer in it, in binary, by treating one as a zero and the other as a one. More invisible characters and some opinionated assumptions about what you are allows denser representation than one bit per character.

Of course, the trick in any case is you have to also slip in the call to decode and execute the invisible code, and unless you have a very unusual language, that’s going to be very visible.

clscott

The issue does not lie with Unicode.

It's just a custom string encoder/decoder whose encoded character set is restricted to non-printables.

Many editors and IDEs have features (or plugins) to detect these characters.

VSCode: https://marketplace.visualstudio.com/items?itemName=YusufDan...

VIM: https://superuser.com/questions/249289/display-invisible-cha...