Skip to content(if available)orjump to list(if available)

LLM Embeddings Explained: A Visual and Intuitive Guide

boulevard

This is a great visual guide! I’ve also been working on a similar concept focused on deep understanding - a visual + audio + quiz-driven lesson on LLM embeddings, hosted on app.vidyaarthi.ai.

https://app.vidyaarthi.ai/ai-tutor?session_id=C2Wr46JFIqslX7...

Our goal is to make abstract concepts more intuitive and interactive — kind of like a "learning-by-doing" approach. Would love feedback from folks here.

(Not trying to self-promote — just sharing a related learning tool we’ve put a lot of thought into.)

rithikrolex

She always hurting my mom

rithikrolex

But i don't like my grandma

rithikrolex

My grandma went to police station

carschno

Nice explanations! A (more advanced) aspect which I find missing would be the difference between encoder-decoder transformer models (BERT) and "decoder-only", generative models, with respect to the embeddings.

dust42

Minor correction, BERT is an encoder (not encoder-decoder), ChatGPT is a decoder.

Encoders like BERT produce better results for embeddings because they look at the whole sentence, while GPTs look from left to right:

Imagine you're trying to understand the meaning of a word in a sentence, and you can read the entire sentence before deciding what that word means. For example, in "The bank was steep and muddy," you can see "steep and muddy" at the end, which tells you "bank" means the side of a river (aka riverbank), not a financial institution. BERT works this way - it looks at all the words around a target word (both before and after) to understand its meaning.

Now imagine you have to understand each word as you read from left to right, but you're not allowed to peek ahead. So when you encounter "The bank was..." you have to decide what "bank" means based only on "The" - you can't see the helpful clues that come later. GPT models work this way because they're designed to generate text one word at a time, predicting what comes next based only on what they've seen so far.

Here is a link also from huggingface, about modernBERT which has more info: https://huggingface.co/blog/modernbert

Also worth a look: neoBERT https://huggingface.co/papers/2502.19587

ubutler

Further to @dust42, BERT is an encoder, GPT is a decoder, and T5 is an encoder-decoder.

Encoder-decoders are not in vogue.

Encoders are favored for classification, extraction (eg, NER and extractive QA) and information retrieval.

Decoders are favored for text generation, summarization and translation.

Recent research (see, eg, the Ettin paper: https://arxiv.org/html/2507.11412v1 ) seems to confirm the previous understanding that encoders are indeed better for “encoder task” and vice-versa.

Fundamentally, both are transformers and so an encoder could be turned into a decoder or a decoder could be turned into an encoder.

The design difference comes down to bidirectional (ie, all tokens can attend to all other tokens) versus autoregressive attention (ie, the current token can only attend to the previous tokens).

rithikrolex

Hey huggingface

nycdatasci

If you want to see many more than 50 words and also have an appreciation for 3D data visualization check out embedding projector (no affiliation): https://projector.tensorflow.org/

smcleod

Seems to be down?

Lots of console errors with the likes of "Content-Security-Policy: The page’s settings blocked an inline style (style-src-elem) from being applied because it violates the following directive: “style-src 'self'”." etc...

dotancohen

One of the first sentences of the page clearly states:

  > This blog post is recommended for desktop users.
That said, there is a lot of content here that could have been mobile-friendly with very little effort. The first image, of embeddings, is a prime example. It has been a very long time since I've seen any online content, let alone a blog post, that requires a desktop browser

petesergeant

I wrote a simpler explanation still, that follows a similar flow, but approaches it from more of a "problems to solve" perspective: https://sgnt.ai/p/embeddings-explainer/

bob_theslob646

If someone enjoyed learning about this, where should I suggest they start to learn more about embeddings?

ayhanfuat

Vicki Boykis wrote a small book about it: https://vickiboykis.com/what_are_embeddings/

lynx97

Shameless plug: If you want to experiment with semantic search for the pages you visit: https://github.com/mlang/llm-embed-proxy -- a intercepting proxy as a `llm` plugin.