Skip to content(if available)orjump to list(if available)

How to Implement a Cosine Similarity Function in TypeScript

ashvardanian

It’s a nice post, but “using array methods” probably shouldn’t be placed in the “Efficient Implementation” section. As often happens with high-level languages, a single plain old loop is faster than three array methods.

Similarly, if you plan to query those vectors in search, you should consider continuous `TypedArray` types and smaller scalars than the double precision `number`.

I know very little about JS, but some of the amazing HackerNews community members have previously helped port SimSIMD to JavaScript (https://github.com/ashvardanian/SimSIMD), and I wrote a blog post covering some of those JS/TS-specifics, NumJS, and MathJS in 2023 (https://ashvardanian.com/posts/javascript-ai-vector-search/).

Hopefully, it should help unlock another 10-100x in performance.

itishappy

Those are some janky diagrams. The labels are selectable, and therefore are repeatedly highlighted and un-highlighted while dragging the vector around. The "direction only" arrow prevents you from changing the magnitude, but it doesn't prevent said magnitude from changing and it does so often because the inputs are quantized but the magnitude isn't. Multiple conventions for decimals are used within the same diagram. The second diagram doesn't center the angle indicator on the actual angle. Also the "send me feedback on X" popup doesn't respond to the close button, but then disappeared when I scrolled again so maybe it did? I'm running Chrome 134.0.6998.36 for Windows 10 Enterprise 22H2 build 19045.5487.

This whole thing looks like unreviewed AI. Stylish but fundamentally broken. I haven't had a chance to dig into the meat of the article yet, but unfortunately this is distracting enough that I'm not sure I will.

Edit: I'm digging into the meat, and it's better! Fortunately, it appears accurate. Unfortunately, it's rather repetitive. There's two paragraphs discussing the meaning of -1, 0, and +1 interleaved with multiple paragraphs explaining how cosine similarity allows vectors to be compared regardless of magnitude. The motivation is spread throughout the whole thing and repetitive, and the real world examples seem similar though formatted just differently enough to make it hard to tell at a glance.

To try to offer suggestions instead of just complaining... Here's my recommended edits:

I'd move the simple English explanation to the top after the intro, then remove everything but the diagrams until you reach the example. I'd completely remove the explanation of vectors unless you're going to include an explanation of dot products. I really like the functional approach, but feel like you could combine it with the `Math.hypot` example (leave the full formula as a comment, the rest is identical), and with the full example (although it's missing the `Math.hypot` optimization). Finally, I feel like you could get away with just one real web programming example, though don't know which one I'd choose. The last section about using OpenAI for embedding and it's disclaimer is already great!

alexop

Thank you for the good feedback. I tried to improve that. I was writing the blog post for myself to understand Cosine Similarity, which is why it's maybe a bit repetitive, but this is the best way for me to learn something. I get your point. Next time I will write it better. Good feedback - I love that.

schappim

I attempted to implement this on the front end of my e-commerce site, which has approximately 20,000 products (see gist [1]). My goal was to enhance search speed by performing as many local operations as possible.

Biggest impact in performance was by moving to dot products.

Regrettably, the sheer size of the index of embeddings rendered it impractical to achieve the desired results.

1. https://gist.github.com/schappim/d4a6f0b29c1ef4279543f6b6740...

alexop

This looks nice. I also played on the weekend with Vue and Transformer.js to build the embeddings locally. See https://github.com/alexanderop/vue-vector-search