Skip to content(if available)orjump to list(if available)

Building a full-text search engine in 150 lines of Python code (2021)

jankovicsandras

This is a good intro to text search. Shameless plug: If you throw in a bit more, ca. 250 SLOC, you can have BM25 search: https://github.com/jankovicsandras/bm25opt

cocodill

> return [token for token in tokens if token]

I love that kind of bullshit poetry.

ks2048

If you're used to this, it's nice and readable.

Or you can do,

    return filter(None,tokens)
Not obvious, but giving None is like giving "lambda x : x" to filter().

chaos_emergent

Holy shit, I’ve been writing python for 15 years and it’s the first time I’ve seen None used as an identity function. That’s nuts! How does it work under the hood? Does filter have a special case for evaluating None?

echoangle

Yes, that’s just a special case. You can’t call None as a function.

https://docs.python.org/3/library/functions.html#filter

> If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.

dkga

It grew on me, too. Originally I frowned upon this kind of python shenanigans but now I must confess it makes me a bit happier inside whenever I have to type a similar thing.

graemep

I think its lovely if its reasonably readable (which this is) but they can be convoluted. I have written Python list comprehensions that I could not read myself the next day and so I am more careful now.

pastage

Yeah, but what is worst? I tried to understand the Haskell way in an article last week[0]

  pure (n, guard (factor /= n) $> factor)
Which I think is more or less the same as this python line.

  return [factor for factor in factors if factor and not factor == n]
The article does fancy stuff with memory caches which I believe is easy to do in python but I need to understand the Haskell code better.

[0] Haskell: A Great Procedural Language https://entropicthoughts.com/haskell-procedural-programming#... https://news.ycombinator.com/item?id=42754098

mrkeen

> pure (n, guard (factor /= n) $> factor)

Returns a tuple: Left side is n, right side is (Just factor) if factor is not n, or Nothing if it is.