Building a full-text search engine in 150 lines of Python code (2021)
9 comments
·January 22, 2025cocodill
> return [token for token in tokens if token]
I love that kind of bullshit poetry.
ks2048
If you're used to this, it's nice and readable.
Or you can do,
return filter(None,tokens)
Not obvious, but giving None is like giving "lambda x : x" to filter().chaos_emergent
Holy shit, I’ve been writing python for 15 years and it’s the first time I’ve seen None used as an identity function. That’s nuts! How does it work under the hood? Does filter have a special case for evaluating None?
echoangle
Yes, that’s just a special case. You can’t call None as a function.
https://docs.python.org/3/library/functions.html#filter
> If function is None, the identity function is assumed, that is, all elements of iterable that are false are removed.
dkga
It grew on me, too. Originally I frowned upon this kind of python shenanigans but now I must confess it makes me a bit happier inside whenever I have to type a similar thing.
graemep
I think its lovely if its reasonably readable (which this is) but they can be convoluted. I have written Python list comprehensions that I could not read myself the next day and so I am more careful now.
pastage
Yeah, but what is worst? I tried to understand the Haskell way in an article last week[0]
pure (n, guard (factor /= n) $> factor)
Which I think is more or less the same as this python line. return [factor for factor in factors if factor and not factor == n]
The article does fancy stuff with memory caches which I believe is easy to do in python but I need to understand the Haskell code better.[0] Haskell: A Great Procedural Language https://entropicthoughts.com/haskell-procedural-programming#... https://news.ycombinator.com/item?id=42754098
mrkeen
> pure (n, guard (factor /= n) $> factor)
Returns a tuple: Left side is n, right side is (Just factor) if factor is not n, or Nothing if it is.
This is a good intro to text search. Shameless plug: If you throw in a bit more, ca. 250 SLOC, you can have BM25 search: https://github.com/jankovicsandras/bm25opt