Falsify: Hypothesis-Inspired Shrinking for Haskell (2023)
11 comments
·April 20, 2025thesz
This is fascinating!
If I understand correctly, they approximate language of inputs of a function to discover minimal (in some sense, like "shortest description length") inputs that violate relations between inputs and outputs of a function under scrutiny.
mjw1007
I've found in practice that shrinking to get the "smallest amount of detail" is often unhelpful.
Suppose I have a function which takes four string parameters, and I have a bug which means it crashes if the third is empty.
I'd rather see this in the failure report:
("ldiuhuh!skdfh", "nd#lkgjdflkgdfg", "", "dc9ofugdl ifugidlugfoidufog")
than this:
("", "", "", "")
gwern
Really? Your examples seem the opposite. I am left immediately thinking, "hm, is it failing on a '!', some sort of shell issue? Or is it truncating the string on '#', maybe? Or wait, there's a space in the third one, that looks pretty dangerous, as well as noticeably longer so there could be a length issue..." As opposed to the shrunk version where I immediately think, "uh oh: one of them is not handling an empty input correctly." Also, way easier to read, copy-paste, and type.
dullcrisp
Their point is that in the unshrunk example the “special” value stands out.
I guess if we were even more clever we could get to something more like (…, …, "", …).
tybug
The Hypothesis explain phase [1][2] does this!
fails_on_empty_third_arg(
a = "", # or any other generated value
b = "", # or any other generated value
c = "",
d = "", # or any other generated value
)
[1] https://hypothesis.readthedocs.io/en/latest/reference/api.ht...gwern
The special value doesn't stand out, though. All three examples I gave were what I thought skimming his comment before my brain caught up to his caveat about an empty third argument. The empty string looked like it was by far the most harmless part... Whereas if they are all empty strings, then by definition the empty string stands out as the most suspicious possible part.
shae
I care about the edge between "this value fails, one value over succeeds". I wish shrinking were fast enough to tell me if there are multiple edges between those values.
evertedsphere
newtype Parser a = Parser ([Word] -> (a, [Word])
missing a paren heremoomin
I’m honestly completely failing to understand the basic idea here. What does this look like for generating and shrinking random strings,
null
How does Hedgehog and Hypothesis differ in their shrinking strategies?
The article uses the words "integrated" vs. "internal" shrinking.
> the raison d’être of internal shrinking: it doesn’t matter that we cannot shrink the two generators independently, because we are not shrinking generators! Instead, we just shrink the samples that feed into those generators.
Besides that it seems like falsify has many of the same features like choice of ranges and distributions.