Skip to content(if available)orjump to list(if available)

Fun with -fsanitize=undefined and Picolibc

zero_k

Wow, this: "random() was returning values in int range rather than long." is a very nice bug find. Randomness is VERY hard to check for humans. For example, Python's binomial distribution is very bad on some inputs [1], giving widely wrong values, but nobody found it. I bumped into it when I implemented an algorithm to compute the approximate volume of solutions to a DNF, and the results were clearly wrong [2]. The algorithm is explained here by Knuth, in case you are interested [3]

[1] https://www.cs.toronto.edu/~meel/Slides/meel-distform.pdf [2] https://github.com/meelgroup/pepin [3] https://cs.stanford.edu/~knuth/papers/cvm-note.pdf

nasretdinov

> String to float conversion had a table missing four values. This caused an array access overflow which resulted in imprecise values in some cases.

I've once wrote a function to parse the date format from log files that Go doesn't natively support, and forgot to add November. I quit that job in April, so I never saw any issues. However when 1st of November came my ex-colleagues saw no logs for this day, and when they found out the reason they created a hash tag #nolognovember which you can probably find somewhere to this day :)

bad_username

> when 1st of November came my ex-colleagues saw no logs for this day

Faced with this symptom I would bet there was a "No" in a yaml somewhere :-)

lionkor

Who needs unit tests when you have "squints lgtm"

wavemode

This is kind of bug unit tests don't catch. The only way to exhaustively test that you're handling every possible input, is to loop over every possible input (which may tend to infinity).

Therein lies the importance of runtime assertions (so we can sanity check that parsing actually succeeded rather than silently failing) and monitoring (so we can sanity check that, for example, we don't ever go 24 hours without receiving data from the parsing job).

Joker_vD

Pfft, you know how tests for functions like this:

    func MonthToString(month int) string {
        switch month {
        case 1: return "January"
        case 2: return "February"
        ...
        case 10: return "October"
        case 12: return "December"
        default: panic(fmt.Errorf("invalid month number: %d", month))
        }
    }
are usually written? You take the switch's body, shove it into the test function, and then replace "case/return" with regexp to "assert.Equal" or something:

    func TestMonthToString(t *testing.T) {
        assert.Equal(t, "January", MonthToString(1))
        assert.Equal(t, "February", MonthToString(2))
        ...
        assert.Equal(t, "October", MonthToString(10))
        assert.Equal(t, "December", MonthToString(12))
        assert.PanicsWithError(t, "invalid month number: 13", func() { MonthToString(13) })
    }
Look ma, we got that sweet 100% code coverage!

BobbyTables2

I feel attacked! (:>

nasretdinov

Well I guess you can have a unit test that checks that there are 12 entries :). Repeating the same list probably wouldn't have found an issue unfortunately

moefh

> Passing pointers to the middle of a data structure. For example, free takes a pointer to the start of an allocation. The management structure appears just before that in memory; computing the address of which appears to be undefined behavior to the compiler.

To clarify, the undefined behavior here is that the sanitizer sees `free` trying to access memory outside the bounds of what was returned by `malloc`.

It's perfectly valid to compute the address of a struct just before memory pointed to by a pointer you have, as long as the result points to valid memory:

    void not_free(void *p) {
      struct header *h = (struct header *) (((char *)p) - sizeof(struct header));
      // ...
    }
In the case of `free`, that resulting pointer is technically "invalid" because it's outside what was returned by `malloc`, even though the implementation of `malloc` presumably returned a pointer to memory just past the header.

bestouff

> the vast bulk of sanitizer complaints came from invoking undefined or implementation-defined behavior in harmless ways

This is patently false. Any Undefined Behavior is harmful because it allows the optimizer to insert totally random code, and this is not a purely theoretical behavior, it's been repeatedly demonstrated happening. So even if your UB code isn't called, the simple fact it exists may make some seemingly-unrelated code behave wrongly.

almostgotcaught

> optimizer to insert totally random code

What are you even saying - what is your definition of "random code". FYI UB is exactly (one of) the places where an optimizer can insert optimized code.

arnsholt

To take an example from the post: in some cases a value was computed that could overflow, but it was not used because of a later overflow check. The optimizer would be fully within its rights to delete the code inside the overflow check, because the computation implicitly asserts that it won't overflow (since overflow is undefined). I think this is a more or less useful way of thinking around UB: any operation you put in your program implicitly asserts that the values are such that UB won't happen. For example, dereferencing a pointer implicitly means it cannot be NULL, because derefing NULL is UB, and anything downstream of that deref which checks for NULL can be deleted.

flohofwoe

Unfortunately UB is an umbrella term for all sorts of things, and some of those can be very harmful/unexpected, while others are (currently) harmless - but that may change in new compiler versions.

The typical optimization showcase (better code generation for signed integer loop counts) only works when the (undefined behaviour) signed integer overflow doesn't actually happen (e.g. the compiler is free to assume that the loop count won't overflow). But when the signed integer overflow happens all bets are off what will actually happen to the control flow - while that same signed integer overflow in another place may simply wrap around.

Another similar example is to specifically 'inject' UB by putting a `std::unreachable` into the default case of a switch statement. This enables an optimization that the compiler omits a range check before accessing the switch-case jump table. But if the switch-variable isn't handled in a case-branch, the jump table access may be out-of-bounds and there will be a jump to a random location.

In other situations the compiler might even be able to detect at compile time that the UB is triggered and simply generate broken code (usually optimizing away some critical part), or if you're lucky the compiler inserts an ud instruction which crashes the process.

moefh

Not OP, but here's an example of "random code" inserted by the compiler[1]: note the assembly instruction "ud2" ("invalid opcode exception" in x86 land) instead of "ret" in not_ok().

You might think this code would be fine if address 0 were mapped to RAM, but both gcc and clang know it's undefined behavior to use the null pointer like that, so they add "random code" that forces a processor exception.

[1] https://godbolt.org/z/sK55YsGz1

juliangmp

> [...] detect places where the program wanders into parts of the C language specification [...]

Small nitpick, the UB sanitizer also has some checks specific for C++ https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html