Binary modding a water dispenser to save me from pressing a button (2021)
3 comments
·January 9, 2025Evidlo
beng-nl
Decompilation is primarily intended to be consumed by humans, I.e. to make the assembly more understandable because it’s written in functions, ifs, and for loops; it’ll likely not compile at all and if it does not to the original firmware. I don’t believe it’s typically an explicit goal of decompilation to be that high fidelity (ie allow recompilation). It’s probably possible to hack the source to compile to byte-for-byte the same firmware if desired with a little patience.
Other than that, you’d be right :-)
poincaredisk
It won't work. Source code from native code decompilers is:
* not designed to be compiled again. It most likely won't compile without serious manual fixing. For example, decompilers often insert "pseudo-functions" to denote that something not easily representable in C is happening. Like CONCAT(var1, var2) may mean that both var1 and var2 are used as a single variable obtained by concatenating their bits (in practice: AX is used when AH and AL were already determined to be variables). Similar intrinsics exist for carry bits, jumps to arbitrary pointers, etc. This sometimes means that type inference went wrong somewhere, which brings us to...
* not perfect, and not designed to be perfect. Decompiler has no idea if a variable on stack is a qword or 8 byte array, it can only guess from local usage. In many cases array will be incorrectly decompiled as a single variable, or pointer parameter as an integer. This is not a huge problem when reversing, but catastrophic for decompilation. Automated struct identification is even harder, often almost impossible when you take unions into account. As a reverse engineer you are supposed to fix that interactively during analysis.
* decompilers in general don't decompile global data definitions - you interpret memory using a separate view. And for a good reason - what may look to the decompiler like three consecutive independent variables may actually be an (unrecognized) structure that must be kept in the same exact layout. Defining them as three independent C variables would almost surely not work.
* for firmware in particular the binary may be required to follow a specific layout or have other unusual characteristics (like volatile memory regions). No decompiler will give you a correct linker script to use for recompilation.
These problems are just out of the top of my mind. Worth noting that a new and upcoming reversing tool called rev.ng actually has working recompilation from C as one of their planned features, so we'll see what they come up with (and if they succeed).
When hacking a firmware like this, why can't you just rebuild the whole binary after modifying the decompiled source instead of patching a specific section of it?