Inserting a 0 bit in the middle of a value
18 comments
·October 25, 2024pavlov
For smaller integers, SIMD bit permute instructions can be used to do this kind of thing rapidly and with arbitrary reordering of the bits.
explodingwaffle
The more and more bit-hacks I see, the more convinced I am that they should be handled by a cleverer compiler. Some intrinsic(s) where you say “do x, y, to these bits” and it just figures out the optimum way of doing it for whatever platform you compile for.
jsheard
This tool is handy if you want a specific bit shuffle:
https://programming.sirrida.de/calcperm.php
It doesn't always find the optimal solution but it usually gets close.
PhilipRoman
That's a really cool side. Sad that it doesn't allow duplicate indices. I tried implementing the same thing but with support for duplicates and it was definitely a humbling exercise. I'm curious if there is any formal mathematical theory for this (shifted-off bits would probably make it less elegant than pure rotation)
explodingwaffle
that page + the book Hacker's Delight are a big part of the reason I am so convinced of this :)
If compilers/languages/standard libraries provided these bit permutations, and it was just something ~everyone had learned, it would be a lot easier to work with bits without needing to come up with the bitwise ops (or use that generator). In addition it would probably make better use of the hardware: sure, people like to pretend that we’re still programming C for PDP11, but modern hardware supports more operations than C has operators for (RISC-V B extension and co have the right idea <3)
Modern compilers are probably pretty good? but I doubt they are perfect at turning code like that in OP into the best instructions. It is probably a bit late for C/C++ though. maaaybe possible to get it into LLVM and Rust.
sumtechguy
I always wanted to see stuff like this website put into the compiler. https://graphics.stanford.edu/~seander/bithacks.html
The very tough problem is recognizing to emit the correct code.
sixthDot
Well it is already the case, for example things like ROR or ROL are generated by optimizers[1] from certain patterns of bitops.
jsheard
That's more like the opposite, OP wants to tell the compiler what transform to do and have it figure out a solution, but in your example you give the compiler a solution and it works backwards to figure out the intent and rewrite it to a completely different solution.
Most languages do have intrinsics for ROL/ROR at least, which you should generally use instead of relying on optimizer magic. I've certainly run into cases where those magic patterns don't get optimized (looking at you MSVC) but intrinsics always work.
badmintonbaseba
Nice bit twiddling hack! There are also the PDEP and PEXT instructions on x86 that could be used for this purpose, but it's possibly not worth it here.
Retr0id
I wonder if the compiler is smart enough to emit those instructions with the right optimization flags
tubs
On arm you can use https://developer.arm.com/documentation/dui0489/i/arm-and-th...
which is pretty convenient for this use-case.
dan-robertson
Am I missing something or does that instruction only clear bits in the field and not shift the adjacent bits to contract the field to be empty?
fmbb
> there’s only ever between 1 and 3 anchor bits
How often are these texturers transfered and how small are they?
Doing all this to avoid sending at most three bits for a texture file sounds like a colossal waste of human life and of money.
elpocko
Not per file, per block of 16 pixels. They are transferred quite often - BCn is a texture compression scheme for realtime use in GPUs. Those textures are stored and processed in compressed form.
null
null
AXbsk14228
[flagged]
But is easily generalized to inserting k bits: