That XOR Trick (2020)

praptak

Fun fact: the xor swap fails when the variables are aliases. This was the trick used in one of the underhanded code competitions.

Basically xor swapping a[i] with a[j] triggered the evil logic when i was equal to j.

akovaski

The partitioning algorithm to find two missing/duplicate numbers is clever, I wouldn't have thought of that. It should also work if you have a list with 1 missing and 1 duplicate, yeah? You'd probably have to do an extra step to actually find out which number is missing and which is a duplicate after you find the two numbers.

> If more than two elements are missing (or duplicated), then analyzing the individual bits fails because there are several combinations possible for both 0 and 1 as results. The problem then seems to require more complex solutions, which are not based on XOR anymore.

If you consider XOR to be a little bit more general, I think you can still use something like the partitioning algorithm. That is to say, considering XOR on a bit level behaves like XOR_bit(a,b)=a+b%2, you might consider a generalized XOR_bit(a,b,k)=a+b%k. With this I think you can decide partitions with up to k missing numbers, but I'm too tired to verify/implement this right now.

analog31

The first thing that occurred to me is that if a number is missing from a list, the sum of that list will fall short. But I like XOR's.

anitil

It really tickles my brain in a lovely way that it avoids all overflow risk as well

repiret

There is no overflow risk. The trick works on any Abelian group. N-bit values form an Albanian group with xor where 0 is the identity and every element is its own inverse. But N-bit values also form an Abelian group under addition with overflow, where 0 is the identity and 2s-compliment is the inverse.

If you’re working on an architecture where a single multiplication and a bit shift is cheaper than N xor’s, and where xor, add, and sub are all the same cost, then you can get a performance win by computing the sum as N(N+1)/2; and you don’t need a blog post to understand why it works.

null

[deleted]

analog31

True, I hadn't thought of that. I'm spoiled by Python. ;-)

mzs

I like the 'store prev ^ next' trick for lists that can be walked from the front or from the back.

mrbluecoat

PTSD for me on this topic due to a week wasted cleaning up PHP malware using XOR for obfuscation and encryption: https://www.godaddy.com/resources/news/php-malware-and-xor-e...

hsfzxjy

To derive "The XOR trick" I think both *associativity* and communitativity are needed.

That is, one should also prove a ^ (b ^ c) = (a ^ b) ^ c. Instinctive, but non-trivial.

burnt-resistor

In ye olden days, bit manip operations were faster than algebraic operations.

And sometimes even faster than a load immediate, hence XOR AX, AX instead of MOV AX, 0.

GuB-42

"xor ax, ax" is still in use today. The main advantage is that it is shorter, just 2 bytes instead of 3 for the immediate, the difference is bigger in 32 and 64 bit mode as you have to have all these zeroes in the instruction.

Shorter usually mean faster, even if the instruction itself isn't faster.

sparkie

In long mode, compilers will typically emit `xor eax, eax`, as it only needs 2 bytes: The opcode and modrm byte. `xor ax, ax` takes 3 bytes due to the operand size override prefix (0x66), and `xor rax, rax` takes 3 bytes due to the REX.W prefix. `xor eax, eax` will still clear the full 64-bit register.

Shorter basically means you can fit more in instruction cache, which should in theory improve performance marginally.

tyfighter

Modern x86 implementations don't even do the XOR. It just renames the register to "zero".

burnt-resistor

Barely. x86 is fading. Arm doesn't do this in GCC or Clang.

> Shorter usually means faster

It depends, so spouting generalities doesn't mean anything. Instruction cache line filling vs. cycle reduction vs. reservation station ordering is typically a compiler constraints optimization problem(s).

userbinator

Arm doesn't do this in GCC or Clang.

Because Arm64 has a zero register, and Arm32 has small immediates, and all instructions are uniformly long.

ameliaquining

Ah, my least favorite technical interview question. (I've been asked it, but only after I first read about it online.)

phendrenad2

Indeed, it kind of feels like asking if someone knows what the number 5318008 means.

XeO3

Apart from these applications of XOR, a favourite one is using Bitwise AND to find Even/Odd numbers.

nullc

Generalizing an 'xor accumulator' support set difference of more than one element is interesting: https://github.com/bitcoin-core/minisketch

st0le

Another fun trick I've discovered.

`XOR[0...n] = 0 ^ 1 .... ^ n = [n, 1, n + 1, 0][n % 4]`

nullc

Tables yuck :P, maybe

XOR[0...x] = (x&1^(x&2)>>1)+x*(~x&1)

tialaramex

Right, or in summary, no you don't need to all that extra work up front.

makeset

Fun fact: you can show that there is another binary operator that performs the same triple assignment swap.

HN

That XOR Trick (2020)

That XOR Trick (2020)