We'd be better off with 9-bit bytes

115 comments

·August 6, 2025

duskwuff

Non-power-of-2 sizes are awkward from a hardware perspective. A lot of designs for e.g. optimized multipliers depend on the operands being divisible into halves; that doesn't work with units of 9 bits. It's also nice to be able to describe a bit position using a fixed number of bits (e.g. 0-7 in 3 bits, 0-31 in 5 bits, 0-63 in 6 bits), e.g. to represent a number of bitwise shift operations, or to select a bit from a byte; this also falls apart with 9, where you'd have to use four bits and have a bunch of invalid values.

willis936

Plato argued that 7! was the ideal number of citizens in a city because it was a highly factorable number. Being able to cut numbers up is an time-tested favorite. That's why there are 360 degrees.

gxs

Or 60 minutes in an hour

1, 2, 3, 4, 5, 6, 10, 12, 15, 20, 30, and 60

AnotherGoodName

Factorio logic applies as always - powers of 2 for trains belts etc. makes evenly splitting resources trivial.

phkahler

Good points! I was going to say I think 12 bits would have been a nice choice, but yeah optimizing for circuits is kind of important.

ForOldHack

Brillant, that 36 bits would be three bytes.

"DEC's 36-bit computers were primarily the PDP-6 and PDP-10 families, including the DECSYSTEM-10 and DECSYSTEM-20. These machines were known for their use in university settings and for pioneering work in time-sharing operating systems. The PDP-10, in particular, was a popular choice for research and development, especially in the field of artificial intelligence. "

"Computers with 36-bit words included the MIT Lincoln Laboratory TX-2, the IBM 701/704/709/7090/7094, the UNIVAC 1103/1103A/1105 and 1100/2200 series, the General Electric GE-600/Honeywell 6000, the Digital Equipment Corporation PDP-6/PDP-10 (as used in the DECsystem-10/DECSYSTEM-20), and the Symbolics 3600 series.

Smaller machines like the PDP-1/PDP-9/PDP-15 used 18-bit words, so a double word was 36 bits.

Oh wait. Its already been done.

Taniwha

Not really - I worked on a DSP with 9-bit bytes in the 90's (largely because it was focused on MPEG decode for DVDs, new at the time) largely because memory was still very expensive and MPEG2 needed 9-bit frame difference calculations (most people do this as 16-bits these days but back then as I said memory was expensive and you could buy 9-bit parity RAM chips)

It had 512 72-bit registers and was very SIMD/VLIW, was probably the only machine ever with 81-bit instructions

falcor84

We just need 3 valued electronics

skissane

The Soviets had ternary computers: https://en.wikipedia.org/wiki/Setun

Then they decided to abandon their indigenous technology in favour of copying Western designs

pif

It already exists: True, False and FileNotFound.

If you don't believe me, just ask Paula Bean.

null

[deleted]

percentcer

on, off, and the other thing

tyingq

hi-z is one choice. Though I don't know how well that does past a certain speed.

mzajc

null :)

mouse_

Times have changed. Gnome people will yell at you for mentioning things as innocuous as pixel measurements. You'd probably be crucified for suggesting there's a hardware-correct way of handling address space.

zamadatix

Because we have 8 bit bytes we are familiar with the famous or obvious cases multiples-of-8-bits ran out, and those cases sound a lot better with 12.5% extra bits. What's harder to see in this kind of thought experiment is what the famously obvious cases multiples-of-9-bits ran out would have been. The article starts to think about some of these towards the end, but it's hard as it's not immediately obvious how many others there might be (or, alternatively, why it'd be significantly different total number of issues than 8 bit bytes had). ChatGPT particularly isn't going to have a ton of training data about the problems with 9 bit multiples running out to hand feed you.

It also works in the reverse direction too. E.g. knowing networking headers don't even care about byte alignment for sub fields (e.g. a VID is 10 bits because it's packed with a few other fields in 2 bytes) I wouldn't be surprised if IPv4 would have ended up being 3 byte addresses = 27 bits, instead of 4*9=36, since they were more worried with small packet overheads than matching specific word sizes in certain CPUs.

oasisbob

The IPv4 networking case is especially weird to think about because the early internet didn't use classless-addressing before CIDR.

Thinking about the number of bits in the address is only one of the design parameters. The partitioning between network masks and host space is another design decision. The decision to reserve class D and class E space yet another. More room for hosts is good. More networks in the routing table is not.

Okay, so if v4 addresses were composed of four 9-bit bytes instead of four 8-bit octets, how would the early classful networks shaken out? It doesn't do a lot of good if a class C network is still defined by the last byte.

marcosdumay

Well, there should be half as many cases of multiples-of-9-bits ran out than for multiples-of-8-bits.

I don't think this is enough of a reason, though.

foxglacier

If you're deciding between using 8 bits or 16 bits, you might pick 16 because 8 is too small. But making the same decision between 9 and 18 bits could lead to picking 9 because it's good enough at the time. So no I don't think there would be half as many cases. They'd be different cases.

mhandley

If we had 9-bit bytes and 36-bit words, then for the same hardware budget, we'd have 12.5% fewer bytes/words of memory. It seems likely that despite the examples in the article, in most cases we'd very likely not make use of the extra range as 8/32 is enough for most common cases. And so in all those cases where 8/32 is enough, the tradeoff isn't actually an advantage but instead is a disadvantage - 9/36 gives less addressable memory, with the upper bits generally unused.

layer8

It would also make Base64 a bit simpler (pun intended), at the cost of a little more overhead (50% instead of 33%).

adrianmonk

> IPv4 would have had 36-bit addresses, about 64 billion total. That would still be enough right now, and even with continuing growth in India and Africa it would probably be enough for about a decade more. [ ... ] When exhaustion does set in, it would plausibly at a time where there's not a lot of growth left in penetration, population, or devices, and mild market mechanisms instead of NATs would be the solution.

I think it's actually better to run out of IPv4 addresses before the world is covered!

The later-adopting countries that can't get IPv4 addresses will just start with IPv6 from the beginning. This gives IPv6 more momentum. In big, expensive transitions, momentum is incredibly helpful because it eliminates that "is this transition even really happening?" collective self-doubt feeling. Individual members of the herd feel like the herd as a whole is moving, so they ought to move too.

It also means that funds available for initial deployment get spent on IPv6 infrastructure, not IPv4. If you try to transition after deployment, you've got a system that mostly works already and you need to cough up more money to change it. That's a hard sell in a lot of cases.

elcritch

And nothing like FOMO of developing markets not being able to access a product to drive VPs and CEOs to care about ensuring IPv6 support works with their products.

PaulHoule

I thought the PDP 10 had 6-bit bytes, or at least 6-bit characters

https://en.wikipedia.org/wiki/Six-bit_character_code#DEC_SIX...

Notably the PDP 8 had 12 bit words (2x6) and the PDP 10 had 36 bit words (6x6)

Notably the PDP 10 had addressing modes where it could address a run of bits inside a word so it was adaptable to working with data from other systems. I've got some notes on a fantasy computer that has 48-bit words (fit inside a Javascript double!) and a mechanism like the PDP 10 where you can write "deep pointers" that have a bit offset and length that can even hang into the next word, with the length set to zero bits this could address UTF-8 character sequences. Think of a world where something like the PDP 10 inspired microcomputers, was used by people who used CJK characters and has a video system that would make the NeoGeo blush. Crazy I know.

jacquesm

You are correct. The Sperry-Univac 1100 series though did have 36 bit words and 9 bit bytes.

xdennis

This is what happens when you write articles with AI (the article specifically mentions ChatGPT).

The article says:

> A number of 70s computing systems had nine-bit bytes, most prominently the PDP-10

This is false. If you ask ChatGPT "Was the PDP-10 a 9 bit computer?" it says "Yes, the PDP-10 used a 36-bit word size, and it treated characters as 9-bit bytes."

But if you ask any other LLM or look it up on Wikipedia, you see that:

> Some aspects of the instruction set are unusual, most notably the byte instructions, which operate on bit fields of any size from 1 to 36 bits inclusive, according to the general definition of a byte as a contiguous sequence of a fixed number of bits.

-- https://en.wikipedia.org/wiki/PDP-10

So PDP-10 didn't have 9-bit bytes, but could support them. Characters were typically 6 bytes, but 7-bit and 9-bit characters were also sometimes used.

vincent-manis

Actually, the PDP-10 didn't have any byte size at all, it was a word-addressed machine. (An early attempt to implement C on this machine came a cropper because of this.) It did have a Load Byte and a Store Byte instruction, which allowed you to select the byte size. Common formats were Sixbit (self-explanatory), ASCII (5 7-bit bytes and an unused bit), and (more rarely, I think), 9-bit bytes.

My first machines were the IBM 7044 (36-bit word) and the PDP-8 (12-bit word), and I must admit to a certain nostalgia for that style of machine (as well as the fact that a 36-bit word gives you some extra floating-point precision), but as others have pointed out, there are good reasons for power-of-2 byte and word sizes.

qwerty2000

Not a very good argument. Yes, more bytes in situations where we’ve been constrained would have relieved the constraint… but it would eventually come. Even IP addresses… we don’t need an IP per person… IPv6 will be IPs for every device… multiple even… including an interplanetary network.

bawolff

> But in a world with 9-bit bytes IPv4 would have had 36-bit addresses, about 64 billion total.

Or we would have had 27 bit addresses and ran into problems sooner.

bigstrat2003

That might've been better, actually. The author makes the mistake of "more time would've made this better", but we've had plenty of time to transition to IPv6. People simply don't because they are lazy and IPv4 works for them. More time wouldn't help that, any more than a procrastinating student benefits when the deadline for a paper gets extended.

But on the other hand, if we had run out sooner, perhaps IPv4 wouldn't be as entrenched and people would've been more willing to switch. Maybe not, of course, but it's at least a possibility.

dmitrygr

> simply don't because they are lazy and IPv4 works for them

Or because IPv6 was not a simple "add more bits to address" but a much larger in-places-unwanted change.

zamadatix

Most of the "unwanted" things in IPv6 aren't actually required by IPv6. Temporary addresses, most of the feature complexity in NDP, SLAAC, link-local addresses for anything but the underlying stuff that happens automatically, "no NAT, you must use PD", probably more I'm forgetting. Another large portion is things related to trying to be dual stack like concurrent resolutions/requests, various forms of tunneling, NAT64, and others.

They're almost always deployed though because people end up liking the ideas. They don't want to configure VRRP for gateway redundancy, they don't want a DHCP server for clients to be able to connect, they want to be able to use link-local addresses for certain application use cases, they want the random addresses for increased privacy, they want to dual stack for compatibility, etc. For the people that don't care they see people deploying all of this and think "oh damn, that's nuts", not realizing you can still just deploy it almost exactly the same as IPv4 with longer addresses if that's all you want.

codebje

If you "simply" added more bits to IPv4, you'd have a transition every bit (ahaha, ahem, sorry) as complex as the transition to IPv6 anyway, because IPv4+ would be a new protocol in exactly the same way as IPv6. A new DNS response record. Updates to routing protocols. New hardware. New software.

And no interoperability between the two without stateful network address translation.

bigstrat2003

I've run IPv6 on both corporate and home networks. Whether or not the additions were merited, they are not a formidable challenge for any reasonably-skilled admin. So no, I don't think that the reason you gave suffices as an excuse for why so many still refuse to deploy IPv6.

The first transition was to IPv4, and it was reportedly (I wasn’t in the workforce yet :-) relatively easy…

https://www.internetsociety.org/blog/2016/09/final-report-on...

Some more interesting history reading here:

https://datatracker.ietf.org/doc/html/rfc33

smallstepforman

The elephant in the room nobody talks about is silicon cost (wires, gates, multiplexirs, AND and OR gates etc). With a 4th lane, you may as well go straight to 16 bits to a byte.

pratyahava

This must be the real reason of using 8-bit. But then why did they make 9-bit machine instead of 16-bit?

AlotOfReading

The original meaning of byte was a variable number of bits to represent a character, joined into a larger word that reflected the machine's internal structure. The IBM STRETCH machines could change how many bits per character. This was originally only 1-6 bits [1] because they didn't see much need for 8 bit characters and it would have forced them to choose 64 bit words, when 60 bit words was faster and cheaper. A few months later they had a change of heart after considering how addressing interacted with memory paging [2] and added support for 8 bit bytes for futureproofing and 64 bit words, which became dominant with the 360.

[1] https://web.archive.org/web/20170404160423/http://archive.co...

[2] https://web.archive.org/web/20170404161611/http://archive.co...

NelsonMinar

This is ignoring the natural fact that we have 8 bit bytes because programmers have 8 fingers.

classichasclass

No, we still have 10. Real programmers think in octal. ;)

mkl

Most have 10. That's the reason we use base 10 for numbers, even though 12 would make a lot of things easier: https://en.wikipedia.org/wiki/Duodecimal

alserio

ISO reserves programmers thumbs to LGTM on pull requests

Keyframe

Yeah, but hear me out - 10-bit bytes!

pdpi

One of the nice features of 8 bit bytes is being able to break them into two hex nibbles. 9 bits breaks that, though you could do three octal digits instead I suppose.

10 bit bytes would give us 5-bit nibbles. That would be 0-9a-v digits, which seems a bit extreme.

tzs

10-bit has sort of been used. The General Instrument CP1600 family of microprocessors used 16-bit words but all of the instruction opcodes only used 10 bits, with the remaining 6 bits reserved for future use.

GI made 10-bit ROMs so that you wouldn't waste 37.5% of your ROM space storing those 6 reserved bits for every opcode. Storing your instructions in 10-bit ROM instead of 16-bit ROM meant that if you needed to store 16-bit data in your ROM you would have to store it in two parts. They had a special instruction that would handle that.

The Mattel Intellivision used a CP1610 and used the 10-bit ROM.

The term Intellivision programmers used for a 10-bit quantity was "decle". Half a decle was a "nickel".

jacquesm

That's a very interesting bit of lore, I knew those were peculiar CPUs but I never know about these details, thank you!

int_19h

Clearly it should be 12 bits, that way you could use either 3 hex digits or 4 octal ones. ~

monocasa

Alternate world where the pdp-8 evolved into our modern processors.

pratyahava

Crockford base32 would be great. it is 0–9, A–Z minus I, L, O, U.

pdpi

The moment you feel the need to skip letters due to propensity for errors should also be the moment you realise you're doing something wrong, though. It's kind of fine if you want a case insensitive encoding scheme, but it's kind of nasty for human-first purposes (e.g. in source code).

phpnode

why stop there? 16-bit bytes would be so much cleaner

titzer

10 bit bytes would be awesome! Think of 20 bit microcontrollers and 40 bit workstations. 40 bits makes 5 byte words, that'd be rad. Also, CPUs could support "legacy" 32 bit integers and use a full 8 bits for tags, which are useful for implementing dynamic languages.

iosjunkie

No! No, no, not 10! He said 9. Nobody's comin' up with 10. Who processing with 10 bits? What’s the extra bit for? You’re just wastin’ electrons.

Waterluvian

Uh oh. Looks like humanity has been bitten by the bit byte bug.

pratyahava

deleted

relevant_stats

I really don't get why some people like to pollute conversations with LLMs answers. Particularly when they are as dumb as your example.

What's the point?

svachalek

Same, we all have access to the LLM too, but I go to forums for human thoughts.

pratyahava

umm, i guess most of the article is made by llm, so i did not see it as a sin, but for other cases i agree, copy-pasting from llm is crap

monocasa

Ohh, and then we could write the digits in octal.

Interestingly, the N64 internally had 9 bit bytes, just accesses from the CPU ignored one of the bits. This wasn't a parity bit, but instead a true extra data bit that was used by the GPU.

ethan_smith

The N64's Reality Display Processor actually used that 9th bit as a coverage mask for antialiasing, allowing per-pixel alpha blending without additional memory lookups.

monocasa

As well as extra bits in the Z buffer to give it a 15.3 fixed point format.

HN

We'd be better off with 9-bit bytes

We'd be better off with 9-bit bytes