Skip to content(if available)orjump to list(if available)

I believe 6502 instruction set is a good first assembly language

kazinator

I cannot agree that the simplicity of the 6502 wins over the 68000.

The 68000 has more registers and wider data types: but the registers are all uniformly the same. It's really just two registers A and D, copied a bunch of times in to D0 to D7, and A0 to A7. Whatever you can do with D0 can be done with D3 and so on. Some A's are dedicated, like for a stack pointer.

Simplicity of structure has to be balanced with simplicity of programming.

It's not that easy to program the 6502, because in moderately complex programs, you're constantly running into its limitations.

The best way to learn how to hack around the limitations of a tiny machine is to completely ignore them and become a seasoned software engineer. A seasoned engineer will read the instruction set manual in one setting, and other pertinent documents, and the clever hacks will start brewing in their head.

You don't have to start with tiny systems to get this skill. When you're new, you don't have the maturity for that; get something that's a bit easier to program with more addressing modes and easier handling of larger arrays, more registers, wider integers.

Keyframe

I come from a similar but different angle. Have been at 6502, 68k, z80, and some x86/64 over past few decades (mostly demo). I also wonder if this article/post was written by AI - he argues for 6502 that it has 6 registers, and thus it's simple. ANYONE who ever touched 6502 for more than a week will know you're pretty much dealing with three, and that's where the nickname of a sempahore CPU comes from (three registers).

6502 is fun and neat, especially if you want to program those machines. If you want more modern approach, it's not really all that suitable. Do what we did in the 90's then and start with MIPS if you want to follow the path of glory days or just start out with Neon. I'd even argue Z80 with all its registers and complexities is more similar to what you'll find today (far from it, but let's entertain the connection between past and present as OP does).

There's nothing inherently complicated about modern asm. Take FASM, start. It's not complicated, but it gets complex rather quickly which was and will be always true for such low level approach. Then you start macroing and sooner or later you have your own poor man's C on top.

markus_zhang

I think one of the values 6502 provides is that it has so many retro platforms. Apple ][, NES, Atari 2600, BBC, C64, they all use 6502. If you want to do something cool retro stuffs that a lot of people today still enjoy, that's the best bet.

But again Z80 looks like a very good option too because it has GB/GBC plus ZX Spectrum. The same goes to 68K -- Mac 68K, Sega Genesis and Neogeo are popular hits.

When I was writing my reply to another question (https://news.ycombinator.com/item?id=42891200), I was thinking about a new software engineering education system which starts from a retro platform and goes up from there. Maybe all three can fit the bill. I'm very much against the "elegant/pure" way of teaching.

BTW totally agree 6502 essentially has 3 registers. You don't get to touch the others.

Keyframe

If you want to do something cool retro stuffs that a lot of people today still enjoy, that's the best bet.

This shouldn't be discounted, since it has great pedagogical value in itself. On the other hand, personal opinion here, is that not many things are transferable to modern ISAs. Even back in the 90's, we were shown to work on MIPS instead. 68k is very nice to read and write, almost feels like higher language, but overall ROI for newer stuff is maybe just jump into Neon, x86_64, or as someone said RV directly. It's not _THAT_ hard. It gets complex as programs grow though.

vardump

6502 really has 262 registers: 256 zero page plus those standard ones mentioned. Or 260 for 6510, as zero page addresses 0 and 1 are for I/O.

Practically anything non-trivial uses those zero page addresses as extra registers.

Keyframe

I get what you're saying but I wouldn't personally call zeropage bytes _registers_, even though they are physically connected to internal ones. I treat them as a special zone of memory that has special properties. There are many of these things and just like you would start with any of the old machines you'd first take a look at the memory map to see where's what and how to use it. Registers in a traditional sense would be those few mentioned of which you'd only ever really touch A, X, and Y.

richrichardsson

Definitely a bit a stretch to include the program counter, status register and stack pointer as "registers".

Of course they all are technically registers, but they really can't be used for anything useful the way the index registers and accumulator can

kazinator

One definition is that if it has to be saved and restored by an interrupt handler, it's a register. :)

musicale

> The best way to learn how to hack around the limitations of a tiny machine is to completely ignore them and become a seasoned software engineer

Learning to hack around the limitations of a tiny machine is a good step toward becoming a seasoned software (and possibly hardware) engineer!

The advantage of tiny systems is that you can understand them completely, from silicon to OS to software.

The 6809 is simpler than the 68K, and more powerful and orthogonal than the 6502, but there isn't the same software base for it. I think Motorola was onto something with the 6809 and 68K instruction sets and programmer-facing architecture. IIRC PDP-11/VAX and NS32K are similarly orthogonal.

DonHopkins

The 6809 is the Chrysler Cordoba of 8 bit microprocessors, the personal sized luxury processor with styling in timeless good taste, rich Corinthian leather, and an efficient sophisticated multiply instruction.

https://youtu.be/tfKHBB4vt4c?t=9

geoelectric

I learned assembly on a 6809 (TRS-80 CoCo) platform. It was only later that I really appreciated how cool of a CPU it really was.

It’s a shame that Tandy missed the boat on including coprocessors for game support in their computers, especially that one. If they’d just included decent audio and maybe something for sprite management it would’ve been highly competitive.

musicale

Apple II had primitive graphics and sound, but was incredibly successful.

Atari 800 featured powerful video processing with display lists and sprites, 4-channel audio, etc., but was much less successful.

As I understand it, Radio Shack did not encourage third-party software and support for its systems, not realizing that it was the key to success.

OS-9, originally a 6809 OS, seems to have survived for quite some time.

https://en.wikipedia.org/wiki/OS-9

hcfman

Totally! I wrote an assembler once for the 6809, it was a lovely architecture.

russdill

I love the 6809 and friends, but knowing 6502 just seems so much more relevant and easier to get real hardware as well.

groby_b

Came here to say that. If you must start learning with 8-bit, the 6809 is the processor of choice.

(Unless you want to really dive into "how can I work around limitations", and then you pick an F8 and pull your hair as much as you like)

zozbot234

6502 is a better "toy" asm than the Z80, but that's not saying much. It's not even obviously better than the AVR 8-bit insn set. As far as more modern platforms, I think there is a strong case for teaching RISC-V over something like MC68k. RISC-V can be very simple and elegant (in the base integer insn set) while still being very similar to other modern architectures like ARM, Aarch-64 and MIPS. It's also available in both 32 and 64-bit varieties, and the official documentation is very accessible.

MC68k just has tons of quirks which aren't really going to be relevant in the modern day and age. (About the only thing it has going for it is the huge variety of hardware platforms that happened to make use of it, many of which still have thriving retro communities around them. But that's more of a curiosity as opposed to something genuinely relevant.)

adrian_b

Some people believe that RISC-V is simple and elegant, other people believe that RISC-V is one of the worst and ugliest instruction-set architectures that have ever been conceived. Usually the first class of people consists of people with little experience in assembly programming and the second consists of those who had experience in using multiple ISAs for assembly programming.

Using RISC-V for executing programs written in high-level programming languages is fine. It should be noted that in the research papers that have coined the name RISC, where there was a list of properties defining what RISC means, one of them was that RISC ISAs were intended to be programmed exclusively in high-level languages and never in assembly language. From this point of view, RISC-V has certainly followed the original definition of the term RISC, by making the programming in assembly language difficult and error prone.

On the other hand, learning assembly programming with RISC-V is a great handicap because it does not expose the programmer to some of the most fundamental features of programming in an assembly language, because RISC-V does not have even many features that existed in Zilog Z80 or in vacuum-tube computers from 70 years ago, like integer overflow detection and indexed addressing.

Someone who has learned only RISC-V has an extremely incomplete image about how a normal instruction set architecture looks like. When faced with almost anyone of the several thousands of different ISAs that have been used during the last three quarters of a century a RISC-V assembly programmer will be puzzled.

6502 is also too quirky and unusual. I agree that among the old ISAs DEC PDP-11 or Motorola MC68000 are better ISAs for someone learning to program in an assembly language from scratch. Writing assembly programs for any of these 2 is much simpler than writing good assembly programs for RISC-V, where things as simple as doing a correct addition are quite difficult (by correct operation I mean an operation where any errors are handled appropriately).

layer8

This is the first time I’ve heard of this, but RISC-V not providing access to carry and overflow status seems insane. E.g. for BigNum implementations and constant-time (branchless) cryptography.

nerdralph

I'm surprised RISC-V doesn't expose carry. Thanks for pointing this out. For embedded programming with AVR and ARM I often prefer ASM over C because I have access to the carry flag and I don't have to worry about C/C++ overflow undefined behavior.

I also agree the 6502 is not a simple ISA. However after learning 6502 machine code, MIPS, AVR, and ARMv6-M were all easy to learn.

brucehoult

Lol. I've been programming assembly language since 1980, on at least (that I can remember) 6502, 6800, 6809, 680x0, z80, 8086, PDP-11, VAX, Z8000, MIPS, SPARC, PA-RISC, PowerPC, Arm32, Arm64, AVR, PIC, MSP430, SuperH, and some proprietary ISAs on custom chips.

RISC-V is simply the best ISA I've ever used. It's got everything you need, without unnecessary complexity.

> RISC ISAs were intended to be programmed exclusively in high-level languages and never in assembly language.

That's a misunderstanding. RISC was designed to include only the instructions that high level language compilers found useful. There was never any intention to make assembly language programming difficult.

Some early RISC ISAs did make some of the housekeeping difficult for assembly language programmers by for example having branch delay slots, or no hardware interlocks between things such as loads or long-running instructions such as multiple or divide and the instructions that used their result. So if you counted wrong and tried to access the result register too soon you probably silently got the previous value.

That all went completely out the window as soon as there was a second implementation of the same ISA with a different number of pipeline stages, or more of less latency to cache, or a faster or slower divide instruction. And it was just completely untenable as soon as you got CPUs executing 2 or 3 instructions in each clock cycle instead of 1. The compiler could not calculate when it was safe to use a result because it didn't know what CPU version the code would be running on.

Modern RISC -- anything deigned since 1990 -- is completely fine to program in assembly language.

bjourne

Great flamebait ;) what makes riscv suitable for teaching is that the ISA is trivial to map to hardware. This is because it is so regular. Students with zero experience can design simple riscv cpus in their first asm course. More experienced students can design cpus with pipelining, branch prediction, ooo execution, etc. Old ISAs from the 1980s are way more difficult.

wk_end

The 68K is still a tiny bit awkward, with its 24-bit bus and alignment restrictions and middling support for indexing into arrays (no scale factor). The 68020 is about as close to C as an ISA can get - it’s extraordinarily pleasant.

adrian_b

While I agree that 68020 felt like a great improvement over 68000 and 68010, scaled indexed addressing is an unnecessary feature in any ISA that has good support for post-incremented and pre-decremented addressing.

Scaled indexed addressing is useful in 80386 and successors only because they lack general support for addressing modes with register update, and also because they have INC/DEC instructions that are one-byte shorter than ADD/SUB, so it is preferable to add/subtract 1 to an index register, instead of adding/subtracting the operand size.

Scaled indexed addressing allows the writing with a minimum number of instructions of some loops where multiple arrays are accessed, even when those arrays have elements with different sizes. When all array elements have the same size, non-scaled indexed addressing is sufficient (because you increment the index register with the common operand size, not with 1).

However there are many loops where scaled index addressing is not enough for executing them with a minimum number of instructions, while using post-incremented/pre-decremented addressing still allows the minimum number of instructions (e.g. for arrays of structures or for multi-dimensional arrays).

Unfortunately not even MC68020 has complete support for auto-incremented addressing modes, because besides auto-incrementing with the operand size there are cases when one needs auto-incrementing with the increment in a register, like provided by CDC 6600, IBM 801, ARM, HP PA-RISC, IBM POWER and their successors (i.e. when the increment is an array stride that is unknown at compile-time).

On x86-64, using scaled indexed addressing is a necessity for efficient programs. On the other hand on ISAs like ARM, which have both scaled indexed addressing and auto-indexed addressing, it is possible to never use scaled indexed addressing without losing anything, so in such ISAs scaled indexed addressing is superfluous.

kazinator

The 24-bit bus means that you can use the top bits of a pointer as a tag. In a small system we don't need that much memory, this can actually be a great advantage. We are rediscovering the value of tag bits in 64-bit systems.

giamma

While I learned programming with 6510, I agree that 68000 instruction set was much nicer and easier to read and learn. I would also chose 68000.

This said, 65XX on an early Commodore computers was extremely rewarding to use because there was no memory protection and you could write code altering video memory, mess with sprites, fonts, borders, interrupts, write self modifying code etc etc. 68000 assembly on Amiga was more safe and controlled.

marssaxman

68000 assembly on Macintosh was the wild west, nothing protected at all, and it was so much fun.

renewedrebecca

When I was in college, they taught the Computer Architecture course using the 68000. Coded GUI stuff on the Mac 128k with assembler, and it was surprisingly easy, especially compared to doing anything with the 8086.

eschneider

6502 and 68000 are both perfectly good assembly languages, but I think the best remains PDP-11 assembly. It scans so nicely in octal. :)

cbm-vic-20

For anyone interested, learn the PDP-11 instruction set from the first model, the PDP-11/20, before DEC added protection modes, memory management hardware, etc. A summary of the instruction set is in Appendix A, and it's worth taking a look at chapters 3 and 4 of the PDP-11(/20) Handbook to get more detail on each instruction and the addressing modes.

https://bitsavers.org/pdf/dec/pdp11/1120/PDP-11_Handbook_Sec...

sippndipp

I've always loved the Z80 and felt home with the 68k.

WorldMaker

My school started with a 6502 lab and then followed it with a 68k lab. That still seems like the right order to me all these years later. Starting with the smaller, more cramped chip and learning its limitations makes it all the more interesting when you can "spread your wings" into the larger, more capable one.

You scale up the complexity of the programs you need to build with the complexity of the chips. In the 6502 lab it was a lot about learning the basics of how things like an MMU works and building basic ones out of TTL logic gates. In the 68k lab you take an MMU for granted and do more ambitious things with that. There's useful skills both from knowing what the low level hardware is like as intimately as a 6502 can get versus the skills from where it is easier to program because you have more modern advantages and fewer limitations.

The other thing about that order was that breadboarding the 6502 hardware was a lot more complex, but it made up for it in that writing and debugging 6502 assembly was a lot easier. There are a ton of useful software emulators for the 6502 and you can debug your code easily before ever testing it on lab hardware. At one point in that lab I even just wrote my own mini-6502 emulator specific to our (intended) breadboard hardware design. On the other side, there a lot fewer software emulators for the 68k and the debug cycle was a lot rougher. The 68k breadboard hardware was a bit more "off the shelf" so it was easier to find an existing emulator that matched our (intended) hardware design, but emulator itself was buggier and more painful to use and ultimately testing on the real hardware was the only trustworthy way to deal with it. I also wasn't going to try to write my own 68k emulator. (Though some of the complexity there was the realities of hardware labs in that the hardware itself starts to pick up its own localized quirks from being run only on breadboards for years.)

crest

I would argue that 6502 is a bad first ISA to learn if you want learn assembly. You‘ll spend most of your time fighting the quirks of this clever, but deeply flawed architecture. The idioms you learn to work around them don‘t translate to any better designed architecture that wasn‘t constrained by the tools and budget available to MOS at the time.

If you want to learn a small, yet powerful instruction set with a few quirks go for ARM v6M. It‘s still in meaningful production (no the Monster 6502 doesn‘t count), has good platform support in the latest open source toolchains (debuggers, compilers, assemblers, linkers, etc.).

If you value openness of the architecture enough to deal with a less mature platform (as of early 2025) then pick a RISC-V MCU instead. If you can‘t decide pick a RP2350 :-).

The ARMv6M instruction set is small and no loading constants doesn‘t require long winded instruction sequences if you do it as documented (PC-relative load instead of shifting in immediate data). You don‘t have to deal with self modifying code and/or the zero page to index memory. Your registers are the same width as the address space. Yes it‘s 32 bits, but that makes it simpler to learn, use and teach than all 8/16 bit and most 16 bit instruction sets I’ve seen because you don‘t have to work around to narrow registers for common operations. To anyone who thinks this sounds boring: don‘t worry ARMv6 still has enough quirks you can use for code golfing.

systems_glitch

Agree, I like the 6502 architecture but tend to steer newcomers in vintage computers away from it if they have no previous ASM experience.

I started on PIC16 assembly, and dabbled in a bunch of other architectures, but my favorite in terms of cleanness has been MIPS32.

Nitpick: you can still buy new made 6502s, MCUs with 6502 cores, and peripheral chips from Western Design Center. They sell through e.g. Mouser:

https://www.mouser.com/c/?m=Western%20Design%20Center%20%28W...

tlb

I wrote a lot of 6502 assembly once, and you spend a lot of time dealing with the 8-bittedness of the architecture. Multiplying two 16 bit numbers is a whole blob of code. That doesn't seem useful for new programmers to struggle with.

The early ARM ISAs are good, as you say.

Lerc

It depends on what you are trying to teach I guess. If you want to impress upon students that everything is made of bytes, that could be useful.

I don't think the 6502 would be a good fit because zero-page doesn't really translate to any useful modern concept. Quite a lot of 6502 coding is Zero-page management.

I went with 8 bit AVR as an instruction set for my silly fantasy console project. It has an in-browser editor and assembler to let people write 8 bit code. The AVR has the best 8-bit instruction set I have found, it's still not perfect (only loading constants to some registers) but definitely built with the hindsight provided by it's predecessors.

If you wanted to avoid the management of data types, I would suggest an instruction set with floating point registers. The same management of bytes into words and dwords, signed and unsigned etc. has to happen on a CPU without floating point support. It's an added complication, which you may or may not want to expose students to.

If the intent is to use Asm to teach from the point of view of "every instruction is a clearly defined action" I would use something with 32-bit ints and 32-bit floats.

If you wanted people to feel our pain, go with 6502, Z80, or PIC depending how sadistic you are.

6510

I believe you should be able to sort~of simulate higher level languages on the editor level. Writing c = a * b could just be a representation for the large blob of code or subroutine.*

What is particularly funny about your example is that 40 years later you still cant do multiplication or addition in js. You have to install packages (that are multiple computers in size!) after you make up your mind which of the 20+ different modules (read: "dependencies") has the right kind of multiplication for you! I'm not bitter, it's objectively ridiculous :)

* it should actually be c = a × b Without the asterix, we did actually have arithmetic reasonably standardized before IT de-standardized it.

astrobe_

I think this is true for any bus/register size. Various high-level language "bignum" libraries generalize to any size.

But it's true that with only 8 bits, the problem happens very soon. However, if one doesn't want to solve basic problems "the hard way", one probably won't be interested in assembly programming anyway.

wvenable

Multiplying (or even adding) two 16bit numbers is a good learning experience for assembler. I think it really depends on your goal. I learned x86 assembly way back when it was relevant but now I do 6502 assembly for fun. The simplicity and the limitations are what make it interesting.

russdill

ARM does have a few pain points for new comers related to how immediates are embedded within instructions. It can be non obvious why a certain immediate or offset can be used but not another.

Granted, this extends across a wide range of assembly languages, but it'd be nice to start with a instruction set without this issue.

jmull

> deeply flawed

I can't even guess what you might be referring to.

NobodyNada

The 6502 is designed to be a simple, minimal, low-cost CPU. From that perspective, it's a brilliant design, and it's fun to write small programs for.

Where it becomes "deeply flawed" is if you're trying to develop large, complex programs for it, mainly because there's no way to efficiently implement pointers or local variables -- you only have 3 registers, none of which are truly "general purpose", and none of which are wide enough to hold a pointer; and stack operations are severely limited. (This also means that writing a C compiler that targets 6502 and generates efficient code is almost impossible).

So, in idiomatic 6502 code, all variables are global; and if you need dynamically managed objects, you keep track of them using indices into statically-allocated arrays. This is difficult to scale as your programs get large and complex, because at some point you're going to waste an afternoon finding out that your "tmp7" variable in one routine is getting clobbered by another routine 5 levels deep down the call chain that also uses "tmp7" for some unrelated purpose.

It was a perfect processor for its time and its target market, but it made heavy design compromises to achieve its goals, and Moore's law quickly made those compromises obsolete.

xp84

>Moore's law quickly made those compromises obsolete.

Your overall argument is perfect, but I'd question the use of "quickly" since people were selling essentially 6502-based stuff for how long, 20 years at least? From the Apple II in 1977 to the SNES which was released in 1990 and still getting games in 1995, so that's 18 years of that CPU being highly relevant and worth learning.

Of course, you're 100% right that, assuming money was no object and you could buy and use whatever hardware you wanted to, it wasn't long before you could escape those compromises since the x86 and the 68000 (and probably others I don't know) brought much better architectures to those who could afford them.

leptons

It's not "deeply flawed" at all, OP is being overly dramatic with that statement.

I've coded on a bunch of embedded 8-bit platforms over the decades, and 6502 is great. A, X, Y registers - it's really quite simple. It has various standard and useful addressing modes. It has pretty much the same status register that exist in modern 8-bit MCUs. There's nothing "deeply flawed" about it.

32-bit MCUs are probably a bit too complex for a beginner, 8-bit MCUs will teach a newcomer a lot about the basics of computing in an easy to learn way. It will teach them the significance of "a byte" and working with raw data that maybe you don't exactly get with 32-bit MCUs. There isn't that much to master with a 6502, it's pretty simple, but amazing things can still be done with it.

sususu

I kept thinking about the Ben Eater's 65C02 in a bread board series[1], is there any way to replicate what he is doing in his videos with an ARM CPU?

[1] https://youtube.com/playlist?list=PLowKtXNTBypFbtuVMUVXNR0z1...

Lerc

It wouldn't surprise me if you could do this with a RP2350, connecting GPIOs to the same location as the 65C02 does on the breadboard and even running emulated 6502 code.

It's totally not the same thing of course. A whole lot of transistors and clock speed go to making that feat possible.

gustavopezzi

As someone who has been teaching assembly to undergrads for many years, I have a couple of things to say about this. First of all, I agree. The 6502 is great for beginners but that is not just merit of the 6502 language and I want to explain why.

I have taught 68K, MIPS, ARM, x86, etc., and the overall good student feedback I got by teaching 6502 is mostly because of the surrounding context that comes with the CPU. The reason 6502 clicked better than other modern alternatives (MIPS, ARM, x86, etc.) was because we use it to program a real machine that is simple to understand (i.e. Nintendo Entertainment System). Rudimentary memory mapped IO, no operating system, no pipelined instructions, no delay slots, no network, no extra noise, ...it's just a simple box with a a clock inside, a CPU, some memory addresses, some helper chips, IO mapped to memory addresses, and that's pretty much it!!! So, even though I agree that the 6502 is not the simplest instruction set out there, THIS simplicity of the system helped a lot.

And about the limitations of the 6502 CPU, these limitations were also important for students to understand that these instructions have a reason to be the way they are. CPUs were designed and wired given the constraints of that time, and that reflects on how we programmed for them.

So, even thought this was mostly empirical, I have to say picking 6502 and the NES to teach beginners was successful. Once again, not really because it was the 6502, but because the 6502 forced us to go simple in terms of the sytem we were moving bits left and right.

Once students played around with the 6502 and saw NES tiles moving on the screen, then it was super cool to evolve and show them how the 68000 did things differently, and then evolve more and show how MIPS came, show how pipelining works, how to take advantage of delay slots, and being able to compare the differences of RISC and CISC. It's super simple to evolve once the basics are there.

markus_zhang

Gustav, thanks for giving us Pikuma.

gustavopezzi

My pleasure! :)

julian55

I'm slightly surprised that no one has suggested PDP-11 assembler as a good starting point if you're not going to learn a current instruction set. Perhaps it's because it was the first one I learnt properly but all the early miccroprocessors felt like a step backwards. I did spend a few years writing Z80 assembler but I wouldn't recommend it nowadays as it's not a very orthogonal instruction set and 6502 doesn't have enough registers to give you a proper feel for writing assembler.

brucehoult

It's kind of hard to get hold of a PDP-11 these days. Even getting an OS, compiler etc is not that easy.

If you like the PDP-11 then you get the same qualities slightly restricted in the MSP430 and slightly enhanced in the 68000.

But, really, just forget all those relics and learn either RISC-V (the best answer) or else one of the half-dozen Arm variations. I'm partial to ARM7TDMI myself for sentimental reasons doing a lot with it in the mid 2000s. The Thumb mode is probably slightly easier to learn than the original Arm mode, but neither is as satisfactory as RISC-V.

shakna

> Even getting an OS, compiler etc is not that easy.

There's a GCC fork [0], macro11 [1] (GCC and clang also both have macro11 backends), ack [2] and more.

Getting hold of a modern compiler is trivial.

> It's kind of hard to get hold of a PDP-11 these days.

The PiDP-11 [3] emulator that runs on a Pi, is fairly popular among the retro crowd. So sourcing something hardware wise that behaves that way is easily possible.

The Computer History Simulation Project [4] will give you easy access to simulating a PDP-11 on just about anything that you own.

But if you want the original hardware, then they're in the $400-500 range, in my area. Easy to source.

[0] https://github.com/JamesHagerman/gcc-pdp11-aout

[1] https://gitlab.com/Rhialto/macro11

[2] https://github.com/davidgiven/ack

[3] https://obsolescence.wixsite.com/obsolescence/pidp-11

[4] https://github.com/simh/simh

snovymgodym

You can find original PDP-11s in the $400-$500 range?

snovymgodym

I mean you can grab SIMH and trivially have a working PDP11 emulator on pretty much any system that has a C compiler.

Then just get 2.11 BSD (or V7 Unix if you're a minimalist/masochist), install it, and you're free to write/run/debug all the PDP11 assembly you want.

But I do question the value of doing this. I'm a big believer in the notion that you can't train skills by proxy, so if your goal is to become proficient in writing assembly for modern architectures, then you might as well do so by learning a modern architecture.

aap_

I agree. PDP-11 is so much more pleasant, it's not even close. One could make the argument that the addressing modes are conceptually more complicated than what you'd have on a RISC, but the 6502 addressing modes are probably actually harder to understand than the PDP-11's.

systems_glitch

Plus, learning PDP-11 ASM explains some of the idioms from C as they map directly onto the architecture! "Pointer to a pointer" is just a native addressing mode, for instance.

WalterBright

Yah, C's pre-increment and post-increment are map right onto the -11 addressing modes. It's a brilliantly conceived minimal instruction set. Just a joy to code in.

I rewrote EMPIRE into PDP-11 assembler.

https://github.com/DigitalMars/Empire-for-PDP-11

WalterBright

The -11 instruction set is an engineering marvel. DEC had everything needed to utterly dominate the microcomputer business.

But DEC spurned that opportunity, while IBM took it over with the clumsy 8086 instruction set.

There's no purpose to learning the -11 anymore.

serviceberry

I really don't see how. When students are first exposed to computer programming, it might make sense to start with toy / compact languages that don't have any real-world use. But assembly is not the first language you're supposed to learn!

It's very utilitarian and most commonly just used for debugging and reverse engineering. So why would you waste time on the assembly language of a long-obsolete platform?

Plus, the best way to learn assembly is to experiment. Write some code in your favorite language and look at the intermediate assembler output, or peek under the hood with objdump or gdb. Try to make changes and see what happens. Yes, you can do that with an emulator of a vintage computer, but it's going to be harder. You need to learn the architecture of that computer, including all the hardware and ROM facilities the assembly is interacting with to do something as simple as putting text on the screen... and none of this is going to be even remotely applicable to Linux (or Windows) on x86-64.

Salgat

I struggled with C until I learned to hex and assembly program on the 68HC11. Maybe I'm just a moron, but things like pointers for a complete beginner seemed so abstract and obtuse until I learned how to do indirect addressing in assembly, then suddenly it was painfully obvious why C had pointers and how they worked. Before that I just mostly used Python where it's far more abstracted away. People forget that many features like pointers exist to address hardware/performance limitations, which is not immediately obvious to new devs the "why and what" of what they're actually doing inside the cpu which limits your intuitive understanding.

gxd

That's why some people see the C language as "portable assembly". I think C is at its best when you want Assembly-like memory addressing flexibility but don't want to deal with instruction-set specific idiosyncrasies.

chungy

This is my experience as well, for a lot of features that seem like magic in high level languages. I could somewhat accept pointers even without understanding them, but object-oriented programming with its classes and all was such a mystery to me that I was scared to even try using it.

It's not until I learned assembly language that I understood pointers, and from there, I could implement a basic OOP system in C and finally understand what objects are all about. It only clicked when I learned how to do it from scratch.

tcbawo

I have a hypothesis that to move beyond being a consumer of technology to being a producer of technology, modern education in software development should be built on building fundamentals through first principles — especially with kids and young adults. It should be analogous to how we learn counting, arithmetic, and higher level maths. One of the best features of obsolete, constrained architectures was their simplicity. Recovery for something wrong is quick and there is (generally) no permanent damage. All of this makes it much easier to understand what is happening at a lower level. Once you have a basic understanding, then you are ready for the next level of abstraction. I assume that openness and availability of IP is the biggest challenge to putting some kind of curriculum together. I would be highly interested in whether anyone has curated this into a cohesive educational approach, especially one that targets childhood development.

privatemonkey

Spending a couple of months tinkering with hardware and programming assembly will make you understand the basics of how a computer works in a totally different way than other high level languages will. I haven't programmed a single line of assembly after high school (back in the 80s...), but the fundamental understanding of how operations are executed, how registries work makes it so much easier to understand the whys, the ifs and buts of programming and optimization. And you'll certainly start to appreciate clean, effective code.

pjmlp

We managed alright as 10 year old kids, as second language following up those BASIC programming comic books.

For a small taste of the past,

https://www.atariarchives.org/

biofox

I wish there were books like this today, that could lead a kid from knowing virtually nothing to a graduate level understanding of computer architecture, digital storage, logic and set theory, graphics, mathematical modelling, networking, numerical methods, signal processing, electrical engineering, software design, ...

hn_acc1

Wow, so many memories there! Yeah, BASIC (TI-99), then Atari Basic, then Basic XL (along with reading a 6502 book), then GFA BASIC, then Megamax / Laser C, then Pascal and 68K and Ansi C at uni. RISC was part of the 4th year digital architecture/design course.

jcarrano

There are roughly two ways to learn programming: top down (start with abstract concept and go down to actual implementation, e.g. SICP) and bottom up (start with the concrete low-level code and let the abstractions naturally emerge).

I studied electronics, so naturally we began with assembly (Motorola HC11). By the end of the course everyone had independently developed their macros to do things like for-loops, so it was a natural progression from there to C. By the end of the C course "C-style OOP" had also emerged naturally, which led to the next course in C++.

The downside of this approach is that it there is no gradual route from there to functional paradigms (or non-imperative in general). Also, one develops the habit of always thinking of how the language works under the hood, which can be counter, productive. E.g. when I was trying to learn Haskell, my mind was trying to understand how the interpreter worked.

Learning assembler is not just about the language, but understanding how the machine works (buses, memory-mapped peripherals, etc). In older platforms this is much simpler, so while ARM instructions can be easier to learn than the CISC instructions of the HC11, everything else is much friendlier for the beginner in the HC11.

WalterBright

With the dmd compiler, compiling with -vasm will show the generated assembly as it compiles. It's been poo-pooed because why not use objdump or -S? But once you try it, you'll know why it's so convenient, as it just emits the assembler, and not the huge pile of boilerplate needed to make an object file.

For example, I'm working on an AArch64 code generator, more specifically, generating floating point code. I have a function:

    float test(float a, float b) { return a * b; }
Compiling it with:

    dmd -c test.c -arm -vasm
yields:

    test:
    0000:   A9 BE 7B FD  stp       x29,x30,[sp,#-32]!    // https://www.scs.stanford.edu/~zyedidia/arm64/encodingindex.html#ldstpair_pre
    0004:   91 00 03 FD  mov       x29,sp    // https://www.scs.stanford.edu/~zyedidia/arm64/encodingindex.html#addsub_imm
    0008:   BD 00 0F A0  str       s0,[x29,#12]    // https://www.scs.stanford.edu/~zyedidia/arm64/encodingindex.html#ldst_pos
    000c:   B9 40 1B A0  ldr       w0,[x29,#0x18]    // https://www.scs.stanford.edu/~zyedidia/arm64/encodingindex.html#ldst_pos
    0010:   1E 21 08 00  fmul      s0,s0,s1    // https://www.scs.stanford.edu/~zyedidia/arm64/encodingindex.html#floatdp2
    0014:   A8 C2 7B FD  ldp       x29,x30,[sp],#0x20    // https://www.scs.stanford.edu/~zyedidia/arm64/encodingindex.html#ldstpair_post
    0018:   D6 5F 03 C0  ret    // https://www.scs.stanford.edu/~zyedidia/arm64/encodingindex.html#branch_reg
It emits the address, the hex instruction, the instruction mnemonic, and the URL to the instruction specification.

Yes, I know the code isn't quite correct, did I mention I was working on it? :-)

WalterBright

You might think that the compiler was generating assembler code, and then assembling the code to binary. Nope. It generates the binary instructions directly. The compiler has a builtin disassembler for the display. This makes for a fantastic way to make sure the correct binary is generated. It has saved me an enormous amount of debugging time.

xorcist

Learning assembly is what finally made programming "click" for me. With a solid intuition for instruction sets, pointers and adressing modes I could suddenly reason about programs on another lever.

I have found good results with model of teaching since them and wish that more people tried it.

anta40

Understandable... but this is not that the point of the article. The article says "...a good first assembly language".

Nowadays, normally people learn assembly after they learn higher level lang like C or Java.

scarface_74

I started programming in AppleSoft BASIC in 1986. But quickly became fascinated with 65C02 assembly. By the time I got to college and started taking different programming language classes, I quickly fell in love C. My knowing assembly helped me understand C and even though I haven’t learned any other assembly language since then, when I read articles about low level architecture of processors, I can follow along.

Let me take that back, I did learn 16 bit x86 assembly and the int21 DOS commands in college.

lutusp

> I believe 6502 instruction set is a good first assembly language

It was for me. In 1977 I lived in a little cabin in Oregon. I bought an Apple II as a diversion. Within a year I was working on a program that later became "Apple Writer" (https://en.wikipedia.org/wiki/Apple_Writer), written entirely in assembly.

Some in this conversation identify 6502 assembly as rather clunky and difficult to use. In retrospect I have to agree, but in 1977 I didn't have a basis for comparison.

There were no fast, high-level languages available for the Apple II, so my little program became an Apple product, primarily from the lack of alternatives.

Consider this -- Apple Writer lived in 8 kilobytes of RAM, but actually did things. It even had a macro language people used to process address lists.

I just boosted my main system to 96 gigabytes of RAM to be able to more easily host DeepSeek locally (I have an RTX 4090). I just realized that's enough RAM to hold almost 12 million copies of Apple Writer.

This is all pretty surreal ... but I've had occasion to say that a number of times since 1977.

hippospark

I prefer RISCV as an starting assembly language because: it has good design, it's more intuitive, it has modern language and tool support (GCC, LLVM, Rust, etc.), and it runs on QEMU and real available hardware.

pjc50

ARMv7 also meets these criteria, and is very nice to use. Available inline in BBC Basic on the Acorn Archimedes.

rlpb

> Available inline in BBC Basic on the Acorn Archimedes.

The 32-bit ARM ISA, yes, but ARMv7 came much later. You're basically limited to ARM2 on the Archimedes, according to https://en.wikipedia.org/wiki/Acorn_Archimedes

pjmlp

Last time I checked there was yet to exist a macro assembler worth its name for RISC-V.

musicale

> real available hardware

Last I checked the 65C02 was still being manufactured and sold.

It can also at a blazing 14 MHz.

brucehoult

It's $8 and can't run without external ROM and RAM and a clock circuit and some glue logic.

A RISC-V CH32V003 32 bit processor with 2k RAM and 16k of flash running at 48 MHz costs $0.10 for an 8 pin package (6 available I/O pins) or $0.20 for the 20 pin package. Once programmed, it needs only electricity between 2.8V and 5.5V to be applied to start running.

https://www.aliexpress.com/item/1005005036714708.html

You can get it on a ready-made board or kit for as little as $1.

https://www.youtube.com/watch?v=dfXWs4CJuY0

https://www.olimex.com/Products/Retro-Computers/RVPC/open-so...

The recommended dev board, with a USB programmer/debugger interface for your PC, plus 5 of the 20 pin chips, costs $8.35 about the same as a bare 65C02 chip.

https://www.aliexpress.com/item/1005004895791296.html

You can make your own cool shit like this, which uses the $0.10 8 pin chip.

https://www.youtube.com/watch?v=1W7Z0BodhWk

It's as easy to learn the instructions of as a 6502 -- there are fewer of them! -- and far easier to write useful code for.

MUCH easier to learn than a 68000, and no more difficult to use. TBH the A and D register split always annoyed the hell out of me in early Mac days.

hakfoo

For a "first dive" into a programming paradigm, I could see the appeal of something more "PC" shaped than "MCU shaped", just because it offers recognizable, easy ways to deal with primitive debugging.

If you have a memory-mapped frame buffer, you can write a single byte to it if you need a status checkpoint or tracking a variable. If you have a keyboard, you can probably read its buffer or use it for triggers.

Maybe modern debugging environments and tools make it easier, but I tend to think of my university assembly language class which featured original Intel SDK-86 boards with LEDs and hex keypads to interact with.

musicale

Sounds nice, but the 6502 is certainly real, available hardware that you can buy and set up on a breadboard.

There are tutorials on youtube for it as well, such as Ben Eater's tutorial.

mytailorisrich

> It's $8 and can't run without external ROM and RAM and a clock circuit and some glue logic.

True but I've always found that there was a special 'charm' to these old CPUs because you have to build the circuit you described as a bare minimum, which is not difficult and makes you learn a range of skills.

lanstin

All my old assembly games would be 14x too fast. I'd have to adjust the between loop sleep constant?

pizza234

The topic of the 6502 ISA simplicity is a pet peeve of mine, because to me it's clear that anybody thinking that such simplicity is a good thing, never progressed past a hello world.

Programming anything of moderate complexity on the 6502 is hard. 8 bits are way too restrictive (e.g. screen addressing on the Commodore 64). Multiplications/divisions need to be hand-rolled. And even 16 bit sums/subtractions are simple but still not trivial to perform efficiently.

The 8086+DOS platform is way easier to work on, in comparison (if one wants to learn basic assembly).

apricot

> Multiplications/divisions need to be hand-rolled.

In the context of someone's first contact with assembly language, that's a good thing! Translating various multiplication and division algorithms to assembly is a great way to learn it.

kccqzy

I learned to implement my own multiplication and division as part of learning bignum with a high level language (in my case Pascal). By the time I learned assembly language I simply focused on how to translate high level language to assembler; basically be a manual compiler and compare my output against a real compiler.

It seems unnecessary to learn any algorithm including the multiplication algorithm in assembly language.

apricot

Implementing shift-and-add multiplication and division algorithms in assembly, and analyzing them, is a very formative exercise in my experience.

Besides, if assembly language is good enough for Knuth to express all the algorithms in TAOCP, it's good enough for me...

wvenable

The restictiveness is part of the fun. 8086 assembly isn't as much fun. If you're already going for something irrelevant (not x86-64 or RISC-V or ARM) then I don't see the advantage of 8086.

PaulHoule

I probably wrote more 8086 (really 80286 in real mode) assembly than any other assembly. I loved every minute of it, I didn't even mind the segment registers.

nickyisonline

I go to an compsci-centered technical institute and my informatics professor (a retro nerd that loves to show us his arcade machines) went against the set program for our class, which is learning the 8808 assembly language, and instead focused on the 6502. It was honestly one of the best learning experiences of my life and i woudln't have it any other way. He even got us to build the ben eater breadboard computer so it was particularly hands on and really interesting!)

WalterBright

My first exposure assembly language was the PDP-10. All I had was the DEC-10 processor manual. I was completely baffled by it. It had hundreds of instructions, with completely opaque descriptions. What was a register? What was an accumulator? What was an address? What was a stack? I had no idea. David Rolfe wrote some subroutines I needed for my Fortran version of Empire, and that helped a little, but I was lost.

One day, I asked my friend Shal Farley, what was a stack? He said "Imagine a stack of plates. You added a plate (pushed it on the stack), and took off a plate (popped it off the stack). Suddenly, Shal had turned the lights on! I instantly understood it.

Then, I began working with a 6800 microprocessor on a little board. It had something like 40 instructions, that all fit on a card. 40 instructions were simple to learn, and suddenly, I got it.

I want back to the -10 manual, and it all made sense.

dhosek

I’ve only ever learned two assembly languages: 6502 (when I was in elementary school in the early 80s) and 370 (my senior year of high school when I had access to the UIC mainframe thanks to taking night classes there). I can roughly follow the output of godbolt on those rare occasions that I’m curious enough to bother, but the complexity of modern CPUs and computer architectures are such that I don’t know that I really need to deal with assembly language now. That said, my mental model of how a computer works at a low level is very much based on how the Apple II was set up.

WhitneyLand

Same, learned as a kid. What was your motivation?

Mine was wanting to make games and hitting the performance wall in basic.

kstrauser

That's what did it for me, too. The magazines at the time said that if you wanted to make your game run faster, you had to write parts of it in assembly. I was fortunate that no one told me that was supposed to be harder. I just knew that it was different, and it never occurred to me to be leery of it.

dhosek

Pretty much all kids want to write games.

toolslive

6502 was my first assembly language (back in the 80s). It was fine. However, when I later had to do some Z80 assembly programming I considered that instruction set a lot nicer. It had more registers. It had a swappable register set. It even had a few 16 bit instructions. Really nice.

I guess most people just give up once they see the ugliness of what they got from Intel.

masswerk

I guess, it's a matter of perspective. The Z80 is derived from the 8008 (actually the 8080), which was an on-a-chip implementation of the processor of the Datapoint 2200. Notably, the DP2200 was designed around an early Intel shift-register for memory (meaning, sequential memory), hence the multitude of internal registers in order to minimize memory access. (It's probably only for the 8080, which provided better support for direct memory access, that there's this notion of luxury about these internal registers. Before this, it had been a necessity.)

The 6502, on the other hand, is a derivate of the 6800, a genuine microprocessor design that takes fast random access into account. And thus, taking fast memory for granted, it became viable to outsource some of the internals. From this perspective, the 6502 provides a plenitude of 256 slightly slower registers in the zeropage. – However, if you're using the 6502 like this, you may discover that you lose some of the much acclaimed advantage in cycle count per instruction.

adrian_b

Zilog Z80 had very significant improvements over Intel 8080. From many points of view it can be considered midway between Intel 8080/8085 and Intel 8086/8088.

Besides increasing the number of registers, Z80 has added many features that were standard in any general-purpose computer (e.g. signed integer support and indexed addressing), but which were missing in Datapoint 2200/Intel 8008/Intel 8080, simply because Datapoint 2200 had not been designed to be used in general-purpose computers but only for implementing the logic required inside a serial terminal.

However many programs for Z80 did not make good use of its additional functions, because they were intended to remain compatible with the legacy Intel 8080 systems.

Intel 8086 did not have this problem, because it was only source-code compatible with 8080, not binary compatible, so any application had to be recompiled for it, and fully using the new ISA did not have any additional disadvantage.

Unlike 6502, Intel 8080 and Z80 had a few 16-bit operations, which were intended for address computations, while data operations were expected to be handled using the 8-bit accumulator.

Despite the intended use and the limited set of 16-bit operations, implementing complicated arithmetic operations, e.g. floating-point arithmetic, was still faster using the 16-bit address registers and operations for handling data. With properly optimized programs, Z80 and even 8080 could be much faster than 6502 for number crunching. (Though faster is only relative, because FP64 floating-point operations took many milliseconds per operation on any 8-bit microprocessor, many billions of times slower than on a modern laptop or desktop CPU.)

masswerk

I think, this is much about another major difference in the history of both designs: while originally for a different architecture, the DP2200 processor / Intel 8008 was meant to be a CPU for a small computer system from the beginning. The 6800 and in consequence even more so the 6502 was more about a microcontroller for implementing in software what wasn't economically viable to be put in silicon. Notably, it was not meant to be a computer CPU. Thus, the 6502 falls short on many things like support for large stacks, as required for higher languages, or efficient 16-bit operations. (Its philosophy may be better described as, "if it runs, it's good enough, even better so, if it runs cost effectively.")

PS: regarding the DP2200 not being meant as a small system, I'm not so sure about this based on my own reading. But, certainly, it wasn't marketed as such.

And, regarding the educational merits of the 6502, it may be a good second language, as it requires you to think about your implementation. (Personally, I'm more for the PDP-1, which Ed Fredkin – "world's best programmer", no less – once claimed to have inspired IBM's RISC architecture. ;-) )