The Sudoku Affair

44 comments

·February 5, 2025

varunneal

Hilarious post. I recommend people read to the end

> If Jeffries started with a different core representation, then it's likely his subsequent design decisions would also change. The bookkeeping for constraint propagation might push him towards Norvig's relational approach to the rules of Sudoku; rather than continually recomputing the rows, columns, and boxes, he could simply have a map of each cell onto its peers. He could distill every lesson of the previous posts, creating something simpler and faster.

> But Jeffries isn't in the business of starting over. He not only believes in incremental design, but in using the smallest possible increments. In his posts, he regularly returns to GeePaw Hill's maxim of "many more much smaller steps." He is only interested in designs that are reachable through a series of small, discrete steps.

Jeffries has radicalized me. This sort of puttering-around with "incremental design" is too pervasive in the corporate world. In software we have the luxury of rethinking from first principles, and we must use it. Death to MMMSS

taurknaut

> In software we have the luxury of rethinking from first principles

What does this even mean? I'm aware of the meaning of "first principles" and it doesn't seem to relate at all to software development at all. I can imagine using this when trying to figure out how code works, but I can't figure out how it relates to building the software to begin with. What questions are you trying to answer where first principles would even come up?

fragmede

One example is with databases. In the olden days, if you needed performance, you bought a big Oracle server, and that was that. If you needed more performance, you just bought a bigger one. Without revisiting the database from first principles, we'd still be using acid-compliant relational SQL databases for everything. By starting from first principles and asking what the database software is for, and considering a changed hardware landscape aka the cloud, among other factors, we now have other types of databases which are more appropriate for different kinds of workloads. Key-Value stores, document stores, Wide column stores, Graph databases, time-series databases, NewSQL databases, vector databases; all of those are an example of first-principles rethinking of the database, and being willing to throw out the rules and ACID compliance in favor of what's actually necessary for a given workload instead of continuing on with existing conventions.

kthejoker2

I find this post ironic in that it (unintentionally) mirrors the atticle's point.

Graph (hierarchical) databases and time series (SCADA) datbases predate your (implicit) definition of "database" as a "relational datbase."

Key values have also existed before Oracle came to be.

It's never really been a matter of "rethinking from first principles".

Like Norvig's contraint propagation in the post, it has been about choosing the right design for the problem at hand, instead of trying to fit square pegs into round holes.

taurknaut

[flagged]

srpablo

It's a thing tech (and tech-adjacent) people say when they rub their temples and decide to try to work out the solution to a problem from their existing ideas and biases instead of considering what anyone else has said about it.

abetusk

Good article. For me, this is the fundamental concept:

> Both Norvig and Jeffries are genre programmers; they have spent most of their career solving a specific kind of problem. And that problem, inevitably, lends itself to a particular kind of solution.

I wish more people would be circumspect about the genre of problem they're trying to solve and how their framing railroads them into a particular type of solution that might have a better method.

JadeNB

> I wish more people would be circumspect about the genre of problem they're trying to solve and how their framing railroads them into a particular type of solution that might have a better method.

Do you mean "explicit?" "Circumspect" seems like not what you want.

abetusk

From Wiktionary [0]:

> Carefully aware of all circumstances; considerate of all that is pertinent.

"I wish more people would put more consideration of all that is pertinent about the genre of problem they're trying to solve and how their framing railroads them ..."

[0] https://en.wiktionary.org/wiki/circumspect

JadeNB

Ah, I thought you meant that you wished people who talked in public about their approaches to a problem (as Norvig and Jeffries did) would talk more about their framings. I think I now understand that you mean that you wish individual people would analyze their own framing for their own use, not (just) for the purposes of other people learning from them.

(I also am more familiar with the sense of "circumspect" from the macOS dictionary, "wary and unwilling to take risks," which seemed totally out of line. The Wiktionary meaning makes more sense.)

frankfrank13

It is interesting to basically critique a series of blog posts like its a film.

But re: design vs. increment, I do think incremental TDD is pretty useful in domains where you have low confidence that you could come up with a good design. If you asked me to implement an LLM today, 0% I could design a good one. But given enough time I could slowly start to implement one (maybe?).

The two quotes that got me are

Jeffries:

> So I try to make small decisions, simple decisions, decisions that will be easy to change when, not if, a better idea comes along.

Norvig, about Jeffries:

> He didn't know that so he was sort of blundering in the dark even though all his code "worked" because he had all these test cases.

I fear someone could say this about me, about almost everything I've ever built. I guess something like "I just kept crawling and it just kept working!"

copypasterepeat

The way I think of this is that there is no shame in Jeffries not having as good and intuitive grasp of the problem as Norvig. After all, this kind of problem is almost tailor-made for Norvig's skill set. Like many of us who haven't done a lot of work in this space would do, Jeffries spends a lot of time early on just exploring the problem space. He starts with a seemingly intuitive (naive?) data structure and goes from there. So far so good. No shame in any of that. All of us have been in a similar situation. I think that the key moment comes at blog post number 12. As the author of this post describes, this is where Jeffries basically recognizes that his approach needs to radically change and where a lot of people would throw away what they have, it having served the function of helping one understand the problem, and start from scratch with different assumptions, data structures, etc. Jeffries didn't do that, probably because he firmly believes in the TDD dogma and probably because he felt pressured to make this work, since he way publicly working through this. That's where Jeffries goes wrong. He should probably have admitted that in some cases you should just start from scratch instead of tinkering at the edges of something that's clearly suboptimal.

karmakaze

I do something similar where even for something substantial, I'll start out with "what's the simplest thing that might work".

Of course with past experience, my first 'simple' thought is usually quite practical or sensible. In the same way it's hard to invent an inefficient algorithm that does no useless busywork--we've wired our brains toward good.

gavinhoward

As someone who wrote 60 pages of design for a new VCS before writing a single line of code, I appreciate this post.

Design is crucial. The optimal time for that design may be different for different things, but I do know that incremental design can, and often does, fail.

jcranmer

One of the most useful skills I gained from programming competitions is the ability to write and debug an algorithm in my head. And I suspect your 60-page design document was written by essentially doing the same thing: coming up with a prototype design, and mentally testing it against the use cases to figure out if the design is workable, and committing the notes to paper only after you confirmed that it worked in your head.

gavinhoward

Interesting. Perhaps that is what I did.

I did do that to design the UX, but for algorithms, I am not so sure. I am really bad at coding competitions, for example.

aidenn0

Was that for Yore?

gavinhoward

Um, yeah. How do you know the name?

I have never mentioned it by name, I think. Only linked to it occasionally.

Edit: Another reason I am surprised is that Yore became its name only a few months ago.

aidenn0

I'll admit I had to search for the name. I recognized your username as "that Yao guy" and remembered that you were also working on a build system and VCS. Since yore is in the same repository of yao, it wasn't hard to find.

Archit3ch

Your method is still incremental design, unless you started at page 1 and stopped at page 60.

gavinhoward

That is, in fact, what I did.

But then again, you know I was referring to Jeffries' SOP of designing while coding and refactoring incrementally.

taeric

Sadly, link isn't loading for me. I'm assuming this is the attempt to TDD into a Sudoku solver?

Sucks, as it is largely a dunk on the author. It really is a sobering experience, to attempt something like that and use what are advertised as good tools, only to fall flat on your face.

I think what people often fail to appreciate is if you see ANY strategy work, it has almost certainly been rehearsed. Many many times. Even when you are doing something where you are using the exact correct tools, for it to work smoothly pretty much requires rehearsal.

And this is exactly why you do gamedays for how to react to failures. If you have not practiced it, then you should not expect success.

aidenn0

It's specifically about the limits of incremental design.

TFA's thesis is roughly that incremental design dooms you to a local maximum:

Since Jeffries (the TDD/Sudoku guy you seem to be aware of) starts out with a suboptimal representation for the board, there is no small change that can turn the bad code into good code. At some point along the line, he makes motions in the direction of the design that Norvig used, but as there is no incremental way to get there (maintaining two representations was a dead-end since it hurt performance so much), he never does.

taeric

Thanks! Annoyed that the link still isn't loading for me.

I'm curious on the thesis. I'm assuming "locked in by tests" increments are the problem? I'm curious why you couldn't treat this like any learning task where you can take efforts that are effectively larger steps to see where they can get you?

I should also note that I am not clear I understand how bad of a representation of the board you could get locked with. I got a working solver years ago with what is almost certainly a poor representation. https://taeric.github.io/Sudoku.html

aidenn0

> I'm curious on the thesis. I'm assuming "locked in by tests" increments are the problem? I'm curious why you couldn't treat this like any learning task where you can take efforts that are effectively larger steps to see where they can get you?

Here's a quote from TFA on this (using >> for quotes from TFA)

>> But Jeffries isn't in the business of starting over. He not only believes in incremental design, but in using the smallest possible increments. In his posts, he regularly returns to GeePaw Hill's maxim of "many more much smaller steps." He is only interested in designs that are reachable through a series of small, discrete steps:

and later

>> Jeffries, however, does not believe in bigger pictures; his approach to software design is proudly myopic. He prevents himself from seeing the forest by pressing his face against the trees.

> I should also note that I am not clear I understand how bad of a representation of the board you could get locked with. I got a working solver years ago with what is almost certainly a poor representation. https://taeric.github.io/Sudoku.html

First a point of clarification; Jeffries also gets a working solver; not in the original episode you may have heard of but in a series of forty-five articles 18[1] years after the infamous incident; TFA focuses almost entirely on this (successful) attempt.

Once Jeffries has a working solver he attempts to simplify it, and TFA makes the claim that these attempts are hindered by the choice of Option[Int] for each cell rather than a Set[Int] (i.e. a set of remaining legal values). This results in Norvig's code being significantly more succinct than Jeffries' code, even when implementing the same heuristic.

1: This originally read "two" due to quick skimming on my part.

jsnell

It's about that, as I too expected from the title, but there's a twist.

The twist is that Jeffries started working on that project again in 2024 and wrote 45 more posts on it. And that's what this article is about.

taeric

Thanks! Annoyed that the link is still not loading for me. :(

dfe

I once wrote a Sudoku solver in SQL in an afternoon because I was playing a Sudoku (a very newb player at that time) and it spontaneously occurred to me that all I was doing was a join or anti-join query repeatedly.

My data model was X,Y,V. Nothing nullable. Separately you need a table (possibly generated on the fly) of range 1 to 9. You wind up joining that a lot.

The whole program consisted of running INSERT INTO ... SELECT ... in a loop until 0 rows were inserted, indicating you were either done or you hit a point where no cell had a single solution. I'll spare everyone the rest of the details.

Incomplete, I know, but it fired the neurons, particularly with respect to the utility of the EXISTS expression in SQL.

I had no idea about things like "naked pairs" at that time, but I'm sure I could extend it to suport that.

It's interesting that if I translate that to a more traditional language, I independently came up with what is a cousin to Norvig's solution. I sure don't have his background, in fact my background is probably closer to Jeffries.

The main difference is that Norvig pre-enumerates all 9 possible values for all 81 cells then sieves them out, whereas my SQL constructs the 9*4 matrix from a temporary range 1..9 table, discovers the "must be" values, inserts those, then just repeats the process. Basically I'm Map<Coordinate, Integer> whereas Norvig is Map<Coordinate, Set<Integer>> and the algorithm is slightly different.

My experience agrees with the author's. Incremental design does not work. Prototyping and being willing to throw away your prototype and do something wildly different has always been the better approach in my experience.

You wouldn't believe how many less experienced engineers I help with problems by coming in and approaching the problem with a fresh set of eyes. It takes years to build the skill and willpower to incinerate days or weeks worth of work in favor of an alternative solution you can write in an afternoon. But it's not a waste! If you gained enough insight from doing it the wrong way to be able to write it the right way, then you maximized the value of your time. It is actually a waste of your time to keep iterating on a fundamentally flawed design.

integricho

Not trying to defend Jeffries, but Norvig's solution despite looking so elegant and in hindsight seemingly the obvious solution by everyone here, is not at all obvious, and I doubt someone without prior experience in exploring the problem area would come up with it. Just no way.

yifanl

To me it seems like it comes from experience solving sudokus. If you've ever tried it (or watch Cracking the Cryptic if you haven't), you don't store just the solution of each square, you keep track of what you've ensured cannot be in this square.

From that mental model, the choice of data structure would seem to follow directly, which would tie nicely with the subthesis of programming within your genre.

zellyn

I dunno. When I first read Norvig's beautiful solution, I immediately thought, "Only someone coming from a Lisp and (old school) AI background (or maybe a super-Mathy background where everything is a Set) would have picked that representation, but oh man does it work so, so well."

gotski

Slightly off topic, but this made me think of an example from the PuLP (Python Linear Programming library) that solves a sudoku using LP constraints.

https://coin-or.github.io/pulp/CaseStudies/a_sudoku_problem....

One nice thing about this approach is that by adding each solution in as a constraint and re-running, you can exhaustively enumerate all possible solutions for a given puzzle.

gotski

This makes me think of an example from the PuLP docs of solving sudoku using constraint based linear programming.

https://coin-or.github.io/pulp/CaseStudies/a_sudoku_problem....

This example helped me enormously in developing my understanding of how to use binary variables in an LP solver

spoonjim

Norvig is a world level genius and he outdid Jefferies because of that. No shame in losing a basketball match to LeBron James.

mjd

Leaving Norvig out of it, it seems to me that there is some shame in taking a reasonably straightforward problem like this one, thrashing around with it for months or years without producing an effective solution, and then being unable to see that your methodology wasn't working.

If the account is accurate, this Jeffries guy wasn't getting the ball through the hoop, whether or not LeBron was around.

paulmooreparks

Exactly. This wasn't a game of one-on-one. It was two guys standing at foul lines taking shots, and one was sinking most of them.

Sesse__

A Sudoku solver isn't a hard problem. It's more like scoring 30% on a couple simple layup shots, then LeBron James comes along and dunks it out of the sky, but you still tell everyone that your way of training basketball is the best.

vincenthwt

Since Sudoku is a logic-based game, why not use Prolog to create a solver? Example of Prolog code can be found here, https://www.metalevel.at/sudoku/.

sinuhe69

A practical programmer will use a SAT solver. And bang!, problem solved! :)

HN

The Sudoku Affair

The Sudoku Affair