Zero-knowledge proofs, encoding Sudoku and Mario speedruns without semantic leak

26 comments

·March 18, 2025

CJefferson

This is really interesting!

I also like how it shows the 'power' of NP-completeness. Explaining zero-knowledge proofs for colouring is fairly easy. Explaining how to go from 3-SAT to colouring is some nice pictures. Explaining how to go from Sudoku to 3-SAT is 5 minutes work (assuming you understand both 3-SAT and Sudoku already).

Together, these things let you do zero-knowledge proofs for Sudoku with no more work, and by a similar process, zero-knowledge proofs for any problem in P.

eru

Also for any problem in NP!

That's useful in practice, because producing zero knowledge proofs is sooo slow. But getting non-deterministic 'hints' can speed up many computations. Crucially, the hints are arbitrary data and do not have to be computed inside the computation-to-be-proven.

The most cliche example is probably verifying that a number is compound: you could either run a complicated test, or you could just 'guess' the prime factors and verify your guess via multiplication.

Slightly more practical: you can sort in O(n log n) deterministic time. But that's easily beaten by Bogosort: you 'guess' a permutation, apply it to your data, and check if it's sorted. Not only does that finish in O(n) if you 'guess' right, the constant factors are also much better than for a standard sorting algorithm.

Of course, your zero-knowledge computation manages to 'guess' the right permutation right away, because it gets a hint from a deterministic computer outside the 'zero-knowledge box' running the classic O(n log n) algorithm.

That's still an advantage over running the O(n log n) algorithm directly, because proving computation is so expensive.

pixelpoet

Agreed, it covers a remarkable amount of ground in kind of far-out (to me) computer science topics in an accessible way; I love teaching and have to say, the exposition is excellent. These guys seem really young too, very impressive.

I'd definitely recommend watching the video before reading the blog post, ideally!

jstanley

> For a clause like x_1 OR x_2 OR x_3, we think of it as “the only forbidden combination is x_1 = \text{FALSE},\; x_2 = \text{FALSE},\; x_3 = \text{FALSE}.” Our task is to construct a gadget that eliminates this one forbidden combination while allowing all others. Here’s how: if both x_1 and x_2 are false (represented as the edge being colored red-blue), the node labeled x_1 OR x_2 must be blue. Similarly, the node labeled x_1 OR x_2 OR x_3 must be blue. You can verify that any other combination is acceptable.

I didn't follow this. I also looked at the graph and found that the node labeled "x1 OR x2 OR x3" is in fact connected to a blue node, so it can't be blue?

I didn't manage to work out how the clause gadget is meant to work so I can't tell if the graph is wrong or the explanation.

> Our zero-knowledge proof was interactive—a back-and-forth conversation between the prover and the verifier.

I think possibly the section containing your zero knowledge proof got edited out by accident?

CJefferson

The graph explanation is saying:

The 'x1 or x2 or x3' node can't be blue (as you say, it's connected to a blue node!)

It doesn't take too long to convince yourself that if we coloured the right-hand node of x1, x2 and x3 all blue, there is no valid colouring of the 'clause 1' bit of the graph where the 'x1 or x2 or x3' node is not blue -- which means there is no colouring of the whole graph.

On the other hand, if I make at least one of the right-hand nodes of x1, x2 or x3 red, then I can colour the 'clause 1' bit of the graph such that the 'x1 or x2 or x3' node isn't blue, so all is fine!

They are trying to explain that this graph correctly represents the SAT problem, because it has a valid colouring if and only if the SAT problem has a solution -- and we do this by checking the clauses one at a time.

jstanley

Thanks.

pixelpoet

Supplementary material for their video at https://www.youtube.com/watch?v=Otvcbw6k4eo

jkaptur

I didn't quite follow how you can actually prove that you've solved a sudoku via reduction to graph coloring. If I understand correctly, an important part of the graph coloring protocol is that the prover permutes the colors between each round (otherwise the verifier can just iteratively learn the color of every node).

But all sudoku puzzles have the same graph structure - a puzzle instance is a partial assignment of colors to nodes.

So can't a verifier can gain knowledge about the prover's solution by asking for edges that correspond to known values?

(I found a different ZKP protocol for sudoku, but I don't think it relates to the graph coloring protocol: https://www.wisdom.weizmann.ac.il/~naor/PAPERS/SUDOKU_DEMO/)

benediktwerner

The way the conversion is done here, different sudokus produce different graphs. Besides the regular sudoku graph structure, there are nine additional nodes, each corresponding to one number. They are all connected to each other to ensure they must be different and each one is connected to each cell where the corresponding number is present as a clue from the start. This way, the graph doesn't need any pre-coloring to still encode the sudoku including the given clues.

jkaptur

Ah ha! Thank you for explaining.

tdb7893

I haven't read this particular blog but the solution I remember seeing is you randomly swap the colors each edge verification so each is independent. All the edges are numbers that are required to be different so when you verify they are different you gain no information.

jkaptur

How does the verifier gain confidence that the prover has solved a particular instance of a sudoku puzzle?

heinrich5991

They can ask whether the pre-filled fields have the correct equal/unequal relationship by testing two of them at a time.

behnamoh

Can someone explain why we care so much about graphs in various branches of science? Like, why model the problems mentioned in the video as graphs that need to be colored?

teraflop

It's not so much that we want to model problems as graphs, it's that many problems and situations naturally correspond to graphs. Anything with a collection of objects, where some subset of pairs of objects are "related" to each other, is a graph. In mathematical terms, graphs are essentially the same as binary relations on a set (a generalization of functions).

And any time you can think of something as a graph, you can benefit from the wealth of available mathematical and computational tools that apply to all graphs.

eru

Graphs are also really simple structures mathematically.

After all, an (undirected) graph is nothing more than a ground set and a collection of its two-element subsets.

It's the same reason why groups crop up so often: groups are also really simple structures, so it's really easy to satisfy their axioms 'by accident'. Same for numbers in general.

whatshisface

I don't think that's a full explanation for groups, because groups other than the free or permutation groups are rather "unlikely," in the sense that they're much smaller.

ForTheKidz

Graphs don't really capture the entirety of hashmaps and other indexing concerns. Other than that yea I agree.

magicalhippo

GP was talking about modelling your problem. A hashmap is seldom a good model for a given problem. Typically they're "just" an implementation detail.

For example, in some cases it can be useful to consider triangles in a 3D model as cyclic graphs of vertices. The edges of the triangle correspond to the edges in the graph.

However I can't think of any case where it's useful to think of a triangle as a hashmap.

Ar-Curunir

Because in this case we have a relatively simple ZKP for 3-coloring.

However what you gain in the simplicity of the ZKP, you lose in the reduction to 3-coloring. So nowadays people use ZKPs that work with more realistic computation representations, like arithmetic circuits

jon_richards

There is a huge category of computer science problems that can be solved with the same algorithm (and a conversion). Graph coloring is probably the easiest of those problems to explain.

jrndcnfh

Not sure if I grasp your question. It's like asking "why do we care about lists or any data structure for that matter"

Graphs are just a very simple generalization of lists. And many problems can be easily modelled as graphs

carra

I'd say this was done mainly to help visualize the problems better, since this is a learning video. But often problems are converted into graphs because it is a well studied field and that way you can apply many theorems and traversal algorithms.

null

[deleted]

crtasm

Neither Sudoku or Mario are mentioned, .. oh there's a linked video - got it.

HN

Zero-knowledge proofs, encoding Sudoku and Mario speedruns without semantic leak

Zero-knowledge proofs, encoding Sudoku and Mario speedruns without semantic leak