Why cryptography is not based on NP-complete problems
58 comments
·February 13, 2025pclmulqdq
honzaik
afaik the "right kind of code" does a lot of heavy lifting for practical implementations, such as Classical McEliece.
correct me if I am wrong as I havent spent much time looking into it, but the security analysis essentially says "we assume the Goppa code is indistinguishable from a random code so the best attack is to do generic decoding for a random code (NP-hard problem)". but there is no reduction to some NP-hard problem that Goppa code (the specific code used in Classical McEliece) is indistinguishable.
the assumption is reasonable as nobody has been able to find a distinguisher for decades. also, if a distinguisher exists, it also doesn't translate into a direct attack against the system, it just means you cannot rule out "structural attacks" and jump to NP-hard problem.
less_less
Yeah that's right, there are no known cryptosystems whose security is based on the difficulty of solving an NP-hard problem. It's not known even in theory whether P != NP implies that one-way functions exist: for example, it might be that all NP problems are easy on average, or that there are problems that are hard on average but that you can't sample the problems and their solution at the same time.
(And this is even with the simplification that polytime = practical and not-polytime = infeasible.)
aleph_minus_one
> It's not known even in theory whether P != NP implies that one-way functions exist: for example, it might be that all NP problems are easy on average, or that there are problems that are hard on average but that you can't sample the problems and their solution at the same time.
Relevant paper:
Impagliazzo, R. A personal view of average-case complexity theory. In Proceedings of the 10th Annual Conference on Structure in Complexity Theory. IEEE Computer Society Press (1995), 134–147.
https://ieeexplore.ieee.org/document/514853
Non-paywall links:
- https://www.karlin.mff.cuni.cz/~krajicek/ri5svetu.pdf
- https://gwern.net/doc/cs/cryptography/1995-impagliazzo.pdf
Popular scientific articles on this topic:
- https://www.quantamagazine.org/which-computational-universe-...
- https://cacm.acm.org/research/fifty-years-of-p-vs-np-and-the...
krackers
The article is a bit confusing (to me as a layman), basically as I understand it's saying that the reason we can't form cryptosystems out of arbitrary NP-complete problems is because there exist randomized algorithms that work for 99% of inputs (average vs. worst-case complexity, e.g. why the simplex algorithm works in practice). But there are some things missing:
* We know that randomized algorithms exist for NP-complete problems. But couldn't we have some problem in NP that while reducible to e.g. 3SAT always reduce to "very hard" instances?
* Conversely, there's no guarantee that there doesn't exist a randomized algorithm for something that's NP-hard (and not in NP), right?
The argument along the lines of NP-complete vs NP-hard seems to be a non-sequitur, since the actual distinction is average vs. worst-case hardness right? It may just so happen that all currently known problems in NP are easy "on average", but I don't think that's formally guaranteed.
Edit: https://theory.stanford.edu/~trevisan/average/slides.pdf seems to imply that (1) is an open question.
nemoniac
There has been quite some research on finding, in layman's terms, where the difficult instances of an NP-complete problem are.
What it comes down to is that there are phase transitions in computational systems that depend on how constrained the problem is.
Consider a sudoku puzzle. If the board is almost empty of numbers then it's easy to solve. If the board is almost full of numbers it's easy to solve. But there's some critical amount of numbers filled in (say half of them) that makes the sudoko most difficult to solve. That's where the phase transition is.
The notion of a phase transition comes from thermodynamics where ice becomes water becomes gas.
Here's an early paper on phase transition is AI.
https://www.sciencedirect.com/science/article/pii/0004370287...
Here's a later one on phase transitions for the satisfiability problem (SAT).
Vervious
1) It'd be really really awesome to show average-case hardness version of a natural problem (i.e. 3SAT on a natural distribution) from the worst-case hardness of that problem (i.e. 3SAT). I believe it's wide open in the context of building one-way functions.
2) Randomized polynomial-time algorithms don't exist for NP-hard problems unless P = NP (iirc).
I think you have the right intuition. The issue is that right now cryptography is built on top of relatively "fragile" "average-case hard" problems -- discrete log, lattice stuff, etc. They are hard by assumption and we really don't know how to study them. (I would bet on some of this stuff being broken in our lifetimes).
It'd be really nice to instead base cryptography on an assumption that we are much more confident in being true, i.e. worst-case hardness of 3SAT. (I mean, who is going to prove P=NP). There are two steps:
1) First, show cryptography from average-case hardness of a "more believable" problem.
2) Second, to show average-case hardness of that problem is implied by the worst-case hardness of an NP-complete problem.
(1) is sort of better studied now, with a line of work on meta-complexity. (2) I believe is wide open, as previously mentioned.
Anyways I'm not really an expert in this area, but it's really quite fascinating.
CJefferson
In general, it's very hard, and a constantly moving target, to make 'hard' NP-complete problems.
For example, it's really nice to have a problem generator which make a series of hard problems, for things like benchmarking. It's very common that someone comes up with such a system, then someone ends up finding a way of solving "almost everything" that the random generator makes, by finding a good way of finding solutions to the solvable problems, and/or finding a way to easily prove unsolvable problems have no solutions.
The second category tends to be more interesting, it ends up being really hard to make large random unsolvable problems which don't end up including some kind of "flaw" with high probability, where the flaw ends up being easy to find, and easily proves the problem is unsolvable.
strulovich
Generally, all problems used for crypto are in NP. Since given the secret, you need to be able to compute it efficiently.
It’s just that being NP hard or complete is not enough.
The theoretical definition of one way functions is used to define cryptography. So reading on that my clarify this some more.
zeroonetwothree
> For example, it could be possible that, if we generated a random map in some way, 99% percent of the time, the three-color map coloring problem would be easy, but 1% of the time, it would be very difficult.
> This would still be NP-complete, but would make for a terrible cryptosystem - a scheme based on this problem would be easily broken 99% of the time!
I don’t think this explanation is great. We could just generate instances until we found the hard one and then use it. So the answer is more complex than just as written.
cobbal
The point is not that all NP hard problems make bad cryptosystems, but rather the NP-hardness guarantee fails to translate into a useful guarantee for a cryptosystem.
It's about the proof, not the computation.
ethanwillis
Can you explain your usage of NP hard in your comment?
saghm
A problem being NP-hard is one of the two requirements of being in NP-complete (the other being that it's NP). If I remember correctly, problems that are NP-hard without being NP are undecidable, so in contexts like "what problems can we base cryptography on?", it's basically the same as saying NP-complete, since we aren't able to base cryptography on undecidable problems.
Vervious
how do you know if an instance you picked is hard?
Another problem with your suggestion: Do we even know that a 1/polynomial sized fraction of 3-coloring instances are hard? (Else you'll never sample one). (NP-Hardness certainly does not give such a guarantee, even though it sounds weak)
ulrikrasmussen
You don't know if the one you generated is hard. Graph coloring is easy for large but trivially connected graphs, and many many problem instances turn out to have structure that can be exploited to solve it efficiently. You would basically need to implement a procedure for determining the lower bound of the runtime of all known algorithms for the problem instance which is most likely very difficult, and even if possible, you don't know if your particular class of generated problems turns out to be easily solvable by a yet to be discovered algorithm.
Amateur observation, take with a grain of salt: On the other hand, I guess we are doing something similar when generating a new public/private key pair on an elliptic curve. It just turns out that finite fields have very little (known) structure to exploit compared to e.g. colored graphs, so we have more confidence in the assumption that the generated discrete logarithm problem is indeed hard.
goalieca
RSA is kind of shaky like this. You generate key and then run some tests hoping it’s “good”. Most of the time it isn’t so you try again. Many products have failed because they were not.
adgjlsfhk1
this mostly isn't true. for much smaller keys there were risks of "weak primes", but as general purpose factoring algorithms got better and pushed keys bigger a higher and higher percent of primes end up safe. with 2048 bit primes, the odds of your prime being significantly weaker than average drop to something like 1 in 2^1024
aleph_minus_one
What is a "weak prime" here (i.e. by which properties is such a prime number characterized)?
ars
How do you know which ones are the hard ones? I don't think it's easy to figure that out.
alexey-salmin
I'd also note that zero-knowledge proofs (which are considered a branch of cryptography) often do rely on NP-complete problems like the three-color maps.
Vervious
That's completely unrelated, the reliance there is because you want to be able to prove "any statement in NP" in zero knowledge. It then suffices to give a protocol for just 3-colorings, because "any statement in NP" can then be reduced to a 3-coloring.
saurik
This seems like a definition issue, with respect to the set from which we are randomly sampling... what constitutes the problem? If you randomly select numbers and try to factor them--which to me seems like a fair way to define the problem: factoring, not two-primes factoring--the vast majority of numbers are, in fact, very easy to factor ;P. Instead, we say we only want to randomly select a number which is only two primes multiplied together in order to find the juicy hard versions of the problem. (And, even then, finding those numbers isn't precise! We rely on probabilistic primarily tests to weed out numbers that are probably not prime.)
Similarly, who is to say that there isn't some way to define what the hard instances of NP-complete problems are, and we just should be sampling from that difficult region of the problem space instead of all possible problems, the same way we sample only from the difficult-to-factor coprimes instead of from all possible numbers? There is definitely something interesting going on with the harder instances of these problems, and researchers have managed to discover a kind of "phase transition" among instances of some of the problems between ones with so many solutions they are easy and ones with so few solutions they are also once again easy (and my knowledge on this is about 20 years rusty and outdated... maybe this is better understood and characterized now).
delifue
The RSA problem in article doesn't mention that RSA's difficulty is based on MODULUS prime factorization, not simple prime factorization.
LegionMammal978
Are there any known attack methods that don't involve factoring the public key into two primes?
bmenrigh
Yeah but they usually require RSA to be used in some rather unusual and bad way.
For example, encrypting one message with many different public keys can be broken with chinese remainder theorem and Nth roots. This reveals the message without factoring any key. This is why randomized padding (among other things) is a must with RSA.
fenomas
There's a footnote along these lines that links to the actual algorithm.
stncls
> Instead, cryptography needs problems that are hard in the average case, like the RSA problem, the discrete logarithm for elliptic curves, and the shortest vector problem for lattices. We don’t technically know whether these are NP-complete
But we do know that the shortest vector problem is NP-hard (in Linf norm), and so is its decision version [1]. We don't have a reduction in the other direction, but we know that SVP is as-hard-as-NP-complete-or-harder.
This goes against the general argument of the article, but only weakly so, because I think lattice-based systems are less broadly deployed than elliptic curves or factoring (not a crypto person, though).
[1] H. Bennett, "The Complexity of the Shortest Vector Problem.", 2022 https://simons.berkeley.edu/sites/default/files/docs/21271/s...
less_less
Yeah, SVP is NP-complete for certain parameters. But lattice cryptography uses other parameters where SVP might not be NP-complete. Also lattice crypto is usually based some other lattice problem like LWE, MLWE, MLWR, SIVP etc.
Lattice crypto is going to be broadly deployed because it hopefully can resist quantum attack, unlike ECDLP and factoring.
ccppurcell
The keyword not mentioned here is one-way function. Cryptography is for the most part based on candidates for one-way functions: functions that are easy to perform on any input but difficult to undo on average. The existence of such a function would imply that P is not NP but the converse doesn't hold.
praptak
Another problem with NP-hard is that it only covers deterministic algorithms.
Even if most instances of a problem are deterministically hard, they might still have a randomized algorithm which solves them feasibly.
tyilo
It might also be the case that P=NP
null
userbinator
Aren't hash functions a counterexample? There have been attempts at using SAT solvers to find preimage and collision attacks on them.
elchananHaas
Yes, this article is only talking about asymmetric cryptography. Symmetric cryptography and hashes are based on NP complete problems.This is only when generalized appropriately, most individual hashes/ciphers are fixed so there isn't really a way to talk about their complexity. The broader class of block cipher like problems is NP complete, and very hard.
Vervious
no, the article is talking about symmetric key crypto also. (i.e. one way functions). You're talking about cryptosystems that don't have proofs of security, or reductions to heuristical assumptions
less_less
Breaking hash functions is in NP, but isn't expected to be NP-complete, meaning that you can't easily encode eg a SAT problem into a hash function, such that breaking the hash solves the SAT problem.
You can do the reverse (encode the hash into a SAT problem), but that's possible for any NP problem, because SAT is NP-complete.
memkit
Your last statement is misleading. A problem being NP-complete isn't exactly the property that allows you to reduce any NP problem to it. Suppose there was a complexity class MP-hard that has no efficient reduction to an NP-complete problem. Then a problem being MP isn't what allows me to write a reduction to the MP-hard problem; I could just as easily write a reduction from a problem in P to the MP-hard problem. Your statement is misleading but incidentally correct because the true condition (NP-hard or easier) happens to be equivalent to your stated condition (NP) for this particular complexity class. It would be clearer to simply state that you can always reduce an easy problem to a harder one.
less_less
What?
Being in NP doesn't exclude being in P. Every problem in P is also in NP.
The definition of "NP-complete" is "in NP, and also NP-hard". The definition of "NP-hard" is that you can reduce any NP problem to it using a poly-time mapping reduction (also known as a Karp reduction). So yes, SAT being NP-complete does mean that you can reduce any NP problem to SAT, using a poly-time mapping reduction.
Breaking a hash (e.g. collision finding) is in NP, because you can easily check a proposed solution. Well, with an obvious quibble: P and NP are about asymptotic complexity, but most hash functions are fixed-size. Also if you're looking at complexity theory you might want to talk about targeted collision finding, or first or second preimage resistance, but same deal there. But anyway, supposing you choose a keyed hash that does scale so that you can talk about its asymptotic complexity at all, and has a poly-time cost to evaluate it, breaking that hash would be in NP. Therefore it can be reduced to SAT using a poly-time mapping reduction.
westurner
SAT Solvers, GA, LLMs, brute force
Birthday paradox probability changes with memoization.
Rainbow tables trade CPU for storage for lookup; but salting, double hashing, and key derivation functions with many rounds like pbkdf and argon2.
impossiblefork
Surely because of Goldreich & Wasserstein's result that public key encryption based on NP completeness necessitates NP=co-NP?
cubefox
Does this mean there is a relationship between the NP completeness of a problem and the worst case time complexity of algorithms which solve it?
There is one cryptosystem that uses an NP-hard problem for its security: the McEliece cryptosystem. The problem used here is decoding a generalized version of an error correcting code, which can be made both NP-hard in theory and in practice if you use the right kind of code. It is reliant on using very specific algorithmic parameters to stay hard, though. However, the versions that are not broken are likely post-quantum ready.