What does “Undecidable” mean, anyway

165 comments

·May 28, 2025

mreid

This is a really nice explanation of decidability. One extra thing it might be worth mentioning is that there are many more functions `f : string -> boolean` then there are programs that implement those functions.

When I first encountered this topic I had trouble intuitively understanding how there could not exist an `IS_HALTING` function when it is also just a function that takes in a string (representing a program plus its inputs) and outputs True or False depending on whether it halts or not.

The argument in the article does a great job of showing that `IS_HALTING` cannot exist because it is in some sense "too powerful" but that means there is a mapping f : strings -> boolean that cannot be represented as a program, which seems weird if you've been programming for ages and every function you encounter is expressed as a program.

The result becomes less weird when you realize that that almost all functions from string -> boolean are not expressible as a program. Why? Well there are countable many programs since there are only countably many finite length strings and every program, by definition, is a finite length string. However, there are uncountably many functions from string -> boolean since these functions map one-to-one to sets of strings (just let the set be all inputs that map to True) and the cardinality of the set of sets of strings is uncountable.

This is essentially due to Cantor's diagonalization argument which shows you cannot put all elements in a set X into a 1-1 correspondence with all the subsets of X, even when X is countably infinite. This fact is at the heart of a lot of these computability results since it shows there is a gap between all functions (= any arbitrary subset of finite strings) and a program (= a finite string).

rstuart4133

> This is a really nice explanation of decidability.

I'm not an enthused about it as you. It doesn't mention that every undecidabilty involves an infinity. What makes a problem undecidable is not that you can't write an algorithm for it, it's that you can't guarantee the spits out an answer in a finite number of steps.

Take the halting problem. He defines it as "does [the Turning] machine M halt on input i?". The answer is famously no. But you can write an algorithm for it, and if you restrict the Turning machine it's analyzing to having a finite tape it will always spit out the correct answer. The problem is the algorithm creates a power set of the Turning machine states. If you run it on a machine with infinite states it potentially needs to construct power set of an infinity. That's not just an infinity, its a new, distinguishable from, and in some sense higher order infinity than the one you started with.

I can't call an explanation nice when it omits the central idea that explains what is going on under the hood. His description of the universality Turning machines on the other hand was nice.

pdpi

> The result becomes less weird when you realize that that almost all functions from string -> boolean are not expressible as a program.

I think this is one of those cases where a maths background makes computer science much easier. It only takes enough calculus to get you to entry level differential equations before you’re confronted with the fact that most functions ℝ → ℝ aren’t elementary functions (or admit any closed-form expression at all). In a certain sense, “program” is really just a weird word for “closed-form expression”.

tliltocatl

IMHO what just referring to uncountability misses is that not only most functions are unexpressable, many __useful ones__ are not. Most ℝ→ℝ functions (and even most real numbers) are so unremarkable there is no way to uniquely name them. The fact that some useful functions are undecidable is quite a bit less trivial than "there are more functions than there are programs to evaluate them".

SkyBelow

Isn't this related to why the busy beaver is the fastest growing program, given that at some N states it would be able to emulate any closed form function and then at some greater number N + M states be able to use that function in more complex functions (as a lower bounds, the actual busy beaver of N + M states given size would likely be an even larger output than the same sized Turing machine emulating whichever fast growing function is used for comparison)?

cyberax

I wonder if this is a correct argument.

A function string -> boolean is always expressible? Simply because the set of all possible mappings from all possible finite strings to booleans is countable.

It's better to say that some functions like "does this program halt?" simply don't exist.

hwayne

`S = {str -> bool}` is actually uncountable. `S` is isomorphic to the power set (set of all subsets) of `str`, and `2^str` is at least as big as the set of real numbers, as any real number (like π) can be mapped to the set of string prefixes (like `{"3", "3.1", "3.14", ...}`). Since the reals are uncountable, so is `2^str` and `S`.

cyberax

> `S` is isomorphic to the power set

D'oh. I missed that.

turboponyy

> It's better to say that some functions like "does this program halt?" simply don't exist.

Let f : (p: String) -> Boolean equal the function that returns True if p is a halting program, and False otherwise

aeneasmackenzie

This is really just the constructive/classical argument but I want to be specific.

You just named a function and specified a property you want it to have. However no function with this property meaningfully exists. We can manipulate the symbol just fine, but we can never look inside it because it’s not real. Classical mathematics was developed before computation was relevant and the question of decidability arose fairly late, it makes more sense to consider it an early attempt at what is now called intuitionistic mathematics. The halting problem disproved excluded middle.

mreid

I think you are experiencing the same confusion I felt when I first started thinking about the difference between a program and a function.

The set of all possible mapping from all possible finite strings to booleans is definitely *not* countable.

What I (and the article) mean by a "function" `f : string -> boolean` here is any arbitrary assignment of a single boolean value to every possible string. Let's consider two examples:

1. Some "expressible" function like "f(s) returns True if the length of s is odd, and returns False otherwise".

2. Some "random" function where some magical process has listed out every possible string and then, for each string, flipped a coin and assigned that string True if the coin came up heads, and False if it came up tails and wrote down all the results in a infinite table and called that the function f.

The first type of "expressible" function is the type that we most commonly encounter and that we can implement as programs (i.e., a list of finitely many instructions to go from string to boolean).

Nearly all of the second type of function -- the "random" ones -- cannot be expressible using a program. The only way to capture the function's behavior is to write down the infinite look-up table that assigns each string a boolean value.

Now you are probably asking, "How do you know that the infinite tables cannot be expressed using some program?" and the answer is because there are too many possible infinite tables.

To give you an intuition for this, consider all the way we can assign boolean values to the strings "a", "b", and "c": there are 2^3 = 8. For any finite set X of n strings there will be 2^n possible tables that assign a boolean value to each string in X. The critical thing here is that 2^n is always strictly larger than n for all n >= 1.

This fact that there are more tables mapping strings to boolean than strings still holds even when there are infinitely many strings. What exactly we mean by "more" here is what Cantor and others developed. They said that a set A has more things than a set B if you consider all possible ways you can pair a thing from A with a thing from B there will always be things in A that are left over (i.e., not paired with anything from B).

Cantor's diagonalization argument applied here is the following: let S be the set of all finite strings and F be the set of all functions/tables that assigns a boolean to each element in S (this is sometimes written F = 2^S). Now suppose F was countable. By definition, that would mean that there is a pairing that assigns each natural number to exactly one function from F with none left over. The set S of finite strings is also countable so there is also a pairing from natural numbers to all elements of S. This means we can pair each element of S with exactly one element of F by looking up the natural number n assigned to s \in S and pairing s with the element f \in F that was also assigned to n. Crucially, what assuming the countability of F means is that if you give me a string s then there is always single f_s that is paired with s. Conversely, if you give me an f \in F there must be exactly one string s_f that is paired with that f.

We are going to construct a new function f' that is not in F. The way we do this is by defining f'(s) = not f_s(s). That is, f'(s) takes the input string s, looks up the function f_s that is paired with s, calculates the value of f_s(s) then flips its output.

Now we can argue that f' cannot be in F since it is different to every other function in F. Why? Well, suppose f' was actually some f \in F then since F is countable we know it must have some paired string s, that is, f' = f_s for some string s. Now if we look at the value of f'(s) it must be the same as f_s(s) since f' = f_s. But also f'(s) = not f_s(s) by the way we defined f' so we get that f_s(s) = not f_s(s) which is a contradiction. Therefore f' cannot be in F and thus our premise that F was countable must be wrong.

Another way to see this is that, by construction, f' is different to every other function f in F, specifically on the input value s that is paired with f.

Thus, the set F of all functions from strings to boolean must be uncountable.

drewhk

> The set of all possible mapping from all possible finite strings to booleans is definitely not countable.

Just to add another perspective to this, this is one of the places where classical and constructive mathematics diverge. Do those functions that are not expressible by algorithms (algorithms are countable) even exist? Of course you can define them into existence, but what does that mean?

Another food for thought is to consider if the limits imposed on computation by the Turing machine is a law of physics? Is this an actual physical limit? If so, what does that mean about the functions not expressible by algorithms? What is so exciting about programs/algorithms that they are both a well-defined mathematical object suitable for formal analysis, but they are actually machines as well, fully realizable physically and their properties physically falsifiable.

Before anyone starting to nit-pick, I just put this comment here as a conversation starter, not a precise thought-train: this is a deep rabbit whole that I think is worth to explore for everyone interested in the computational world. I am pretty sure other commenters can add more accurate details!

yusina

I appreciate your lengthy explanation and I largely agree with it, even though it probably doesn't help much because anybody who has not understood this yet will have stopped reading latest at 20% in. That's not your fault but just based on the observation that attention spans are short and people strongly prefer to spend their time on reading things they are interested in. And anybody interested in this subject who has this much time to spare has likely already done that elsewhere. But I'd be happy to be wrong and would welcome if just a single person gained better understanding through your text.

What I would suggest though is to avoid using the term "random" for this. I know you put it in quotes, but that term is so over-misused that we do it a disservice by adding more misuse. Numbers (or other objects) are not random, it's the process that produced them in some context that's random. Random processes have very specific, mathematically well-defined properties, just like the concept of decidability has, and throwing around the term with other meanings really does it a disservice, similar to doing the same for decidability.

What you are probably looking for is a term like "arbitrary".

ummonk

"This to me is a strong "intuitive" argument for why the halting problem is undecidable: a halt detector can be trivially repurposed as a program optimizer / theorem-prover / bcrypt cracker / chess engine. It's too powerful, so we should expect it to be impossible."

I don't buy this. Bcrypt cracking and solving chess is easy already if you don't care about runtime complexity (both have rather trivial finite -- albeit exponential -- runtime algorithms), and it wouldn't be any easier to transform them into halting problems.

jonahx

What do you not buy? He's saying it could be used to do anything for free. He gave four examples. The first two don't have general solutions, and never will. You don't like the last 2 examples because we have algorithms for them, ok. It doesn't undermine the argument. And in any case both things were developed with effort and ingenuity... they weren't spit out of a halting problem calculator.

cvoss

Having a decision algorithm does not make a computation free. It may take exponential time in the input size, doubly exponential time, some time involving Graham's number, ... etc. It may also take up an unreasonable amount of space. Indeed, some problems have such complex algorithms that you should expect the universe to end before the answer is determined and/or the computing device is doomed to collapse into a black hole, making the answer irretrievable.

So the blog post's claims that a hypothetical halting algorithm would solve anything "overnight" are exaggerated and naive.

jonahx

"Free" in the sense of thought, cleverness, insight. You have an "everything" calculator. That is the sense in which it might be intuitive that it couldn't exist.

> So the blog post's claims that a hypothetical halting algorithm would solve anything "overnight" are exaggerated and naive.

I agree "overnight" is misleading in this context. However, I am fairly sure the author is aware of the point you are making.

Animats

Any deterministic system with a finite number of states is decidable, in the halting problem sense. Either it halts or repeats a state. The halting problem only applies for infinite memory.

Now, there are finite state systems where the halting problem is arbitrarily hard. But that's not undecidability. That's complexity. That's a problem in the space where P=NP lives.

The article does not make this distinction, and it's important.

yusina

The halting problem applies also for finite-but-unbounded memory.

If you give me a decider that can tell me for any program with a state space up to size N whether it halts or not, then I will be able to produce another program with a larger state space and which then your decider won't be able to decide. This new program doesn't use infinite space. Just more than your decider can handle. You can't produce a single decider that works for all inputs.

It is often ok to approximate "finite but unbounded" with "infinite", and surely anyhing that can handle infinite inputs will be able to handle finite-but-unbounded inputs too, but in this context the two are not the same.

The term "infinite" is often misused to mean "finite but unbounded" but it is an important distinction.

dooglius

That doesn't sound right. All my decider needs to do is wait for a repeated state (no halt) or a halt. If the state space is finite then one of these is guaranteed to happen.

yusina

And how does your decider recognize that a state has been attained twice? If I make the input system large enough then you don't have enough space in your decider to save all the states which it has observed.

LPisGood

I don’t think finite-but-unbounded and infinite makes any difference in this setting, since at most finitely many cells of the memory tape may be used at any stage of computation.

yusina

Exactly, that was my point. Thanks for putting it more succinctly.

bubblyworld

I don't think this is an important distinction. The point of Turing machines is that you can ask questions like "can all instances of this problem be solved uniformly, and how relatively expensive is it?". This requires infinite memory in a formal sense, because instances of (most) problems can be arbitrarily large.

Yes, if you ask the same question with a fixed (finite) memory restriction on everything, the answer is uninteresting. To me, this is... uninteresting. It tells you nothing about the underlying logical structure of your problem, which is what mathematicians are really trying to get at!

(first and foremost Turing machines are a tool for analysing problems mathematically, not a model of your laptop)

Also note that "states" in your sense are not the same as "states" in a Turing machine (which are just one of the inputs to the transition function). There are Turing machines with less than ten thousand states whose behaviour is undecidable in ZFC (https://arxiv.org/abs/1605.04343).

Ukv

> Yes, if you ask the same question with a fixed (finite) memory restriction on everything, the answer is uninteresting. To me, this is... uninteresting. It tells you nothing about the underlying logical structure of your problem, which is what mathematicians are really trying to get at!

> (first and foremost Turing machines are a tool for analysing problems mathematically, not a model of your laptop)

I think what makes Animats's distinction worth stressing is that a lot of people miss this and fall into the trap of making false/misleading claims about the impact of the halting problem on actual computers/software.

For instance, claiming that the halting problem proves there are problems humans can solve that computers fundamentally can't, giving "the halting problem makes this impossible" as reason against adding a halt/loop detector to a chip emulator, making all sorts of weird claims about AI[0][1], or arguably even this author's claim that a halt detector would make bcrypt cracking trivial.

[0]: https://mindmatters.ai/2025/02/agi-the-halting-problem-and-t...

[1]: https://arxiv.org/pdf/2407.16890

Xcelerate

To be fair to [0], Leonid Levin (co-discoverer of NP-completeness) makes a similar but slightly weaker claim in a more formal sense: that no algorithm or even random process can increase mutual algorithmic information between two strings (beyond O(1)), which includes any process that attempts to increase mutual information with the halting sequence (https://cs-web.bu.edu/fac/lnd/dvi/IIjacm.pdf).

Nevertheless, we clearly do have some finite amount of information about this sequence, evident in the axioms of PA or ZFC or any other formal system that proves an infinite number of programs as non-halting (hence why the Busy Beaver project has been able to provably confirm BB(5)). We presume these systems are truly consistent and sound even if that fact is itself unprovable, so then where exactly did the non-halting information in the axioms of these systems “come from”?

Levin stops short of speculating about that, simply leaving it at the fact that what we have so far cannot be extended further by either algorithmic or random means. But if that’s the case, then either AI is capped at these same predictive limits as well (i.e., in the sense of problems AI could solve that humans could not, both given unlimited resources), or there is additional non-halting information embedded in the environment that an AI could “extract” better than a human. (I suppose it’s also possible we haven’t fully exploited the non-halting information we do have, but I think that’s unlikely since we’re not even sure whether BB(6) is ZFC-provable and that’s a rather tiny machine).

bubblyworld

That's fair, I've run into a lot of misconceptions like that before. And probably believed some of them at earlier points in my life =P

(Roger Penrose himself makes a version of that claim in "the Emperor's New Mind", so we're in good company...)

About the bcrypt thing... yeah, so one way you can do it is write a program that generates all possible keys starting with the letter 'a', and tests them. Then you run your halting oracle on that, and if it returns true you know the first letter of the key is 'a'.

Do this for each index/letter combination and you can crack bcrypt keys in linear time (in the number of calls to your halting oracle).

ummonk

Yes, the article is clearly confused about the concepts, given its mention of bcrypt cracking and chess engines.

dmurray

Yes, I thought the article was really good until it got to that point.

> a halt detector can be trivially repurposed as a program optimizer / theorem-prover / bcrypt cracker / chess engine. It's too powerful, so we should expect it to be impossible.)

A Turing machine can be trivially repurposed as a bcrypt cracker or chess engine (for the same definition of trivial the author is using), so if it's not intuitive that a computer can do this, then your intuition is wrong.

Program optimizers and theorem provers also exist, of course, but in those cases the problems really are undecidable so no Turing machine is guaranteed to work on any given program or theorem.

bubblyworld

A solution to the Halting problem can be repurposed as a general-purpose theorem prover. The author is correct. You simply write a program that searches all possible valid proofs till it finds the one you are looking for (or maybe doesn't and runs forever). Then you check whether it halts with your Halting solution - if that returns true, you know that a proof exists, otherwise you know that one doesn't.

In other words, you can use undecidability of first-order logic to prove undecidability of the Halting problem if you like, although it's a bit of a chicken-egg thing historically (I believe).

taeric

I'm curious if you could make an analogy to the idea of "underspecified" in FreeCAD. It isn't that the drawing doesn't exist. It is that you still have some freedom in how long certain parts could be. Crucially, not all. It could be that parts of the drawing are indeed fully specified.

Same can go with programs. You can constrain parts of it enough that you can answer some pretty specific questions. And similar to how a drawing with fewer lines is easier to constrain, a program that has restricted constructs can be easier to constrain.

constantcrying

>I'm curious if you could make an analogy to the idea of "underspecified" in FreeCAD.

That is just solution theory of a non-linear system of equations. The solution space can have exactly one solution (fully constrained), more than one (under constrained) or zero (over constrained).

To be honest I do not really see a connection.

taeric

I meant more as how to mentally model it in a way that can get someone across the line. My assertion is that caring about undecidability is almost certainly a waste of time for most people. That said, the reason we work in small chunks is often so that we can more easily answer questions about programs.

Moving the graphical drawing and constraints over to a symbolic system also helps see how many symbols it can take to cover a simple system.

Of course, the real reason for me thinking on this is that I'm playing with FreeCAD for the first time in a long time. :D

nialv7

One of my favorite insights is that the existence of undecidable problems is the same thing as the uncountability of real numbers.

Too bad the author didn't get into it.

mreid

I had exactly the same reaction, which prompted me to write this comment: https://news.ycombinator.com/item?id=44122045

bubblyworld

...which is the same thing as Rice's theorem, and many other mind-bending results. It's all diagonalization under the hood =)

godelski

I think it helps to understand my namesake's Incompleteness Theorem[0]. They are actually connected [1,2]

  1) no consistent system of axioms whose theorems can be listed by an effective procedure (i.e. an algorithm) is capable of proving all truths about the arithmetic of natural numbers. 
  For any such consistent formal system, there will always be statements about natural numbers that are true, but that are unprovable within the system.

  2) the system cannot demonstrate its own consistency.

Any axiomatic system will have true statements that are unprovable. Which means basically any system of logic. If you've made an assumption you'll end up with this. You're always making assumptions, even if you don't realize it. It's usually a good idea to figure out what those are.

I hear a lot of people say "from first principles". If you're not starting from your axioms, you're not starting from first principles. Think Carl Sagan making a pie from scratch.

[0] https://en.wikipedia.org/wiki/G%C3%B6del%27s_incompleteness_...

[1] https://scottaaronson.blog/?p=710

[2] https://cstheory.stackexchange.com/questions/10635/halting-p...

brap

I sometimes wonder if the concept of “intelligence” is going to benefit from a formal model the way “computation” benefited from Turing Machines.

Are there classes of intelligence? Are there things that some classes can and cannot do? Is it a spectrum? Is the set of classes countable? Is it finite? Is there a maximum intelligence? One can dream…

bawolff

Philosophers have been trying to define what it means to be conscious since forever. I think that is informally what you mean here.

If you just mean what problems can it solve, and how quickly, we already have a well developed theory of that in terms of complexity classes -https://complexityzoo.net/Complexity_Zoo

joe_the_user

I don't either philosophical conceptions of consciousness or theories of computational complexity count as even "efforts to formalize intelligence". They are each focused on something significantly different.

The closest effort I know of as far characterizing intelligence as such is Steven Smale's 18th problem.

https://en.wikipedia.org/wiki/Smale%27s_problems

bawolff

The wikipedia article is pretty useless here.

The original paper is better, but still seems to be too vauge to be useful. Where it isn't vauge it seems to point pretty strongly to computability/complexity theory.

Intelligence means many different things to different people. If we just gesture vaugely at it we aren't going to get anywhere, everyone will just talk past each other.

mystified5016

I think this is more about levels or classifications of intelligence.

If you've ever interacted with a very smart animal, it's easy to recognize that their reasoning abilities are on par with a human child in a very subjective and vague way. We can also say with extreme confidence that humans have wildly different levels of intelligence and intellectual ability.

The question is, how do we define what we mean by "Alice is smarter than Bob". Or more pertinently, how do we effectively compare the intelligence and ability of an AI to that of another intelligent entity?

Is ChatGPT on par with a human child? A smart dog? Crows? A college professor? PhD level?

Of course we can test specific skills. Riddles, critical thinking, that sort of thing. Problem is that the results from a PhD will be indistinguishable from the results of a child with the answer key. You can't examine the mental state of others, so there's no way to know if they've synthesized the answer themselves or are simply parroting. (This is also a problem philosophers have been thinking about for millenia)

Personally, I doubt we'll answer these questions any time soon. Unless we do actually develop a science of consciousness, we'll probably still be asking these questions in a century or two.

bawolff

> Is ChatGPT on par with a human child? A smart dog? Crows? A college professor? PhD level?

That presumes a total ordering of intelligence. I think the balance of evidence is that no such total ordering exists.

There are things chatgpt can do that children (or adults) cannot. There are thing that children can do that chatgpt cannot.

At best maybe the Turing test can give us a partial ordering.

I don't think there is much value in viewing "intelligence" as a whole. Its a combination of a multitude of factors, that need to be dealt with independently.

skydhash

Intelligence is often a measure how quickly we can embrace a new model and how effectively we can use it. Building such a model can be done haphazardly or be guided with skill transfer methodology. Once that done, intelligence is how well we can select the correct model, filter the relevant parameters out and then produce a correct answer.

There's a lot of factors there and more that I haven't specified. But one thing that I believe is essential is the belief that an answer is correct or uncertain.

ryandamm

Purely intuitively, it seems like there should be a connection between the two (computation and intelligence). But I have not formally studied any relevant fields, I would be interested to hear thoughts from those who have.

wat10000

The known laws of physics are computable, so if you believe that human intelligence is purely physical (and the known laws of physics are sufficient to explain it) then that means that human intelligence is in principle no more powerful than a Turing machine, since the Turing machine can emulate human intelligence by simulating physics.

If there's a nonphysical aspect to human intelligence and it's not computable, then that means computers can never match human intelligence even in theory.

null

[deleted]

brap

I was thinking the same thing, maybe a level above in the Chomsky Hierarchy…

ninkendo

Kinda related, I’ve had a hunch for a while that we’re going to eventually learn that “the singularity” (AI’s improving themselves ad infinitum) is impossible for similar reasons the halting problem is impossible. I can’t really articulate why though. It just seems similarly naive to think “if the AI becomes smarter than us, surely it can thus make a better AI than we could” as it is to think “if a computer can compute anything, surely it can compute whether this program is will halt.”

My bet is there is some level of “incompleteness” to what an intelligence (ours or a machine’s) can do, and we can’t just assume that making one means it can become a singularity. More likely we’re just going to max out around human levels of intelligence, and we may never find a higher level than that.

pixl97

>More likely we’re just going to max out around human levels of intelligence, and we may never find a higher level than that.

I've seen people state this before, but I don't think I've seen anyone make a scientific statement on why this could be the case. The human body has a rather tight power and cooling envelope itself. On top of that we'd have to ask how and why our neural algorithm somehow found the global maxima of intelligence when we can see that other animals can have higher local maxima of sensory processing.

Moreso machine intelligence has more exploration room to search the problem space of survival (aka Mickey7) that the death any attached sensoring/external network isn't the death of the AI itself. How does 'restore from backup' affect the evolution of intelligence?

Granted there are limits somewhere, and maybe those limits are just a few times what a human can do. Traversing the problem space in networks much larger than human sized capabilities might explode in time and memory complexity, or something weird like that.

Dylan16807

For that definition, including the phrase "ad infinitum", then it's pretty unlikely.

But a lack of infinities won't prevent the basic scenario of tech improving tech until things are advancing too fast for humans to comprehend.

mystified5016

This is one of those ideas that great minds have been chewing on for all of recorded history.

In modern thinking, we do certainly recognize different classes of intelligence. Emotional intelligence, spatial reasoning, math, language, music.

I'd say the classes of intelligence are finite. Primarily because different types of intelligence (generally) rely on distinct parts of our finite brains.

As for maximum intelligence, I like the angle taken by several SciFi authors. Generally, when an AI is elevated to the extremes of intelligence, they detach from reality and devolve into infinite naval gazing. Building simulated realities or forever lost in pondering some cosmic mystery. Humans brought to this state usually just die.

For physical, finite entities, there's almost certainly an upper limit. It's probably a lot higher than we think, though. Biological entities have hard limits on energy, nutrients, and physical size beyond which it's just impossible to maintain any organism. Infinities are rarely compatible with biology, and we will always be limited.

Even an AI is still beholden to the practicalities of moving energy and resources around. Your processor can only get so large and energy/information can only move so fast. Intelligence in AI is probably also bounded by physical constraints, same as biological entities.

joe_the_user

I agree the idea of a formal view of intelligence is appealing. The hurdle that any such view faces is that most intelligence seems to involve rough, approximate reasoning.

I think this approximate nature of intelligence is essentially why neural nets have been more successful than earlier Gofai/logic based system. But I don't think that means formal approaches to intelligence are impossible, just they face challenges.

skydhash

One of the biggest boost to my SWE career was studying theory of computation and programming languages theory. My major was electronic engineering, so I didn't touch those at the university. But I use some books to at least grasp the introductory knowledge.

The boost was how easy it is to find the mechanism to some abstractions. And instead of being those amazing and complex tools that you have to use carefully, they've become easier to use and reason about. I would say very much like the study of forms, value, and perspectives for an artist, or music theory for a musician. It just makes everything easier. Especially the understanding about automata (regex), context-free grammar (programming language), and the Turing machine (CPU).

nitwit005

I have to vote the opposite direction. I don't think it's ever been relevant.

As an example, A practical walk through of what a simple CPU is doing was quite useful for getting the idea of what's going on. The concept of a Turing machine, which was originally intended for mathematical proofs, was not.

skydhash

The CPU is an implementation of theory. It just happens that the realization of several axioms in the Turing machine is realizable physically. And the rest hold true. The nice thing with notation is that they're more flexible than building/creating/executing the thing they describe.

nitwit005

If you look back at the history, CPUs were often designed for specific tasks, and some of the early ones lacked conditional jumps that you'd normally consider nessesary for a Turing machine.

We've made them more general purpose as that sells better, not because people cared about making Turing machines.

andoando

Turing machine works going left or right writing/reading on a tape of 1s and 0s. Your CPU works on RAM writing and reading 1s and 0s.

CPU in principle isn't that different and is largely just using a more sophisticated instruction set. Move 5 vs right right righy right right. swap vs read, write, right, write. etc

Definitely not necessary for programming but it's not some completely theoretical mumbo jumbo.

Moreover what's really interesting to me is that Turing came up with the idea from following what he himself was doing as he did computation on graph paper

DonaldPShimoda

> Definitely not necessary for programming but it's not some completely theoretical mumbo jumbo.

While I'm a big fan of teaching theory, I regret to inform you that the Turing machine is kind of completely theoretical mumbo jumbo. The theoretical equivalent of the modern processor is the Von Neumann machine. Certainly there is a direct connection to be made to the Turing machine, as all computation can be framed as a program run on such a theoretical device, but there is not really much practical use in teaching students that modern computers are Turing machines. There's just too much difference, I think.

The utility of the Turing machine as a pedagogical device stems more from, like, being able to formalize a decidable problem in a particular way, following a rigorous algorithmic approach to computation, thinking about abstraction, etc. I think the lambda calculus is also useful to teach for similar (though also different) pedagogical reasons, but I would never tell students that the lambda calculus "in principle isn't that different" from how their favorite language works. Practically all programming languages can be framed as lambda calculi, but some of them are sufficiently far removed from the theory for that comparison to be mostly useless to the majority of students.

Dylan16807

If the only principle you look at is "does this compute?" then they're not that different. Otherwise they're about as far apart as you can get.

A Turing machine (at least one that isn't designed in some super wacky way to prove a point) has a few bits of internal state and no random access memory. If you want RAM you have to build a virtual machine on top of the Turing machine. If you try to program it directly it's going to be a byzantine nightmare.

Being restricted to tape, and in particular a single tape, makes Turing machines absolutely awful for teaching programming or how a CPU implements an algorithm. Instructions, data, and status all have to fit in the same place at the same time.

It's so bad that Brainfuck is an order of magnitude better, because it at least has separate instruction and data tapes.

from your other comment > But as far as understanding computer science, computational theory, etc certainly you'd want to study Turing machines and lambda calculus. If you were say, writing a programming language, it would be nice to understand the fundamentals.

Turing machines are not fundamental. They're just one way to achieve computation that is close to minimal. But they're not completely minimal, and I strongly doubt studying them is going to help you make a programming language. At best it'll help you figure out how to turn something that was never meant to compute into a very very bad computer.

tialaramex

I completely disagree. For example understanding the theory gives me a very powerful bullshit detector because entire categories of problem which might sound merely difficult are in fact Undecidable.

Knowing that the actual regular expressions can be recognised by a finite automaton but PCRE and similar "LOL, just whatever neat string matching" extensions cannot means you can clearly draw the line and not accidentally make a product which promises arbitrary computation for $1 per million queries.

I agree that understanding something of how the CPU works is also useful, but it's no substitute for a grounding in theory. The CPU is after all obliged to only do things which are theoretically possible, so it's not even a real difference of kind.

DonaldPShimoda

> One of the biggest boost to my SWE career was studying theory of computation and programming languages theory.

I totally agree with you.

> My major was electronic engineering, so I didn't touch those at the university.

Unfortunately, ever more computer science graduates lack the same belief, even if they go to "top" university programs.

Of course, I think part of the problem is the way the material is presented; "learn this to pass the exam" without fundamental motivation is pretty much always a bad set-up to get students interested. But, speaking anecdotally, a large part also seems to be a semi-recent surge of undergraduate students who genuinely believe that their academic career is nothing more than a hurdle to be muddled through in service of getting a piece of paper that lets them get a high-paying, low-labor-intensity job. They just don't engage with theory-focused material beyond what's strictly necessary to graduate, and then they dump the thoughts from their minds the second the semester is over.

This manifests in all sorts of ways, but a common one is undergrads telling one another how little their professors know about the "real world" because they use (not even that) outdated technology in their lectures. Or maybe the profs use some less popular language and the students think of it as a waste of time. There's comparatively little self-motivation among modern CS students, where historically I think there was rather a lot.

I suppose I'm not really going anywhere specific with this other than complaining, so maybe I can ask: What do you think it was that helped you realize that learning fundamental theory would be beneficial? (I don't have any indication that you were the same kind of student as these that I mention, since you were in a different major altogether, but I'm always looking for insights to help motivate students to study theory more, and insights can come from anywhere.)

Gruzzy

Would you have book recommendations / resources to share?

skydhash

Introduction to the Theory of Computation by Michael Sipser. There's also his course based on the book on YouTube[0].

Language Implementation Patterns by Terence Parr. It avoids the theory in other books, going for a more practical approach.

Then it was just trying a lot of programming paradigms like functional programming with Common Lisp and Clojure, logic programming with Prolog, array and stack programming with Uiua. And reading snippets of books and papers. It was chaotic.

But the most enlightening lesson was: Formalism is the key. You define axioms, specify rules and as long as you stay consistent, it's ok. The important thing is to remember the axioms and understanding the rules.

[0]: https://www.youtube.com/playlist?list=PLUl4u3cNGP60_JNv2MmK3...

soulofmischief

I'd recommend codifying such axioms, rules and grammar into a domain-specific language, and then writing your logic in that DSL.

It will keep things consistent and allow newcomers to quickly understand the domain, enabling them to contribute without need for deep institutional knowledge.

nixpulvis

Might be hard to approach but my favorite is Pierce's "Types and Programing Languages": https://www.cis.upenn.edu/~bcpierce/tapl/ (known as TAPL). Was grateful to take a graduate level class using it in college.

tel

I really like TAPL but would recommend Harper’s Practical Foundations of Programming Languages (PFPL) first (though skip the first 2 chapters I think?).

https://www.cs.cmu.edu/~rwh/pfpl.html

It’s far more directed than TAPL, so I think it’s easier to read from start to finish. TAPL feels better as a reference.

skydhash

TAPL is nice, and also very dense. I haven't fully read it, but my current understanding is the follow: Both the Turing Machine and the lambda calculus only describes how to act on data. They don't actually describes what the data is. So you can craft an algorithm that works well, but it can be supplied with the wrong type of data and it will goes haywire. Your only recourse is to trust the user of your algorithm.

What type does is to have a set of classes of data, then have rules between those classes. You then annotate your code with those classes (combining them if it's possible) and then your type system can verify that the relations between them hold. To the type system, your program is data.

downboots

Mertens Theory of Computation

Jtsummers

> Mertens Theory of Computation

No such book exists. Do you mean The Nature of Computation by Moore and Mertens?

Aicy

Really? I studied it in my undergrad degree, and since then have worked as a SWE and not found it relevant at all really

skydhash

It's the recursive nature of it.

Almost any program can be written in a DSL that solves the class of problem that the program solves. So if you consider the user stories of your project as that class of problem, you can come up with with a DSL (mostly in form of pseudocode) that can describe the solution. But first you need to refine the terms down to some primitives (context free grammar). You then need to think about the implementation of the execution machine. which will be a graph of states (automata). The transition between the states will be driven by an execution machine. The latter needs not be as basic as the Turing machine.

But often, you do not need to do all these stuff. You can just use common abstractions like design patterns, data structures, and basic algorithms to have a ready made solution. But you still have to compose them and if you understand how everything works, it's easier to do so.

drdrey

it depends what you're working on, if you do any form of program analysis (security, compiler stuff) you bump into undecidability everyday

null

[deleted]

theodorethomas

Can someone explain to me why although we've known that classical physics is not a correct description of the hardware of the universe for at least 100 years now, we are still hooked on 90-year-old Turing Machines which cannot physically exist (they violate quantum mechanics) and whose theoretical limitations are, consequently, irrelevant.

irthomasthomas

Ha, I bought https://undecidability.com, but haven't decided what to do with yet, so for now it just redirects to my public bookmarks.

Aziell

The first time I encountered the halting problem, I was honestly confused. I kept thinking there had to be another solution. But over time, I came to realize it wasn’t a technical issue ,it was about the limits of computation itself. This article explains the concept of undecidability really well. It breaks things down in a very practical way. That last part, about the limits of how we think, really hit me.

cubefox

There are at least two meanings of "undecidable". The one, from computer science, is discussed in the blog post. The other, from formal logic, is a synonym to "independent". A proposition (not property) is independent of some axiomatic theory with respect to a proof system, if and only if the proposition can neither be proved nor disproved in that theory.

For example, the continuum hypothesis is independent of ZFC. Another way to express this is to say that the continuum hypothesis is undecidable (in ZFC).

Kurt Gödel used this sense of "undecidable" in his famous paper "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I". ("On formally undecidable propositions in Principia Mathematica and related systems I")

Are these two meanings of "undecidable" related in some way? I guess probably yes. But I'm not sure how.

ttctciyf

> Are these two meanings of "undecidable" related

I can't claim to answer this, but you might like to look at A Relatively Small Turing Machine Whose Behavior Is Independent of Set Theory[0] which (amongst other interesting things) does discuss the proposition "ZFC is consistent", which is independent from ZFC, in terms of a Turing Machine which enumerates all possible proofs in ZFC and halts when it finds a contradiction (e.g. a proof of 0=1).

But I don't think there's a simple equivalence here, all the same.

The logician's undecidable is always relative to a set of axioms, whereas the question of whether a property of strings can be decided by some TM doesn't place any constraints on the TM, which is to say the deciding TM is not required to, nor prohibited from, implementing any particular axiom set.

(It's tempting to try and show "provable in ZFC" to be a (CS)undecidable property by having a TM enumerate ZFC proofs and halt on finding a proof or disproof of the input string, and imagine the TM running forever when given a statement of the Continuum Hypothesis. But the TM could proceed by other means - for example [1] shows a proof of CH's independence in a theorem prover, so that a TM incorporating this construction could indeed reject CH statements as unprovable in ZFC. Which is not to say "ZFC-provable" *isn't* (CS)undecidable, just that showing this isn't as simple as constructing a ZFC-proof-enumerator and giving it the CH as input.)

0: https://arxiv.org/abs/1605.04343

1: https://arxiv.org/abs/2102.02901

tel

Independent and undecidable aren't quite the same, even in formal logic. Or rather, sometimes they are but it’s worth being specific.

A proposition P being independent of a theory T means that both (T and P) and (T and not P) are consistent. T has nothing to say about P. This may very well be what Gödel was indicating in his paper.

On the other hand, undecidable has a sharper meaning in computation contexts as well as constructive logics without excluded middle. In these cases we can comprehend the “reachability” of propositions. A proposition is not true or false, but may instead be “constructively true”, “constructively false”, or “undecidable”.

So in a formal logic without excluded middle we have a new, more specific way of discussing undecidability. And this turns out to correspond to the computation idea, too.

alok-g

Could I say that 'P is Undecidable' is defined as: It is False that {There exists T such that [(T and P) and (T and not P) are both consistent]}?

tel

Quantifying over T is probably not going to work. In informal terms that reads like "No logic exists where P is independent", which probably wasn't quite what you wanted, but also we can trivially disprove that with T = {}. As long as P is self-consistent, then "not P" should be too.

We're interested in a proposition's status with respect to some theory that we enjoy (i.e. Zermelo–Fraenkel set theory).

cubefox

I still think what I said was correct.

> On the other hand, undecidable has a sharper meaning in computation contexts as well as constructive logics without excluded middle. In these cases we can comprehend the “reachability” of propositions. A proposition is not true or false, but may instead be “constructively true”, “constructively false”, or “undecidable”.

Yes, but that just means that independence/undecidability depend on the proof system, as I said before. So when using a constructive proof system, more statements will turn out to be undecidable/independent of a theory than with a classical one, since the constructive proof system doesn't allow non-constructive proofs, but the classical one does.

tel

Yeah, I agree. "Independence" is fundamentally a property of the formal system you're working within (or really, it's a property of the system you're using and of the axiomatic system under test, the system a proposition would be independent from). I'm holding out a bit to unify that with "undecidability" because undecidability takes on a particular character in constructive systems that happens to align with Turing's notion.

So at some level, this was just an acknowledgement that "undecidability" in this form is well represented in formal logic. In that sense, at least in constructive logics, it's not just a synonym for "independence".

HN

What does “Undecidable” mean, anyway

What does “Undecidable” mean, anyway