Collecting All Causal Knowledge

60 comments

·September 2, 2025

tgv

This makes little sense to me. Ontologies and all that have been tried and have always been found to be too brittle. Take the examples from the front page (which I expect to be among the best in their set): human_activity => climate_change. Those are such a broad concepts that it's practically useless. Or disease => death. There's no nuance at all. There isn't even a definition of what "disease" is, let alone a way to express that myxomatosis is lethal for only European rabbits, not humans, nor gold fish.

dr_dshiv

Democritus (b 460BCE) said, “I would rather discover one cause than gain the kingdom of Persia,” which suggests that finding true causes is rather difficult.

s1mplicissimus

"According to the Greek historian Herodotus, Xerxes's first attempt to bridge the Hellespont ended in failure when a storm destroyed the flax and papyrus cables of the bridges. In retaliation, Xerxes ordered the Hellespont (the strait itself) whipped three hundred times, and had fetters thrown into the water."

Not so sure one should take stories about who said something in ancient times at face value ;)

[1] https://en.wikipedia.org/wiki/Xerxes_I

hugh-avherald

Or is less of a hassle.

DrScientist

I totally agreed that in the past years of hammering out an ontology for a particular area just results in a common understanding between those who wrote the ontology and a large gulf between them and the people they want to use it ( everyone else ).

What's perhaps different is that the machine, via LLM's, can also have an 'opinion' on meaning or correctness.

Going fully circle I wonder what would happen if you got LLM's to define the ontology....

Xmd5a

>what would happen if you got LLM's to define the ontology.

https://deepsense.ai/resource/ontology-driven-knowledge-grap...

>hammering out an ontology for a particular area just results in a common understanding between those who wrote the ontology and a large gulf between them and the people they want to use it

This is the other side of the bitter lesson, which is just the empirical observation of a phenomenon that was to be expected from first principles (algorithmic information theory): a program of minimal length must get longer if the reality it models becomes more complex.

For ontologists, the complexity of the task increases as the generality is maintained while model precision is increased (top down approach), or conversely, when precision is maintained the "glue" one must add to build up a bigger and bigger whole while keeping it coherent becomes more and more complex (bottom up approach).

tomaskafka

But “disease => death” + AI => surely at least few billion in VC funding.

taneq

The best thing about this statement is that it can be read as 'the fact that disease causes death, plus the application of AI, will surely lead to billions VC funding' but it can also be read as 'disease is to death as AI is to a few billion in VC funding'. :D

notrealyme123

Koller and Friedman write in "Probabilistic Graphical Models" about the "clarity test", so that state variables should be clear for an all seeing observer.

States like "human_activity" are not objectively measurable.

Fairly PGMs and causal models are not the same, but this way of thinking about state variables is an incredible good filter.

eru

> States like "human_activity" are not objectively measurable.

Well, or at least they would need a heavy dose of operationalisation.

koliber

Exactly. In some cases disease causes death. In others it causes immunity which in turn causes “good health” and postpones death.

Nevermark

Contradictory cause-effect examples, each backed up with data, are a reliable indicator of a class of situations that need a higher chain-effect resolution.

Which is directly usable knowledge if you are building out a causal graph.

In the meantime, a cause and effect representation isn't limited to only listing one possible effect. A list of alternate disjoint effects, linked to a cause, is also directly usable.

Just as an effect may be linked to different causes. Which if you only know the effect, in a given situation, and are trying to identify cause, is the same problem in reverse time.

asplake

Agreed. About the strongest we can hope for are causal mechanisms, and most of those will be at most hypotheses and/or partial explanations that only apply under certain conditions.

Honestly, I don’t know understand how these so-ontologies have persisted. Who is investing in this space, and why?

tossandthrow

Ontology, not ontologies, have been tried.

We have quite a good understanding that a system cannot be both sound a complete, regardless people went straight in to make a single model of the world.

kachnuv_ocasek

> a system cannot be both sound a complete

Huh, what do you mean by this? There are many sound and complete systems – propositional logic, first-order logic, Presburger arithmetic, the list goes on. These are the basic properties you want from a logical or typing system. (Though, of course, you may compromise if you have other priorities.)

lemonwaterlime

My take is that the GP was implicitly referring to Gödel’s Incompleteness Theorems with the implication being that a system that reasons completely about all the human topics and itself is not possible. Therefore, you’d need multiple such systems (plural) working in concert.

Xmd5a

Could you define sound and complete in this context ? IIRC Rust's borrow checker is sound (will not mark something dysfunctional as functional) but not complete: some programs would take too long to verify, the checker times out, and compilation fails even though the program is potentially correct.

tossandthrow

The meaning of the word person is ~sound (ie. Well defined) when two lawyers speak.

But when a doctor tells the lawyer that they operated a person, the lawyer can reasonably say "huh" - the concept of a person has shifted with the context.

jiggawatts

Even more importantly, it's not even a simple probability of death, or a fraction of a cause, or any simple one-dimensional aspect. Even if you can simplify things down to an "arrow", the label isn't a scalar number. At a bare minimum, it's a vector, just like embeddings in LLMs are!

Even more importantly, the endpoints of each such causative arrow are also complex, fuzzy things, and are best represented as vectors. I.e.: diseases aren't just simple labels like "Influenza". There's thousands of ever-changing variants of just the Flu out there!

A proper representation of a "disease" would be a vector also, which would likely have interesting correlations with the specific genome of the causative agent. [1]

Next thing is that you want to consider the "vector product" between the disease and the thing it infected to cater for susceptibility, previous immunity, etc...

A hop, skip, and a small step and you have... Transformers, as seen in large language models. This is why they work so well, because they encode the complex nuances of reality in a high-dimensional probabilistic causal framework that they can use to process information, answer questions, etc...

Trying to manually encode a modern LLM's embeddings and weights (about a terabyte!) is futile beyond belief. But that's what it would take to make a useful "classical logic" model that could have practical applications.

Notably, expert systems, which use this kind of approach were worked on for decades and were almost total failures in the wider market because they were mostly useless.

[1] Not all diseases are caused by biological agents! That's a whole other rabbit hole to go down.

Nevermark

That was very well said.

One quibble, and really mean only one:

> a high-dimensional probabilistic causal framework

Deep learning models aka neural network type models, are not probabilistic frameworks. While we can measure on the outside a probability of correct answers across the whole training set, or any data set, there is no probabilistic model.

Like a Pachinko game, you can measure statistics about it, but the game itself is topological. As you point out very clearly, these models perform topological transforms, not probabilistic estimations.

This becomes clear when you test them with different subsets of data. It quickly becomes apparent that the probabilities of the training set are only that. Probabilities of the exact training set only. There is no probabilistic carry over to any subset, or for generalization to any new values.

They are estimators, approximators, function/relationship fitters, etc. In contrast to symbolic, hard numerical or logical models. But they are not probabilistic models.

Even when trained to minimize a probabilistic performance function, their internal need to represent things topologically creates a profoundly "opinionated" form of solution, as apposed to being unbiased with respect to the probability measure. The measure never gets internalized.

bckr

What’s the relationship between what you’re saying and the concepts of “temperature” and “stochasticity”? The model won’t give me the same answer every time.

rwmj

Isn't this like Cyc? There have been a couple of interesting articles about that on HN:

https://news.ycombinator.com/item?id=43625474 "Obituary for Cyc"

https://news.ycombinator.com/item?id=40069298 "Cyc: History's Forgotten AI Project"

HarHarVeryFunny

Seems like a subset of CYC - attempting to gather causal data rather than declarative data in general.

It's a bit odd that their paper doesn't even mention CYC once.

pavlov

The sample set contains:

    {
        "causal_relation": {
            "cause": {
                "concept": "boom"
            },
            "effect": {
                "concept": "bust"
            }
        }
    }

It's practically a hedge-fund-in-a-box.

kolektiv

Plus, regardless of what you might think of how valid that connection is, what they're actually collecting, absent any kind of mechanism, is a set of all apparent correlations...

TofuLover

This reminds me of an article I read that was posted on HN only a few days ago: Uncertain<T>[1]. I think that a causality graph like this necessarily needs a concept of uncertainty to preserve nuance. I don't know whether this would be practical in terms of compute, but I'd think combining traditional NLP techniques with LLM analysis may make it so?

[1] https://github.com/mattt/Uncertain

notrealyme123

I get some vibes of fuzzy logic from this project.

Currently a lot of people research goes in the direction that there is "data uncertainty" and "measurement uncertainty", or "aleatoric/epistemic" uncertainty.

I foumd this tutorial (but for computer vision ) to be very intuitive and gives a good understanding how to use those concepts in other fields: https://arxiv.org/abs/1703.04977

9dev

Right. The first example on the site shows disease as a cause, and death as an effect. This is wrong on several levels: There is no such thing as healthy or sick. You’re always fighting off something, it just becomes obvious sometimes. Also, a disease doesn’t necessarily lead to death, obviously.

kaashif

Since you're always going to die, the problem is solved - the implication is true by the right side always being true, and the left side doesn't matter.

9dev

Then it’s correlation instead of causation and the entire premise of a causation graph is moot.

refactor_master

Might as well go ahead and add https://tylervigen.com/spurious-correlations?page=135 from the looks of it.

larodi

Why not use PROLOG then, is the essence of cause and effect in programming. And also can expound syllogisms.

orobus

The conditional relation represented in prolog, and in any deductive system, is material implication (~PvQ), not causation. You can encode causal relationships with material implication but you’re still going to need to discover those causal relationships in the world somehow.

cubefox

Conditional statements don't really work because "if A, then B" means that A is sufficient for B, but "A causes B" doesn't imply that A is sufficient for B. E.g. in "Smoking causes cancer", where smoking is a partial cause for cancer, or cancer partially an effect of smoking.

"A causes B" usually implies that A and B are positively correlated, i.e. P(A and B) > P(A)×P(B), but even that isn't always the case, namely when there is some common cause which counteracts this correlation.

Thinking about this, it seems that if A causes B, the correlation between A and B is at least stronger than it would have been otherwise.

This counterfactual difference in correlation strength is plausibly the "causal strength" between A and B. Though it doesn't indicate the causal direction, as correlation is symmetric.

bbstats

Causality is literally impossible to deduce...

jack_riminton

Reminds me of the early attempts at hand categorising knowledge for AI

koliber

I wonder how they will quantize causality. Sometimes a particular cause has different, and even opposite, effects.

Alcohol causes anxiety. At the same time it causes relaxation. These effects depend on time frame, and many individual circumstances.

This is a single example but the world is full of them. Codifying causality will involve a certain amount of bias and belief. That does not lead to a better world.

null

[deleted]

rhizome

"The map is not the territory" ensures that bias and mistakes are inextricable from the entire AI project. I don't want to get all Jaron Lanier about it, but they're fundamental terms in the vocabulary of simulated intelligence.

lwansbrough

I was hoping this would be actual normalized time series data and correlation ratios. Such a dataset would be interesting for forecasting.