Skip to content(if available)orjump to list(if available)

IRS open sources its fact graph

IRS open sources its fact graph

49 comments

·October 15, 2025

vineyardmike

Am I being dumb or does this not actually contain the facts about the tax code? Is the /demo/all-facts file supposed to be the “real” facts? Are the XML fact files provided in another location?

It’s pretty cool to see the way that the IRS handles defining and maintaining its tax calculations, but also a machine-readable tax code seems cool too.

ronbenton

I believe the actual IRS tax code implementation is in a separate repo here: https://github.com/IRS-Public/direct-file while the originally linked repo is the fact graph tooling decoupled from the tax implementation.

tyingq

Look like many of them are specifically the xml files here:

https://github.com/IRS-Public/direct-file/tree/e0d5c84451cc5...

ronbenton

I was just reading through those! A bit dizzying

MangoToupe

As far as I am aware, fact just means shared assumption. This seems entirely reasonable for a tax code.

hedayet

I’ve had frustrating experiences with TurboTax due to its overly complex interface, aggressive data collection under the guise of saving money (which it doesn’t deliver), and a convoluted pricing structure that rivals the IRS’s own complexity.

I hope this initiative is good enough to enable domain experts and good people to build transparent, user-friendly alternatives to challenge TurboTax’s market grip.

Has anyone encountered promising tools or approaches that tackle these pain points?

willis936

DirectFile was quite good for the one year I was able to use it and addressed your concerns. Don't worry, that's since been taken care of.

https://apnews.com/article/irs-direct-file-tax-returns-free-...

j_bum

Just a heads up, your URL 404’s

willis936

Thanks. Fixed. I stripped what I thought was a tracker without testing.

somehnguy

TurboTax’s advertising is borderline fraudulent in my opinion.

Freetaxusa.com (no affiliation) is just as good and legitimately free.

babelfish

FreeTaxUSA is legitimately fantastic!

Spooky23

The H&R Block software is better imo.

aliljet

I wonder how this can be used with an LLM to provide interesting tax advice? I'd love to regularly ask questions of the tax code...

Jach

patio11's already saved over $2k apparently, maybe he'll do a more formal write-up at some point. (A couple threads here https://x.com/patio11/status/1977425626584711668 and here https://x.com/patio11/status/1978168404793037087 )

koolba

Any idea what the actual deduction it supposedly found for private school?

You can pay for K-12 with 529 or Coverdell ESA funds. But neither allows deductions for contributions. Only growth in either is tax free (assuming it’s spent on education expenses).

ryandrake

I guess as long as it's for entertainment purposes only. I'm going to file "actually following tax/legal advice from a potentially hallucinating LLM" under NOPE.

hahahacorn

The super obvious workflow is to query for an idea in natural English and then verify or ask the LLM to provide the paths it was following.

It begs the question why you assume the parent comment was going to blindly follow the LLMs output.

ronbenton

Makes me wonder if someone has already trained a model on the tax code. Would be interesting for sure.

astrange

Model training data already contains all the text there is[0], so they can already answer questions like this (especially with web search), but they aren't good at tax calculations.

https://arxiv.org/abs/2507.16126v1

[0] but it's quite possible the conversion from HTML to text is bad

kevin_thibedeau

The problem is that the text of US tax code isn't enough to know the correct action to take. The IRS has semi-formal policies based on how it has chosen to interpret the statutes. There are areas of gray that they don't clearly specify. Some of this is in supplementary publications but it still has subjective elements. One example is that settlements for "serious injuries" are regarded as non-taxable income. What constitutes serious is a squishy concept.

TZubiri

You can technically use the language model as a data model. That was the quick hack that started it all, autocomplete on a question produces the answer, yes.

However it's clear that we are moving towards separating the data and the language model. Even base chatgpt is given Search Tools and python Tools instead of producing them by text, the tool call itself may be generated by the model though.

You can for sure use a pure LLM to ask it questions about tax code, but we'll probably see specific tools that only contain canon law and kosher case law, and sources it properly. Y'know instead of halucinating

tallowen

It's nice to see an open sourced implementation of the US tax code! This was part of the IRS Direct File codebase that allowed people to file their taxes for free, directly with the IRS. It was canceled earlier this year by the Trump administration. It looks like the Fact Graph was already opensourced a couple months ago and that version of the factgraph lives here: https://github.com/IRS-Public/direct-file/tree/main/direct-f...

I'm curious why a second repository was created for this.

ronbenton

I wonder too. Perhaps the intent is for it to be standalone for general usage and not just as a part of the direct file project?

Twisol

Seems so, according to this file: https://github.com/IRS-Public/fact-graph/blob/main/docs/from...

> The main changes are: [...] converting the fact-graph to a standalone library [...]

infotainment

I'm still disappointed that they got rid of Direct File, such a promising start...

ronbenton

Big W for the tax lobby, big L for the rest of us

astrange

It's still there. They like saying things and not doing them.

https://directfile.irs.gov

So it's always possible they'll just forget to shut it off.

shrinks99

Having talked at length with one of the developers of Direct File at a conference who was fired along with many of the other folks that worked on Direct File, I can assure you that it's no longer being worked on.

The 2024 site remains up so people can file their taxes for that year, but it will no longer be updated.

beej71

I'm far beyond disappointed for that. I'm fucking pissed. Such stupid politicking that makes all of our lives shittier.

mensetmanusman

Build it and release for free.

hk1337

My eyes read Scala but my brain was thinking Clojure, so I was a bit confused on why there weren’t any parentheses for the first couple of seconds looking at the source.

alberth

> As a work of the United States Government, this project is in the public domain within the United States.

What does it mean for the license to say "within the US"?

Does this mean this software cannot be used outside the US?

dragonwriter

> What does it mean for the license to say "within the US"?

It means exactly what it says; you have to read the whole thing (or at least the two sentences before the CC 1.0 Universal text, which is the operative mechanism by which the second sentence is effected), not a fraction of the first sentence.

> Does this mean this software cannot be used outside the US?

No. The license explains two things:

(1) Without any license, this is automatically public domain in the US because it is a federal government work.

(2) The federal government (as the owner of the copyright at creation outside the United States, at least anywhere that applies the common rules underlying the Berne Convention) waives copyright worldwide, and does so via the CC 1.0 Universal declaration (the text of which is then included.)

So, it is, to the extent that this is legally possible, copyright-free globally.

jandrewrogers

Some countries don't recognize the concept of Public Domain works. In the US, many government works are Public Domain as a matter of law. This creates complications internationally in those countries that don't recognize the legitimacy of Public Domain as a legal concept. Nonetheless, the US still wants to make it available internationally.

To satisfy these conflicting requirements, the US government places it in the Public Domain in the US to satisfy US law. Additionally, they make it available internationally under a license that approximates the intent of Public Domain while still being recognized as a legally valid thing.

null

[deleted]

ronbenton

Good question. Copyright laws are country-specific, right? So perhaps it is just trying to be clear that there is no license being asserted outside of the US.

dragonwriter

Licenses are offered or granted (they are permissions from the copyright holder), not asserted.

jauntywundrkind

This was such a fun neat part of the Direct File code drop 5 months ago. https://news.ycombinator.com/item?id=44131901

In particular there's a pretty nice inline tutorial that's still there in that release: https://github.com/IRS-Public/direct-file/blob/main/direct-f...

bickfordb

Surprised to learn we still have an IRS

rvitorper

Scala mentioned

ok123456

Why would I want to use this over Prolog/Datalog?

NoahZuniga

Because prolog/datalog don't offer a list of questions that you can ask based on context to calculate someone's US taxes.

ok123456

That's the database you consult(). Doing income taxes is well-suited to traditional logic programming.

akerl_

This is a bit like asking "why would I use my car's schematics instead of a wrench".

This is the rules engine's details. You could use it to build the logic and traversal in whatever language you like.