Skip to content(if available)orjump to list(if available)

Announcing the data.gov archive

Announcing the data.gov archive

23 comments

·February 7, 2025

black_puppydog

Great to see there's some resistance. What I'm missing from this announcement though is any mention of how they intend to secure this "vault" against the current government. I'm assuming good intentions on the part of Harvard, but keeping this data online against the express will of the government is gonna cost (political) capital. And from what I can see, the archive is hosted by US entities on US-controlled servers on US soil?

This is the same thing that's been bothering me with archive.org lately, by the way. I haven't found a good way to simply (for some reasonable definition definition of "simple") contribute 10 TiB or so of redundant storage on my (european) home server either. That kind of thing might (have to) serve to ensure tamper-resistance for that data, given the current political climate on both sides of the pond. Any pointers welcome.

bjackman

I think a fully distributed storage system must be the way here. There must be some IPFS type system where Harvard could say "we designated a set of data that we can add to as needed but only delete from with a critical mass of storage providers' consent, here are some instructions for you to add your spare capacity to become a storage provider".

EnnEmmEss

If I remember correctly, Harvard has immunity to eminent domain under the Massachusetts constitution. Maybe it has a similar right which would make it immune to such attacks?

lou1306

I beg you, please stop applying rule-of-law mindset against might-makes-right adversaries. It creates blind spots giving the illusion that the attack surface is way smaller than it actually is.

Muskolites are taking on the SSN system without any Congressional oversight as we speak. The President is attacking ius soli which is a Constitutional right. If they decide that sending their sleuths to Cambridge MA to physically destroy this data is in their best interest, they will do so and handle the courts later. Just stop pretending they will play by the book.

globalnode

Im predicting no more elections for you guys in 4 years. Something makes me think theres gonna be some "reason" to turn them off.

squigz

How do you recommend engaging with the situation then? Throw out the book too?

black_puppydog

That may be so, but given what Trump and Musk have been up to, the situation of the courts, and how they blatantly don't give a f*k about what's constitutional or not, I wouldn't rely on this so-called "immunity".

0xEF

I really hope I am wrong, but I'm planning on seeing some headlines about Musk shutting down Harvard next week over this.

For anyone who still thinks the existing laws, constitutions and policies mean anything to this current regime, prepare to get some whiplash. They are proving that none of that matters if they simply ignore it and do what they want to do anyway.

zombot

> ...how they intend to secure this "vault" against the current government.

Yup, I was about to ask whether Trump could still force them to delete what he doesn't like. Time will tell, I guess.

LadyCailin

In general, no. By withholding federal funds, and with owning congress and scotus, yes.

lisp2240

Fourth paragraph

fredoliveira

Honestly a shame it has to come to this. Sure, people elected this administration and I guess with that comes with a bunch things I disagree with. But the removal of years of scientific research and data from the web (paid for by citizens with their taxes) is absolutely unacceptable. Ravaging CDC data, climate data, etc is horrendous and unforgivable.

zombot

It's today's equivalent of book burnings.

"where they burn books, they will ultimately burn people as well."

Those who delete research will ultimately delete people as well.

https://en.wikiquote.org/wiki/Heinrich_Heine

govideo

From the post: Today we released our archive of data.gov on Source Cooperative. The 16TB collection includes over 311,000 datasets harvested during 2024 and 2025, a complete archive of federal public datasets linked by data.gov. It will be updated daily as new datasets are added to data.gov. This is the first release in our new data vault project to preserve and authenticate vital public datasets for academic research, policymaking, and public use.

null

[deleted]