Show HN: New search engine and free-FOIA-by-fax-via-web for US veteran records
74 comments
·January 13, 2025ldoughty
Neat, Submitted a request for my grandfather's records. Some comments:
1) May want to auto-magically handle input for things like apostrophes. E.g. "O'Hare"... It looks like somewhere in the process this data was not preserved/saved/sent, but people will probably try to search with it. Might also want to handle the accent marks and what not too
2) The terms & conditions for Step 3, the checkbox at the bottom doesn't have enough contrast when checked. I do not have a disability, and I still found it very faint. Someone with a disability would likely have a lot of trouble (not to mention, it requires scrolling to the bottom to check it in the first place, which isn't awesome for accessibility)
3) I appreciate the warning on the terms and conditions about seeing things you might not want to see. A good reminder for those that might not want to tarnish a memory of someone... Reminds me of the DNA tests for Christmas, or learning about Punnett Squares and genetics, sometimes you might not want to go looking :-)
owenmarshall
> 3) I appreciate the warning on the terms and conditions about seeing things you might not want to see.
I'd echo this. I found that to be exceptionally well-written and helped me understand the records I'd receive were unlikely to be the records I was interested in, so I cancelled at that point.
Your abandon rate at that step could make for interesting reading!
tivert
>> 3) I appreciate the warning on the terms and conditions about seeing things you might not want to see. A good reminder for those that might not want to tarnish a memory of someone... Reminds me of the DNA tests for Christmas, or learning about Punnett Squares and genetics, sometimes you might not want to go looking :-)
> I'd echo this. I found that to be exceptionally well-written and helped me understand the records I'd receive were unlikely to be the records I was interested in, so I cancelled at that point.
I was curious, so clicked through the form far enough to get those warnings:
> ...
> The specific type of FOIA request that you can make through this website is one that asks the VA for a copy of a deceased veteran's Claims File (C-File). This file primarily contains a record of the veteran's contact with the VA (or the veteran's heirs' and family's contact with the VA) specifically regarding veterans' benefits. It may include copies of some of the veteran's service-related records, including entry/induction and separation/discharge documents, but often only to the extent that those records were considered necessary in order to establish their identity or to make a claim for a benefit. A C-File is not the same as an Official Military Personnel File (OMPF), although it sometimes may contain parts of the OMPF within it.
> Many of these C-Files will include medical information and medical claims that were brought by the veteran (or their family or heirs) from before, during, and/or after their service. This often includes basic physical and health information about the veteran, including their height, weight, descriptions of childhood illnesses, past surgeries, notations of scars or distinctive markings, and so on. However, these files might also include medical information that would otherwise be considered private or sensitive, including graphic depictions of injuries, illnesses or diseases, and/or wartime trauma suffered while the veteran was in the service, or after service, or concerning end-of-life care. The file may also include discussions of disabilities, service-related or not, only some of which may have been covered by veterans' benefits, while others may have been denied by the VA, possibly unfairly.
> The file may also include sensitive information about the veteran's mental health, including their experiences with, treatment for, and/or claims for disability for psychological trauma or for mental illnesses. This may include descriptions of what we would today recognize as service-related Post-Traumatic Stress Disorder (PTSD), but which may be listed in the veteran's file with outdated phrases such as "shell shock" or "psychoneurosis, anxiety state" or even more overtly disparaging terms from former military or VA medical personnel. It may also include archaic medical terminology, depictions, presumed causes, or treatments for various types of mental illness. This information may be troubling to read, not just for the sometimes graphic depiction of the veteran's trauma or mental state, but also because of the way it was often treated (or negligently untreated) in their official VA file.
> (For example, in one C-File we've seen, an Army doctor officially "diagnosed" a hospitalized World War I veteran with "hypochondriasis on a constitutionally inferior basis" [sic] before discharging him from the service. This record remained in his file, even after his death in 1973.)
> The file may also include information about the veteran's alcohol use, drug use, and/or tobacco use, or at least to the extent that the veteran reported their "habits" to the Armed Services or to VA personnel or to medical doctors.
> The file may also include information about the veteran's sexual behavior or sexual orientation, including possible military discharge or punishment for same-sex relations or non-heterosexual identity, whether actual or perceived.
> The file may also make explicit note of any venereal diseases or sexually transmitted infections experienced by the veteran, including documentation of ongoing or past treatments for what may have (at the time) been a chronic incurable illness such as advanced or tertiary syphilis. This information may therefore be medically relevant or potentially damaging to the veteran's spouse(s) or partner(s) or other family members.
> The file may also have information about the veteran's financial or educational information, or other typically-private information, particularly if they were using or attempting to qualify for a pension, a disability benefit, a VA Home Loan, the GI Bill of Rights for educational benefits, or other benefits.
> In addition to the veteran's own information, the veteran's C-File may sometimes include information about their non-veteran family members, including their parents and spouses and siblings and sometimes even extended family members, even a veteran's spouse's former spouse. This is generally just basic information, including for example the parents' places of birth or a spouse's date of marriage, but it may include family medical information (including reports of physical or mental health conditions that might be genetic), financial information, educational information, or other details of their lives. The file sometimes contains actual copies of family members' vital records such as birth, marriage, divorce, and death records. While the veteran is deceased, making their C-File largely open to the public under FOIA (as the Privacy Act of 1974 only refers to the files of living people), it is still possible that some of the other people mentioned in that now-open file may still be alive. If you come across sensitive information about a third party referenced in a deceased veteran's file that was (wrongly) not redacted by the VA, you are strongly urged to not disseminate, re-publish, or misuse any part of that information which could affect a living person's privacy.
> You, the FOIA requester, therefore understand that these files might contain all sorts of information which might be considered sensitive, objectionable, upsetting, disparaging, invasive, or otherwise cause you or the veteran's family members or heirs distress. If you are not okay with the possibility of learning this kind of information, then you should not make a FOIA request for this kind of file, and you should hit the cancel button now.
> ...
Asparagirl
Thanks. The original data set, as provided by the VA, has all sorts of data errors and oddities in it. The major ones involving surnames include the inconsistent use of apostrophes in names like O’BRIEN, often written as O BRIEN, and/or vice versa — or the inconsistent formatting of MC and MAC names like MCMAHON as MC MAHON, and/or vice versa. There are also some names where the VA includes an errant dash, not meant to be a hyphen, and other mistakes, as well.
So we try our best to help a user find the veteran even with the dirty data we have. For example, there is code here (using a common NPM package) to convert a user’s potential typed accent marks to a non-accented version of the same letter. In compound surnames we will also break up the surname on a space or a hyphen and search both parts, but not if a surname part is three letters or fewer. It’s imperfect but we have to work with the data we’ve got and can’t and shouldn’t normalize or clean the underlying file.
Suppafly
>names like MCMAHON as MC MAHON, and/or vice versa.
My mom has a Van name and it's hell trying to use government and insurance websites, because they'll take the space out or add one in irregardless of what you use when signing up and then fail to find the account when doing a lookup for things like password resets or for activating the account that they created for her.
It'd seem logical that some sort of fuzzy matching for apostrophe and spaces would be built in, but I've yet to find a government site where that's the case.
null
patwolf
I previously tried getting military records for my deceased grandfather. From what I can recall, it was complicated by the fact that I wasn't "next of kin", which would limit the data I'd have access to. Even my parent, who was next of kin, would have needed additional paperwork as proof, e.g. a death/marriage certificate for my grandmother.
On one hand, if this works then I'll be happy to have the information I otherwise wouldn't have. But on the other hand, all these processes, no matter how convoluted, exist for a reason. It feels weird bypassing those.
Asparagirl
The processes for getting these very particular records (C-Files, as opposed to something like an OMPF or other better known military records) has been horrendously broken for years. They were almost completely inaccessible from this specific agency (the VBA, inside the VA) their entire existence. Only 5% of the files have been turned over to NARA, even for records that are very old.
And even now, the “processes” to get the records, as defined by a 58+-year-old law (FOIA) are not really being followed. An agency refusing to process any FOIA requests except by fax (!) is insane, in this day and age. But more specifically, it’s against the law. A letter AND an e-mail are supposed to work. Hence our use of a fax API on this website…
Furthermore, the “requirement” that a FOIA requester must hand-sign the paperwork is absolutely made up by this agency. Hence our signature widget on this website…
Point being, if they’re going to shamelessly ignore or misinterpret the federal law, we are going to just jump through those hoops and say no, we want the files, please do your jobs.
greentxt
But that may be bad actually.
Suppafly
>From what I can recall, it was complicated by the fact that I wasn't "next of kin", which would limit the data I'd have access to.
My state has a process for claiming unclaimed funds that banks and such report to the government and that is what is keeping me from claiming some funds my grandmother has listed on the site. It's not even clear to me what constitutes 'next of kin' legally, presumably it'd be one of her kids, but it's not like we have laws designating the oldest male heir and then on down the list.
toomuchtodo
Your state should have information on their unclaimed property page to contact someone who can explain what documentation is necessary to establish legal order next of kin chain from your ancestor to whomever is claiming the property they are acting as custodian for. Call them! Most are usually very helpful.
wtfssn38
Are you aware that the API appears to be publishing the SSN of each individual? Although I’m aware most SSNs are leaked in one breach or another, I still thought it was customary in the U.S. to attempt to keep such information somewhat protected.
Asparagirl
Yes, that’s on purpose. SSNs of *deceased* people are public, not private. They are never reused. They are available under FOIA from other sources as well, such as the SSDMF.
ldoughty
I agree it probably isn't a great idea to publish it... I'm guessing some malicious actor could find a way to use this information to fiddle with the remaining benefits their family might be receiving...
That said, the _typical_ things an SSN is used for would not be terribly useful for someone that's been dead >2-4 years... Automated checks should flag e.g. credit applications as being for a dead person :-)
toomuchtodo
Social Security Admin publishes a master death file containing a list of SSNs of those who have died. This is how institutions know to close credit, deposit, and brokerage accounts; they subscribe to and consume the file. Some copies of it have been uploaded to the Internet Archive when someone has obtained a copy (cost to obtain from NTIS is prohibitive for individuals, but reasonable for commercial customers).
Broadly speaking, right to privacy evaporates at death, and when the Machine is working properly, SSNs have no value once they have been marked as deceased in SSA's system of record (as that flows through to various gov and commercial systems to ensure benefits cease and next of kin processes kick off for anything of value).
rgrieselhuber
Do voting registration boards check for this too?
hoppyhoppy2
>49 states plus the District of Columbia have voter registration lists and in all of them there is a process for removing deceased voters from the list. (Note: North Dakota does not require people to register to vote.)
https://tracker.votingrightslab.org/issues/voter-list-mainte...
mattw2121
As someone trying to piece together family history, after most of my family has died, I really appreciate this. Any and all efforts to make records available helps with clues. Building an accurate family history is a process of "one more document". This effort is definitely helpful to me. I've already utilized your service to submit a request for my grandfather's records. I'll be spending time searching for other relatives as well. Thanks!
necovek
I believe you should work to limit exposure of sensitive information like SSN: while it's ok to allow search by an exact SSN, you should probably not display it unless the requestor already knows what it is.
OTOH, if you have really succesfully worked to make this database public domain and do publish it somewhere (and you did, as I can see at https://archive.org/details/BIRLS_database), this wouldn't be of much help against any malicious actors out there.
But really, it seems the burden is on VA if there are non-deceased persons in the database since they have done a bad job of maintaining the data, and they would be liable for any leakage of information (unless Reclaim the Records was aware of any in particular). Even so, RTR might have put themselves out on the fence for some lawsuits against them too.
Asparagirl
The VA worked to confirm that everyone in this dataset is deceased, in order to satisfy the judge’s order, and produced an internal document about how they did it — which we then FOIAed and posted online too. (It’s up on the site, next to the legal paperwork.) The veterans and their SSNs are believed to have been deceased prior to mid-2020, checked by the VA’s internal datasets as well as public data sets such as the SSDMF. And SSNs of deceased people are *not private*, since they are never reused. The Social Security Administration also makes copies of all deceased peoples’ original SS-5 applications available to the public under FOIA.
greentxt
Have you ever worried about your impact on veterans? Maybe not a concern?
Asparagirl
The veterans in the data set are all deceased, and I have not heard any complaints from them so far.
fergbrain
Thanks for your efforts in liberating this data so that Ancestry.com isn’t the only ones with it!
Reminds me a bit muckrock.com as well.
Asparagirl
We love MuckRock! And we made the original FOIA request to the VA for this dataset via MuckRock’s platform. You can see the actual screenshot in the “Reclaiming These Records” legal papers section. They also get a shoutout in our colophon for indirectly inspiring the FOIA-by-fax-via-web method, although I believe their site uses e-mail, including interfacing with agency FOIA portals when possible.
ungreased0675
This site feels icky and I’m not quite sure how to articulate why. What is the purpose of this service? Why is it good for the public have access to detailed records of individual, recently deceased veterans? Isn’t this a gold mine for scammers? Is this project LDS affiliated?
tivert
> This site feels icky and I’m not quite sure how to articulate why. What is the purpose of this service?
Sounds like genealogy, and a small fraction of the documents in a veteran's would probably be very helpful in fleshing out some basic details of their military service (especially given a fire destroyed many of the original copies of those documents).
The actual medical records part seems inappropriate, though.
> Why is it good for the public have access to detailed records of individual, recently deceased veterans? Isn’t this a gold mine for scammers? Is this project LDS affiliated?
It seems like it's a gap in FOIA. These records should be available, just not to everyone in the whole world (at least not before, say, 60 years after the veteran's death). It seems legitimate that an appropriately-close family member should be able to request them (similar to restrictions in requesting birth certificates).
irunmyownemail
Agree completely with your comments and the parent's comment. It feels like this shouldn't have been allowed by the courts without better scrutiny and wisdom on the part of the courts.
asacrowflies
Why exactly should arguably one of the MOST important parts of a state get exceptions to the law and transparency based on feelings?
asacrowflies
I need to get a copy of the database asap just for my data archiving neurosis lol. The amount of vitriolic comments here from entitled "service" members is astounding and makea me wish to safeguard it personally. I mean we have people in the comments calling people "extremist" and bringing up founding fathers bullshit ....
Only jarheads seem to think the parental tone of "you don't know what freedom is" actually works.... Maybe because they have been thru boot camp idk.
tivert
> The amount of vitriolic comments here from entitled "service" members is astounding and makea me wish to safeguard it personally.
As someone who's never been in the military and isn't even acquainted with that many people who have been, I think I should give you a head's up that you are being an entitled asshole.
I think most people would be mad if they found out some of the most private and personal records about them (https://news.ycombinator.com/item?id=42685002) would be made public to anyone who'd care to request them. It's a pretty terrible violation of privacy. Try to think about how you'd feel if something like that was going to happen to you (say, your complete browsing history would be made public, because I'm sure you have one and you almost certainly want to keep it private).
Bjartr
Spent a few minutes trying to find an answer to "why do this?" Beyond just implying that it should be done and the most I was able to find was one sentence buried amongst paragraphs and paragraphs of "what" and "how".
> these materials were largely unknown and inaccessible to historians, journalists, and genealogists
I think it would be worthwhile to lead with that and include a little more detail too.
If there isn't a clear motivation, people will assume the worst.
draftsman
I think it’s critically important to mention that the VA provided all this data to Ancestry.com years ago. According to the newletter op linked, Ancestry.com charges $300/year for access to this data. This unfairness is what prompted the lawsuit and ultimate release of data.
Asparagirl
Indeed. This is ABSOLUTELY not the first time we’ve dealt with a government agency (at the local, state, or federal levels) providing a copy of a public dataset to Ancestry.com and not to the general public. Our taxpayer-funded data keeps ending up solely behind a $300/year paywall. It’s not fair.
(Also, the stripped-down version of BIRLS that has been on the Ancestry website for a while now is much smaller and older.)
irunmyownemail
How recent was the last death of a veteran, given to Ancestry.com, compared to what your efforts have now exposed?
toomuchtodo
Tremendous work on this, thanks for the work to unlock these records for the public.
Suppafly
I think it's pretty obvious why this material should be available.
>If there isn't a clear motivation, people will assume the worst.
This is just a weird assumption.
archerjax
My father, grandfathers and all my uncles are here. All deceased. I’m failing to find a use case for social security numbers to be present here. Or any of it to be honest.
hoppyhoppy2
Dead people's SSNs are published on a regular basis by the Social Security Administration (the "Death Master File" or "Social Security Death Index"). Once you're dead it's not really private information.
I agree with you re: living people's SSNs, though.
neilv
This project might be entirely well-intentioned, but some possibilities to be careful of, with this kind of effort:
* Intent is to sell the data, or otherwise "monetize" it, in the techbro sense.
* "Shell" effort of a specific company that wants the data.
* Shell effort of an organized crime group.
* Shell effort of a foreign intelligence agency, or terrorist group.
Awhile ago, there was a different project, which had the effect of making different US records, which were already reasonably accessible to US citizens and journalists, easily available to foreign adversaries, such as for espionage profiling and blackmail. When that project was promoted on HN, I caught the promoter seeming to use a sockpuppet account in the comments (accidentally using the wrong account to respond to themself), which I found additionally suspicious.
Even when a project is fully honest and with good intentions, we also have to consider the risks of likely other consumers of the data, which include all the possibilities above.
tivert
Somewhat related: there was a fire in 1973 that destroyed the military records of a large fraction of former military personnel at the time: https://en.wikipedia.org/wiki/National_Personnel_Records_Cen....
Asparagirl
Yes, a terrible fire, although there are efforts ongoing to restore some of those files, even reading the data from charred papers and edges with newer technology.
However, these particular files (benefits claims files, or C-Files) are a different type of file and never burned. Better yet, they often have some parts of the veteran’s OMPF that were copied *into* the C-File, to establish eligibility for those benefits — copies that were made before the fire! In other words, these files could serve as partial backups…
Hi HN. I'm the president and founder of a small non-profit called Reclaim The Records that identifies historical and genealogical materials and data sets held by government agencies, archives, and libraries -- and then returns them to the public domain, for free public use.
Back in September 2017, our organization made a Freedom of Information Act (FOIA) request to the US Department of Veterans Affairs (the VA) asking for a copy of a database they maintain called "BIRLS", which stands for the Beneficiary Identification Records Locator Subsystem. While it's not exactly an index of every single post-Civil-War veteran of every branch of the US military, it's possibly the closest thing that exists to it.
BIRLS is a database that indexes all the known-to-the-VA-in-or-after-the-1970s *veterans' benefits claims files*, also called C-Files or sometimes XC-Files. Older veterans' claims files have been moved to the National Archives (NARA), such as the famous Civil War pension files. But 95% of the later benefits claim files, from the late nineteenth century up to today, are still held at the VA, in their warehouses, and still haven't been sent to NARA.
And even if you know these files exist, the VA really doesn't make it easy to get them. The Veterans Benefits Administration (VBA) group within the VA only seems to accept FOIA requests for copies of C-Files by fax (!) and also seems to have made up a whole new rule whereby you have to have an actual wet ink signature on your FOIA request, not just a typed letter.
Well, seven years and one very successful FOIA lawsuit in SDNY against the VA later, we at Reclaim The Records are very proud to announce the acquisition and first-ever free public release of the BIRLS database, AND that we built a new website to make the data freely and easily searchable AND that we even built a free FOIA-by-FAX-API system (with a signature widget, to get around the dumb new not-FOIA rules!) built into our website's search results, that makes it much, much easier for people to finally get these files out of the VA warehouses and into your mailbox. :-)
We also added the ability to do searches through the data for soundalike names, abbreviated names, common nicknames, wildcards, searches by date of birth or death, or ranges of birth and death years, or search by SSN, or by branch(es) of services, or by gender...
For a lot more information about our FOIA lawsuit against the VA for the database, including copies of our court papers and the SDNY judge's order:
https://mailchi.mp/reclaimtherecords/the-birls-database-goes...
As for the tech stuff, actually building the website, the search engine, and its FOIAing capability...well, it has been a pretty fun project to build.
The BIRLS dataset was eventually provided to us by the VA (several years after we originally asked for it...) as a large zip file which, when decompressed via the command line, yielded the hilarious file name of *Redacted_Full.csv*. I then loaded the cleaned CSV data into a MySQL database, and then used a modified version of the Apache Solr search engine to index the data, so that it could become searchable by soundalike names (using Beider-Morse Phonetic Matching), nicknames (using Solr's synonyms feature), partial names (using wildcards), with dates converted to ISO 8601 format to enable both exact date and date range searches, and various other search criteria.
The front-end of the website is built with Nuxt and hosted on Digital Ocean's App Platform, with backups of the FOIA request data on the cloud storage service Wasabi. The fax interface for submitting FOIA requests is powered by the Notifyre API. We use Mailchimp to send e-mail newsletters, and their product Mandrill for programmatic e-mail sending. We use Sentry for error monitoring, Better Stack for server logging, and TinyBird to collect FOIA submission analytics.
Enjoy!