Skip to content(if available)orjump to list(if available)

Find the oldest line in your repo

Find the oldest line in your repo

42 comments

·January 27, 2025

lb1lf

Not on Git, but I was curious and grepped through the Siemens S7 repository I maintain at work; we've been using the same comment practice since forever, with the date in ISO8601 format. (Since before ISO 8601 even was a thing!)

Oldest I found?

1986-06-17: Trygve glemte å sjekke om vi deler på null. Fikset.

(Trygve forgot to check whether we divide by zero. Fixed.)

loeber

Incredible. This is like Graffiti from Pompeii.

akoboldfrying

Ugh. That is sooo Trygve.

NoMoreNicksLeft

You've got me beat... at a previous job, there were comments from 1991 complaining about how it was ported/rewritten from cobol.

jamesfinlayson

Ported to what?

js2

You don't need to blame every file. Use `git rev-list` to find your oldest commit.

  git rev-list --reverse --date-order | head -1  # or
  git rev-list --reverse --author-date-order | head -1
That's it. Now you have the commit whose tree contains your oldest files. Now use:

  git ls-tree -lr <tree-ish> # where <tree-ish> is the commit ID from above.
To see a particular file:

  git show <commit-id>:/path/to/file  # you could also use `git cat-file` here
Caveat: I suppose this doesn't account for files which no longer exist or that have been completely re-written.

https://git-scm.com/docs/git-rev-list

https://git-scm.com/docs/git-show

https://git-scm.com/docs/git-cat-file

https://git-scm.com/docs/gitrevisions

lutherqueen

Similar oneliner to paste on MacOS terminal and get the eldest line for each file extension:

for ext in $(git ls-files | grep -vE 'node_modules|\.git' | awk -F. '{if (NF>1) print $NF}' | sort -u); do echo -e "\n.$ext:"; git ls-files | grep "\.$ext$" | xargs -I {} git blame -w {} 2>/dev/null | LC_ALL=C sort -t'(' -k2 | head -n1; done

OJFord

It's probably almost always going to be a boring config line(s) in the initial commit?

A section header in a pylintrc or Cargo.toml, a Django settings.py var, etc. Or even an import/var in a file that's core enough to still exist, import logging and LOGGER = ... for example.

lionkor

You underestimate the amount of software that starts with CRUFT

skeptrune

I like leaving something like gitlens on so I can see the super old lines ad-hoc when I naturally come across them. It's fun to get glimpses of the past.

cmgriffing

Take my upvote :)

zellyn

In our monorepo (of 101470 Java files, according to

    find . -name '*.java' | wc -l
), I shudder to think how long that would take. For large repos, I imagine you could get quite a bit faster by only considering files created before the oldest date you've found so far.

verytrivial

Our code base still has ghost comments about code being just so because the NeXT compiler won't accept it any other way. No one has the heart to remove them.

jamesfinlayson

I picked up a project from 1999 a few years ago that still had far pointer macros - I didn't think they were still a thing in 1999 so I'm not sure why they were there to start with. I think I've left them though.

nortlov

I imagine future engineers as archaeologists of software development, in a way, digging through ghost comments like fossils in the code.

chrisweekly

future engineers? archaeology is an essential part of virtually every real-world software project

JensRantil

Not sure why all the lines of code. This is much shorter:

    git ls-files|xargs -n 1 git blame --date=format:%Y%m%d -f |grep -Eo '\d{8}.*' |sort -r | head -n 1 | sed 's/^[^)]*) \t//'
(on MacOSX)

js2

Get in the habit of using `ls-files -0` and `xargs -0` to prevent surprises. But there's no need to blame every file:

https://news.ycombinator.com/item?id=42883340

jamesfinlayson

Hm, tried this on a Mac but something must be askew - it returned a commit by me in 2022 in a repo that has existed since at least 2017.

pc86

Formatting strikes again

JensRantil

Thanks. Fixed.

hoten

"Initial Commit", 9 years ago (transfered an at-the-time 15 year old SVN repo)

sigh..

jamesfinlayson

Sigh indeed... at a previous job there was a project that was a port of an Algol project that began in 1992. I have no idea what version control systems were used in its history (wouldn't be surprised if it started with no version control) but the last version control migration was from Team Foundation Service to GitHub and of course it was just a single commit of the then current master. 23 years of history gone.

krick

FWIW, when I ported SVN repos at work, I converted the commit history as well.

notwhereyouare

we hired contractors to move us from source gear to git and they said "moving the history would be too hard, so we didn't"

lost probably 10+ years of history

almostnormal

Renaming the company is also fun when its name is used for folders / package paths. The history isn't lost, just unusable.

null

[deleted]

JensRantil

facepalm

hoten

yeah... wasn't me that did it though. Same group of people did it with a git repository they "recreated" more recently. They just don't know how to software.

I got my hands on the old SVN but it's a few TBs and so I had some trouble unzipping it. Maybe someday I'll patch a branch for blame archeology.

abejfehr

doesn't seem to work on macOS, I get:

  find: illegal option -- t
  usage: find [-H | -L | -P] [-EXdsx] [-f path] path ... [expression]
         find [-H | -L | -P] [-EXdsx] -f path [path ...] [expression]

spatten

I ran into that too. Turns out that you need to add in the path you want to search when invoking the command. If it's your current working directory, use `.`

lionkor

At my old job, I remember it was some time at the beginning of the 1990s. I was born like 8 years after the code I was working on was written.

kridsdale3

Oldest file I ever had to fix was the same age as me: 1986. Found a bug in 2013. Timezone math.

rozenmd

> README.md 2021-01-28 17:27:57 +1100

Huh, TIL the birthdate of my business was actually a couple of days ago.