Converting a Git repo from tabs to spaces (2016)
82 comments
·May 2, 2025mmastrac
js2
It does. See `--ignore-revs-file`:
https://git-scm.com/docs/git-blame
You can configure a default:
git config blame.ignoreRevsFile .git-blame-ignore-revs
GitHub supports it too:https://docs.github.com/en/repositories/working-with-files/u...
I'm really curious though. This is a feature you've wished for: have you never bothered to run `man git-blame`, `git blame --help`, or Google for it? Git has supported it for ages and it's a trivially easy feature to find. Using your own description:
https://www.google.com/search?hl=en&q=git%20skip%20commit%20...
McP
Nice to see ignore-revs getting some love :)
I originally wrote it because I wanted to do a mass-refactoring to llvm-project to change its weird naming convention and "it will mess up git blame" was an objection that was raised. Getting ignore-revs landed took many iterations over several months (thanks Barret!) and at the end of it I felt so drained that I didn't have the energy to do the mass refactoring I originally planned. Oh well. Maybe someday.
mabster
A big thank you! Blame history being correct is something i care quite a bit about and I always add one of these files when I do formatting changes. I think I'm probably the only developer on my teams with this configured on though haha!
jayd16
The annoying thing about git is that you can't really set this kind of stuff up globally for a project w/o digging into some custom hook solutions. They should really have some kind of default config file with all these things. I really don't understand why everything needs to be per user settings ONLY.
npendleton
Whoa, it looks like my old patch is getting fixed up properly now! We might be getting this feature https://lore.kernel.org/git/20250501214057.371711-4-gitster@...
IshKebab
It would be a lot more usable if you could put that info in the commit.
prepend
No. I don’t want the author to make that decision for me. I’d rather git record everything and then I can choose how to view or render it.
Different people have different view preferences.
OptionOfT
That on its own is a security risk, as it would introduce means to hide a commit in the commit itself.
At least with the . file you have to make 2 separate transactions.
mmastrac
I've been using Git since the early 2010s and this feature was released in Aug 2019 (https://github.blog/open-source/git/highlights-from-git-2-23...).
You don't think I looked for it for the first 7-8 years of using Git at least a few times and came up empty? Seems a little uncharitable. Hacker News is a place to learn about stuff, not be chided for missing a point note in a release.
Come on man, you've been using HN for almost as long as I have. Be curious, treat people's comments with charity, continue the life-long learning tradition.
Obligatory XKCD lucky 10,000 link: https://xkcd.com/1053/
js2
You're right. My apologies. It wasn't meant as a critique. I've been using git even longer and my memory was that the feature had been there way before 2019. Time flies. Relevant commit:
https://github.com/git/git/commit/ae3f36dea16e51041c56ba9ed6...
joshstrange
`.git-blame-ignore-revs` is probably what you are looking for
Example: https://gist.github.com/kateinoigakukun/b0bc920e587851bfffa9...
y-curious
My one gripe with this is that devs need to point their IDE to the file in the IDE settings. When I implemented .git-blame-ignore-revs, I got a lot of people complaining about blame disappearing completely and I had to point them all to editing IDE settings
braiamp
`blame -w` ignores the ones that are described in the article.
PhilipRoman
git-blame-ignore-revs is great, but ultimately a half measure. Replace blame with log -L
patrickthebold
Is .git-blame-ignore-revs what you are looking for?
kwk1
See also `git blame -w`
zzo38computer
Note that some files will need tabs such as Gopher menus and Makefile.
joshstrange
> Then, commit! As per Yelp tradition when rewriting every single file in the whole codebase, I attributed the commit to Yelp’s lovable mascot Darwin. It stands out better in git blame, and it preserved the extremely critical integrity of my commit stats.
Interesting, I fully expected this blog post to touch on `.git-blame-ignore-revs` as a way to not "pollute" the git history but I'm not sure when that "came out". I found a Github issue from 2021 asking for support to be added to Github so it may just be newer.
How do other people feel about this? Massive code changes across the codebase? Where I work some people are (understandably) concerned about it "ruining" `git blame` or IDE tools to blame. It's not useful to see "Converting to spaces!" on every line you want more context on. Yes, you can step further back but that's always been a little awkward for me (at least in IntelliJ) but maybe I'm missing something. I just find it incredibly helpful to understand the context of why a line was last changed and I'd want to skip over any edits like tabs->spaces.
johnmaguire
Added to Git in 2019: https://github.com/git/git/commit/209f0755934a0c9b448408f9b7...
Supported on Github in 2022: https://github.blog/changelog/2022-03-24-ignore-commits-in-t...
zahlman
> I fully expected this blog post to touch on `.git-blame-ignore-revs` as a way to not "pollute" the git history but I'm not sure when that "came out".
Per https://news.ycombinator.com/item?id=43869828, it appeared August 2019 - so, indeed too late for OP.
e: Also, FTA:
> Blame is not, in fact, permanently ruined. git blame -w ignores whitespace-only changes.
matsemann
What if one instead rewrote the last commit for each line to use spaces for that line? Or just rewrite the whole history to have used spaces. Might break something in the history if one were to check out an old commit, though. And makes it hard to revert if something breaks due to changing to spaces (impossible to find the offender in the diff).
_Algernon_
>Or just rewrite the whole history to have used spaces.
Ah, yes. The 1984 approach to coding
null
woodrowbarlow
`git blame -w` ignores whitespace-only changes, for what it's worth.
Alifatisk
Why would you want to convert from tabs to spaces?
diggan
> their mostly-Python codebase had always been indented with tabs
Tabs VS spaces isn't usually very important, but what's more important is that all the stuff is the same way. So if all the other codebases (in the same language) are using tabs, then make everything (in that language) use tabs. Consistency basically :)
gwbas1c
I used to agree with that, until I read this article. I would always use the IDE's default and "not care" as long as the code was consistent.
The problem with tabs is that they render as different widths in different contexts. For example, Visual Studio shows them as 5 spaces, but Github shows them as 8.
Puts me firmly in the spaces camp now.
InsideOutSanta
> The problem with tabs is that they render as different widths in different contexts
The funny thing is that this is why I prefer them. It means I control how indents render rather than the person who wrote the code.
diggan
> I would always use the IDE's default and "not care" as long as the code was consistent.
I mean, "just use the IDE's default" isn't really agreeing, unless that's what your entire organization does too, and you all use the same IDE :)
Alifatisk
I agree.
mcdonje
Because they're deranged control freaks who need to convert a single character that is semantically a tab into multiple characters that are an opinionated representation of a tab.
Devs: We need to separate concerns and split the view from the model.
Also devs: Someone might view the code differently!!1!
maw
A codebase that's formatted notgivingashittily is an accessibility issue. It's not just deranged control freakism.
Maybe Yelp's codebase was otherwise clean, but aside from golang projects (and the Linux kernel) I've come to associate tabs with unreadable slop code. Maybe your experience is different.
smrq
Forcing a single opinionated tab width is an accessibility issue -- a real one, not a weird heuristic that boils down to "tab fans can't format". I've read multiple accounts from people who need either very small tab widths (to accommodate unusually large font sizes for eyesight reasons without cascading off the side of the screen), or very large tab widths (to accommodate difficulty in seeing indentation differences, again for eyesight reasons).
Defletter
I've been firmly pro-spaces ever since I discovered there was an everlasting war over this, and it came about primarily over documentation. Say you're writing documentation within a /***/ block, so each line is prefixed by three characters. Now say your documentation includes a code snippet. Or lets say that particular sections of the documentation (such as JavaDoc's @see) are indented so each line always starts after the @see. You end up with documentation indented with spaces because it's the only way to ensure consistency. And if you're doing it with your documentation... why not your code too?
However, my conviction has since been tested by Dart which opinionatedly forces you to use two-space indentation. There's no way to disable this and its IDE plugins enforce the style. I just find it so difficult to read, even with Rainbow Brackets. I yearn for Dart to use tabs just so I can configure the tabs to appear as four-space indentation. Or better yet, stop trying to coerce how people write their own code.
ooterness
To make more money:
https://stackoverflow.blog/2017/06/15/developers-use-spaces-...
zahlman
My best guess: using spaces selects for developers who understand how their editor works (which correlates with higher overall cluefulness), because they'd go insane otherwise.
david2ndaccount
Tabs are a control character and have no business being in a text file. Do you use ascii record separator characters too?
yjftsjthsd-h
> Do you use ascii record separator characters too?
The only reason I don't use them is because nothing supports/expects/shows them. The alternate history where we properly use them is a world where CSV isn't needed and we're better off for it.
IvyMike
Galaxy brain: indent using U+001F Unit Separator
imiric
No, but I do use Line Feed and Carriage Return.
rascul
This is the discussion I came here for.
mmastrac
Because it's the one true way, and tabs are WRONG.
Also Vim > Emacs, the new BSG was better than the old BSG, TNG is the best Trek, and all the other hashed-out flamewars of the 90's and 2000's. :)
evbogue
There's a debate about new BSG being better than old BSG?
mmastrac
I posit:
For every topic of A vs B where A and B are related in some way, no matter how small, there exists an argument C where two people take increasingly opposed positions about which is better.
HideousKojima
I actually love the original BSG. And the new one started out strong but the writers clearly didn't have a plan for where they wanted things to go despite the opening credits insisting the Cylons have a plan.
baobun
I would just approach this like text. Something like:
find -type f -name '*.py' -exec sed -i 's/^\t/ /' {} \+
, until you don't see a diffSeems simpler to adjust that general approach to whatever codebase and replacement.
gwbas1c
FYI: If you're in the .net ecosystem, you can choose your tabbing style (tabs or spaces) with an .editorconfig file. Then running "dotnet format" will change everything for you. (And, if you use github, you can create actions to assert that the .editorconfig is followed.)
diggan
FWIW: EditorConfig isn't a ".net ecosystem" thing but works across a ton of languages, editors and IDEs: https://editorconfig.org/
Also, rather than using GitHub Actions to validate if it was followed (after branch was pushed/PR was opened), add it as a Git hook (https://git-scm.com/docs/githooks) to run right before commit, so every commit will be valid and the iteration<>feedback loop gets like 400% faster as you don't have to wait for Actions to finish.
gwbas1c
Git hooks require environment-specific configuration. CI enforcement makes sure that everyone follows the rules, even if they "forget" to set up the git hook.
Also: dotnet format is kind of slow, which is why they aren't used where I work.
diggan
> CI enforcement makes sure that everyone follows the rules, even if they "forget" to set up the git hook.
Yeah, my wording was a bit poor (shouldn't have said "rather"), both are needed, one just helps you fix stuff faster :)
And if you write your hook in a language that can cross-compile and can easily deal with multiple platforms (Go, Rust, NodeJS, many options [probably .net too?]), it's really easy. Just need to make the setup of them part of the onboarding.
s09dfhks
what is this furry tomfoolory
gwbas1c
One funny anecdote: I once did a similar cleanup on a codebase that was mostly spaces, but a few tabs slipped in. (I just did a find and replace on \t -> " ")
Suddenly, one unit test broke. On closer inspection, whoever wrote it put a tab character into a string. I changed the test to use \t.
user9999999999
whitespace is a terrible block scope definition, its literally using 'invisible' characters to determine block scope! just use semi colons. LONG LIVE SEMI COLONS ;;;;;;;;;;;
kgwxd
I never understood why programmers universally like fixed width fonts, but then about half want just 1 of those characters to be batshit crazy.
gwbas1c
> One way or another, you must get this block in your devs’ Git configuration
Uhm, things like this should be enforced in CI. IE, as a rule that must pass in order for a pull request to be merged.
I wish Git had a way to "skip" a commit for blame for mechanical changes like this. It's the one big shortcoming I keep running into. A commit should be able to be marked as "blame-free" and git blame should then walk up to the parent commit.
It might be expensive to compute but man it would be so useful.
Edit: TIL about .git-blame-ignore-revs. I am the 1 in 10000 for this one today, thanks.