Lesson Learned - I Can't Get My Git Repo Clean!

Published March 7, 2013

First baseI recently ran into an issue on one of our projects with a Git repository that stumped me for a few days. It was a small project: only three developers committing to a single repository hosted on Pantheon. I kept on running into an issue where I (or any of the other developers) could ever get my local repository to a “clean” state.

That is, I’d run “git status” and see

$ git status
# On branch master
#
# Changes not staged for commit:
# (use "git add/rm <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: image.JPG
#
# no changes added to commit (use "git add" and/or "git commit -a")

Hmmm, that’s odd, as I didn’t modify image.JPG. No worries, let me just do a “git checkout” on it...

$ git checkout image.JPG
$ git status
# On branch master
#
# Changes not staged for commit:
# (use "git add/rm <file>..." to update what will be committed)
# (use "git checkout -- <file>..." to discard changes in working directory)
#
# modified: image.JPG
#
# no changes added to commit (use "git add" and/or "git commit -a")

Whaaa? But I just checked it out? How can it be modified?

Well, as you may be guessing by the capitalized “.JPG” extension, the root of this problem was with case-sensitivity.

I called upon my Git mentor, David Rogers to help me figure out where the problem was and how to remedy it. David explained to me that the issue was that due to the fact that the three of us developers were on case-insensitive machines (Macs), and some of the commits were being made on a case-sensitive server. Pantheon has a feature where files SFTP’d up to the server are automatically committed, and our client-developer was making his commits this way.

The problem was caused because one of our developers committed the same file with two different (according to case-sensitive Linux) filenames to the repo, “image.jpg” and “image.JPG”. So, as David eloquently explained:

So Linux sees "FOO.JPG" and "foo.jpg" as two separate files... But Mac / Win see them as the SAME file... And Git, being a Linux tool, sees them as different, because it includes the name of the file in the hashing algorithm. If the hashes don't match, there's a difference. If the name is not the exact same, the hashes won’t match.

Brilliant - we had our cause, so what was the solution? It was actually a two-parter.

First, we needed to correct the repo on a case-sensitive machine so that there was only one copy of the (lowercase-named) image. We did this by using “git rm” to remove the duplicate, upper-cased file, then committed the change, then had our developers pull the commit.

If their local repos were still “dirty”, all that needed to be done was to locally drop the offending file, then check it back out using “git checkout”.

The key step here is fixing the upstream repository on a case-sensitive machine, without doing this, we’d never fix the core issue. Or, as David explained, we’d have a classic “who’s on first” routine on all of our case-insensitive machines:

user: git, what's our status?
git: the file named FOO has changed...
user: what's changed about FOO?
git: it looks different.
user: uh, okay... mac, delete FOO.
mac: (silently) deleted "foo"
user: okay, git, what's our status?
git: you're missing two files: "foo" and "FOO"...
user: wait, what? Okay, check out "foo" for me...
git: (silently) checked out "foo"
user: okay, great now what's our status?
git: the file named FOO has changed...
user: this again!? mac, what files are in this directory?
mac: there's a file called "foo"
user: git, what's the status of file "foo"...?
git: the file named FOO has changed...
user: MUST KILL ROBOT!!!!

So, the moral of the story - always keep your filenames lowercase to avoid issues like this, and if you do encounter an issue, fix it on a case-sensitive machine.

David Rogers is a professional software developer, speaker, trainer, and organizer of the OrlandoPHP user group. If you’d like more git-based comedy routines or at least an entertaining solution to your development problems, you can find him on the internets.

Comments

When I got my macBook and read about the filesystem, I started out by shrinking my partition and putting a case sensitive file system on the second half - for all my "unix-y/webdevelopment" stuff. On a hunch that problems like these would pop up.

I'm pretty happy with having the same case sensitivity that's present on my webservers. It isolates me from problems, and sometimes introduces new problems too - but I think they're fewer.

(Like having to debug MacPorts scripts which reference the same file as uppercase in one place, and lowercase in another place. That creates errors on my system which are not encountered on a standard MacOS partition.)

Thanks for the post...Looking for a way to delete all uppercase duplicates in linux. Closest I've found online:
find . -maxdepth 1 -print0 | sort -z | uniq -diz

Submitted by Devin (not verified) on Fri, 07/01/2016 - 14:38

Sign up to receive email notifications of whenever we publish a new blog post or quicktip!

Name
CAPTCHA