DevOps

Gitignore Files: The Filesystem MVP

If you’ve seen me talk about Drupal or are an avid reader of the blog, you probably know I do a lot of work with Acquia’s Build and Launch Tools (BLT). Recently BLT launched it’s 12th major release to support Drupal 9.x. During that release process there was a significant re-architecting of some of the functionality in BLT (and as a result, a fair amount of functionality was broken out into separate plugins). Dane Powell, BLT’s current maintainer, wrote up a great update guide that covers this in more detail if you’re interested.

As part of this re-architecting the long-used blt-project was sunsetted in favor of Acquia’s drupal-recommended-project. As with any new project, it has had a few minor tweaks that have been made. And one of those tweaks had to do with the .gitignore file. You know what? Having an improperly formatted .gitignore file really really makes life difficult.

So in this post I wanted to talk a little bit about why the .gitignore file is your filesystem MVP.

What is a .Gitignore File?

There’s git commands for just about anything you could possibly want to do with it. That’s one of the reasons git is so popular and so powerful. But, there isn’t actually a command to ignore something. If you have something you don’t want git to manage for you, you have to create a rule. And that rule lives in a special dot file called the .gitignore.

For more information, check out Atlassian’s documentation (which is pretty thorough). Keep in mind that .gitignore files are a git construct and not specific to any specific git host. So Github, Gitlab, Bitbucket, your local git, etc. all respect .gitignore files (and respect them in exactly the same way). So that’s pretty cool!

Why Should You Use a .Gitignore File?
(Why is it the MVP?)

I’ve talked a bit in previous posts about critical files that should be included in your repositories. It’s just as important (if not more so) to keep the wrong files out of your git repo and the .gitignore is the last line of defense for this.

In general, I would say files fall into a few categories of reasons why you shouldn’t put them in:

  1. They’re redundant

  2. They need to be regenerated as part of another process

  3. They’re actually not useful

  4. They are risky

Redundant Files

Many of the files you “could” commit in a repository are redundant due to other processes. Examples of these might be dependencies managed by dependency managers like composer or npm. You want these tools to manage which versions of your dependencies are built / deployed and not git. So you shouldn’t store the files in your repo! By excluding these files you also cut down on your repository size by a significant amount. Like tens or hundreds of thousands of files. This makes cloning and pushing significantly faster and it makes the management of these dependencies significantly simpler.

Common examples of this might be:

# Ignore drupal core.
docroot/core
# Ignore contrib modules. These should be created during build process.
docroot/modules/contrib
docroot/themes/contrib
docroot/profiles/contrib
docroot/libraries
drush/Commands
# Dependencies
vendor/
docroot/themes/custom/*/node_modules

They Should be Regenerated

An example of this one is files that are compiled as part of a build process like minified javascript and css files. It’s a best practice to never commit these files as you want to re-compile / re-minify them anytime you’re doing a build or deployment to ensure that you have the most current version.

Common examples of this might be:

# Ignore custom theme build artifacts
docroot/themes/custom/*/css
docroot/themes/custom/*/styleguide
docroot/themes/custom/*/js/dist

Note that in this case you probably would commit the unminified javascript and the scss files (if you’re using scss). You just don’t want to commit the minifed js and compiled css.

Another example here might be a complete build artifact. Meaning you want to build out something that has all the dependencies, all of the compiled front end code, sanitized for actual production use, etc. Again, in this case, you want to generate the build artifact without committing it into your development repository.

Not Useful Files

The next category is files that are actually not useful. Prime examples might be a .DS_Store generated by Mac OS or configuration files/folders generated by applications (like your IDE). Does it strictly speaking hurt to have these files in the repo? Probably not. These files represent a lot of noise that will make your git repo larger and clutter your pull requests / git diffs. They should never be in the repo. Ever.

Common examples of this might be:

# OS X
.DS_Store
.AppleDouble
.LSOverride
# Thumbnails
._*
# Files that might appear on external disk
.Spotlight-V100
.Trashes
# Windows image file caches
Thumbs.db
ehthumbs.db
# Folder config file
Desktop.ini
# Recycle Bin used on file shares
$RECYCLE.BIN/

Risky Files

The final category is less easy to prescribe but represents one of the most critical. Many projects rely on API keys or other credentials that are highly sensitive. Having these in git history could lead to a compromised application / webserver / etc. It’s definitely a best practice to gitignore files that might potentially be “accidentally” committed with these sensitivities.

An example from the Cloudflare Drupal Module:

# Ignore cloudflare CMI export file because it includes API keys.
config/default/cloudflare.settings.yml

In Conclusion

Obviously your miles may vary depending on what application you’re building for, but the principles here should apply quite broadly for software development. Here’s a complete .gitignore example for a Drupal project. Yes, there’s a lot of rules there. Also yes, many of these rules might be unnecessary (e.g. why have the $RECYCLE.BIN/ rule in there if no one on the team is developing from Windows?)

The critical thing here with a gitignore is you want to anticipate files before they end up in your repo. It’s much better to have an overly restrictive gitignore that you can gradually loosen the requirements on than to have a blank one where you are constantly realizing “oh we shouldn’t have added this.” Remember, once you commit something into git and push it, it’s incredibly challenging to get it back out again. This is particularly important to remember for those items under the “risky” category. If you push a password into a public git repository, consider it compromised.

As with many architectural (and devops) conversations, this is one that will be ongoing as your project continues! Definitely spend some time considering it up front. And if possible, rely on things like BLT to provide you with a recommended default so that you don’t have to try and think of all the things.

Photo by Giorgio Trovato on Unsplash

Related Content