Git continues to be a technology that I observe folks struggling with. Especially when / if they haven’t come from a computer science or software development background. Why is this? I have some theories.
First and foremost, you have to remember that in many git workflows you have not one but THREE different ‘things’ to keep track of:
your local environment - this is whatever you’re working on right now and the local git repo that tracks it
your personal cloud repository - while there is debate over what to call this, I tend to name this one origin, this is your “fork” and should only be “your work”
the main repository - again, some debate on this one, but I tend to call it upstream. This is the central repository that everyone has forked from and is ultimately doing pull requests to.
I cannot tell you how many times I see folks with their local out of sync with the upstream or get confused about which branch they are on, why they can’t merge code, etc.
I’ll tell you right now, a big part of this is a lack of visualization. I admit freely that I learned how to use version control (originally Subversion, not Git) using a GUI. It took me years to learn to do everything on the command line and I still do some tasks using a GUI. Why? It’s EASIER IN A GUI.
When I’m managing 10 developers and my own fork plus the upstream I want to be able to rapidly switch between branches, see where things are in terms of the commit history, etc. Can you do that with a terminal? Sure. Is it easier in a GUI? YES. You tell me which is easier to read.
Terminal:
git log
commit 5c60eda513a97e408a4ab172471d9f244140390f (HEAD -> develop, upstream/develop, hotfix)
Merge: 940e9910 e258748c
Author: Mike Madison <22753451+mikemadison13@users.noreply.github.com>
Date: Thu Oct 8 13:45:12 2020 -0700
Merge pull request #913 from mikemadison13/hotfix
DGC-000: correcting deprecated function user in theme.
commit e258748c7a137e71d46498010636ef5a5db64f29 (origin/hotfix)
Author: Mike Madison <mike.madison@acquia.com>
Date: Thu Oct 8 13:32:30 2020 -0700
DGC-000: correcting deprecated function user in theme.
commit 940e9910d307030f553a704f3c57dcb73a3e7aff
Merge: ea735636 3cd134fc
Author: Mike Madison <22753451+mikemadison13@users.noreply.github.com>
Date: Thu Oct 8 12:40:08 2020 -0700
Merge pull request #912 from mikemadison13/cleanup
DGC-000: config and theme cleanup.
commit 3cd134fcaf5815ad7838abf68cfbae78653035a9 (origin/cleanup, cleanup)
Author: Mike Madison <mike.madison@acquia.com>
Date: Thu Oct 8 11:59:23 2020 -0700
DGC-000: updating menu config, removing old menu.
commit b4d848fbf46cb752f6e90ed9fba14119cfe7a9b8
Author: Mike Madison <mike.madison@acquia.com>
Date: Thu Oct 8 11:54:57 2020 -0700
DGC-000: removing old themes.
GUI:
Having said that, there are still some really critical terminal commands that you should learn and start using ASAP (and at least 1 that you should unlearn and stop using). Also, just as a disclaimer, this is by no means intended to be a comprehensive “all the Git commands” guide. I’m making some assumptions that you’ve already cloned a codebase and are actively working with a local repo.
1. Git Fetch
Any time you merge something upstream, you need to make sure that your local is tracking the change. Git fetch only updates tracking in the Git registry, it doesn’t actually “change” anything in your codebase. This commonly causes confusion. Typically you would run this command like:
git fetch upstream
The expected result is either nothing (meaning you have the most up to date version) or a list of branches that have changed. From here you still need to take action!
Why Use Git Fetch:
You want to make sure that at any given time, the local branch you are working on is tracking your upstream integration branch (e.g. upstream/develop). Any time a pull request is merged upstream you want to rebase your code (see the next item on the list). Git fetch will ensure that you can rebase. Without a fetch, running a rebase may not actually get you the most recent commit(s) from the upstream.
2. Git Rebase
Git rebasing is one of the most complicated Git tasks to explain. So I’ll try to do it with an image first.
Here’s the scenario:
You have done some work and made commits. Hooray!
While you’ve been doing so, a pull request has been merged upstream in the primary branch.
Before you open a pull request from your feature branch, you want to make sure you get the changes from the upstream.
By doing a rebase, you will effectively pick up your 2 new commits in the feature branch and “change the base” commit in the integration branch.
Why Use Git Rebase:
The short answer is that it makes the git history much cleaner and easier to read. Rebasing keeps a single, straight line of history that keeps a series of commits over time. Without rebasing, you would be doing a “git pull” which will create a merge commit every time you “pull in” changes from upstream.
Over time if merge commits get created, a few things will happen:
Your history gets “really” hard to read
You will start losing commit order (things that happened days ago could get moved around and re-circulated)
Feature branches may actually start “undoing” work without the developer realizing it because the merges are happening behind the scenes! This is bad!
The TLDR is that you want git history as clean and concise as possible and as easy to read as possible. The best way to do this is to always rebase and never pull.
The one command to forget? git pull.
3. Git Rebase -i
An interactive rebase is a little different than just running a rebase. To run it, you need a hash from a commit. Here’s an example:
git rebase -i 36f29df8960a02643c3d1dc9933b26b462cc7dbd
Each line with a “pick” is a commit in my git history.
Why do an interactive rebase?
The short answer is that sometimes you need to rewrite git history. Disclaimer: you should try to avoid writing git history whenever possible.
Let’s say, for instance, that my second commit 00b05d1a ended up being a mistake. Maybe I made a change, but it caused a build to fail. Maybe I missed something and had to do another config commit immediately after it. Maybe it was so bad of a mistake that I had to undo/redo work from the commit.
Regardless of the scenario, let’s assume that 00b05d1a is a “bad” commit. We have some options:
Commit a knowingly bad commit (might be ok, might not). This can really clutter up the git history.
Delete the commit (might be ok, depending on what you do later)
Combine the commit with another commit.
An interactive rebase lets you do both scenarios 2 and 3.
Want to delete a commit entirely? Just remove the line from the git history and save.
Want to combine commits? Change the word pick.
Want to change the oder of commits? (helpful when you want to squash commits if they aren’t right next to each other) then just cut and paste lines and save.
The big take away here is that you can really clean up your git history to reorganize, remove, and combine. But this can be bad, too. You should NEVER do this on an integration branch. You should only do it on your own branches prior to merging pull requests. And obviously, you should be very careful not to accidentally delete things that don’t need to be there.
If you have a pull request with like 15 commits though… there’s a good chance I’m going to ask you to do an interactive rebase and try to clean up that history.
4. Git Commit --amend
For the same reason as above, amending a prior commit can be super useful to keep your git history clean. I run into this frequently! I’ll do some development and make a commit. Then I’ll use my code validation tools and find a standards issue in my code. Sure, I could do a second commit to change what I put in my first commit after I fix the standards. OR I could just amend the first commit so that I’m committing quality code the first time.
git commit --amend
Reminder: to amend a commit you have to first add at least one file as if you were going to do a new commit. By passing --amend you instead change the prior commit (instead of creating a new one).
You can also use git rebase -i to change the order of commits so you can amend older commits as well since you can only amend your most recent commit. You can also use this command to update the commit messages on previous commits.
5. Git Reset --hard
Warning: this is the most destructive command on this list. It will wipe away any commits or file changes you have locally. So why run it?
Whenever I start work on a new ticket, I want to be 100% sure that my local environment precisely matches the upstream integration branch. So I nuke it with this command.
git reset --hard upstream/develop
Obviously, this is one that you should always run after a git fetch (and NEVER EVER run unless you’re really, truly sure you want to be destructive.) But it’s a great way to make sure that your local is exactly tracking the upstream.
A Note About Forks
You may have noticed that in the intro to this article I talked about 3 repositories, but throughout this article I’ve really only touched on the upstream and the local. That’s because the fork / origin should ever really be used for anything other than submitting a PR.
There is literally no reason to keep origin/develop in sync with upstream/develop. Why? Well, you’d have to run an extra command every time you did anything. You shouldn’t be using origin/develop for anything, you aren’t going to be merging code into it. So use the fork as a place to open pull requests, and remember: all the commands that interact with or retrieve code from your cloud repo(s) should be run against the upstream. Git push obviously should still go to your fork/origin. The rest should be with the upstream.
My Recommended Git Workflow:
when starting a new ticket, run these commands:
git checkout develop
git fetch upstream
git reset --hard upstream/develop
git checkout -b <ticket-number, eg. DGC-000>
do your development work
prior to opening a pull request, run these commands
git fetch upstream
git rebase upstream/develop
What happens with composer patches silently fail? This article covers a couple common scenarios and how to resolve them.