Welcome to the second part of Working with Git branches. In case you haven’t read the first part yet, you might want to do that first. While the first part covers the very basics of Git, this post is going to focus on merging and rebasing in more detail.
Rebase, what is it and why would I want to use it?
In my last post I showed a simple example of why branching might become necessary during an application’s development cycle. In that example all development happened on the master branch and additional branches were created to provide fixes for older versions before the next version was ready for release. The formal description of this process is known as git-flow and is best suited for application’s using semantic versioning. However, I made no mention of rebasing at all. Hold on for a bit longer, we’re going to get there soon.
While Branching and merging is good enough for a lot of scenarios, there is more than one way to develop software. Semantic versioning is fine for software that has a major release every so often, with a few bugfix releases in between, but it’s not particularly well suited for more agile development styles.
We’re in the process of transitioning to a more streamlined release cycle, known as “continuous delivery”. The main advantage of continuous delivery is that it allows us to get features out more quickly, while at the same time reducing the number of changes introduced with each release. Put another way, it increases the number of releases during a given time frame, while at the same time reducing the number of changes for each release.
Git-flow is not well suited to this kind of release cycle, so our usage of Git needed to be adapted to this. One possible alternative to git-flow is known as GitHub-flow. Here’s a quick list of bullet points that illustrates the main differences:
- The “master” branch is never modified directly. Ever.
- The “master” branch is always in a “ready-to-release” state.
- Development work is done on feature branches.
- Feature branches are merged back into master only once they are ready for release.
- Merging a feature branch into master must never result in a merge conflict. Ever.
- When a feature branch is merged into master all other branches need to be updated.
The final bullet points is the most important one from this article’s point of view. The rest of the article is going to focus on how to update any open feature branches once another feature branch has been merged into master. Updating all other branches from master once a feature branch has been merged will also take care of the second to last point. There are two basic strategies how a feature branch can be kept up to date once master has been updated:
- By merging the current state of the master branch into the feature branch
- By rebasing the feature branch onto the current state of the master branch
The reason why a merge back into master must never create a merge conflict is simple: Solving a merge conflict introduces changes to code. Those changes have never been tested. This immediately breaks the second rule: Master must always be ready to release. If merge conflicts are instead solved when master is merged into a feature branch those changes can tested (and possibly fixed) before they are merged back into master.
A new feature “measure” was developed and subsequently merged back into master for deployment. Two new features branches “measure-with-margin” and “measure-with-padding” were created that both address specific issues that were found after deployment. It is important to note that continuous delivery makes no distinction between features that introduce new functionality and features that fix existing bugs.
To give a more visual representation of the above repository, imagine this to be the repository of an internal library. The measure feature introduced a method that can be used to calculate the size of a view. After the library was released some users found that the method doesn’t work for views that have a margin. This prompted the creation of the measure-with-margin branch that aims to address this shortcoming. Some time later other users noticed that the method does not work for views that use a padding, which prompted the creation of the measure-with-padding branch.
In a real world application both fixes would probably be done on the same branch, since they concern the same method. For the sake of this example I have separated both changes into different branches to create a simple scenario that results in a merge conflict. A merge conflict happens when changes introduced by one branch conflict with the changes introduced by another branch. Simply put, a conflict occurs whenever two branches modify the same line of code (unless both branches changed the same line the same way).
First, let’s take a look how to add these feature branches into master, using merges only. The order in which the branches are merged isn’t important here. In reality it is rare that two feature branches are completed at exactly the same time. No matter which branch is merged first, the other branch will have to be updated before it can be merged.
For continuous delivery feature branches are merged by issuing a merge request to another developer. This developer will review the changes before either rejecting or accepting the branch for merging. Merges through a change request will never result in a fast-forward merge. To replicate this when doing a merge on the command line the “–no-ff” flag is added to the merge command below. Without this flag Git would to a fast-forward merge in case it detects it can do so. In this case a fast-forward merge would be possible since master contains no other changes since the code was branched. A fast-forward merge would make the history look as if the commits were done directly on master.
git checkout master git merge feature/measure-with-margin --no-ff git branch -d feature/measure-with-margin
Since the feature branch is no longer needed to track changes after it was merged it can be safely deleted. After the first branch is merged the history will now look as follows (the feature branch has not been deleted yet)
After a feature branch has been merged into master all other branches need to be updated. Note how the measure-with-padding branch originates from a older commit of the master branch. The master branch is now ahead of the feature-measure-with-padding branch. To fix this we need to update the second feature branch to contain the changes that were just merged into master:
git checkout feature/measure-with-padding git merge master
The above commands have checked out the second feature branch and then merged master into it. Once we have fixed all potential merge conflicts the history now looks as follows:
Assuming the second feature branch is ready for release as well it can also be merged into master. In reality the second branch would need to be run by QA again to ensure the new changes introduced by the previous merge introduced no issues.
git checkout master git merge feature/measure-with-padding --no-ff
As with git-flow the workflow above can result in a complex history fairly easily. The example currently only contains two branches with a single commit each. Yet, there are already several intersecting lines of development that make it hard to see which changes have gone where. Imagine the same with 30 or more concurrent branches. To make history less complex a rebase of open feature branches is preferable. Here’s how:
As before the first feature branch has been merged back into the master branch. But, instead of merging the master branch into the second feature branch the second feature branch is rebased on top of the master branch. To rebase a branch we first need to switch to the branch we want to rebased, in this case the measure-with-padding branch. After switching to the branch we want to rebase we tell Git which branch we want to rebase on top of:
git checkout feature/measure-with-padding git rebase master
This rebase will take the changes made to the measure-with-padding feature branch and replay those changes on top of the master branch. After the changes have been applied to the master branch it updates the branch pointer to point to the new head of the commits it just created. This will turn our history into something resembling this image:
It should be easy to see that this will result in a much cleaner history, that is almost linear.
A word of warning
Rebasing is not without its own set of pitfalls. A rebase does not “move” commits from one place to another. Instead it creates a new set of commits, by replaying the changes found in the original commits on top of another branch. This makes it look like those changes were made on top of the new branch to begin with. After the new commits have been created the branch pointer is updated to point to the head of the new set of commits, orphaning the original commits. This will essentially remove the original commits from history (there are ways to restore them, but that is beside the point of this article).
So long as nobody else was referring to these commits this is not that big of an issue. However, as soon as those commit are referenced by somebody else or somebody else has a branch based off of those commits things can get ugly.
On top of that, if the rebased feature branch has already been pushed to a remote repository updating the remote repository requires a force push. As long as that feature branch is “private” this will not be an issue. Private in this case meaning it is clear to others they should not be working on that branch and they should not base their own work on top of that branch. If this is not the case a force push is out of the option. This means if a feature branch is “free for all” rebasing is never an option. In this case merging is the only viable solution.
That’s all for now. Check back next week for the last part in this series. It’ll cover an area I’ve completely ignored until now, namely conflicts.