Branches and code merging at scale — automated branch merging
One day we decided that we don’t want to waste any more time on tasks that can be automated and we decided to start with automation of our merging process.
Today’s development teams use various branching models. Ranging from the straightforward ones, such as the trunk-based model, up to a completely autonomous continuous deployment process. Many consider continuous deployment to be a utopia for large projects and only achievable for small, startup projects.
We realized that with a sophisticated — multiple development loops — model, such as the one we use, automation is a necessity, not an option. As an example, at any particular moment, we might have a standard feature development loop living next to an emergency bug fixing loop (that needs to go to production asap) and a standard bug fixing loop (that can wait for two or more weeks) in addition. For various reasons, each loop is going to be deployed at some unknown time in the future. Every single loop usually comes with a dedicated branch and then developers create short-living branches that are eventually merged into this dedicated branch.
Pretty complex with multiple teams working towards one common regular deployment, isn’t it? Well, it was, until we decided to solve it with an internal tool — The Automerger. It’s a tool that we created during our “branching story” at the EcoVadis Engineering team and in this article, we’ll show you how it helped us…
Let’s examine the situation where the development team has three active branches. A common situation with GitFlow branching pattern:
Master branch is representing the current production state.
The first working branch is Develop. Developbranch contains all the features that are going to be included in a release in the future; moreover, it doesn’t matter when this release happens.
The second branch, let’s name it Release-8 is going to be deployed to production very soon. No feature development on this branch is happening at this precise moment. Release-8 is currently in the stabilization phase. Let’s think about it as it is in QA hands. Only bug fixes can go there.
What’s the dilemma? Let’s imagine a situation where one bugfix goes to Release-8 and does not go to develop. Next Release-9 would reintroduce bugs. A situation like that is unquestionably not desired. Yep, you can ask developers to put changes into two branches at the time, once for the Release-8 and second time to develop. Sure, but a human tends to make mistakes, someone can forget. Develop can be a way ahead of Release-8, which makes it hard to put the same change on two branches. A developer can simply omit this duty. That’s life, what can we do about it? Automate it.
This is where the Automerger tool comes in place. It is intended to detect branches assigned to the development loops and perform in-place merges between them systematically. Whenever it finds any conflicts during the merge, it notifies an appropriate team that user action is required.
How can Automerger help my team/teams?
If your teams are in a position where you perform merges that are required by the branching strategy, or there are technical merges that have to be made periodically for any reason — Automerger comes into play. It takes the duty of your team and allows them to concentrate on delivering value to your company rather than completing boring merges. Do not lose time on tasks that can be done automatically.
What is Automerger?
Technically, it is a tool built with Cake. Sources can be found here: GitHub. It makes extensive use of the git command line output. Please feel free to fork the repo (or contribute to it via pull requests) and put your changes there. We will be delighted if you notify us 😉
Flow. How does it work inside?
One of the goals of the tool was to keep it as simple as possible. As an input, it takes two required parameters: source branch, target branch and executes the following steps:
- Create a temporary branch based on the source branch. The naming of the temporary branch can also be customized if you need.
- Merge to the temporary branch target branch.
- Parse output, confirm whether there are conflicts or not. If so, notify a particular developer to perform a manual merge procedure.
- Usually, when there is no conflict on the temporary branch, the script performs Cake task “Default”.
- After properly finish the default target, the script merges temp branch to the destination branch.
- Push changes to the origin.
There is one external dependency inside the specified flow. I mean point (4) with the execution of the “Default” target from the main Cake script. Its purpose is to validate the codebase after the merge. It is up to the script to run a build of the whole solution with unit test execution. It is necessary to validate the condition of the codebase at this point.
The Automerger is supposed to be an automatic tool, as few manual interventions as possible should be needed. Based on that assumption, there is a need for an automatic way to validate the correctness of performed merge-operations. The easiest way to achieve that is by building all projects with as many checks as possible. For example unit, integration or acceptance test. Of course, including more checks does not come for free — it increases the execution time significantly. Bear in mind that usually there are more than one of the executions of the Automerge script at a particular time.
The platform, only .Net?
The Cake is Roslyn and .Net based. It allows you to write scripts using plain C#.
Convention over configuration
Usually, the branching model defines the naming of branches that are created during development. The Automerger takes advantage of such conventions. Based on branch names, the Automerger knows which branches should be merged to which. Having branches named:
master, develop, release/release-8.0, release/release-8.1, release /release-9.0
The script parses names and sort branches in order:
Depending on how many active release branches there are, it performs merging as follows Master -> Release-9.0 -> Develop. Introducing a new branch that follows the convention is automatically picked up by the tool. No additional configuration needed.
Automerger requires a way to check whether changes on one branch are already present on the target branch. If you decide to perform proper merges from one branch to another, rather than doing cherrypicks, you are in a safe place. Git checks it for you, based on the hash of each commit. Automerger is based on this assumption. At this moment, if your team decides to go with cherrypicks and moving changes between branches with creating new commits each time. Automerger gets confused. It notifies you about the vast changes list and probably many conflicts.
Scalability and maintainability
The Automerger is present in the form of a script. Logically it is present as at least two separate files. These files need to be in a repository where the script is working. This limitation makes the Automerger a bit tricky to scale out and maintain in the longer run. Each time you create a new repository, it needs to be provided with those scripts inside. Each repo has a new instance of Automerger files. As a solution for this, you can create a template for your empty repository containing the Automerger files along with .gitignore file.
Getting all together
Automerger in CI/CD
The Automerger achieves full usability when it goes hand to hand with tools like Azure DevOps or any other CI/CD platform. CI part provides the possibility to schedule a build that executes the Automerger periodically. You can find out of the box build definition for VSTS here: GitHub.
After two years of using the tool, it proved itself working and merging in around 90%. What it means, loosely nine of ten times the Automerger do merge changes between branches without any merging conflict. Nine of ten times, a set of changes is propagated between branches without a single person knowing. The 10% scenario is a case when the Automerger informs the author about needed intervention.
We do not keep the data from the beginning of time due to data retention policy. Last 7 days of data at the moment when the article was published looks like:
Is there a minimum requirement to run Automerger?
As long as you fulfil the requirements for the Cake build tool, you are fine. Minimal requirements do not come directly from Automerger itself. Requirements are more related to your building process itself. Please be sure you have put all required external dependencies in place to have a proper build process. For example, you have a smoke test calling Elasticsearch as a part of artefact verification, please be sure Elasticsearch instances are available from the build machine, etc..
How often do you run the script?
After a time, we have learned that as often as possible synchronization is crucial. Postponing merging of changes increases the risk of a potential merge conflict significantly, also increases the difference between the two branches. It is generally not recommended to perform the huge merge, way better is to have them more and smaller. We have decided to perform script runs every ~3–4 hours per working day. Usually it is 5am, 8am, 12pm, 3pm, 6pm. Hours were chosen empirically. We have found out it is a good balance between the execution time and development speed of the engineering teams and build agent occupation. You may want to pick your hours yourself. I suggest running it as often as possible.
You use Slack notifications, our team is using mail communication.
No problem with that, please feel free to make code changes with the mailing mechanism and share a pull request. I will be more than happy to cooperate. Please, also find that cake building tool provides a wide array of out-of-the-box plugins that can be used. You can find add-ins here: https://cakebuild.net/addins/
We have more than one team, 20 teams with 10 devs each.
No problem. Please find out file: .automergeConfig. It contains a simple mapping between developers and the team. Whenever the Automerger finds a conflict during merge, it looks for the last person that edited a particular file. Then, the script finds the right team and notifies the team — no need to notify everyone in your IT department.
Sure, but I want to notify everyone.
The default configuration will do it then 😉
Does The Automerger produce quite a several branches over time?
It depends on configuration. The script can go both ways. It can leave the branch with conflict for your team to follow later. Depending on how often you execute the script, it creates a new temporary branch each time. It is not recommended, we have found out that it is better to allow The Automerger to clean after itself. It allows work to be more asynchronous. The developer does not need to take care of conflicts before the next Automerger’s run. On the other hand, the Automerger’s build can run as much as it wants to not be blocked by the developer.
Automation is crucial. Do not let your team or yourself find a comfortable place, repeating the same task day by day. Building a tool for task automation allows you to move on and focus on tasks that bring company value. The Automerger tackles one of these day-by-day struggles. In this particular case, it is all about repeatable merges.