Git Internals 101: The Fundamentals of Git's Internal Architecture (Part 3 of 3)

Git Internals 101: The Fundamentals of Git's Internal Architecture (Part 3 of 3)

In the previous blogs, we were introduced to two new concepts, namely Git Objects and how Git SHA works with it. A blob object stores the contents of a file, a tree contains other trees and blobs, and a commit is a snapshot of the main working tree. All of them are referenced by a hash of the SHA-1 Algorithm in Git. In this blog, we'll talk about branches, understand how they are implemented and what the commands git branch or git checkout actually do behind the scenes.

What is a Git branch?

We usually think about branches as a sequence of changes that are related to each other in an independent environment so that these changes don't affect the other branches until merged. For example, one branch is dedicated to a single feature and another branch to a specific bug fix.

img

Under the hood, A branch is just a name referring to a commit. We could always reference a commit by SHA-1 Hash but humans usually prefer other forms to name an object. A branch is one way to reference a commit, but it's just that. In most repositories, the main line of development is done in a branch called "main" (or "master"). This is just a name and it is created when we use git init making it widely used. However, by no means it is special and we could use any other name we like.

Under the Hood

Typically, the branch points to the latest commit in the line of development we are currently working on. So I created one commit and then I made some changes and created another commit pointing to the previous commit. I'd usually have the branch point to the second commit which is currently the latest commit in the branch.

img

To create another branch, we usually use the git branch command. By doing that we actually create another pointer. So if we called another branch called test by using git branch test, we are actually creating another pointer by the name of test that points to the same commit as the branch we are currently on (i.e main).

img

But there is still an unanswered question here, how does Git know what branch we are currently on? After all, when we commit, we commit to the current branch so Git has to know which branch it is.

Here comes the HEAD

Git knows what branch we are currently on by a special pointer called HEAD, usually HEAD points to a branch which then successively points to a commit.

img

In some cases, HEAD can also point to a commit directly but we won't focus on that in this blog. To switch the active branch to test we can use the command git checkout test now I believe you can already guess what this command does. It just changes HEAD to point to test.

img

By the way, we can also use git checkout -b test before creating the test branch. This command where "-b" is shorthand for "branch" is equivalent of git branch test to create the branch and then git checkout test to move HEAD to point to the test branch. Let's consider the current state, what happens if we make some changes and create a new commit using git commit which branch will be the new commit added to? The answer is test branch, as this is the active branch since HEAD points to it. So after the new commit object is created, the branch test will point it. Note, that HEAD still points to the test branch and this branch points to the new commit.

img

So if we go back to the main branch now by using git checkout main what we actually do is move HEAD to point to the main branch again. Now that we moved HEAD to point towards the main branch, what happens if we create another commit, what branch will it be added to? I'll leave this question for you to answer. (Hint: the branch name starts with "m" :p)

Git Merge

Git merge allows us to combine multiple branches of code into a single branch. This is often used when two or more developers are working on different parts of a project simultaneously, and need to combine their changes into a single branch.

Git merge works by comparing the code in the two branches that are being merged. If there are no conflicts, git will simply combine the two branches, creating a new commit known as "Merge Commit" that contains all of the changes from both branches.

img

As you can see the test branch is merged into the main branch resulting in a new merge commit.

Conclusion

So in this blog, we learned not only what Git branches are but also how they operate behind the scenes. With this, I complete the 3 part Git Internal series. Take your time to go through all the parts again, and understand how each term relates to each other to form the complete Git picture in your head.

In conclusion, the Git objects, Git SHA, and Git branches are all important concepts in the world of version control. Understanding how these concepts work together can help us better collaborate on projects and maintain a clear and organized codebase. By learning how to use these tools effectively, we can improve our productivity and efficiency when working with Git. Thank you for following along with this three-part series on Git Internals :)

Did you find this article valuable?

Support Hamees Sayed by becoming a sponsor. Any amount is appreciated!