Implement Git by yourself (1: Introduction)
I’d like to say that Git is the most popular version control system (VCS). As a developer, you probably use git porcelain commands in your daily work and treat it as black box. But
If you can’t make one, you don’t know how it works.
You can access to git source code though, it’s a little bit of challenge to go through this repo now. What’s worse, git is written in C language which is sophisticated for us. Git is also Linus Torvalds’s masterpiece. It contains a lot of tricky code.
So, we are going to implement Git by ourselves. This time, we will choose C# language with .Net Core platform which is cross-platform language.
This blog series has two major references
1 Structure of .git folder
The first step using git is to type git init in the target folder. It will create a sub-folder named .git.
We’re going to dive into HEAD, objects and refs items, which are the core part of git at the first version.
In general, .git folder is file-base database which means we can restore the codebase as long as .git folder is intact.
2 Git Object
Plumbing command git hash-object takes some data and store it in .git/objects directory, then display the unique key which maps to this data object.
$ echo 'test content' | git hash-object -w --stdin
Let’s see what happened in the .git/objects directory
It create a file 70460b4b4aece5915caf5c68d12f560a9fe3e4 in d6 folder. Where does the value come from ?
It’s SHA-1 digest of data which consists of content and header.
The type could be
- blob: The common file
- tree: The folder
- commit: the commit log
With SHA-1 value, we can also restore the blob file easily.
As we known, the blob file doesn’t include any file name and attributes information. All of them are kept track in tree object.
From the above output, the structure would like be that.
With this nest structure, it’s possible the restore the file with correct file name and folders.
Every commit command will create an object as well, which includes
- Current work directory tree
- Previous commit object
- Committer user information
- Commit message.
2 Git Reference
We can travel through the commit history by the commit id (the SHA-1 value). But it looks like too difficult to remember such long value. Git provides the readable mechanism to reach specific commit.
$ find .git/refs/
It has two directories in the
.git/refs directory. Each file in the
.git/refs/heads means a individual branch. And files in the
.git/refs/tags represents each tag you create.
Let’s look into the content of the
$ cat .git/refs/heads/master
$ git cat-file -p 584ad834b71a95161ee79d237b730c30a06a080a
author gaufung <email@example.com> 1602422297 +0800
committer gaufung <firstname.lastname@example.org> 1602422297 +0800second commit
The same with
$ git tag v1.0 28cd27cccf7ed33b4556e2ea66d06cdbbac038fc
$ cat .git/refs/tags/v1.0
$ git cat-file -p 28cd27cccf7ed33b4556e2ea66d06cdbbac038fc
author gaufung <email@example.com> 1602421464 +0800
committer gaufung <firstname.lastname@example.org> 1602421464 +0800first commit
How does git know which branch are in? The answer is the
$ cat .git/HEAD
When you advance the commit history,
HEAD file would point out which head needs to be updated.
What if you type
git checkout <commit id> ?
$ git checkout 28cd27cccf7ed33b4556e2ea66d06cdbbac038fc
Note: checking out '28cd27cccf7ed33b4556e2ea66d06cdbbac038fc'.You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:git checkout -b <new-branch-name>HEAD is now at 28cd27c first commit$ cat .git/HEAD
HEAD doesn’t point to any refs any more just a commit id. We call this case as
detached HEAD . It’s dangerous since you cannot come back if you switch to other branch.
They are basic knowledge for git internal. It’s good beginning for us to implement basic feature by ourselves.