Git
Git is
a free and open-source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
Unlike SVN (subversion), Git is distributed. That is, it doesn’t require centralized servers to manage the files (although servers like GitHub can be convenient). This provides flexibility in management and makes it extremely good for open-source projects.
- Suggested Chinese References
- Official Guide
Note: Only plain texts are welcomed in a Git system, try not to include too many binary files or compressed files (such as .docx
, .paper
, .jpg
) into the system.
The necessary binaries are better stored elsewhere (image hosts) and only keep their paths in the Git files.
Frequently changed binary files can make the repos blow up and slow down the program.
Markdown is recommended for texting in Git.
Installation
Git can be easily installed with sudo apt-get install git
for most Debian-based systems.
The same package can be installed with most package managers in other Linux systems.
On the official site, a very handy shell, GitBash, is provided for Windows, which goes with many frequently used tools including Git.
Terminologies
To understand the functionality of git, it is important to underline some terminologies.
Repository
Repository, or repo in short, is the core of Git. As the name suggests, it stores all the files (including the history) of a project.
A repo looks like a regular folder containing the files you are working on.
All the dirty works are hidden behind the .git/
folder in the repo, and that is where the version control is based on.
Workspace/Working Tree
In a repo, everything outside the .git/
folder is your current workspace.
The workspace can be used as if Git does not exist.
Other than the workspace, the physical structure of a repo is not important. All the “places” beyond this point are not related to real “folders” or “paths”. They are just abstract objects.
Stage/Index
The index is the place where Git holds a temporary version of your project. It is very simple to stage your current progress and roll back later if you want to discard the changes after the stage.
Commit
Once you reach a certain point of your project, you can archive it for later references. To do so, you need to commit your current stage (files in the index) to the repo history. A commit includes the current version of the project and what has been changed since the last commit. The commits will form a chain according to the dependency history. This provides a well-established version control system.
Each commit contains its hash as the id, the name of the author, and the time stamp. It is also recommended to append comments to the commit, so it would be easier to understand the history. Also, commits can be signed with GPG keys, which prevents secrete illegal modifications by others.
Head
The head is a pointer to the current commit. It is used to check out each commit in the chain and roll back the project to a historical version.
Branch
Git can store multiple versions of your projects in parallel. Each version is called a branch. The workspace only shows one of the branches at a time, and you may jump between the branches with Git. It is the branch that gives Git the magic power of cooperation.
Note that everything outside the commit chain will be lost when you jump to another branch.
Remember to stash
them before checkout to other branches.
Start with Git
Create a Repo
A repo can be set up by init
in any folder:
:path_to_repo $ git init
Initialized empty Git repository in [path_to_repo]
It is empty as no files are stored in the stage or commits at init.
Temporary Version Control
To add files to the index, simply use
$ git add [file] [another file]
You can add multiple files at once, or stage all files by
$ git add .
You can use git status
to check which file is modified (or untraced, or deleted) after the stage.
Similarly, files can be removed from the index (not the workspace) with git rm
Commit to the tree
After the files are staged, the index (not the workspace) can be committed to the tree. This is done with
$ git commit -m "some notes"
[master (root-commit) b2df984] 'some notes'
1 file changed, 1 insertion(+)
create mode 100644 demo.txt
where b2df984
is the id of the commit.
If the flag -m
is not provided, a text editor will pop out and ask for your commit message.
You can sign the commit with a [GPG signature] with flag -S
$ git commit -S -m "sign the commit"
[master 7e45c33] sign the commit
1 file changed, 1 insertion(+), 1 deletion(-)
The information of the output can be also found in the git log
$ git log
commit 7e45c334a3f60be6fa3d31cb91305ab0bd383376 (HEAD -> master)
Author: Demo <demo@demo>
Date: Thu Apr 25 23:12:15 2019 +0800
sign the commit
commit b6b7e26f3b2089cec745e90e7c073a7cd6a39695
Author: Demo <demo@demo>
Date: Thu Apr 25 23:06:15 2019 +0800
Another Commit
Show the difference
One of the powerful functions of Git is diff
.
It can show the difference between any two files.
This is especially useful to checkout what is changed between two commits or between the index and the workspace.
$ git diff
diff --git a/demo.txt b/demo.txt
index 0f22871..e019be0 100644
--- a/demo.txt
+++ b/demo.txt
@@ -1 +1 @@
-extra
+second
a
and b
are two flags used to indicate different files.
As shown in the output, -
identifies what is deleted from a
, and +
identifies what is added to b
.
The number between @@
is the line number of the change, and the following text is the explicit difference between the files.
Some tools, such as GitHub Desktop may help reading this message with more human-friendly interfaces.
This powerful tool can only be used on plain texts, and that is why binary files are not recommended for the Git system.
For the detailed usage of diff
, see the cheat sheet at the end of the page.
Roll Back
As a version control system, Git provides several methods to roll back to old versions.
There are two main methods for this task:
revert
reset
The revert command only rolls back the project to an old version.
The roll back with revert will create a new commit, which records this step.
If you want to roll back the commit history as well, you have to use the reset
commend.
To reset to a commit, its id is required, which can be found in the log.
If no commits are specified, the default value is HEAD
.
The commit can also be indicated with HEAD^^^
, where each ^
indicates the parent commit from the previous one, and HEAD
is the current commit.
Since Git is a two-step version control system, you need to specify where to put the original files. To simplify the condition, the effects of different flags are shown in the cheat sheet.
reset
only reset the position of the head. The commits are not deleted afterreset
. You can find all command histories withreflog
$ git reflog e475afc HEAD@{1}: reset: moving to HEAD^ 1094adb (HEAD -> master) HEAD@{2}: commit: sign the commit e475afc HEAD@{3}: commit: Another commit eaadf4e HEAD@{4}: commit (initial): Some notes
which provides all commit ids, including those reset ones, e.g.
1094adb
here.
Change the Workspace
Git can maintain multiple branches in parallel.
This makes sure people in a team can work independently on different parts of the project and assemble the whole program later.
That is, each member can create his/her own branch and merge the branches to the main branch, usually called master
, after the job is done.
You can download the branches from others and check out their progress.
To jump between different branches, you need to checkout
$ git checkout master
Switched to branch 'master'
M demo.txt
flag M
hear means that, this file is modified by checkout
.
If checkout
is provided with a flag -b
, a new branch will be created if it doesn’t exist.
This is equivalent to git branch [branch]
then git checkout [branch]
.
(the flag -B
works similarly. But it is dangerous as it will override the branch if it exists).
checkout
can also be used to drawback files from previous stage.
$ git checkout demo.txt
Updated 1 path from the index
If no commits are specified, the default is the the index.
Unlike reset
, checkout
doesn’t move the head (unless you are checking another branch, which moves the head to that branch).
Remote Repo
As a distributed system, the Git repo can be distributed to different devices and forms a network. There is no such thing as a ‘central server’, and each server is equally important. However, people prefer to rely on a single server, which prevents the difficulty of synchronization.
Repos are transmitted through ssh tunnels between the devices.
To connect to a Git server, you need to send your ssh public key to the server.
All major servers like GitHub, GitLab, and Gitee provide very simple instructions to do that.
Major servers may provide https://
access to the repos as well, which might be simpler for beginners.
But you may need to enter passphrases every time pushing your branches..
After the ssh is setup, the only information you need is the address of the remote repo.
It should be something like git@github.com:UserName/Demo.git
.
You can connect the remote repo to your local repo with
$ git remote add origin git@github.com:UserName/Demo.git master
origin
is the default name for remote repos.
After set up the remote repo, you can sync between remote and local repos.
To push the local repo to remote, simply use the alias origin
defined above
$ git push -u origin master
Counting objects: 20, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (15/15), done.
Writing objects: 100% (20/20), 1.64 KiB | 560.00 KiB/s, done.
Total 20 (delta 5), reused 0 (delta 0)
remote: Resolving deltas: 100% (5/5), done.
To git@github.com:UserName/Demo.git
* [new branch] master -> master
Branch 'master' set up to track remote branch 'master' from 'origin'.
The branch master
must be specified at the first time.
The flag -u
is used to set up the links between the local and remote master
branches, as indicated in the last line of output.
After the link is set up, the branch name can be omitted in later pushes.
Merge
In real projects, usually, new features are developed in some independent new branches, e.g. dev
.
Once the feature is finished, we want to merge it to the master branch.
This can be done with merge
on the master
branch.
If there are no conflicts between the dev
and the master
, it can be down with the Fast-forward
mode:
$ git merge dev
Updating d46f35e..b17d20e
Fast-forward
demo.txt | 1 +
1 file changed, 1 insertion(+)
Otherwise, if there are conflicts, you will be notified and Git will enter the merging
mode:
$ git merge dev
Auto-merging demo.txt
CONFLICT (content): Merge conflict in demo.txt
Automatic merge failed; fix conflicts and then commit the result.
In this mode, git status
will show you where the conflicts are.
You can abort the merge and roll back to normal mode with git merge --abort
, or edit the files listed and resolve the conflicts.
In the merging
mode, you can find the conflicting lines as
<<<<<<< HEAD
This is master branch.
=======
This is develop branch.
>>>>>>> dev
between <<<<<<< HEAD
and =======
are the lines in HEAD
.
The lines in dev
branch are below =======
.
Replace this chunk of code with what you want to keep, and then save the files.
Once you are done, you can add
the files to the index and commit
the merge.
The bifurcation and merges of the branches will form loops in the git log.
You can find the loops in a graph view with git log --graph
.
$ git log --graph --pretty=oneline --abbrev-commit
* 2f128c4 conflict solved
|\
| * 1c85f25 a conflict commit
| * dd37a3b commit in dev
* | a22211f second commit
|/
* 1c7e0d8 first commit
However, it is relatively hard to read and fix files on the command line.
Git has a built-in GUI, which can be opened with git gui
.
It makes reading logs and status easier.
To better merge and view differences, a gui tool is recommended.
Personally, I prefer VSCode, as it lists the differences in a more human-friendly interface, and it is smarter than the native git.
The vanilla VSCode is good enough for light duties.
To configure the tools, simply paste the code below to your git config files:
[merge]
tool = vscode
[mergetool "vscode"]
cmd = "code --wait $MERGED"
[diff]
tool = vscode
[difftool "vscode"]
cmd = "code --wait --diff $LOCAL $REMOTE"
This config file can be editted with git config -e
.
Other Features
Git also provides other handy tools for version control
Stash
Stash provides you a place to temporarily save your workspace.
If you need to shift to another branch in an emergency, e.g. to fix a serious bug on the master
branch, everything not in the index will be lost.
In case you don’t want to override the index, you can use stash
.
git stash list
can list all stashes, and git stash pop
will recover the latest stash.
If you need to recover old stashes, you need to specify its hash, which is shown in the stash list
.
Tag
It is annoying to memorize the commit ids.
You may need to read through the commit history every time and carefully pick out the one you need in order to check out an old version.
For some important milestone commits, you may give them tag
s, e.g. v0.1
.
You can use git tag
to list all the tags in the branch, and use git show [tag]
to check the information assigned to it.
$ git tag -a v0.1 -m "Some notes"
adds a tag v0.1
to the current commit.
The flag -a
indicates that, you want to include an annotation to the tag, including your information and the time stamp.
The flag -m
put a note into the tag.
You can also use -s
instead of -a
to sign the annotation with your GPG keys.
It is also possible to tag an elder commit by specifying its id, for example: $ git tag -a v0.1 ffffff
.
Tags will not be pushed to remote automatically.
You can push them to remote by specifing the flag --tag
when you push
.
Cheat Sheet
- Configuration
-
Use
git config -e
to edit configuration files, where-e
is a shorthand for--edit
.The default flag is
--local
configuration, which is for the current repo. The--global
flag is for the current user, and--system
is for all users.
For diff
, the default target is index
Command | Apply to |
---|---|
diff |
Workspace vs. Index |
diff head |
Workspace vs. HEAD |
diff --cached |
Index vs. HEAD |
For reset
, the default target is HEAD
Command | Apply to |
---|---|
reset --soft |
HEAD -> HEAD |
reset --mixed |
HEAD -> Index |
reset --hard |
HEAD -> Index -> Workspace |
For checkout
, the default target is HEAD
Command | Apply to |
---|---|
checkout |
- |
checkout [branch] |
[branch].HEAD -> Workspace |
checkout [commit] |
[commit] -> Workspace |