Git is a version control system that allows you to manage and keep track of your source code history/projects
GitHub is a cloud-based hosting service that lets you manage git repositories remotely and more!
You can use Git without github, but you cannot use GitHub without Git.
After the installation of Git, you can now call git commands using a command line interface. Please get your CLI ready to follow up the hands-on sessions bellow.
1. Git first setup
In this step, you will be configuring Git username and email address. This input will serve as an identity of any git-related actions on your local machine. It's worth noting that the username and email address basically have nothing to do with your GitHub account. Still, you are recommended to use the GitHub account email.
git config --global user.name "Maverickmiaow"
git config --global user.email "ycheng0116.us@gmail.com"
2. Create git repository on a local machine
Next, you are going to create a local git repository. You can either create a new project folder for testing or use existing folders.
On your CLI, move to the project folder so that the next git-related actions will be performed on that folder.
cd /Users/maverickmiaow/Desktop/test-repo
Initialize git repository.
git init
Initialized empty Git repository in /Users/maverickmiaow/Desktop/test-repo/.git/
After initializing git repository, Git classifies all the files in a project folder into three categories, i.e., tracked files, untracked files, ignored files. The ignored files are those you tell git file not to watch at all. Those files could be the log files generated automatically in runtime. .gitignore file is where git gets the information on files to be ignored. It is a plain text file where each line contains a pattern for files/directories to ignore. Generally, this is placed in the root folder of the repository, and that's what I recommend.
To understand untracked files and tracked files, you will first need to know what staging environment refers to in Git system. A staging environment is where you git gets information on what files to be watched. In other words, Git will not keep track of changes on files unless you add them into the staging environment. Files in the staging environment are called tracked files. The other files not belonging to ignored files or tracked files are the untracked files.
Check out the status of your git repository.
git status
On branch master No commits yet Untracked files: (use "git add <file>..." to include in what will be committed) .idea/ .ipynb_checkpoints/ img/ master-git-github-slides-v2.ipynb myscript.py nothing added to commit but untracked files present (use "git add" to track)
Add individual file to the staging environment.
git add myscript.py
or add all files to the staging environment
git add .
Now, let's see what happens to your git repository.
git status
On branch master No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: myscript.py Untracked files: (use "git add <file>..." to include in what will be committed) .idea/ .ipynb_checkpoints/ img/ master-git-github-slides-v2.ipynb
A new file has been added to the staging environment. You can use git rm --cached
3. First commit
The git commit command captures a snapshot of the project's currently staged changes, which basically records all changes since last commit
Let's say you want to commit new files added to the project folder. You want to attach a message with the commit saying what changes you made. The message must be descriptive and can be understood by your collaborator in case you are working on a team project. If you do not have a naming strategy, this is the one I am using for now.
git commit -m "docs: add myscript.py"
[master (root-commit) d15cbac] docs: add myscript.py 1 file changed, 0 insertions(+), 0 deletions(-) create mode 100644 myscript.py
Now, let's check what changes we have made on this git repository
git log --pretty=format:"%h - %an, %ar : %s" --graph
* d15cbac - Maverickmiaow, 1 second ago : docs: add myscript.py
This log message tells you the unique id of the commit, the git username of the author, the time when it was committed, the commit message. You can include more information on the log message by configuring --pretty.
An often asked question is how often to make a commit. The short answer is it depends and is up to you. I believe that you all have your own strategies of version control. Let's think of how you worked with Word documents before knowing any version control systems. What I used to do was duplicating -> renaming -> modifying -> saving. With Git, the workflow will be modifying -> adding in staging environment -> committing. I guess now you should know the answer of how often to make a commit. It is recommended to commit as often as you can. In this regard, Git commit is kind of equivalent to the save button on Microsoft Word. My strategy is to plan ahead the version distribution of the project and make major commits corresponding to different versions. On top of that, I make minor commits whenever there is a descriptive message that can be used to summarize the changes.
4. Understand branch
A branch represents an independent line of development. Every git repository has at least a primary branch or a root branch, which is usually named as master or main and is automatically generated in the git initialization step.
By running git branch, you can check which branches exist in your git repository and which branch you are work on.
git branch
* master
In order not to mess up the master branch, it is highly recommended to create a new branch and make commits on that branch before you think the changes are finalized. The branch can be used in different ways according to user requirements. Let's say you want to develop a feature using different logics and select the outperformed ones. You can then create multiple branches to develop and test different logics in parallel. Later on, you can decide which to be merged to the master branch. By doing so, you are basically discarding the others from the final version of your project.
Next, you are going to create a new branch namely test and list all branches.
git checkout -b test
Switched to a new branch 'test'
Let's see what branches exit now. The star indicates the current branch you are working with.
git branch
master * test
This tree shows how branches and commits are being developed in your git repository. Each line represents a branch with the name of the branch written at the end of the line. The highlighted text are the unique id of each commit attached with the corresponding branch.
d15cbac master
||
\/
test
/
/
d15cbac master
You can now make some changes on the myscript.py file in the project folder. Add it to the staging environment and commit the change.
git add myscript.py
git commit -m 'test: test branch'
[test e45cbbc] test: test branch 1 file changed, 1 insertion(+)
This is what happened to your git repository
git log --pretty=format:"%h - %an, %ar : %s" --graph
* e45cbbc - Maverickmiaow, 2 seconds ago : test: test branch * d15cbac - Maverickmiaow, 42 seconds ago : docs: add myscript.py
e45cbbc test
/
/
d15cbac master
5. Branch manipulation
Next, you will be learning how to integrate branches to the master branch using git merge.
First of all, you will need to switch back to the master branch.
git checkout master
Switched to branch 'master'
Merge test branch into current branch (master branch) with a merge commit
git merge test --commit -m 'finalize merge' --no-ff
Merge made by the 'recursive' strategy. myscript.py | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
Now, let's check the commit tree.
git log --pretty=format:"%h - %an, %ar : %s" --graph
* ec20fd3 - Maverickmiaow, 1 second ago : finalize merge |\ | * d249a89 - Maverickmiaow, 12 seconds ago : test: test branch |/ * 138fd54 - Maverickmiaow, 43 seconds ago : docs: add myscript.py
This shows how your git branches and commits were developed.
d249a89 test
/
/
138fd54 master
||
\/
d249a89 test
/ \
/ \
138fd54 —— ec20fd3 master
1. Create a new repository on GitHub
Please following this tutorial
2. Push local repository to GitHub
Before linking the github repo with your local git repo, you will need to configure how git access GitHub credentials. For PC users, please git config --global credential.helper wincred.
git config --global credential.helper osxkeychain
If you skip the previous hands-on section, please run the following lines accordingly.
cd /Users/maverickmiaow/Desktop/test-repo
git init
git add .
git commit -m 'initialize repo'
You've already had your local git repository initialized and made some commits. Now you will need to tell git where to find the corresponding GitHub repository by specifying the URL argument on git remote add command.
git remote add origin https://github.com/YanCheng-Env/test-repo.git
Now, the URL has been configured and can be retrieved by running git remote show origin
git remote show origin
* remote origin Fetch URL: https://github.com/YanCheng-Env/test-repo.git Push URL: https://github.com/YanCheng-Env/test-repo.git HEAD branch: (unknown)
You are now ready to push the local repository to the corresponding GitHub repository.
git push -u origin master
Branch 'master' set up to track remote branch 'master' from 'origin'.
To https://github.com/YanCheng-Env/test-repo.git * [new branch] master -> master
Next, you will be learning how to collaborate on GitHub repositories that shared by others. The figure below shows the workflow. The Upstream refers to the original repository, which is most likely created by the project manager and hosted on his GitHub account. The Origin is a copy of the Upstream repository on your own GitHub account. The Local is a copy of the Origin repository on your local machine.
1. Fork into a repository on GitHub
First, you will need to fork into a repository. This repository refers to the Upstream on the workflow graph. It can be a private or public repository shared by others. If it is a private repository. The owner will have to assign access to you. Please follow this instruction
This step will be performed on GitHub web page. This is the instruction on how to fork a repository in GitHub. after that, you should be able to see the forked repository on your own GitHub account. This repository is what we called Origin on the workflow graph.
2. Clone the forked repsository locally
Next, you are going to make a local copy of the forked repository. This repository is corresponding to the Local on the workflow graph.
On your CLI, move to the folder where you want to save the clone.
cd /Users/maverickmiaow/Desktop/clone
Then create a local clone using git clone. After that, you should be able to find the forked repository on the folder.
git clone https://github.com/YanCheng-Env/fork-demo.git
Cloning into 'fork-demo'...
3. Track the original repository
How to keep your local copy of the upstream up to date? You will need to tell git the URL of the Upstream repository. Don't be confused; this is not the URL of your Origin (forked) repository.
On the CLI, move to the local copy of the upstream.
cd /Users/maverickmiaow/Desktop/clone/fork-demo
Then configure the url of Upstream
git remote add --track master upstream https://github.com/YanCheng-go/fork-demo.git
Now, let's see the information of Origin and Upstream URLs
git remote -v
origin https://github.com/YanCheng-Env/fork-demo.git (fetch) origin https://github.com/YanCheng-Env/fork-demo.git (push) upstream https://github.com/YanCheng-go/fork-demo.git (fetch) upstream https://github.com/YanCheng-go/fork-demo.git (push)
You are now ready to sync the repository with the Upstream repository.
Let's say you want to sync any changes on the master branch. What you want to do is that 1) checkout master branch, 2) fetch Upstream, 3) rebase local master branch on top of upstream master branch. If there are any changes on upstream master branch since last synchronization, those changes are now synchronized on your local copy.
git checkout master
git fetch upstream
git rebase upstream/master
Let's look into what changes have been made on Upstream
git log --pretty=format:"%h - %an, %ar : %s" --graph
* 85ac979 - Yan Cheng, 22 minutes ago : Create projectscript.pyOn branch main Your branch is up to date with 'origin/main'. nothing to commit, working tree clean
For now, you only synchronized the local copy of the upstream repository. The GitHub copy has not been updated yet. By calling git status, you should see that your local branch is a number of commits ahead with 'origin/master'.
git status
Next, you will also update the GitHub copy (forked repository) by running git push origin master
git push origin master
To https://github.com/YanCheng-Env/master-git-github.git 9a7821f..7b68d4a master -> master
4. Contribute to repositories
Next, you will be learning how to contribute to the repositories. The following figure shows the workflow. You will need to
The first three steps will be performed on you local machine with git commands. From step 4 through step 6, you will need to do it on GitHub.
GitHub Flow
Let's say that you have made some changes on the local repository.
Next, you will need to make a new branch on your local repository.
git checkout -b test-commit-fork upstream/master
On branch test-commit-fork Your branch is ahead of 'upstream/master' by 1 commit. (use "git push" to publish your local commits) nothing to commit, working tree clean Branch 'master' set up to track remote branch 'master' from 'origin'.
Everything up-to-date
Then, you will need to add files to the stating environment and make commits.
git add test-commit-fork.txt
git commit -m "test: test commit fork"
Once you think it is ready, you can use git push to push changes to the GitHub copy (forked repository).
git push -u origin test-commit-fork
Branch 'test-commit-fork' set up to track remote branch 'test-commit-fork' from 'origin'.
remote: remote: Create a pull request for 'test-commit-fork' on GitHub by visiting: remote: https://github.com/YanCheng-Env/master-git-github/pull/new/test-commit-fork remote: To https://github.com/YanCheng-Env/master-git-github.git * [new branch] test-commit-fork -> test-commit-fork
Open your forked repository on GitHub. You should see a message saying something like "hey, we notice that you have made some changes on the forked repository, would you like to integrate your changes to the Upstream? Then submit a pull request."
What you will do next is to submit a pull request. The owner of the Upstream repository will receive a pull request and start a review process. Once the pull request is approved, the changes you made will be integrated into the Upstream repository.
# Load local files
from google.colab import files
src = list(files.upload().values())[0]
open('mylib.py','wb').write(src)
import python-scripts
# Mount Google Drive
from google.colab import drive
drive.mount('/content/gdrive')
import sys
sys.path.append('/content/gdrive/my-python-directory')
import python-scripts
# Connect to GitHub repositories
! git clone https://github.com/YanCheng-go/master-git-github.git
import sys
sys.path.insert(0,'/content/github-repo')
import python-scripts
Review