A version control system is a tool for managing changes to computer code over time by efficiently tracking changes all modifications. One of the most popular version of version control is “git”. Other version control systems such as subversion or SVN are also widely used, but “git” has become a de facto standard. Git generally is already installed on most Linux-like systems.
Version control is essential for projects with multiple people editing the same collection of code. However, it can also help even a single user more effectively and efficiently manage and document their own scripts and programs.
Read more about version control and git here. In particular, you should read:
To begin tracking an existing project, move to the top folder in the project tree and type:
git init
To create a local copy of an existing repository:
git clone "url for repo"
For example, to clone the “Stats506_F17” repository from gihtub.com:
git clone https://github.com/jbhender/Stats506_F17
Git is a “distributed version control system” meaning that all copies are “local”. When you begin a version controlled project using git, the project folder is itself a repository. You do not need a remote repository to use git.
However, git is used most effectively with a remote repository. Remote git repositories such as github or bitbucket are effective means for both backing up and sharing the code you write. It is also a great way to manage work across multiple computers, such as a personal laptop and one or more servers.
The following commands are useful for working with a remote repository.
To link a local and remote repository, use git remote
(this is based on github):
git remote add origin url_to_remote_repo.git
git remote -u origin master
If you haven’t made any local commits and want to bring your local copy up to date with the remote repository use, git pull
. To retrieve remote changes without merging into the local repository use git fetch
followed by git status
or git diff
to see changes that have been made. You can then use git merge
or git pull
to finish merging.
When using a remote code repository like github or bitbucket be careful not to submit any protected or sensitive data. When working with such data, it is especially important to use git status
before git pull
to ensure you know exactly what will be uploaded to a remote repository.
Similarly, when working on a project with someone else, be sure to check whether there is any concern about sharing the code in a public or private repository.
As with most programs, you can set preferences in git to improve your workflow. Specifically, this can be done from the command line using git config
. Here are a few options you set everywhere you use git:
user.name
- the username for your remote repository,email
- associated with your remote repository,core.editor
- your preferred text editor (defaults to vi).Setting your user name an email allows you to omit explicitly specifying them when using git push
and git pull
to interact with your remote repository.
git config --global user.name "jbhender"
git config --global user.email "jbhender@umich.edu"
To set your default editor to emacs
(substitute your choice of editor) use:
git config --global core.editor emacs
This could also be done in the shell by exporting the EDITOR
environment variable,
export EDITOR=emacs
echo$EDITOR
but doing so will affect all programs that rely on the EDITOR
variable.
.gitignore
At times you may find you have certain files that you want to keep in your local repository but not the remote repository for privacy or other reasons.
You can tell git to ignore such files by including a .gitignore
file at the top of your repositories file tree. For example, running
cd ~/My_Repo
echo "*.csv" > .gitignore
from the command line will create a .gitignore
file in My_Repo
telling git to ignore all files with the .csv
extension.
Markdown is a plain text formatting language that is designed to create (x)html documents using special characters and indentation to specify the desired format while remaining easy to read in its raw format.
There are many “flavors” of markdown. Let’s look at a quick tutorial on Github flavored markdown.
This document was written in Rmarkdown using RStudio. You can see the source at my Stats 506.
You will learn more about Rmarkdown later in the course. For now, you should familarize yourself with the basic formatting options specified at the page we viewed earlier.
Read more about configuring git here.
For routine use of git from your personal computer, you may want to connect to your remote repository using ssh
. Using ssh keys you can connect to GitHub without supplying your password for each interaction. Read more here.
git
for homeworkI encourage you to create an account at github or bitbucket and to use git
for your work in this class. However, please refrain from posting solutions to problem sets for this course prior to the due date. After the due date, I suggest you do post your (possibley corrected) code and inlcude a concise description of what you’ve learned in a Readme.md
written in markdown. Your repository can serve as a sort of “digital portfolio” for your computing and analysis skills.