Git Commands _ DataCamp
Git Commands _ DataCamp
EN
C H E AT S H E E T S Category
Richie Cotton
Webinar & podcast host, course and book author, spends all day chit-chatting about data
TO P I C S
Data Science
Data Engineering
Version control tools like Git are essential for working with data science in the real world. Git
helps data scientists track their work, collaborate with others easily, experiment with
changes to their existing work in a safe environment, and more. This cheat sheet will give
you the download on all things Git.
Key Definitions
Throughout this cheat sheet, you’ll find git-specific terms and jargon being used. Here’s a
run-down of all the terms you may encounter
Basic definitions
Local repo or repository: A local directory containing code and files for the project
Remote repository: An online version of the local repository hosted on services like
GitHub, GitLab, and BitBucket
Branch: A copy of the project used for working in an isolated environment without
affecting the main project
Staging area: a cache that holds changes you want to commit next.
Git stash: another type of cache that holds unwanted changes you may want to come
back later
Commit ID or hash: a unique identifier for each commit, used for switching to different
save points.
HEAD (always capitalized letters): a reference name for the latest commit, to save you
having to type Commit IDs. HEAD~n syntax is used to refer to older commits (e.g.
HEAD~2 refers to the second-to-last commit).
Installing Git
On OS-X — Using an Installer
Download the installer for Mac
On Linux
$ sudo apt-get install git
On Windows
Download the latest Git For Windows installer
Setting Up Git
If you are working in a team on a single repo, it is important for others to know who made
certain changes to the code. So, Git allows you to set user credentials such as name, email,
etc...
$ gc -m “New commit”
What is a Branch?
Branches are special “copies” of the code base which allow you to work on different parts
of a project and new features in an isolated environment. Changes made to the files in a
branch won’t affect the “main branch” which is the main project development channel.
Git Basics
What is a repository?
A repository or a repo is any location that stores code and the necessary files that allow it
to run without errors. A repo can be both local and remote. A local repo is typically a
directory on your machine while a remote repo is hosted on servers like GitHub
$ git init
A note on cloning
There are two primary methods of cloning a repository - HTTPS syntax and SSH syntax.
While SSH cloning is generally considered a bit more secure because you have to use an
SSH key for authentication, HTTPS cloning is much simpler and the recommended cloning
option by GitHub.
HTTPS
SSH
$ git remote
Add all untracked and tracked files inside the current directory to git
$ git add .
$ git rm <<filename_or_dir>
$ git status
Saving staged and unstaged changes to stash for a later use (see below for the
explanation of a stash)
$ git stash
$ git stash -u
$ git diff
Show the differences between two commits (should provide the commit IDs)
$ git diff <id_1> <id_2>
A note on stashes
Git stash allows you to temporarily save edits you've made to your working copy so you can
return to your work later. Stashing is especially useful when you are not yet ready to commit
changes you've done, but would like to revisit them at a later time.
Branches
List all branches
$ git branch
Create a new local branch named new_branch without checking out that
branch
Pulling changes
Download all commits and branches from the <remote> without applying them
on the local repo
A more aggressive version of fetch which calls fetch and merge simultaneously
$ git log
List one commit per line (-n tag can be used to limit the number of commits
displayed (e.g. -5))
Log commits after some date (A sample value can be 4th of October, 2020 -
“2020-10-04” or keywords such as “yesterday” , “last month” , etc. )
$ git log --oneline --after=”YYYY-MM-DD”
Log commits before some date (Both --after and --before tags can be used for
date ranges)
Reversing changes
Checking out (switching to) older commits
Undo the latest commit but leave the working directory unchanged
You can undo as many commits as you want by changing the number after the tilde.
Instead of HEAD~n, you can provide commit hash as well. Changes after that commit will be
destroyed.
Undo a single given commit, without modifying commits that come after it (a
safe reset)
Related
B LO G
What is Git? - The Complete
Guide to Git
B LO G
Understanding GitHub: What is
GitHub and How to Use It
T U TO R I A L
See More
Learn Python
Learn R
Learn AI
Learn SQL
Learn Power BI
Learn Tableau
Assessments
Career Tracks
Skill Tracks
Courses
DATA C O U R S E S
Python Courses
R Courses
SQL Courses
Power BI Courses
Tableau Courses
Alteryx Courses
Azure Courses
AI Courses
DATA L A B
Get Started
Pricing
Security
Documentation
C E R T I F I C AT I O N
Certifications
Data Scientist
Data Analyst
Data Engineer
SQL Associate
Azure Fundamentals
AI Fundamentals
RESOURCES
Resource Center
Upcoming Events
Blog
Code-Alongs
Tutorials
Open Source
RDocumentation
Course Editor
Data Portfolio
Portfolio Leaderboard
PLANS
Pricing
For Business
For Universities
DataCamp Donates
FO R B U S I N E S S
Business Pricing
Teams Plan
Data & AI Unlimited Plan
Customer Stories
Partner Program
ABOUT
About Us
Learner Stories
Careers
Become an Instructor
Press
Leadership
Contact Us
DataCamp Español
DataCamp Português
S U P PO R T
Help Center
Become an Affiliate
Privacy Policy Cookie Notice Do Not Sell My Personal Information Accessibility Security Terms of Use