A complete, unshortened guide to Git: from how it works under the hood to the most advanced
real-world usage — branching, rebasing, worktrees, submodules, hooks, recovery, team workflows,
GitHub/GitLab groups, and interview Q&A. Real commands, real examples, nothing skipped.
1. How Git Works Under the Hood (the mental model)
Most Git confusion disappears once you understand that Git is a content-addressable filesystem
with a version-control UI on top. Internally, Git stores four kinds of objects in .git/objects/,
each named by the SHA-1 (now also SHA-256) hash of its content:
Object
What it is
blob
The raw contents of a file (no name, no metadata — just bytes).
tree
A directory listing: maps filenames → blob/tree hashes (like a folder).
commit
A snapshot: points to one tree + parent commit(s) + author + message.
tag
An annotated tag object: points to a commit + tagger + message.
Key insight: A commit is not a diff — it’s a full snapshot of the whole tree. Git is
efficient because identical files share the same blob (deduplicated by hash).
Refs are just pointers (text files) to commit hashes:
A branch (refs/heads/main) = a movable pointer to the latest commit.
A tag (refs/tags/v1.0) = usually a fixed pointer.
HEAD = a pointer to “where you are now” (usually points to a branch).
The three areas you constantly move files between:
Working Directory ──git add──▶ Staging Area (Index) ──git commit──▶ Repository (.git)
(your files) (the next snapshot) (permanent history)
Real example — peek inside Git:
Terminal window
gitcat-file-pHEAD# show the commit object (tree, parent, author)
gitcat-file-pHEAD^{tree}# show the directory listing it points to
gitrev-parseHEAD# the full SHA of the current commit
gitadd-p# stage *parts* of files interactively (hunk by hunk)
gitadd.# stage everything in the current dir
gitcommit-m"message"# snapshot the staged changes
gitcommit-am"message"# add (tracked files) + commit in one step
gitlog--oneline--graph--all# readable history of all branches
gitdiff# unstaged changes
gitdiff--staged# staged changes (what will be committed)
Pro tip — git add -p is the single most underused command. It lets you split one messy file
of changes into clean, logical commits by staging only the hunks you choose (y/n/s to split).
Rebase = replay your commits on top of another base. It produces a clean, linear history (no
merge commits) — but it rewrites commit hashes, so never rebase shared/public history that others
have pulled.
Terminal window
gitswitchfeature/login
gitrebasemain# replay feature commits on top of latest main
Rewrite the last N commits — reorder, squash, edit, drop, reword:
Terminal window
gitrebase-iHEAD~5
You get an editor:
pick a1b2c3 Add login form
squash 4d5e6f Fix typo # fold into previous commit
reword 7g8h9i Add validation # change the message
edit 0j1k2l Refactor auth # pause here to amend
drop 3m4n5o Debug print # remove this commit entirely
Commands: pick (keep), reword (change message), edit (pause to amend), squash (merge + keep
both messages), fixup (merge + discard message), drop (delete), and reorder lines to reorder
commits.
Real example — clean up before a PR: You made 8 messy commits (“wip”, “fix”, “oops”). Run
git rebase -i main, mark the noise as fixup, reword the main one, and present one clean
commit.
gitcommit--fixup=<commit-sha># marks a commit as a fixup of an older one
gitrebase-i--autosquashmain# automatically orders & folds the fixups
Safety: When you must push rewritten history to your own branch, use
git push --force-with-lease (refuses to clobber someone else’s new work) instead of the
dangerous --force.
gitrestorefile.txt# discard unstaged changes to a file
gitrestore--stagedfile.txt# unstage a file (keep the edits)
gitrestore--source=HEAD~2file.txt# bring back an old version of one file
revert vs reset (interview classic): Use revert on commits already pushed/shared (it adds a
new undo commit). Use reset only on local, unshared commits (it rewrites history).
Temporarily set aside uncommitted changes to switch context quickly.
Terminal window
gitstash# shelve tracked changes, clean the working dir
gitstash-u# include untracked files
gitstashpush-m"wip: login"# named stash
gitstashlist# see all stashes
gitstashshow-pstash@{0}# view a stash's diff
gitstashpop# re-apply the latest stash AND remove it
gitstashapplystash@{1}# re-apply a specific stash, KEEP it in the list
gitstashbranchfix-xstash@{0}# turn a stash into a new branch
gitstashdropstash@{0}# delete one stash
gitstashclear# delete all stashes
Real example: You’re mid-feature when an urgent bug comes in. git stash, fix the bug on a
clean tree, commit, then git stash pop to continue exactly where you left off.
The reflog records every place HEAD has been — even after a “lost” reset, rebase, or deleted
branch. This is how you recover from almost any mistake.
Terminal window
gitreflog# list everywhere HEAD has pointed, with timestamps
gitreflogshowfeature/login# reflog for a specific branch
Example output:
a1b2c3d HEAD@{0}: reset: moving to HEAD~3
f4g5h6i HEAD@{1}: commit: Add validation ← the work you thought you lost
Recover it:
Terminal window
gitreset--hardf4g5h6i# jump back to that exact state
# or recover a deleted branch:
gitswitch-crecoveredf4g5h6i
Real example — “I hard-reset and lost 3 commits!”:git reflog, find the hash from before the
reset, git reset --hard <hash>. Crisis averted. Reflog entries live ~90 days by default.
A worktree lets you check out multiple branches at once into separate folders, all sharing the
same .git repo — no second clone, no stashing to switch context.
Terminal window
gitworktreeadd../project-hotfixhotfix/urgent# new folder on the hotfix branch
gitworktreeadd../project-reviewpr-123# review a PR in its own folder
gitworktreelist# show all worktrees
gitworktreeremove../project-hotfix# clean up when done
gitworktreeprune# tidy stale entries
Real example — why this is gold: You’re deep in a feature with a dirty working dir, and a
production bug lands. Instead of stashing and switching branches, run
git worktree add ../app-hotfix main, fix and deploy from that folder, then delete it — your
feature folder stays completely untouched. Great for running tests on one branch while coding
on another.
Rules: the same branch can’t be checked out in two worktrees simultaneously; all worktrees share
branches, stashes, and config from the one repo.
gitfetchorigin# download new commits, DON'T touch your working branch
gitpull# = fetch + merge (or rebase) into your branch
gitpull--rebase# fetch, then replay your commits on top (linear history)
Fetch is safe and read-only — it just updates your knowledge of the remote. Pull modifies
your branch. Many teams default to pull --rebase to avoid noisy merge commits:
git config --global pull.rebase true.
gittag-av1.0.0-m"Release 1.0"# annotated tag (recommended: has author/date/message)
gittag-sv1.0.0-m"Signed"# GPG-signed tag
gittag# list tags
gitshowv1.0.0# view a tag
gitpushoriginv1.0.0# push one tag
gitpushorigin--tags# push all tags
gittag-dv1.0.0# delete locally
gitpushorigin--deletev1.0.0# delete on remote
gitcheckoutv1.0.0# check out the code at that tag (detached HEAD)
Annotated vs lightweight: Use annotated tags for releases — they’re real objects with
metadata and can be signed. Lightweight tags are just bookmarks for private/temporary use.
11. Submodules vs Subtrees (projects inside projects)
Pros: no special commands for cloners (files are just there); simpler for teammates.
Cons: bigger repo; contributing changes back upstream is more involved.
Rule of thumb: Use submodules when the embedded repo is independently versioned and you want
a precise pinned commit. Use subtrees when you mostly want the files vendored in with minimal
hassle for everyone cloning.
gitlog--oneline--graph--all--decorate# the everyday "map" of the repo
gitlog-pfile.txt# full diff history of one file
gitlog--followfile.txt# follow a file across renames
gitlog-S"functionName"# "pickaxe": commits that added/removed that text
gitlog-G"regex"# commits whose diff matches a regex
gitlog--author="Saeid"--since="2 weeks ago"
gitlogmain..feature# commits on feature not in main
gitshortlog-sn# contributor commit counts
gitblamefile.txt# who last changed each line
gitblame-L10,20file.txt# blame a line range
gitshow<commit># full details + diff of a commit
gitdiffmain..feature# compare two branches
gitdiff--stat# summary of changed files
Real example — find when a bug text appeared:git log -S "buggyConfig = true" jumps straight
to the commit that introduced that exact string. The “pickaxe” is a lifesaver.
Binary-search your history to find the exact commit that introduced a bug.
Terminal window
gitbisectstart
gitbisectbad# current commit is broken
gitbisectgoodv1.2.0# this old version worked
# Git checks out a commit halfway between. Test it, then tell Git:
gitbisectgood# or:
gitbisectbad
# ...repeat; Git narrows it down in log2(N) steps...
gitbisectreset# finish & return to where you were
Automate it with a test script (exit 0 = good, non-zero = bad):
Terminal window
gitbisectstartHEADv1.2.0
gitbisectrun./run_tests.sh# Git finds the culprit commit unattended
Real example: A test broke somewhere in the last 300 commits. git bisect run npm test finds
the exact breaking commit in ~9 steps instead of you checking hundreds by hand.
Team tip:.git/hooks isn’t shared by Git. Use a tool like Husky (JS), pre-commit
(Python), or git config core.hooksPath .githooks to commit shared hooks into the repo.
15. Large repos: LFS, partial clone, sparse checkout
Git LFS (Large File Storage) — store big binaries (videos, models, PSDs) outside the main repo,
keeping a small pointer in Git.
Terminal window
gitlfsinstall
gitlfstrack"*.psd"# writes a rule to .gitattributes
gitadd.gitattributes
gitadddesign.psd && gitcommit-m"Add design"
Partial clone — skip downloading all blobs upfront (huge repos):
Terminal window
gitclone--filter=blob:none<url># fetch blobs lazily on demand
gitclone--depth=1<url># shallow clone: only the latest commit (CI builds)
Sparse checkout — only materialize part of a giant monorepo:
Terminal window
gitclone--filter=blob:none--sparse<url>
cdrepo
gitsparse-checkoutsetapps/weblibs/ui# only these folders appear on disk
Real example — monorepo: A 40 GB monorepo where you only work on one app:
--filter=blob:none --sparse + sparse-checkout set apps/payments gives you a fast, tiny working
directory with only what you need.
rerere (reuse recorded resolution): once enabled, Git remembers how you resolved a conflict
and auto-applies the same resolution next time the same conflict appears — invaluable during long
rebases of long-lived branches.
*.sh text eol=lf # force LF line endings for shell scripts
*.png binary # don't try to diff/merge binaries
*.psd filter=lfs diff=lfs merge=lfs -text
Line-ending gotcha (Windows/Mac/Linux teams): set core.autocrlf appropriately, or better,
pin endings in .gitattributes so the repo is consistent for everyone.
18. Team Workflows (Git Flow, Trunk-based, Forking)
GitHub Organizations (and Teams within them) / GitLab Groups (and subgroups) let you
manage many repos and people together, with shared permissions and visibility.
GitLab subgroups can nest (e.g., company/backend/payments), and permissions inherit
downward — a powerful way to model real org structure.
Roles/permissions: typically Read → Triage/Reporter → Write/Developer → Maintain/Maintainer →
Admin/Owner. Grant the least privilege needed.
Require PR review (e.g., 1–2 approvals) and passing CI before merge.
Forbid force-pushes and direct pushes.
Require linear history or signed commits if desired.
Real example — team setup:main is protected (no direct push, 2 approvals, CI must pass).
Devs branch feature/*, push, open a PR, get review + green CI, then Squash and merge for one
clean commit per feature. A GitLab group acme/ holds subgroups acme/web, acme/api with
inherited Maintainer access for the platform team.
Shrinking a bloated repo / removing a leaked secret from ALL history: use
git filter-repo (the modern, recommended tool; replaces the old filter-branch and BFG):
Terminal window
gitfilter-repo--pathsecrets.env--invert-paths# scrub a file from entire history
Then force-push and have everyone re-clone. (Rotate the leaked secret regardless.)
Q1. Merge vs Rebase — when do you use each?
Merge preserves true history and creates a merge commit; it’s safe for shared branches. Rebase
replays your commits to produce a clean, linear history but rewrites hashes, so only use it on
local/unshared branches. Typical flow: rebase your feature branch onto main to stay current,
then merge (or squash-merge) the PR.
Q2. git reset vs git revert?reset moves the branch pointer and rewrites local history (--soft keeps staged, --mixed keeps
unstaged, --hard discards). revert creates a new commit that undoes an old one without
rewriting history — the safe choice for commits already pushed/shared.
Q3. How do you recover commits after a bad git reset --hard?
Use git reflog to find the commit hash from before the reset, then git reset --hard <hash> (or
git switch -c recovered <hash>). Reflog tracks every HEAD movement for ~90 days.
Q4. What actually is a commit?
A commit is an object pointing to a full tree snapshot of the project, plus parent commit(s),
author/committer, and a message — identified by the hash of its content. It’s a snapshot, not a diff.
Q5. git fetch vs git pull?fetch only downloads remote changes into remote-tracking refs (safe, read-only). pull = fetch +
merge (or rebase) into your current branch, modifying your working branch.
Q6. What is a fast-forward merge?
When the target branch hasn’t diverged, Git just moves its pointer forward to the source branch’s tip
— no merge commit. Use --no-ff to force a merge commit and keep the feature grouping visible.
Q7. How do you squash commits / clean up a messy branch?git rebase -i <base> and mark commits as squash/fixup, or git reset --soft <base> then one
git commit. For PRs, “Squash and merge” achieves the same on the platform.
Q8. How do you safely rewrite and push shared-feature history?
Rewrite locally (rebase/amend), then git push --force-with-lease, which refuses to overwrite if
someone else pushed in the meantime — unlike the blunt --force.
Q9. cherry-pick use case?
Apply a specific commit (e.g., a hotfix that landed on main) onto another branch (e.g.,
release/1.4) without merging everything: git cherry-pick <sha>.
Q10. Submodule vs subtree?
Submodule = a pinned pointer to another repo’s commit (separate history, fiddly, needs
--recurse-submodules). Subtree = the other repo’s files merged into yours (simpler for cloners,
bigger repo).
Q11. How do you find which commit introduced a bug?git bisect — binary search between a known-good and known-bad commit; automate with
git bisect run <test-script>. Or git log -S "text" (pickaxe) to find when specific code appeared.
Q12. What’s the staging area / index for?
It’s the proposed next commit — it lets you craft a commit precisely (e.g., git add -p to stage
only some hunks), separating what you changed from what you’re about to record.
Q13. How do you work on two branches at once without stashing?git worktree add ../folder <branch> checks out another branch into a separate directory sharing the
same repo — perfect for a hotfix while a feature is in progress.
Q14. How do you remove a committed secret from the whole history?
Use git filter-repo to strip the file/string from all commits, force-push, have everyone re-clone —
and rotate the secret, since it was already exposed.