Understanding the .git Folder in Git

Introduction (Why This Matters)

Git is a version control system that helps developers collaborate and keep track of all the changes made in the code over time. Git stores the project data such as who authored the commit and the timestamp of the commit in the repository.

Most people use Git as a magic box, where they know just a few commands and go about their tasks without understanding what exactly happens under the hood of Git. How it stores the changes and what the contents of the .git folder are not known by most developers who work on the surface level and don't dive deeply into understanding the workings of Git.

This specific article is meant to create a mental model of Git’s workings and is not going to be a command memorization cheatsheet. By the end of this article, you’ll know what Git stores, where it stores it, and how it tracks the changes.

What Is Git Really?

Git is a content-addressable file system. What does that mean? It means that at the core of Git is a basic key-value data store. You can add any type of content to a Git repository, and it provides a unique key that you can use later to retrieve the content.

Git doesn’t really track changes of the code-base but it takes a snapshot of the project whenever asked to do so. It does not track line-by-line differences of the project.

Everything in Git is stored using hash. Git uses SHA-1 hash function to generate unique identifiers (hashes) for all of its objects, such as commits, files (blobs), and directories (trees). These hashes are 40-character hexadecimal strings.

The commands in Git serve as a user-friendly interface that allows you to interact with the repository. However, the real magic happens in the background, where Git manages object storage. This means that when you use Git commands, you are actually communicating with an internal database that efficiently handles the storage and retrieval of data.

Understanding the `.git` folder:

Why the .git Folder Exists

.git is the hidden subfolder created after the command git init is run, this command initializes the repository creating a .git folder. It is the entire repository of the project. The .git folder is where all the data related to the project repository is stored and this folder is created in the current directory.

Deleting the .git folder means losing the entire version history of the project. It works as the internal database of the project. To understand Git, you must understand what’s inside .git.

Structuring of the .git Directory

objects/ → The subfolder objects/ is the heart of the Git database. It stores all your content in a compressed format using a hash algorithm. Blobs, Trees and Commits are the part of the content saved in objects/
refs/ → Short for “references.” Keeps track of where all the branches and tags are pointing. HEAD points to the latest commit.
index/ → This is the staging area. It stores the information of the changes you had decided for the next commit. It acts as a middle-man between your working directory and your actual commit history

config → This file holds the repository's settings, like your name and email for commits, the default branch name, and any custom Git command shortcuts you’ve set up. It tweaks how Git works to suit the project’s needs.

Git Objects: Blob, Tree, Commit

Blob:

Blob stands for Binary Large Object. It is the simplest file and stores the content of a file and nothing else. A blob does not know its own name or where it lives in your folders. It only knows the stored data inside. Every time you change even a single character in a file. Git creates a new blob for that specific version, only if the content changes. The same content has the exact same hash. If you have 10 identical files in different folders, Git creates only one blob. This is why Git is so space-efficient.

Tree:

If a blob is an object, then a tree is the directory. It acts as a map that organizes your files. A tree object lists filenames and maps them to their respective blob hashes. It also stores file permissions. If a directory has not been changed between commits, Git just points to the existing tree object rather than creating a new one.

Commits:

The commit acts as the “wrapper” that ties everything together in a snapshot of the project at a specific moment. Every commit points to exactly one root Tree (representing your main project folder). It stores data like who, what, when, and why a specific change was made to the project. Most importantly, every commit points to the parent commit (previous commit). This creates a chain of all the commits made so far, which acts as a history of the project.

How Git Tracks Changes?

Git works as a version control system that tracks changes made to a project and helps developers work together. These changes are recorded in the .git folder when the git commit command is run after changes have been added to the staging area using the git add command.

What Happens Internally During `git add`:

Git has three stages:
1. Working Directory – your files on disk
2. Staging Area (index) – what you intend to commit
3. Repository (.git) – immutable history (objects)

git add moves the files to the staging area
git commit turns staged data into permanent history

Step by Step Process of internal working of git add:

Git reads the file content:

Git looks at the actual bytes of file.txt.
It does not look into the filename and not even the path, it purely looks at the content.
Git hashes the object:
Git runs SHA-1 on the file’s content, this hash becomes the identity of the content.
Same content → Same hash
Different content → Different hash
Git creates a blob object(if needed):
If a blob with this hash already exists → Git reuses it

If not → Git writes a new blob into:
Git updates the index (staging area):
Git writes an entry in the index:
Filename
Permission
Pointer → blob hash

Here, we are just telling Git that for the next commit, I want this version.

What Happens Internally During `git commit`:

Git reads the index:

Git looks at:
- all staged files
- their blob hashes
- directory structure implied by paths
Git creates Tree objects:

Git builds:
- trees for subdirectories
- one root tree for the project

Each tree has:

maps filenames → blob hashes
stores permissions

Trees are also hashed and stored in .git/objects.

Git creates a Commit object:

Now Git creates one commit object containing:
- pointer to the root tree
- pointer to parent commit
- author, timestamp
- commit message

This commit is hashed and stored.

Git updates HEAD:
- HEAD moves to point to the new commit
- Current branch ref updates to the same commit

Result of git commit:

History grows
Snapshot becomes permanent
Previous commits remain untouched.

Conclusion:

Git might seem tricky because a lot of its processes happen in the background. However, once you understand what's inside the .git folder, it becomes clearer. Git is essentially a content-addressable database that keeps snapshots of your project using blobs, trees, and commits, all connected by hashes.

Commands like git add and git commit aren't magical, they are simply ways to interact with this structure. When you know what Git is doing internally, it's easier to fix errors, debugging becomes more straightforward, and workflows feel more dependable. Instead of just memorizing commands, you gain a clear understanding of how Git tracks changes, maintains history, and ensures everything is accurate. This understanding is what makes Git truly powerful.

Inside Git: How It Works and the Role of the .git Folder

Introduction (Why This Matters)

What Is Git Really?