Mini-git, Understanding How Files Are Stored in Git Objects-JS Tutorial-php.cn

Mini-git, Understanding How Files Are Stored in Git Objects

WBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWBOYWB

Release： 2024-08-22 18:45:03

Original

406 people have browsed it

Mini-git, Understanding How Files Are Stored in Git Objects

Yesterday, I set out to implement one of Git's core functionalities on my own—specifically, how files are stored, what Git objects are, and the processes of hashing and compressing. It took me 4 hours to develop, and in this article, I'll walk you through my thought process and approach.

What Happens When You Commit a File?

When you commit a file in Git, several important steps occur under the hood:

File Compression:

The content of the file is compressed using a zlib algorithm to reduce its size. This compressed content is what gets stored in the Git object database.

Hash Calculation:

A unique SHA-1 hash is generated from the compressed file content. This hash serves as the identifier for the file in the Git object database.

Storing the Object:

The object file is stored in the .mygit/objects directory, organized by the first two characters of the hash. This structure makes it easier to manage and retrieve objects efficiently.
Updating Commit Information:

To demonstrate how files are stored in git.
I have implemented commit functionality, taking one file in to consideration

For every file, I have calculated hash
Inside objects folder, new folder is created with name equal to first two characters of hash.
And a file is created inside that folder with remaining hash as name.(this file stores the compressed format of committed file)
Detected changes by comparing newly calculated hash and last calculated hash of the file

Detecting Changes

I implemented this algorithm based on my own approach, but Git uses more efficient algorithms for these operations.

Extracted array of lines from oldContent and newContent
Created a Map to store line as key and index as value
Created two new arrays to store indexes of common lines in oldContent and newContent 4.eg: OldCommonarray = [0 , 3] then deleted lines will be [1,2]

GitHub Repo
Linkedin

Thanks a lot for you time.

The above is the detailed content of Mini-git, Understanding How Files Are Stored in Git Objects. For more information, please follow other related articles on the PHP Chinese website!