Merkle Trees in Git and Bitcoin - What they are and Usecases

Merkle Trees in Git and Bitcoin - What they are and Usecases

What is a Merkle Tree?

Merkle trees also known as “binary hash trees” is a hash-based data structures popularly used in computer science applications.

It’s a tree structure where each leaf node is a hash of a block data and each non-leaf node is a hash of its children. Typically in Merkle Trees, each node has 2 children extending in branches.

Merkle Trees enable efficient data verification in distributed systems. They are efficient as they use hashes instead of entire files. Hashes are basically a method of holding encoded files that are much bigger in size in a small encrypted code called hash code. Currently, Merkle Trees are widely used in peer-to-peer networks such as blockchain technology like Bitcoin, git, Tor, etc.

Why are Merkle Trees important?

Merkle trees are used to efficiently verify data. For example, if Bitcoin didn’t have Merkle Trees, each node would have to keep a complete copy of all the data and each transaction that ever occurred on Bitcoin. Can you imagine how tedious that would have been?

Any verification request would require an extremely large pack of data to be sent over the network in order to verify it. Since the data is not hash coded and raw, each computer would have to use a lot of computing power to process the data and verify it.

Merkle trees solve this issue. They hash large data records into small hash codes. Offering a small amount of data across networks to prove its validity is all that is needed.

Post the FTX Crash in November 2022 (a centralized crypto exchange), having a Merkle tree verification of proof-of-reserves has become a need of the hour.

Chart-based explanation of Merkle Trees

In various distributed and peer-to-peer systems, data verification is very important. As the same data exists everywhere. If a piece of data is modified in one location, it must be changed everywhere. This is why it’s important to ensure the same data is everywhere.

In a Merkle tree, the topmost node is called the “Root” node like that of a tree. Each node has 2 children branches called “Leaf” as shown in the image above. These nodes carry all the data of the blockchain in a secure hash function.

An example of a hash code

Since the intention is to limit the amount of big data being sent over the network. Instead of sending an entire file we just sent the hash of the file to check if it matches. The protocol of a Merkel tree is as follows:

  1. Computer A sends the hash of the file to computer B.

  2. Computer B checks and verifies that hash against the root of the Merkle tree.

  3. If there is no difference, the job is done! If not, go to step 4

  4. If there is a difference in a hash, the computer will request the hash of the root of the two sub leaves

  5. Computer A will find the necessary hash and send it back to computer B

  6. Steps 4 & 5 are repeated until the computer has found the data block(s) that are inconsistent. It's possible to discover more than 1 incorrect data block.

Since it's inefficient to check the entirety of a file to check for issues. Merkle trees are useful in a peer-to-peer network system to verify information, as some information comes from untrusted sources (which is a concern in peer-to-peer systems).

Use Cases

As talked about earlier, Merkle trees are specifically useful in distributed systems where the same data should exist in multiple locations.

Git

Git is a popular version control system popularly used by programmers, All the saved files are saved on the computer of every user. Hence, its imperative to check that these changes are consistent and applied across everyone’s computers.

Bitcoin

Bitcoin is a blockchain-based anonymous currency. All transactions that happen on the blockchain in bitcoin are stored in blocks. This blockchain exists on every bitcoin user’s computer.

The leaves of the bitcoin Merke tree are typically hashes of single blocks. Every time someone wants to alter the blockchain for example: add a transaction in the chain, this change needs to be reflected everywhere.

This is particularly difficult in Bitcoin’s blockchain the reason being, if some intruder wants to modify the chain for their personal benefit, it is not possible due to the 51% attack resistance of a blockchain as proposed by Satoshi Nakamoto in the bitcoin whitepaper.

Conclusion

Merkle trees are significant for Blockchain Technology and a concept used by many developers in the Web3 Ecosystem. Learn how to build a blockchain and use your own research to apply this Merkle tree concept in your own blockchain.

I hope you benefited from this explanation, in case of any doubts feel free to drop your questions in the comment section below.