Today we’re going to talk about the two key cryptographic features that maintain the integrity of all blockchains; cryptographic hash functions and digital signatures. Before we get into this topic allow us to clarify one thing. You do not have to know how these cryptographic features work to purchase, store and/or use cryptocurrencies. Most users have little to no understanding of the underlying technology that protects the integrity of blockchains. If the technical aspects of blockchain doesn’t interest you, feel free to skip this post.
The main benefit of understanding blockchain cryptography is that you will develop a more complete understanding of how blockchains works and will be better able to communicate it to friends and family who will undoubtfully have tons of questions. Also, it’s a super awesome technology!
Before we begin let me introduce you to the guy who makes all of this possible.
While their exact origin is debated, hash functions started to gain traction back in the 1960’s when researchers realized their potential in the budding field of computing. A hash function is a type of function that returns a string of characters of a fixed size from a dataset of an arbitrary size. Cryptographic hash functions are hash functions that have some application in cryptography and have some unique properties that make them extremely useful:
- They are deterministic. That is, the same input always produces the same output.
- Small changes in the input results in a dramatic changes in the output.
- They are quick to compute.
- They are “one-way functions”. That is, it is extremely difficult to determine an input if you are given the output.
SHA (Secure Hash Algorithm) is a family of hash functions published by the US National Institute of Standards and Technology. Most blockchains use either SHA or similar cryptographic hash functions to secure the flow of information. Bitcoin uses SHA-256. You can check out how SHA-256 works and play with this hash function on this website. We highly recommend you try this out for a few minutes as it will make the next section a lot easier to follow.
Cryptography and Passwords
To understand one of the key use cases of hash functions, lets start with a simple example. Lets say 2017 Average Jane and Joe are both members of a secret cult and the only way to prove their membership to each other is by stating a secret password.
Jane could check if Joe is a member of the secret cult by asking him to state the password but there’s a problem with this; Joe doesn’t want to give away the password since he has no way of knowing if Jane is a member or not!
The solution to their problem is a hash function like SHA-256. Lets say the actual password is:
“Support the creators of this blog if you find their content educational”
To prove their secret cult membership to each other, Jane and Joe could both insert the passphrase into the hash function (try it out for yourself) and compare the resulting hash:
By comparing the hash instead of the actual password, Jane and Joe can check each others passwords without actually revealing it. If one of their passwords is even slightly different, the resulting hash will be completely different:
Cryptographic Hash Functions in Blockchains
You’ll remember from our first post that unconfirmed transactions that are broadcast to a blockchain network go into the unconfirmed transaction cloud.
From here, thousands of nodes in the blockchain network compete by rearranging the transactions in the unconfirmed transaction cloud in an effort to solve a special puzzle.
Note. the below describes how the Proof of Work (PoW) consensus algorithm works. PoW is used by the Bitcoin and Ethereum blockchains.
What these nodes are actually doing is packaging up unconfirmed transactions into a block (like adding rows to a spreadsheet), then adding a random number to the end of the block.
Lets have a closer look at each part of the spreadsheet:
- Hash of previous block: Each block in a blockchain has to contain the hash of the previous block in the chain, i.e. the output if you were to put all the information in the previous block through the hash function.
- List of unconfirmed transaction: Miners are free to include any number of transactions in the unconfirmed transaction cloud. If there are more transaction than what can fit inside a block. Miners usually pick the transactions that are willing to pay the highest transaction fees.
- Miner's address: Miners also include their own address. All mining rewards are paid to this address.
- Random number: The random number completely changes the hash of the block and is essential to solving the secret puzzle.
This entire package is then fed through a cryptographic hash function like SHA-256.
“A4377BE8EE61C2EF05C0BEFC217213; Joe sends $3 to Jane; Simon sends $34 to Thomas; Ellie sends $4 to Frank; Ron sends $8 to Tom; Patrick sends $45 to Frank; 0x352d351dBEdBCC8D964892D112A967176227101d; Random number = 567,598,105,436”
Lets feed all of this into the SHA-256 hash function
To solve the secret puzzle, all you have to do is find a hash with a certain number of zeros at the start of the hash. The more zeros at the start of a hash, the "smaller" the hash is said to be. Lets try again.
Success! When a node finds a number that produces a sufficiently small hash, the node is said to have "solved" the block.
Well, we didn't!
Literally the only way to find a small hash is through trial and error. The smallest Bitcoin hash ever found was for Block #125,552 (blockchain.info). Look at this sexy beast:
Blocks can only be added to a blockchain if the hash is sufficiently small. If a node broadcasts a block with a larger hash to the network it will be rejected by the other nodes. The difficulty of the secret puzzle is continuously adjusted so that each block takes a set amount of time to be solved. For Bitcoin, this is roughly 10 minutes. For Ethereum, it's roughly 15 seconds.
Occasionally, two nodes find two different solutions to the same block at the same time.
When this happens, there is a temporary split in the blockchain where some nodes have accepted one solution while others have accepted the other.
In order to resolve this ambiguity at the end of the chain, we utilize the golden rule of blockchain:
“The longest blockchain is the true blockchain”
So instead of arguing about which blockchain is correct, each node just accepts whatever block it received first and continues to build on that. Next time a block is solved, one chain will be longer than the other.
This triggers all nodes on the network to ditch their old blockchains and accept the new longest chain as the true blockchain
There is usually full agreement within the blockchain network about all but the final few blocks in the chain. At first this may sound like an awfully difficult and unnecessarily complicated way of adding new blocks to the blockchain. That is until you realize how ingeniously difficult it is to tamper with a blockchain system.
To explore exactly how this cryptography protects the integrity of the blockchain, lets have a another look at a simple block
A blockchain with six blocks would look something like this
- Each block contains the hash of the previous block in the chain
- The hash of each block is very small (contains lots of zeros at the start)
Both of these conditions have to be true for every block in the blockchain in order for other nodes on the network to accept it. Now lets say a malicious hacker wanted to add an extra block to the middle of the blockchain (to benefit themselves)
Immediately, there's a couple of problems:
- The hash of the new block is not small enough to be accepted
- The hash listed at the start of Block #4 no longer corresponds to the hash of the previous block
Other computers will not accept this blockchain until these two problems are fixed. So, to get around this problem the malicious hacker uses her computing power to solve her new block and changes the number at the start of Block #4 to match the hash of her new block
This changes the hash of Block #4, which means the malicious hacker will have to use her computing power to solve Block #4 before any other computer will accept this blockchain. She will also have to change the number at the start of Block #5 to match the new hash of Block #4, this will in turn change the hash of Block #5... You see where this is going. In order to make any retrospective changes to a blockchain, every single subsequent block has to be solved all over again. By the time the malicious hacker has solved all the blocks in her blockchain, the other nodes on the network would have already found several new blocks. The malicious hacker’s blockchain is therefore rejected since it's now shorter than the true blockchain.
For a malicious hacker to succeed at altering a blockchain, they would have to control more computing power than all the other nodes on the network combined!
This is where the second key cryptography feature comes in; digital signatures. Digital signatures allow a recipient of a message to confirm two things:
- That a message came from the right person
- That a message hasn’t been tampered with in transit
When a node receives a new block from another node, the first thing it does is to confirm that all transaction in the block are authentic. Blockchains use a system of public and private keys to achieve this. Your private key is just a long string of randomized numbers and letters that are used to differentiate between accounts. All anyone needs to gain access to all the funds, assets and information stored on your account is your private key. Keeping your private key private is therefore essential. Each private key has a corresponding public key which is actually just a hash of the private key. The below is an example of an Ethereum key pair I just generated:
- Private Key: b13a1d831cd311673c4a2c04932b96d09326ec29dc8b885c5deb266c4f623656
- Corresponding Public Key: 0x0FD20Bdce5F2072b89FD2099070BB4cD8cA18Aa3
The public key can be shared with anyone and is often referred to as the address of the account. This is what people would enter as the destination if they wanted to send a transaction to you. The goal of a digital signature is for anyone who receives a transaction/message to be able to confirm that the person who wrote the message knew the corresponding private key.
The way this works is rather ingenious. As the owner of the private key you need to
- Feed your message through a cryptographic hash function
- Feed the output of the cryptographic hash function and your private key through a signature algorithm
Then send out
- Your public key
- Your message
- Your digital signature
For a recipient to authenticate your message, they have to
- Feed the incoming message through a cryptographic hash function
- Feed the incoming signature and public key through a signature verification algorithm that produces a hash
- Compare the two hashes, if they are the same the message is authentic
In other words, your public key is used to verify that your digital signature can produce the same hash as your message. If either your message or your digital signature changed along the way, the two hashes would be vastly different. For someone to hack this system, they would have to change the message first then find a digital signature that matches the new message. Because hash functions are “one-way functions” (i.e. the input can not be determined using the output), the only way to do this is through trial and error.
Lets do a quick calculation to determine roughly how long it would take to generate a new digital signature if you don't know the corresponding private key. Lets assume every single computer on the Bitcoin network worked together to find a new digital signature:
- The hash rate (total number of random numbers generated) of the entire Bitcoin network is ~7,000,000,000,000,000,000 or ~10^19 hashes per second (blockchain.info)
- The total number of possible solutions to a SHA-256 hash function is 2^256 or ~10^77
- To guess every possible solution to a SHA-256 hash function would take ~10^58 seconds or ~10^50 years
To have a 1% chance of guessing a digital signature the entire Bitcoin network would have to work together for ~5,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000,000 years...
As you can see, the distributed nature of blockchains protect the integrity of the system by pitting nodes of computational power against each other. The system will therefore remain safe as long as there is adequate incentive for people to devote their computing power to solving new blocks. The fact that Bitcoin has run a public blockchain since 2009 without a single breach is an excellent testimony to the power and reliability of this system.
Understanding how blockchain cryptography works behind the scenes has given us additional confidence in the technology. Hopefully it will do the same for you. Did you enjoy this blog post? What other technical aspects of blockchain technology would you like to hear about in the future? We would love to hear from you on our message board below! Our next post will be a step by step guide on how to buy your first cryptocurrency. Hope to see you there!