In the context of Bitcoin, the blockchain is a shared public ledger on which the entire Bitcoin network relies. It has a linked list data structure, with each block containing a hash of the previous block. Each block is formed by a proof-of-work algorithms, through which consensus of this distributed system could be obtained via the longest possible chain. The blockchain provides the basis for the trustless distributed system of Bitcoin and it is extendable in many ways through modifications of the parameters of the chain.
What is Blockchain?
The simple explanation is a 'chain' of blocks.
A block is an aggregated set of data. Data are collected and processed to fit in a block through a process called mining. Each block could be identified using a cryptographic hash (also known as a digital fingerprint). The block formed will contain a hash of the previous block, so that blocks can form a chain from the first block ever (known as the Genesis Block) to the formed block. In this way, all the data could be connected via a linked list structure.
What is inside a block?
In simple words, data are contained inside blocks as well as an arbitrary integer (called nounce) that is necessary for producing the proof-of-work.
In bitcoin blockchain, the block contains a header and relevant transaction data. A merkle tree of transactions is created and the hash of the root is included in the header.
A merkle tree is a full binary tree of a hash values. At the bottom level of the tree, each transaction has a node containing its hash value. After that, the tree is constructed in a way such that the parent node has a value of the hash of the data contained in its children concatenating together.
The merkle tree data structure allows fast validation by constructing a merkle tree path from the bottom level of the tree up to the root node. Since each bitcoin transaction output can be spent once only, as long as the output is spent, it could be erased out of the tree structure using some pruning algorithms. In this way, disk usage could be reduced while the validation functions preserve.
How are blocks chained together?
Each block contains a hash of its previous block.
In bitcoin blockchain, the block header has a field for previous block hash. Hence, all blocks will contain a reference of its previous block, and this could build up a chain of blocks.
Sometimes, a fork on the block chain may occur. This is due to two blocks computed at a very short time interval. The subsequent blocks may build upon both blocks and both of the chain remain valid. In subsequent process of mining, one fork would be longer than the other fork, in this case, the longer chain would be accepted by the network and the short would not be used unless its length exceeds the the longer chain in the future.
Features of Blockchain
Controlled Block Generation Time
The blockchain is designed in a way such that the average time for a block to be generated remains fairly constant. In the bitcoin blockchain, the average time for a block to generate is 10 minutes. Other blockchains may have a different time, e.g. 30 seconds, 5 minutes, etc.
The controlled block generation time is achieved by adding a difficulty value inside the block. In bitcoin, the hash of the block must be strictly smaller than a given value to be accepted. The given value varies according to the total computation power of the network. The more powerful the network is, the more smaller the given value, and hence the more difficult it is to generate the block.
Bitcoin Transactions support a function such that messages could be sent with bitcoins in a transaction (via OP_RETURN operator in the unlocking script). This feature extends bitcoin blockchain to more uses than handling transactions. A sender to choose to include a text in a transaction, as the transaction is included in a block attached to the blockchain. The message can be retrieved from the block by everyone and it could hardly be modified unless the whole block is re-written (see below). This can provide reliable storage for short texts.
For example, a hash of a file could be included in a transaction. The users of the file could check the message field attached and verify whether the file in his hand has not been comprimised and has remained original.
Bitcoin blockchain is a shared public chain. It means that everyone would have access to the chain, not only read the information on the chain, but also append new blocks on the chain, i.e. everyone have full access over the chain. This is known as unpermissioned chain. The chain could also be modified so that stricter access control applies. The strictest access control is that only the owner of the chain could have full access of the chain whereas others have no access at all. This may be similar to the way a central database stores confidential data.
However, in many scenarios somewhere between a shared public chain and a private chain should be the cases resembling real world uses. Through public key cryptography, access control could be implemented during setting up of the chain so that different access control could apply. An example would be the health information of individual. This should be accessed only by the patient or anyone who have been granted access by the patient; only trusted body could append new data to the chain. This is known as permissioned chain.
Bitcoin blockchain uses a proof-of-work algorithm for reaching a consensus. The cryptographic hash function of each block must be smaller than a specific value in order to be considered value. A nonce is thus included in the block for this feature. By using the proof-of-work method, in order to change the data in one block, all successors of that block must be re-written and a huge amount of calculation is necessary. In addition, the longest chain would be accepted by the network whereas the shorter ones would be discarded at the situation of branches of the chain. This made the data in blocks practically unmodifiable, and moreover, the more blocks built upon the block in which the data is contained, the harder the processing of overwriting the data.
However, the blockchain may use other methods of consensus. For example, a blockchain may use Scrypt for proof-of-work algorithm instead of hash functions. In addition, the blockchain could be extended for scientific computation where a correct solution to a certain problem could act a validation method. In this way, the computation power may be used to help solving scientific problems and contribute to scientific researches.
Bitcoin blockchain is a shared public ledger. Each user running a full node on the computer will download a full copy of the whole blockchain, which will include data of all transactions of the bitcoins recorded on the blockchain. After that, each node can run independently to process any incoming transactions and propagate the transaction further. The node can also contribute to the establishment of the consensus by mining - to include transaction data in a block and then to find a proof-of-work for the block. There is not a central node processing the data and distribute the data, but every node can run independently and broadcast any work that is proved. This model of distributed computation could be extended to many other services such as Domain Name Server.
- Antonopoulos A. Mastering bitcoin. Sebastopol: O'Reilly Media; 2015.
- Nakamoto S. Bitcoin: A Peer-to-Peer Electronic Cash System. [Internet]. 2008 [cited 14 March 2016];. Available from: https://bitcoin.org/bitcoin.pdf
- Baran P. On distributed communications networks. IEEE Transactions on Communications Systems [Internet]. 1964 [cited 14 March 2016];12(1):1. Available from: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=1088883
- Hancock M, Vaizey E. Distributed Ledger Technology: beyond block chain [Internet]. Government Office for Science; 2016. Available from: https://www.gov.uk/government/publications/distributed-ledger-technology-blackett-review