At its peak in December 2017, one Bitcoin was pushing $20K CDN. Yes, if you had possession of such a single virtual coin, you could sell it over the internet for $20K. In December 2018 its value dropped to around $5K. At the time of this article, its price is more than $10K – where will it end the year doesn’t concern me as much as where blockchain technology will end up managing data. Crypto currencies are probably the first application of this technology, but other applications are gaining traction over centralized flat files and databases.
To prepare for this future, I’ve been reviewing some material online. Some concepts are making sense theoretically, and others still felt like science fiction – I have a theoretical understanding, but also many questions. I decided to build a dummy blockchain with Python to gain a practical understanding of some of the blockchain concepts, all the while getting some Python practice (specifically object oriented programming, as oppose to data science / analysis scripting). This article and my project will cover the basic structure of a blockchain.
Now onto our pretend Blockchain…
Picture a box with information in it. This box has a ID label, and also happens to have the ID label of the box sequenced before it. This box would be a block on a blockchain. The information contained within would be the stored data, and the ID labels are called “hashes” – each box’s “previous box hash” is what makes the sequential linkage of the information. This is the mechanism that secures the order of events, bolstering the integrity of the story the blockchain tells.
A hash, as defined on Investopedia, is “a function that converts an input of letters and numbers into an encrypted output of a fixed length…”. In other words, it’s a way of encoding some kind of input into output of a fixed length – 64 characters. For example, encoding the input “a” will always produce the output of “ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb”.
>>> import hashlib >>> hashlib.sha256('a'.encode('utf-8')).hexdigest() 'ca978112ca1bbdcafac231b39a23dc4da786eff8147c4e72b9807785afee48bb'
I happen to use the “SHA256” method of hashing, and there are others. All methods of hashing (Secure Hashing Algorithms) will always be done so with no randomness, and it’s not meant to be decoded the other way around. Take another example;
>>> hashlib.sha256('Tuesday October 8, 2019: the dish ran away with the spoon.'.encode('utf-8')).hexdigest() '643e250ff91b8a5ee3060b6074d1892f9576ca34f7096a267fedb0ecf956619a'
These very specific, and hard-to-tamper IDs, linked from one box to the previous, create the linkage for the chain of information. They are a direct product of the information stored within the block, so any change to the contents would change the hash, breaking the lineage of the chain. The following is my code for creating the Block object;
from datetime import datetime class Block: def __init__(self, record): self.timestamp = datetime.now() self.record = record self.prev_blk_id = '' self.blk_id = '' def view(self): print ('Timestamp: '+str(self.timestamp)+'\n'+ 'Record: '+self.record +'\n'+ 'Prev Hash: '+self.prev_blk_id +'\n'+ 'This Hash: '+self.blk_id)
As defined within the “__init__” method, it simply creates an instance of a block with a timestamp (timestamp), allotted memory for its contents (record, as inputted by the caller of the method on line 5), and blank hashes (prev_/blk_id) for the time being. The Block object is also able to present the above information about itself with the “view” function.
The following creates the Chain object. Line 9 to 12 is where the hashes are generated and populated. ;
class Chain: def __init__(self): self.currBlk_id = 'genesisBlk' self.query =  def addBlock(self, record): blk = Block(record) blk.blk_id = hashlib.sha256((str(record)+ str(blk.timestamp)) .encode('utf-8')).hexdigest() blk.prev_blk_id = self.currBlk_id #last block's hash stamped to added block self.currBlk_id = blk.blk_id #last block's hash updated to new block's self.query.append(blk) def view(self): for i in range(len(self.query)): self.query[i].view() print('\n')
I’ve chosen to give the responsibility of creating blocks and hashing them to Chains. When Chains are initiated, an empty List called “query” (line 5) is created in preparation to store Blocks.
As calls are made to “addBlock”, it initiates a new Block object containing a user specified “record”. A hash for that “record” is generated by encoding the concatenation of the timestamp and record contents (line 9 to 11), and the previous block’s hash is stamped. Chains keep track of its last hash in the “currBlk_id” variable.
The first Block in any blockchain is called the “Genesis Block” – see line 4) Lastly, the newly created Block is appended to the Chain’s “query” List (line 14). Blocks of the chain can be viewed by requesting to see contents of “query”.
Next is the blueprint of a pretend person who will add blocks to a blockchain.
class Member: def __init__(self): pass def contrib(self, record, chain): chain.addBlock(record)
Not much at all needs to be specified about the person for this demonstration. The only thing this person needs to do is contribute some information to a specified blockchain (record and chain on line 6). They don’t need to know anything about hashes or hashing.
Here’s a short script to simulate members, Patrick, Gary and Bob, contributing to the “burgerChain” blockchain. This could be thought of as a ledger of who ate how many burgers and when.
import random, time #Utilities to randomize time gap between each entry burgerChain = Chain() patrick = Member() gary = Member() bob = Member() patrick.contrib('Patrick ate one burger.',burgerChain) time.sleep(random.randint(1,5)) #Pause for 1 to 5 seconds gary.contrib('Gary ate three burgers.',burgerChain) time.sleep(random.randint(1,5)) bob.contrib("Bob doesn't eat burgers. He ate none.",burgerChain) time.sleep(random.randint(1,5)) patrick.contrib('Patrick ate one more burger.',burgerChain)
Below is the result;
Some basic mechanisms of blockchains have been developed here; blocks, (block)chains, hashing, and members’ interaction with them. More meaningful information to store would be inventory counts in supply chains, transactions, or even entire documents.
This was an over-simplified blockchain demonstration to solidify an understanding of a couple key features of blockchains. The below are some of the characteristics where a real blockchain would differ.
Blockchains are decentralized. Contributors to a blockchain are called a “nodes”, and typically, they each hold a copy of the blockchain and the ownership is shared (hence the term “distributed ledger”). This is another mechanism that makes blockchains incredibly hard to compromise, all the while allowing the transparency the technology often boasts. There’s actually Centralized and Decentralized blockchains. Both types of blockchains have nodes and neither imply one node stores the information and all others read it. The difference is that a Centralized one restricts who can be a node, and Decentralized ones allow any node’s participation. Sometimes Centralized vs Decentralized blockchains are also referred to as Private vs Public.
This might pertain more to public blockchains, and the crypto currency application of the technology. In my simulation, adding a block to a blockchain was straight forward. The chain assigned your block of information a hash, and linked the block right on. A real blockchain puts the onus on the node to generate a hash that meets certain standards set by the network. This work the nodes are tasked with, to generate a valid hash, is the “mining”.
For example, a chain could have a set requirement that hashes must start with “00”. Nodes would then have a piece of information they want to add to the chain. This piece of information’s hash probably won’t start with “00” itself, so some wildcard information must be added to generate another hash. Trying different wildcards will eventually land you a hash that begins with “00”, and once it does, the chain would accept your block contribution. This wildcard analogy is what’s called a “nonce“.
The below code uses a random integer between 1 and 1000 as the nonce, along with the information I want to contribute, to mine for a hash that starts with “00”. You can see from the line number at the bottom right that it took 2000 tries.
The difficulty of crypto currency blockchains are much higher and complex, and solving for them is probably much more sophisticated than generating random integers, but that’s the theoretical background on block mining.