Building a Blockchain

We've all heard about blockchains, but what are they? It can be a complicated concept. The best way to learn, of course, is to roll up your sleeves and build a blockchain engine. That's what I did.

I followed the example from Daniel van Flymen called Learn Blockchains by Building One on HackerNoon. Now I'm not going to do what he did in this article; if you want to follow the code line by line, go there.

What I'm going to describe here is what I learned. If you read it you won't know blockchains. But if I'm sufficiently clear you should come away knowing something.

Current Transactions

A transaction is exactly what you think it is. "John pays Fred five dollars." The blockchain I created kept transactions very simply, but you could imagine how they can be more complex. "University of Downes awards Fred McGuire a Logic badge." Or anything.

Our blockchain engine keeps a list of current transactions. Each item in the list is a separate transaction. It's nothing fancier than that.

Block

Once our list of current transactions gets long enough (however you want to define that) we gather them together into a block. A block is, then, a list of transactions.

We also add some other properties to the block so we can keep track of it. We give the block an index number. We give it a timestamp depending on then it was created. When we put our transactions into a block, we'll have a record of them.

We could just keep a list of blocks. But then anyone could change the previous transactions. So we want to prevent people from changing earlier transactions by locking them into our block, and we want to create a chain of transactions so that new transactions are only added to the end of the chain.

Starting

We create an empty block as a starting point. We give it an index number of zero and a value we'll call a 'proof of work' of 1 (this number doesn't matter; it could be anything) and a 'previous hash' of 100 (again, the number doesn't matter). We give it a timestamp and an empty list of transactions.

Mining

Next, I want to create some transactions and store them in a new block. It will be the second block in our chain. By creating the second block, I will lock in all the changes in the first block so nobody can change it.

So here's how it works: I get the previous block, and I get the value for the 'proof of work' from the previous block. Then, using the last proof, I create a new proof of work for this block. I will create it by taking the old proof of work and solving a cryptographic puzzle for it. This is called 'mining'.

In my blockchain the puzzle is very simple. I take the old proof of work and attach a new value to the end of it. I start with '0'. So if the old proof of work was '1' then I create a string '10'. Then I encrypt my string, which produces a long output string, like this: b5ua881ui4pzws3O03/p9ZIm4n0=

The puzzle I'm trying to solve is this: find a new proof of work so that encrypting the string produces an output string that begins with '0000'. My very simple system keeps trying numbers until it gets the answer. If '0' didn't work, try '1'. If '1' didn't work, try '2'. And so on. In my system, the new proof of work was something like 726,095.

That's a lot of computing. I had to create 726,095 different encryptions until I found the right one. In real blockchains, the challenge might be even more difficult. That's why it takes so much power to mine bitcoins. They're solving problems like this to create proofs of work.

Hashing

Once I've obtained my new proof of work, I can create my new block. I give the block an index number. I give it a timestamp depending on then it was created. I put the list of transactions into it. Then I add the proof of work to it.

Finally, I want to link it to the previous block. I do this by creating a 'hash' of the previous block. I do this by encrypting the block using the same sort of encryption I used to create the proof of work. The encryption produces a long string, just like before, say,
47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=

This long string is called a 'hash'. It's a one-way encryption; I can generate a hash out of my content, but I can't generate my content out of my hash. The hash is also unique to my content. If you changed one digit of my content, you'd change the hash. So that locks the previous content. I take this string, call it the 'previous hash', and add it to my block.

My block is done. But the story isn't over.

Validating

Although it's very hard to create a new block, it's very easy to prove that the new block is valid.

First we check the proof of work. It's very easy to check because we know what it is. We create the string, encrypt the string, and check to make sure it starts with '0000'.

Then we check the hash. We do this by encrypting the previous block and checking our result against the one in the block. If they match, the has is valid.

If the block passes both tests, it's valid.

Nodes

If I were building a one-server system, I'd be done. But the list of transactions needs to be shared with other people. Also, we want other people to be able to create their own transactions.

To share my blockchain, I create a service so that anyone who requests it can receive the full structure of the blockchain from me. It will contain one or more blocks, where each block has a list of transaction, and where each block is chained to the previous block by means of the proof of work and the hash of the previous block.

Now I need a list of other servers. They will each have the same blockchain engine I have. They will set up the same way I did. And when they're ready they send a command to my system to 'register' as a blockchain engine. In my system, I keep a list of these nodes (if I wanted, I could also share it).

Notice that there's no centralized list of nodes. We don't need it. Each node (including me) keeps its own list of nodes.

We also each have our own blockchain. So which one is the right one?

Resolving Conflicts

There are different ways of resolving conflicts. We chose a very simple one: compare two blockchains. Whichever blockchain is the longest is the official blockchain.

So when I want to make sure I have the most up-to-date blockchain, therefore, I check the chain at each node, one by one. If the chain is longer at the node, I replace my blockchain with the one I found at the node. I keep going until I've finished nodes.

If everybody does this, we arrive at a consensus. This consensus reflects the history of all the transactions in the system - the exchanges of money, the recording of contracts, whatever.

Coins

In a financial system like Bitcoin, all transactions are executed in 'coins'. Where do the coins come from? They are created as a reward for mining. In our system, for each block you mine, you receive one coin.

We can also exchange coins (and record these as transactions) for anything we want. So I could sell my coins, I could trade my coins for goods or services, whatever. This depends only on the willingness of someone else to accept my coins.

What makes my coins valuable is that they're scarce. They can only be obtained by mining, and mining gets harder and harder as the blocks get bigger (it can also be made harder by design, which is what Bitcoin did). So instead of setting up a computer and mining your own coins, you might find it easier to buy one from me.

But note that we don't have to use coins. We could simply agree to use this as a distributed record-keeping system. But we'd also have to agree to pay our fair share of the costs of keeping it running.

Complexity

As I mentioned, this is a simple blockchain system. We could make it a lot more complex in a variety of ways.

For one thing, we need to make sure we don't release our current transactions too early. We need to make sure they're locked into the blockchain somewhere, even if someone else's blockchain was longer and took precedence. One way to do this is to notify other nodes of the transactions as they occur. In some blockchains they wait until the transactions are ten blocks deep into the system before releasing their local copies of current transactions. 

Another thing we need to do is to create identities for the people conducting the transactions. So a node might have a list of accounts, such that these accounts are listed as the ones conducting the transaction. I might have an identity like @downes@downes.blockchain or some such thing.

I might also want to check to be sure whether a person is entitled to make a transaction before entering it into my system. If, for example, @downes@downes.blockchain wants to transfer 1,000 coins to someone, there needs to be a history of transactions for  @downes@downes.blockchain such that he has more than 1,000 coins to transfer.

In the case of other types of transactions, other types of checks would need to be made. Only accredited institutions should grant degrees, for example. Only reputable agencies should grant badges. How is this accomplished? Ideally, it is built into the system, so that participation in a biven blockchain network constitutes agreement regarding the constraints on transactions.

The nature of the encryption challenge in the mining process could also be changed. The main condition is that it should be hard to solve but easy to check, and any number of one-way functions and algorithms fit the bill. There needs, however, to be a balance between how hard it is to solve and how quickly we can create new blocks.

Also, to arrive at a consensus, I could check every other node, but why would I want to? If I know that Frank checks all the other nodes, I could just check with Frank. In a network we can propagate from end to end by checking only a few nodes, provided we do it right.

Finally

This is just the tip of the iceberg. There are many issues with my own application, as well as the Flymen example cited above. But if you've understood this article up to this point, you understand blockchains.

If you want to see my code you can check it out at GitHub: https://github.com/Downes/blockchain

I have an endpoint running just for fun at http://www.downes.ca/cgi-bin/app.cgi and yes you could add your node to the system. But remember. Just for fun. :)



Comments

  1. This comment has been removed by a blog administrator.

    ReplyDelete
  2. This comment has been removed by a blog administrator.

    ReplyDelete
  3. This comment has been removed by a blog administrator.

    ReplyDelete

Post a Comment

Your comments will be moderated. Sorry, but it's not a nice world out there.

Popular Posts