Verifying Symbol account state proofs for fun and profit (part 1)

But mostly for fun.

Public network has magic feature called state proofs. I don’t think this has been described with enough details or I don’t think most users realize what this feature gives.

State proofs allow to PROVE to some third party, that at height X, your account state (or mosaic state, or namespace state, or any other state) was equal to Y.

There is one small catch here: right now there is no easy way to retrieve via APIs the state at given height, so assumption is the state was saved at proper height. Best part is, that such state can be validated by verifier in a trustless manner.

Serializing state

Let’s take a look at following Symbol testnet account: TAKUOCDJH4KUDYP7ZH5HMW43QXYXTLFEDQUR4MI

Symbol block explorer screenshot
Symbol block explorer screenshot
TAKUOCDJH4KUDYP7ZH5HMW43QXYXTLFEDQUR4MI account details as seen in testnet block explorer

Account details can be retrieved by querying REST API endpoint /accounts/<account-id>, at the time of writing this, the data returned looks like this:

{
"account": {
"version": 1,
"address": "98154708693F1541E1FFC9FA765B9B85F179ACA41C291E31",
"addressHeight": "119481",
"publicKey": "13182CF21C0C13BA3FB5401EEED94D274C4305A54017ABCA7CDDEB85A173F765",
"publicKeyHeight": "119490",
"accountType": 0,
"supplementalPublicKeys": {},
"activityBuckets": [],
"mosaics": [
{
"id": "091F837E059AE13C",
"amount": "8539426394"
}
],
"importance": "0",
"importanceHeight": "0"
},
"id": "<node-based, not important>"
}

Currently symbol python core-sdk, does not allow to serialize account state, but account state has it’s definition inside catbuffer-schemas. So let’s do pretty ugly, manual serialization of the data above.

The account above has importance = 0, which means this is not a high-value accounts, which makes things much easier.

Python code for serializing account state (naive version, only for simple cases)
Python code for serializing account state (naive version, only for simple cases)
serializing account state like there’s no tomorrow

Last thing needed before verifying account state, is to calculate sha3 hash of the serialized data

010098154708693f1541e1ffc9fa765b9b85f179aca41c291e31b9d201000000000013182cf21c0c13ba3fb5401eeed94d274c4305a54017abca7cddeb85a173f765c2d20100000000000000000001003ce19a057e831f095a4efdfc01000000

Resulting account state sha3 hash is: 9532475d9217848ca629e690c89470f18c95497d8913233e74fa32b802b3c5d5
It will be verified later.

Verifying state tree

Let’s query current merkle tree path of that account. The REST API endpoint for this is /accounts/<account-id>/merkle, returned data looks like following:

REST API account merkle path response
REST API account merkle path response
merkle path of account in question (click to zoom in)

REST API returns the tree both in “raw” format — which needs parsing and in “parsed” format which is easier to process.

Returned merkle tree path is from the top of the tree to the bottom (leaf). The verification can be in either direction — either from bottom to the top, or from top to the bottom.

When visualized, the tree will look like so (I deliberately shortened the links==hashes . This is full tree, so missing links are exchanged with 00-hashes)

Merkle tree path: 16 branches at first level, branch taken at 0xe, 16 branches at second level, branch taken at 0x2, 16 branches at 3rd level, branch taken at 0x0, finally a leaf.
Merkle tree path: 16 branches at first level, branch taken at 0xe, 16 branches at second level, branch taken at 0x2, 16 branches at 3rd level, branch taken at 0x0, finally a leaf.
Full merkle tree path visualized

Let’s start with a path. Path is simply hash of a key. In case of account state cache, key is simply account address. So in case of account TAKUOCDJH4KUDYP7ZH5HMW43QXYXTLFEDQUR4MI the path is:

>>> sha3_256(unhexlify('98154708693F1541E1FFC9FA765B9B85F179ACA41C291E31')).hexdigest()E220E4E4BCF533F0E0082F28E65B8EA5A278D255DB22D61D097A2F9008676F3A

Every branch in a tree might contain non-empty path, but in example above it’s always empty, so going from the top to the bottom, proper nibbles (half of a byte is called a nibble) from account path must be taken. In order: E, 2, 2, 0, so “leftover” path is E4E4BCF533F0E0082F28E65B8EA5A278D255DB22D61D097A2F9008676F3A

Note: if it’s not possible to traverse the tree from the top to the bottom using proper branches, it means tree has been manipulated.

Now, it’s easy to notice that the leaf (“type”: 255) value is equal to sha3 hash calculated previously, yay! one check done.

{
"type": 255,
"path": "E4E4BCF533F0E0082F28E65B8EA5A278D255DB22D61D097A2F9008676F3A",
"encodedPath": "20E4E4BCF533F0E0082F28E65B8EA5A278D255DB22D61D097A2F9008676F3A",
"nibbleCount": 60,
"value": "9532475D9217848CA629E690C89470F18C95497D8913233E74FA32B802B3C5D5",
"leafHash": "39D61E874BE3A301D1FEE4DDA1D2F6D53030004A51E6399BFB6F58EE35218EC4"
}

Now, what verifier needs to do is to check all the hashes on the way. Let’s start from the bottom as it will be easier.

leafHash is a hash of concatenated encodedPath(*) and value, so:
(* — how to obtain encodedPath from path is bit outside of the scope of this post, curious readers should take a look at chapter 4 in catapult whitepaper)

>>> sha3_256(unhexlify('20E4E4BCF533F0E0082F28E65B8EA5A278D255DB22D61D097A2F9008676F3A') + unhexlify('9532475D9217848CA629E690C89470F18C95497D8913233E74FA32B802B3C5D5')).hexdigest()39D61E874BE3A301D1FEE4DDA1D2F6D53030004A51E6399BFB6F58EE35218EC4

Yay! It matches.
Path nibble that took us to this leaf had value 0. That means that “link” at 0th element at level 4 must equal to that hash and indeed it does.

In a similar fashion, branchHash — is hash of concatened encodedPath and all the links at given level. As noted earlier missing elements must be filled with 00-hash. So for level 4, encodedPath is equal to ‘00’ and that gives us:

>>> zero = unhexlify('00000000000000000000000000000000000000000000000000000000000000')
>>> sha3_256(unhexlify('00') + u('39D61E874BE3A301D1FEE4DDA1D2F6D53030004A51E6399BFB6F58EE35218EC4') + zero * 3 + unhexlify('6A69ADA0538FC9582D813619288B400038286522206FEA14D231F0D9775900E3') + zero * 11).hexdigest()
B1508023DFF34155479B81C93EE43926F590E59E0CEA66F4633D5929B809AC34

This can also be verified here. And you can check that this matches branchHash at level 4.

Path nibble that leaded here was 2 and link at level 3 at position 2, matches calculated hash.

Now we need to tediously repeat the process, I’ll just paste links that calculate proper branchHashes:

Calculated root hash is B6B9F48079B27914FFE030B8CBBE6C81ED30FEB4E007946962C4283CBD99C581

Finally, prover can check, that the calculated hash matches state tree hash at that concrete height, for which merkle tree path was obtained:

http://explorer.testnet.symboldev.network/blocks/173056

Block explorer information for block at height 173056

Now, I took a shortcut and referenced block explorer, but to do this fully trustlessly, the verifier should gather all block headers — only headers, no need to actually verify their contents — check that they form a chain, and then can validate that account state hash in that block actually matches computed one.

That last part can be optimized a lot, by checking only blocks at multiples of finalization epoch.

Summary:

  • account state
  • merkle tree path at given height

Verifier:

  • serializes account state,
  • validates serialized state against merkle tree path
  • validates state root hash against block header

The title is blatant reference to (in)famous 25 years old article.

References:

mijin/nem developer