Verifying Symbol account state proofs for fun and profit (part 1)
Public network has magic feature called state proofs. I don’t think this has been described with enough details or I don’t think most users realize what this feature gives.
State proofs allow to PROVE to some third party, that at height X, your account state (or mosaic state, or namespace state, or any other state) was equal to Y.
There is one small catch here: right now there is no easy way to retrieve via APIs the state at given height, so assumption is the state was saved at proper height. Best part is, that such state can be validated by verifier in a trustless manner.
What is inside the state? Short answer is: “probably too much”. But let’s dive in.
Let’s take a look at following Symbol testnet account:
Account details can be retrieved by querying REST API endpoint
/accounts/<account-id>, at the time of writing this, the data returned looks like this:
"id": "<node-based, not important>"
Currently symbol python core-sdk, does not allow to serialize account state, but account state has it’s definition inside catbuffer-schemas. So let’s do pretty ugly, manual serialization of the data above.
The account above has importance = 0, which means this is not a high-value accounts, which makes things much easier.
Last thing needed before verifying account state, is to calculate sha3 hash of the serialized data
Resulting account state sha3 hash is:
It will be verified later.
Verifying state tree
There’s a pretty detailed description of merkle trees in chapter 4 in catapult-whitepaper, so I’m not gonna repeat that here.
Let’s query current merkle tree path of that account. The REST API endpoint for this is
/accounts/<account-id>/merkle, returned data looks like following:
REST API returns the tree both in “raw” format — which needs parsing and in “parsed” format which is easier to process.
Returned merkle tree path is from the top of the tree to the bottom (leaf). The verification can be in either direction — either from bottom to the top, or from top to the bottom.
When visualized, the tree will look like so (I deliberately shortened the links==hashes . This is full tree, so missing links are exchanged with 00-hashes)
Let’s start with a path. Path is simply hash of a key. In case of account state cache, key is simply account address. So in case of account
TAKUOCDJH4KUDYP7ZH5HMW43QXYXTLFEDQUR4MI the path is:
Every branch in a tree might contain non-empty
path, but in example above it’s always empty, so going from the top to the bottom, proper nibbles (half of a byte is called a nibble) from account path must be taken. In order:
0, so “leftover” path is
Note: if it’s not possible to traverse the tree from the top to the bottom using proper branches, it means tree has been manipulated.
Now, it’s easy to notice that the leaf (
“type”: 255) value is equal to sha3 hash calculated previously, yay! one check done.
Now, what verifier needs to do is to check all the hashes on the way. Let’s start from the bottom as it will be easier.
leafHash is a hash of concatenated encodedPath(*) and value, so:
(* — how to obtain
path is bit outside of the scope of this post, curious readers should take a look at chapter 4 in catapult whitepaper)
>>> sha3_256(unhexlify('20E4E4BCF533F0E0082F28E65B8EA5A278D255DB22D61D097A2F9008676F3A') + unhexlify('9532475D9217848CA629E690C89470F18C95497D8913233E74FA32B802B3C5D5')).hexdigest()39D61E874BE3A301D1FEE4DDA1D2F6D53030004A51E6399BFB6F58EE35218EC4
Yay! It matches.
Path nibble that took us to this leaf had value
0. That means that “link” at 0th element at level 4 must equal to that hash and indeed it does.
In a similar fashion,
branchHash — is hash of concatened encodedPath and all the links at given level. As noted earlier missing elements must be filled with 00-hash. So for level 4, encodedPath is equal to ‘00’ and that gives us:
>>> zero = unhexlify('00000000000000000000000000000000000000000000000000000000000000')
>>> sha3_256(unhexlify('00') + u('39D61E874BE3A301D1FEE4DDA1D2F6D53030004A51E6399BFB6F58EE35218EC4') + zero * 3 + unhexlify('6A69ADA0538FC9582D813619288B400038286522206FEA14D231F0D9775900E3') + zero * 11).hexdigest()B1508023DFF34155479B81C93EE43926F590E59E0CEA66F4633D5929B809AC34
This can also be verified here. And you can check that this matches
branchHash at level 4.
Path nibble that leaded here was
2 and link at level 3 at position 2, matches calculated hash.
Now we need to tediously repeat the process, I’ll just paste links that calculate proper branchHashes:
Calculated root hash is
Finally, prover can check, that the calculated hash matches state tree hash at that concrete height, for which merkle tree path was obtained:
Now, I took a shortcut and referenced block explorer, but to do this fully trustlessly, the verifier should gather all block headers — only headers, no need to actually verify their contents — check that they form a chain, and then can validate that account state hash in that block actually matches computed one.
That last part can be optimized a lot, by checking only blocks at multiples of finalization epoch.
- account state
- merkle tree path at given height
- serializes account state,
- validates serialized state against merkle tree path
- validates state root hash against block header
The title is blatant reference to (in)famous 25 years old article.
- account information — https://docs.symbolplatform.com/symbol-openapi/v1.0.1/#operation/getAccountInfo
- account merkle path — https://docs.symbolplatform.com/symbol-openapi/v1.0.1/#operation/getAccountInfoMerkle