I published a new RSKIP173 “Chunk-Based Code Merkleization using the Unitrie” based on Ethereum’s EIP-2926. The objetive is to allow stateless clients, and eventually also sharding over RSK. This is a very promising research direction now that our research on syncchains is public.
RSKIP173 makes heavy use of the properties of the Unitrie to streamline the implementation. The main benefit is that the design is clean and simple. The downside may be a lower execution efficiency as loading all contract code at once (even if 24 Kbytes in length) may take the same time as loading a few bytes, because load time is dominated by SSD access time. Therefore having the code split in chunks on SSD will not help.
It will be great to run some performance tests to get a better insight on this.
I’ll be working on stateless client proposal for RSK in the next months, which I hope becomes the basis of a sharding system. It would be great to have more people working with me on this.
Why to save the merklelization in persistent storage? It could be calculated on the fly. The light client it’s not the most popular use case now. Having code merklelization on the fly and cached, it’s the “baby step” way to REALLY collect data about its need and added value. BTW, the same functionality could be added with the Ethereum tries
The reason to store in the persistent storage is stated in the motivation of the RSKIP.
I copy it here: By using the Unitrie to store the chunks, we simplify considerably the stateless client wire protocol and stateless execution logic.
Secondly, I suppose that "Having code merklelization on the fly and cached, " you mean that the code merkle-root should not even be part of the world state. But that prevents stateless nodes from synchronizing without receiving the full contract code database from peers.
The original EIP-2926 computes the root and stores only the root in the world state to let stateless clients synchronize.
Finally, you say “the functionality could be added with Ethereum Tries”. I’m not sure what you mean by that. RSK does not have Ethereum tries anymore. It has unitries. Did you mean unitries?
Also you say “The light client it’s not the most popular use case now.”, but this category is for the discussion of stateless clients, and so it can’t be a popular use case because the RSK node doesn’t implement stateless clients. Also, It cannot be a popular use case in Ethereum because Ethereum doesn’t have stateless clients also.
If you you mean that there won’t be any use case in the future for stateless clients, then if that was true then the whole Ethereum 2.0 community is wrong.
I think sharding has many use cases, but it won’t solve the problem of high fees in the financial heavy shards.
I also mean that the merkle tree could be build without using any trie structure. Merkle trees are simpler than tries, and the proof is simpler
“On the fly” means that it could not be stored. Only an additional hash to account state. The original code hash is still there. And the additional hash is only computed at some points of execution and for some contracts (the first time a contract with length > X is executed)
I mean: any consensus chain should be evaluated. One of the criteria would be the added value to the blockchain involved. In this case, only having REAL data of light clients use (wo code merklelization) and consumption, we could evaluate the REAL gain of such implementation (code merklelization). Any Ethereum analysis is, in someway, no so relevant: they have different use cases, different patterns of use, different contracts.
In short: I mean any consensus change should be evaluated with REAL data.
Thanks @ajlopez for the clarifications. You propose to use something similar to EIP-2926 but instead of computing all the trees at once, to compute the Merkle trees the first time the code needs to be used. That seems to be a good optimization for saving the one-time cost of iterating over the whole state trie. However, I’m not worried much about the one-time cost, because the RSK state trie is not so big now. We could run a simple test to traverse the whole tree (skipping storage cells) to check how much time it takes. I suppose it will take a few minutes. In Ethereum I suppose it would take more like an hour.
Using Merkle trees is good, and not changing the structure of the code in the trie had advantages for code loading, but disadvantages for the stateless block transfer protocol, which must do special things related to the code witness part.
I think that coding and testing both approaches would help. It’s an interesting research topic.
But you should not confuse light clients with stateless clients. Light clients are based on an inferior security model (SPV) while stateless clients are fully validating.
Stateless clients can be used for sharding (efficiently verifying parallel RSK sidechains), while light clients should not be used for that purpose.
Stateless clients could be used for light clients, but generally they are not efficient enough on smart-phones. Stateless light clients on PCs and IoT appliances, on the contrary, do make sense.
So here we’re talking about the future. Basically, sharding. Will sharding be used ? It’s the selling point of several blockchains, such as Polkadot or Ethereum 2.0