How does Snapsync work?

I am trying to run a RSKJ node for the first time, and naturally I am more interested in a faster sync than validating historical blocks, because I believe sidechains should be checkpointed by their bridge operations, as any reorg earlier than a successful pegout breaks the peg and can be considered double spending.

But I can’t find any detailed documentation about the algorithm behind snapsync, so I can’t judge how far is it from what I would hope to see:

  1. Find the latest successful pegout transaction on Bitcoin
  2. Find a checkpoint block associated with it
  3. Sync the state at that checkpoint
  4. Start sync headers since that block to calculate the accumulated proof of Work (hopefully with improvements to the similar effect of FlyClient).
  5. Once the fork choice is decided, download the blocks since the checkpoint to update the state locally.

I am sure this not what snapsync does, but I would love to read any documentation for how does it actually work.

1 Like

Hey Nuh! I’ve pinged some of the core contributors about this so they can help out.

1 Like

After some practice GitHub - Nuhvi/rsk: WIP Rust implementation of rskj for educational purposes... for now. · GitHub I know realized that unless the bridge commits to a previous block in the Pegout transaction, I can’t use the Bitcoin transaction as checkpointing device, that is because anyone can create a Rootstock block that mentions the pegout transaction, what I need is the opposite.

I don’t know why is the bridge not doing that? Is there a fundamental problem with that? or is it just that it requires a fork? if so, I would love to see that happening as soon as possible, and would be great if you welcome RSKIPs to that end.

For now, I think the only option I have is exactly what the snapshot sync does, which means downloading all headers, finding a “good” header (according to some criteria) and sync the state at that snapshot and going forward. Or for a light client, simply download all headers, which is very unfortunate because there are much more Rootstock headers than Bitcoin, and they are much bigger than Bitcoin.

Alternatively, maybe there could be a Rootstock version of https://www.raito.wtf/ compressing all headers to a STARK proof.

The only way for your light client to decide which checkpoint is the correct one would be to find one transaction that spends bitcoins from the RSK peg address. But how do you validate that the peg addrss is the current active one ? You need to download a rskj node and sync it, then query the bridge contract.

So you’re back at the same problem: syncing the rsk chain. There are however several ways to sync faster:

  1. Flyclient
  2. Superchain
  3. Ephemeral Blockchain (RSKIPs/IPs/RSKIP215.md at master · rsksmart/RSKIPs · GitHub)

The first allows probabilistic proofs of cumulative work, so you download less headers.
The second allows super-headers of higher difficulty, so you can advance faster on the chain
The third allows pruning old headers and not downloading them, while still having a high assurance that it’s the currect chain based on enough cumulative work.

Actually you can do pruning-light-client on the client side without RSKIP215. You ask for the latest block height X, you go back 10k blocks, then start downloading the headers from X-10k up to X, checking the cumulative work. If you’re satisfied with the amount of work collected, you stop. If not, keep going back 10k blocks until you are.

1 Like

But how do you validate that the peg addrss is the current active one ? You need to download a rskj node and sync it, then query the bridge contract.

This is admirably conservative, but I am not convinced that this is a real limitation, as far as I understand, changing the peg address is a hard fork, even if it isn’t an actual hard fork according to the protocol, it is according to me as a user..

To explain that, imagine this from the point of view of a user who is trying to peg-in to Rootstock, that users can’t care less what happened since genesis until now, their relationship to Rootstock is defined by the peg address they are being told to deposit to, they can be introduced to that address using a hardcoded constant in a wallet they downloaded, or an address in a bridge web app, but they can always go check the funds in that address, and hopefully read the script, compare it with the advertised federation or in the future Union bridge or a true smart contract … it doesn’t matter, what matter is that the user decides where they are going first.

Now if that address is supposed to change, while the user still have their money in it, then at worst they need to do a linear sync as a stateless client from the moment they deposited their money, to the moment they are trying to sync or withdraw, in which case they can learn the new peg address accordingly.

The only changes I wish to see here are:

  1. pegout transaction points to the latest Bitcoin block number that is ancestor of the Rootstock block where the releaseRequest occured.
  2. Rootstock block headers contain a pointer to the latest Bitcoin block that contain a matching balance in the peg address, to the balance of RBTC.
  3. Rootstock block headers contain the total balance of RBTC

I think these are sufficient for a new joining user to:

  1. find a checkpoint and ignore older data, including stateless clients and full nodes trying to skip IBD.
  2. light clients be fairly confident that the peg is 1-1
  3. stateless clients can verify the peg is 1-1 by downloading the entire state.

Bonus: add the Cumulative Work in the RSK info as you suggested in an earlier RSKIP + use the pointer I suggested in #2 to revive the syncchain proposal :slight_smile:

I might be entirely wrong, but I would rather be wrong and be corrected, than remain wondering why aren’t things done the way that make more sense to me.

Very thankful for your patience.

Here is another way to know what is the current peg address:

  1. we start from a hardcoded checkpoint in the software about the current peg address.
  2. consensus rule forcing any transactions generated by the bridge contract for the federation to sign, to create only one utxo per peg address (I believe synchcains suggested some of that).
  3. consensus rule forcing any transactions generated by the bridge contract for the federation to sign, to include an OP_RETURN <Rsk Block hash> <cumulative work> where the <Rsk Block Hash> is the block where the bridge authorizes this transaction in the first place.

Then I don’t have to sync the RSK chain, I can sync only depend on Bitcoin clients to find the latest transaction spending from the peg address, and read the <Rsk Block Hash> and consider that the checkpoint.

I am aware that Rootstock wasn’t designed to rely on Bitcoin that way, and that this is closer to Syncchain, but that was always how I thought Sidechain work, and I would still like to see Rootstock embrace that direction.

@SergioDemianLerner I think FlyClient doesn’t contradict my goal of using the pegout as a checkpointing mechanism. For example, if we have GSR in Bitcoin Script, we could have the peg address forever stable, and any spending from a peg utxo must include a FlyClient proof that passes a static or dynamic minimum work, in that case the FlyClient proof is available for anyone to download from Bitcoin itself before connecting to any peers, and thus they get their checkpoint that any future FlyClient can not revert, which is already assumed and defacto true (if the work threshold is high enough) but then clients can also enforce, in the same way releases of Electrum wallets ship with checkpoints that can’t be reverted.

If we add FlyClient to Rootstock, we can even use a Handoff Covenant, and post the FlyClient proof with every pegout and wait until miners enforce that modern version of OP_SIDECHAINPROOFVERIFY … it will make pegouts more expensive, but I welcome that as I would like Rootstock to become where most users stay and do what they need and rarely go back to L1, similar to how Drivechains envisioned that dynamic.

Either way, I will study more about FlyClient, and try to contribute to Rskj and RSKIPs and argue for that addition.

If you think that “superchain” is simpler and as likely to work in a modern version of OP_SIDECHAINPROOFVERIFY, and doesn’t suffer from the same issues of NiPoPow that FlyClient was meant to solve, please let me know and point me to any reading material that could help me understand it better.

I dug deeper, and my suggestion for option #4 (a light client in zkVM) seems to already be happening in Union bridge so that is what I will build on top even though I think FlyClient could be much simpler to audit than zkVMs that have a new soundness bug almost every year.

I plan to add Chain State proof like Raito did to Bitcoin, so I can have an payment verification or Receipt verification and not just pegout + state.

I think for an offchain client, zkVMs are less risky than they are for pegouts in BitVM, as clients can be updated much easier than BitVM when a bug in the verifier gets discovered.

I am not sure yet about the prover cost, but maybe if I did a good job, the Union Bridge client can be updated to use my work, and then we get proofs whenever someone generates a pegout proof. But either way spending some proving work once a month is already good enough, no need to make realtime proofs or even daily, as long as the size of headers I need to download in an IBD remains bounded.

I also would like to note that I have no idea what is the purpose of Superchain at this point, because the zkVM could generate succinct STARK proof that makes Superchain irrelevant. Moreover, Superchain (as far as I understand) is one level NiPoPow which doesn’t reduce the size of the proof logarithmically, instead only a factor of 20, which is not only not sufficient, but it is not clear how is Superchain solving the problem of NiPoPow regarding dynamic difficulty, which was the vulnerability that made FlyClient a replacement of NiPoPow.

I don’t need to learn the answer and I might be able to safely ignore Superchain… but it is also a sign about how opaque development is in Rootstock at the moment, since I can’t find any specification for it, nor an RSKIP.