optimistic-specs: Safely generating L2 blocks based purely on L1 blockdata

// see also discussion here*

Current State of L2s

In general, the nature of a rollup is such that the L2 state can be entirely defined by the L1 blocks. That is, all possible L2s can be expressed by state transitions taking the form f(old_l2_state, new_l1_block) -> new_l2_state.

In reality, most (all?) rollups today do not act on raw blockdata, but instead use an L1 smart contract to store a hash of the calldata sent to a system contract during L1 block execution. This function lived in OVM_CanonicalTransactionChain for our v1, for example. Here’s where Arbitrum does it as well.

The Ideal

Our goal of EVM equivalence should make the idea of acting on raw L1 block data more feasible. In reality, a committment to input calldata is already stored within the L1 blockhash itself, and we could use this as the verification source during the fraud proof. This has been proposed by @lightclient and others. It would give us optimal gas costs, and also set us up for ETH2p1, in which the place that calldata is submitted to will not have any smart contract capabilities.

Problem Statement

The above post proves that it’s feasible to give the L2 virtual machine access to some get_l1_blockhash(number) function. However, it does not tell us how we can safely use it. Geohot has proposed one way to use this blockhash in a very generalized form: give the L2 VM a function, effectively a “get_known_preimage(bytes32 hash)”, which allows it to recover the preimage of a hash into memory. The L2 VM could use this to unpack the block body from the block hash, then unpack the transaction root’s children from that, and so on, until it unpacks the relevant transaction data itself.

However, there is a problem: the preimages, even if known off-chain, may be too big to do any verification on-chain. For example, imagine that a block gaslimit was entirely filled with a massive, single transaction (identified by txhash) to a dummy account. It should be clear that there is no way to verify the get_known_preimage(txhash) step on-chain, because there’s not enough gas. How should we deal with this?

Solution 1: Upper bound on preimage sizes

The most obvious solution to this is just to enforce a maximum preimage size that can be successfully loaded into memory by the L2 VM. This would mean that get_known_preimage returns null_bytes32 if the real preimage.length > UPPER_BOUND.

To detect this on-chain, an honest party would just allow their challenger to run out of time on the dispute game because it is impossible to prove anything else. This is the easiest to implement, but it has the downside that the fraud proofs can no longer validate arbitrary L1 blocks – only those below some size constraint.

Solution 2: Secondary game on preimage chunks

The keccak algorithm itself can be broken down into smaller individual steps which each ingest a word of the input into a constant-sized state. So another solution is to have the two parties submit the preimage they claim is correct in smaller chunks as calldata. Once all are submitted, then they can play a bisection game on the hash substeps to prove that their preimage will result in the requested hash. This adds complexity, but does allow us to validate arbitrary L1 blocks in their entirety.

About this issue

Original URL
State: closed
Created 3 years ago
Reactions: 4
Comments: 21 (14 by maintainers)

Most upvoted comments

this statement isn’t obviously true to me, but perhaps I’m missing something?

It’s less about the feature and more about protocol complexity. We have the choice of highly encapsulated complexity (implement a pre-image verifier for any size preimage) vs leaky complexity across multiple parts of our protocol. We need to add logic in Geth to reject txs over a specific size, make preimage oracle to return nil for large preimages, and add a special step in our dispute game to handle resolutions which timeout. Our last codebase we suffered death by 1,000 cut corners that bloated protocol complexity beyond belief.

Some other places where I’m afraid of complexity popping up:

It becomes difficult to detect whether a deposit can actually be processed by the L2. Ben mentioned this before but you basically emit an enormous amount of zero bytes as an event and then deposit. If the receipt becomes too large then the fraud proof will skip the deposit but the contract has no way to know that it was skipped, so we gotta add some extra logic into our deposit watcher. This also might have other downstream effects.
It is the only fraud proof step where a timeout actually produces a value. That means that theoretically you could have a preimage that is small, let the game timeout, and then it’ll be “validly” interpreted as nil. This is a weird property and may be able to be used in tricky ways.

TLDR: I agree that supporting large txs is not a feature that users are clamoring for; however I think using this seemingly simple timeout approach will actually introduce gnarly complexity. + ofc EVM equivalence.

karlfloersch on Oct 14, 2021

It’s super important that we’re not limited by calldata size.

@karlfloersch this statement isn’t obviously true to me, but perhaps I’m missing something?

I can’t think of many on-chain applications that in practice would depend on being able to receive 30MM gas worth of call data. Until recently they were all making do with more like 13MM on L1 anyways.

IMO, implementing an on-chain sponge function (as fun as it sounds!) is a major endeavour in itself, if it is essential to the mission of creating an EVM equivalent ORU then we must do it. But if it results in something like a 20% to 30% increase in maximum TxData size, I’m less certain of it’s value.

maurelian on Oct 9, 2021

Okay, so I spent some time thinking about this, and I’ve convinced myself it isn’t possible.

The reason is actually pretty interesting. Turns out what I called f (warning: it’s not what the litterature calls f) is reversible. This is actually great: if you agree on a suffit of all internal states, you can always prove the input that you disagree upon. The problem is that if you don’t agree on the last internal state, you cannot retrieve (or even commit to) that state from the hash, as it is essentially truncated from it (and so finding collisions is trivial).

norswap on Jan 12, 2022

Running all the computation on-chain works, but it’s very costly (a full L1 block or more!), so let’s consider the alternatives as well.

One other approach is to do a fraud-proof over the thing that generated the large pre-image (L1 execution) at the same time (execute the L1 source block before executing the derived L2 block(s)). Then keep the receipt (or whatever large input) in the authenticated memory, to avoid using the pre-image oracle for it later.

The pros:

Arbitrary large receipt data can be consumed
Can work for any L1 data, not just receipts
Introspection in L1 EVM is possible, if we ever need something more advanced than a receipt

The cons:

The fraud-proof will have to be updated whenever the L1 EVM updates

Or another alternative is to merkleize receipts in some better way on L1, and see if it can be adopted during the Merge (transactions are already going to be merkleized nicely, and will be accessible through a binary merkle proof via the beacon-block hash-tree-root)

protolambda on Dec 6, 2021

Ah I forgot that each interaction needs to disputable. I think in that case my example doesn’t work, because one honest party could dispute my fraudulent fraud proof.

lightclient on Oct 18, 2021

@ben-chain can we write this in a way such that it is non-interactive? IMO that would simplify it conceptually a lot. Non-interactive but expensive. Even though it sounds somewhat nightmarish to do, it would be super isolated & seems relatively easy to test.

karlfloersch on Oct 7, 2021