HIP-260: Smart Contract Traceability
Author | Danno Ferrin |
---|---|
Discussions-To | https://github.com/hashgraph/hedera-improvement-proposal/discussions/262 |
Status | Replaced ⓘ |
Needs Council Approval | Yes ⓘ |
Review period ends ⓘ | Tue, 21 Dec 2021 07:00:00 +0000 |
Type | Standards Track ⓘ |
Category | Service ⓘ |
Created | 2021-12-02 |
Updated | 2021-12-07, 2021-12-14, 2022-08-15 |
Requires | 206 |
Superseded by | 513 |
Table of Contents
Abstract
Proposes additional data placed in transaction records to enable full replay of transactions by mirror nodes and downstream users of Hedera.
Motivation
Enabling full replay of smart contract service transactions will improve the usability, auditability, and debuggability of smart contract service transactions. Current facilities are insufficient to inspect “internal” transactions executed across contracts, and provide insufficient error messages when correcting errors during smart contract development.
Rationale
Creating a traceable output is always the classic balancing act of time vs. memory. The service node could easily put an entire operation-by-operation trace in the record stream but the storage and data transmission costs would quickly become unmanageable.
Instead, we add the minimum needed data that allows a user to fully replay the transaction and derive the same set of results. We also need to presume that the user does not have access to the full historical state of the chain. However, we do presume some access to select immutable data such as the bytecode of the contracts that are called as part of the transaction.
This will allow for the same level of transaction inspection that popular ethereum chains have on their block explorer sites without requiring speculative calculation of traces from transient execution data.
User stories
As a user of a mirror node I want to be able to see a list of state changes, a full representation of the calls between contracts, and a step by step evaluation of all EVM operations. I want to be able to create these records disconnected from the Hedera Network with only the record streams of the transactions.
Specification
Add ContractStateChange to ContractFunctionResult
Add and populate a new repeated field stateChanges
of
type ContractStateChange
to the ContractFunctionResult
message. This will
be populated whenever a transaction causes contract storage states to change.
This field will include the changes across all contract storage and not just the
initial contract in the transaction.
// existing protobuf
message ContractFunctionResult {
// existing fields
ContractID contractID = 1;
bytes contractCallResult = 2;
string errorMessage = 3;
bytes bloom = 4;
uint64 gasUsed = 5;
repeated ContractLoginfo logInfo = 6;
repeated ContractID createdContractIDs = 7;
// new field
repeated ContractStateChange stateChanges = 8;
}
// new messages
message ContractStateChange {
ContractID contractID = 1;
repeated StorageChange storageChanges = 2;
}
message StorageChange {
bytes slot = 1;
bytes valueRead = 2;
google.protobuf.BytesValue valueWritten = 3;
}
For each slot read or written by the smart contract transaction
a ContactStateChange
record should be added to the transaction record. As
there can be multiple contracts called in one transaction the storage is grouped
by contract.
The slot
, valueRead
, and valueWritten
values are byte strings of up to 32
bytes. Semantically they are uint256
stored in big endian notation and they
will have leading zero bytes stripped. Hence, any value less than 32 bytes is
semantically identical to a value left padded with zeros.
A slot that is only read and not updated MUST omit the valueWritten
field.
Values that are updated without first being read must have their valueRead
field set regardless. This is required as gas calculation is dependent on the
original value of the field updated. A message without a valueRead
field will
be treated as though a value of zero had been read, as that is what will be
encoded.
Canonically the fields are sorted by the contractID
then by slot. Contracts
addressed by account alias (when supported) will sort after accounts addressed
by contractID
. Canonically the slot
, valueRead
, and valueWritten
are
stripped of preceding zero bytes. Because of this zero values will not be
written by the protobuf encoder. Hence, a contract that reads and writes zeros
to slot zero will have a message with only an empty valueWritten
field and a
contract that only reads a zero value from slot zero will have an empty message.
The value in valueRead
reflects the storage value prior to the execution of
the smart contract. The value in valueWritten
, if present, represents the
final updated value of the storage slot after the completion of the smart
contract call. Transient states between the start and finish of the contract are
not stored in the record stream.
Populating contractCallResult
field in Precompiled Contract Records
As an update to HIP-206 smart contracts will need to populate
the contractCallResult
for all transaction records generated by precompiled
contracts recording key facts about the smart contract call. The semantics of
each field will be as follows:
contractID
will be set to the address of the precompile that generated the record.gasUsed
will be set to the gas that the precompiled contract charged for its execution.contractCallResult
will be populated with the return of the call, if any.bloom
will be populated if the precompile produces EVM Logs.logInfo
will be populated if the precompiled contract produces logs.stateChanges
will be populated if the precompiled contract changes state in any contract.createdContractIDs
will not be populated as it is not anticipated any precompiled contracts will create new contracts.
Replaying the Transaction
When replaying the transaction it is presumed the program has access to the smart contracts deployed on the chain or can quickly address them on demand. From this byte code along with precompiled system contract results a replay can be performed.
First, all the read values of the storage changes are loaded into the local world state. The entire state of the contracts involved are not needed and for large contracts is not desirable for efficient execution.
Next the EVM operations are executed against this world state. For calls that go to other contracts the bytecode is provided by the mirror node. For calls that go to precompiled system contracts the results are mocked out by the related record stream.
Finally, the execution can compare the record stream results with its own results. This includes gas used and storage value changes.
A tracer can be attached to the EVM to capture intermediate data for
presentation by block explorers. Three kinds of data are most relevant. State
changes can be extracted from the record stream directly without replay.
Internal Transactions or “flat traces” can be calculated by storing select data
prior to a CALL
, DELEGATECALL
, CALLCODE
, or STATICCALL
operation and
then using that after the call returns to produce the flat trace. A full EVM
operation trace can be extracted in a similar manner.
Clients are encouraged to produce standard trace outputs, such as traces the OpenEthereum Trace Module would produce or that EIP-3155 would produce.
Backwards Compatibility
The added fields can be ignored by clients not wishing to trace or replay smart contract transactions.
Prior transactions lacking these fields cannot be replayed efficiently, and will require historical state reconstruction. This reflects current practice.
Security Implications
All data stored can be recreated by anyone who already has full access to the historical records. This EIP makes the data more efficient and economical to access.
In adversarial conditions it is estimated the record stream can have 30 KB of data added per million gas spent. This does not make smart contracts the most efficient attack surface for record stream bloating.
Because we are not matching external gas pricing for our precompiled system contracts we will be able to adjust the gas price of precompiled contracts if they become an economical way to bloat the record stream size.
How to Teach This
For creating a transaction replay clients should be pointed to the reference implementation.
For end users looking for the benefits of state, flat, and operation traces user help documentation and possibly a technical blog post should be sufficient.
Reference Implementation
None yet. Expected no sooner than 0.22.0
Rejected Ideas
Trace Cross-contract calls in the record stream
The initial proposal included adding the full “flat trace” as a part of the record. Early development showed that in adversarial interactions the record stream would produce about one byte of data per unit of gas spent on the transaction. When targeting millions of gas per second this would result in megabytes of record data generated per second. This load in the record stream is not sustainable for further scaling efforts.
Open Issues
References
- OpenEthereum Trace Module
- EIP-3155 EVM trace specification
Copyright/license
This document is licensed under the Apache License, Version 2.0 – see LICENSE or (https://www.apache.org/licenses/LICENSE-2.0)
Citation
Please cite this document as: