GitHub Logo HIP-639: Errata records

Author Michael Tinker
Working Group Steven Sheehy, Nick Poorman, Neeharika Sompalli, Xin Li
Discussions-To https://github.com/hashgraph/hedera-improvement-proposal/discussions/634
Status Accepted
Needs Council Approval Yes
Review period ends Thu, 19 Jan 2023 07:00:00 +0000
Type Standards Track
Category Service
Created 2022-11-29
Updated 2023-01-19

Abstract

A bug in the consensus nodes can cause them to export bad data in the record stream, so we need a way to issue errata records. There many categories of bugs and resulting bad data, and not all of them can be fixed by simply correcting the record stream. Consider the following examples:

  1. (Missing record) The network handles a transaction correctly, but fails to write its record to the stream.
  2. (Inconsistent record) The network fails while handling a transaction, with no effect on state, but still exports a complete record.
  3. (Incorrect record) The network handles a transaction and exports a consistent record; but the handling itself was wrong.

In this HIP we propose a council-only issueRecordStreamErrata RPC to publish corrections to the record stream, including “compensating” actions that mirror nodes can take to bring their databases back in sync with the consensus node state. This will support “fixing” problems of type (1) and (2) above.

We do NOT propose any way for council accounts to change the consensus node state. So this HIP does not provide any way to fix problems of type (3) above. As an extreme example, if some account 0.0.A is supposed to receive 10B hbar in a transaction, but a bug redirects those funds to account 0.0.B, then it is cold comfort if the record stream consistently reflects the transaction. But preserving such comforts is all we aspire to here.

Motivation

Although it should be extremely rare, it will always remain possible for bugs to cause consensus nodes to export bad data in the record stream. If this happens, we need a precise, highly visible way to issue an errata for mirror nodes. The proposal is that the super-majority of the council approves the errata file proposal, and the super-majority of consensus nodes (by stake) ratify the errata proposal.

Rationale

The two main decision points in the design are:

  1. Whether to support “decorating” the record stream with errata sidecars at the time errors occurred.
  2. How to instruct mirror nodes to “compensate” for the side effects of an error.

Errata sidecars

The benefit of errata sidecars is that if a mirror node syncs from a database snapshot before the error, it can avoid the error entirely, and simply use the correct record based on the sidecar. But mirror node sync times are sufficiently long that there is no practical reason to ever start a sync from any but the latest snapshot. This means there is no practical benefit to decorating the stream with an errata at some time in the distant past, since no syncing node will ever “be there” to appreciate it. (It is also unclear how a syncing node would be able to process signatures on an errata sidecar, since the signatures could come from keys in a future address book that the syncing node knows nothing about.) Therefore we chose not to support errata sidecars.

Compensating actions

There are many techniques we could use to instruct mirror nodes to correct their databases at the consensus time of the RecordStreamErrataTransactionBody. The ideal technique would have the expressiveness of ANSI SQL. But mirror nodes are free to use any form of database and schema they like, so some mirror node developers might face great difficulty in supporting a SQL-like language that assumes a different schema. We chose to use synthetic transactions to define compensating actions because mirror nodes must already support transactions. The weakness of this choice is that we can imagine a bug whose side effects cannot be “fixed” by any sequence of synthetic transactions. If such a bug occurred, we would be forced to add a more general compensating strategy to the RecordStreamErrataTransactionBody.

User stories

  • As a network operator, I want to publish a precise, maximally visible errata to the record stream after a bug in the consensus nodes has caused an error.
  • As a mirror node operator, I want a trustworthy source of record stream errata that will automatically bring my mirror node database back in sync with consensus node state.

Specification

We propose a new TransactionBody type,

message RecordStreamErrataTransactionBody {
    // The (past) consensus time at which an export error occurred
    Timestamp consensus_time_of_error = 1;

    // The data that should have been exported (unset if no data should have 
    // been exported); allows mirror nodes to fix transaction history tables
    RecordStreamItem correct_item = 2;

    // One or more synthetic transactions whose side effects, when incorporated
    // in the mirror node database in the normal way, will bring its present state
    // back in alignment with the consensus node state; these items must NOT
    // be inserted in any transaction history tables
    repeated RecordStreamItem compensating_items = 3;
}

We suggest this transaction type be submitted using a NetworkService RPC named issueRecordStreamErrata; and that it require a council payer account number (either 2 or 50).

Backwards Compatibility

There is no existing errata feature to maintain backward compatibility with.

Security Implications

It would be very bad if an unauthorized account was able to issue record stream errata. Even when the network limits this ability to council, there is a material risk of human error in constructing the RecordStreamErrataTransactionBody. This risk should be acceptable, however, since if operators issue an incorrect errata for time T1 at time T2, they can correct that mistake at T3 by issuing another errata for time T1 whose compensating_items at T3 take into account the effects of the compensating_items applied at T2.

How to Teach This

Reference Implementation

TBD

Rejected Ideas

As noted in the Rationale section, we considered but did not propose:

  1. Inserting an errata side car into the stream at the consensus time of an error.
  2. Defining a general way to specify mirror nodes database changes.

Open Issues

TBD

References

TBD

Copyright/license

This document is licensed under the Apache License, Version 2.0 – see LICENSE or (https://www.apache.org/licenses/LICENSE-2.0)

Citation

Please cite this document as: