GitHub Logo HIP-435: Record Stream V6

Author Stoyan Panayotov
Discussions-To https://github.com/hashgraph/hedera-improvement-proposal/discussions/436
Status Final
Needs Council Approval Yes
Review period ends Tue, 31 May 2022 07:00:00 +0000
Type Standards Track
Category Service
Created 2022-04-13
Updated 2023-02-01
Release v0.28.0

Abstract

Updates the record file definition to be done via a protobuf.

Introduces sidecar records - a generic way for Hedera nodes to externalize additional information about transactions that is not part of TransactionRecords.

Includes BlockNumber changes for the record files.

Motivation

Using a protobuf definition for the record file schema is far preferable to having a custom parser or random specification someplace (currently a word doc). The goal is to be able to externalize verbose information about transaction executions without bloating the main TransactionRecord object with potentially unnecessary information and requiring mirror nodes to always download and process that information.

Rationale

There are use cases requiring a detailed and verbose output for a HAPI transaction execution. Users of Hedera should have the option to query a mirror node for this information. Meanwhile, bloating the TransactionRecords with information that would be unnecessary in most cases is not an ideal option. It is a much better design decision to create sidecar record files that can be downloaded and processed only by interested parties.

Design Goal #1 Mirror nodes should not be required to download and process verbose transaction information. The sidecar transaction records will be externalized in separate record files.

Design Goal #2 One transaction can have 0, 1, or more than 1 sidecar record related to it located in 0, 1, or more record files.

Design Goal #3 It should be possible and relatively easy to add new types of sidecar transaction records.

User stories

As a Hedera user, I want to have the ability to access verbose information about transactions.

As an operator of a mirror node, I want the ability to download and process only the most important record-related information that is externalized as part of the main RecordFile.

As an operator of a mirror node, I want the ability to download sidecar record files containing all the information about records execution that a Hedera node can provide.

Specification

Define new protobuf messages to support creating record files, sidecar record files, and signature files.

Protobuf definitions

Version 6 of the record stream file format is defined via protobuf messages. The types used for files creation are RecordStreamFile, SignatureFile, and SidecarFile. RecordStreamFile and SignatureFile will have their leading 4 bytes reserved for file version. The version field is not part of the protobuf message since protobufs don’t guarantee the order of fields serialisation. Record files and sidecar files will be compressed. The SidecarFiles will be uploaded into a separate folder in the cloud bucket. Doing so will reduce the number of rows Mirror nodes have to search in S3, reduce S3 list response size, and avoid having to filter them out locally. Probably recordstreams/record0.0.3/sidecar.

Record Stream File

/**
 * RecordStreamFile is used to serialize all RecordStreamItems that are part of the
 * same period into record stream files.
 * This structure represents a block in Hedera (HIP-415).
 */
message RecordStreamFile {
  /**
   * Version of HAPI that was used to serialize the file.
   */
  SemanticVersion hapi_proto_version = 1;

  /**
   * Running Hash of all RecordStreamItems before writing this file.
   */
  HashObject start_object_running_hash = 2;

  /**
   * List of all the record stream items from that period.
   */
  repeated RecordStreamItem record_stream_items = 3;

  /**
   * Running Hash of all RecordStreamItems before closing this file.
   */
  HashObject end_object_running_hash = 4;

  /**
   * The block number associated with this period.
   */
  int64 block_number = 5;

  /**
   * List of the hashes of all the sidecar record files created for the same period.
   * Allows multiple sidecar files to be linked to this RecordStreamFile.
   */
  repeated SidecarMetadata sidecars = 6;
}

RecordStreamFile is used to serialize all RecordFileObjects that are part of the same period into record stream files. Periods are intervals of hedera.recordStream.logPeriod seconds used to group RecordFileObjects into record stream files. The block structure is similar to the current structure of record files, block number and sidecars are new fields.

/**
 * A RecordStreamItem consists of a Transaction and a TransactionRecord,
 * which are already defined protobuf messages.
 */
message RecordStreamItem {
  Transaction transaction = 1;
  TransactionRecord record = 2;
}

A RecordStreamItem consists of a Transaction and a TransactionRecord, which are already defined protobuf messages. No changes will be required for these messages.

/**
 * Information about a single sidecar file.
 */
message SidecarMetadata {
  /**
   * The hash of the entire file.
   */
  HashObject hash = 1;

  /**
   * The id of the sidecar record file
   */
  int32 id = 2;

  /**
   * The types of sidecar records that will be included in the file.
   */
  repeated SidecarType types = 3;
}

Information about a single sidecar file: the hash of the entire file, the id that will be apended to the file name, the types of sidecar records that will be included in the file.

/**
 * The type of sidecar records contained in the sidecar record file
 */
enum SidecarType {
  SIDECAR_TYPE_UNKNOWN = 0;
  CONTRACT_STATE_CHANGE = 1;
}

Lists the possible sidecar record types. New types of sidecar records can be added in the future.

Sidecar File

/**
 * A SidecarFile contains a list of TransactionSidecarRecords that are all created
 * in the same period and related to the same RecordStreamFile.
 */
message SidecarFile {

  /**
   * List of sidecar records
   */
  repeated TransactionSidecarRecord sidecar_records = 1;
}

A SidecarFile contains a list of TransactionSidecarRecords that are all created in the same period and related to the same RecordStreamFile.

/**
 * TransactionSidecarRecord is used to create sidecar records complementing
 * TransactionRecord and storing additional information about a transaction's execution.
 */
message TransactionSidecarRecord {
  /**
   * Consensus timestamp will be the same as the consensus timestamp of the
   * transaction the side car is related to. This offers a convenient
   * way to match record to sidecar.
   */
  Timestamp consensus_timestamp = 1;

  /*
  * List of sidecar types.
  * In future there will be other categories.
  */
  oneof sidecar_records {
    ContractStateChanges state_changes = 2;
  }
}

TransactionSidecarRecord is used to create sidecar records complementing TransactionRecord and storing additional information about a transaction’s execution.

Signature File

/**
 * The record signature file which is created for each record stream file
 * that signs the hash of the entire corresponding stream file.
 */
message SignatureFile {

  /**
   * Signature for the file
   */
  SignatureObject file_signature = 1;

  /**
   * Metadata signature
   */
  SignatureObject metadata_signature = 2;
}

The record signature file that is created for each record stream file signs the hash of the entire corresponding stream file. The list of sidecar file hashes is included in the record stream file. This way mirror nodes or any interested party can download the record stream file and all sidecar files and verify that:

  1. repeated SidecarMetadata sidecars is correct;
  2. the signature file signed the correct hash of the entire record stream file;
/**
 * A Signature defined by its type, length, checksum and signature bytes and the hash that is signed
 */
message SignatureObject {
  /**
   * The signature type
   */
  SignatureType type = 1;

  /**
   * Signature length
   */
  int32 length = 2;

  /**
   * Signature checksum
   */
  int32 checksum = 3;

  /**
   * Signature bytes
   */
  bytes signature = 4;

  /**
   * The hash that is signed by this signature
   */
  HashObject hash_object = 5;
}
/**
 * The signature type
 */
enum SignatureType {
  SIGNATURE_TYPE_UNKNOWN = 0;
  SHA_384_WITH_RSA = 1;
}

Hash Object

/**
 * Encapsulates an object hash so that additional hash algorithms
 * can be added in the future without requiring a breaking change.
 */
message HashObject {

  /**
   * Specifies the hashing algorithm
   */
  HashAlgorithm algorithm = 1;

  /**
   * Hash length
   */
  int32 length = 2;

  /**
   * Specifies the result of the hashing operation in bytes
   */
  bytes hash = 3;
}
/**
 * List of hash algorithms
 */
enum HashAlgorithm {
  HASH_ALGORITHM_UNKNOWN = 0;
  SHA_384 = 1;
}

Record File Names

A record file name is created using a string representation of the Instant of consensus timestamp of the first transaction in the file using ISO-8601 representation, with colons converted to underscores for Windows compatibility.

Record files and sidecar record files will have the .rcd extension.

The nano-of-second always outputs nine digits with padding when necessary, to ensure same length filenames and proper sorting.

Examples of a record stream file name with corresponding signature file and sidecar record files:

Example:

Record File: 2022-10-19T21_35_39.000000000Z.rcd.gz

Sidecar Record File 1: 2022-10-19T21_35_39.000000000Z_01.rcd.gz

Sidecar Record File 2: 2022-10-19T21_35_39.000000000Z_02.rcd.gz

Record Signature File: 2022-10-19T21_35_39.000000000Z.rcd_sig

Record Stream Flow

  1. Transactions are executed sequentially on the Hedera node.
  2. A TransactionRecord is created for each Transaction execution.
  3. TransactionSidecarRecords are also created when appropriate.
  4. RecordStreamItems are created grouping Transactions to their corresponding TransactionRecords.
  5. When a period ends, all RecordStreamItems from that period are added to a RecordStreamFile.
  6. Zero, one or more SidecarFiles are created, containing all the TransactionSidecarRecords from that period.
  7. A SignatureFile is also created by the node, signing the RecordStreamFile which also contains information about the SidecarFiles created during the period.

Backwards Compatibility

The new v6 record file is not backward compatible with v5 and clients will need to be updated to support both. Adding support for both concurrently will allow them to ingest historical information and it will allow clients to roll out v6 support ahead of the cutover date.

The added files can be ignored by clients not wishing to trace or replay smart contract transactions.

Sidecar records for prior transactions will not be created and the verbose information related to them will not be available. This reflects current practice.

Security Implications

The new Record Stream format doesn’t provide any risk in itself. Care should be taken about what kind of information is externalized by each added type of sidecar record file.

How to Teach This

Reference Implementation

Rejected Ideas

Enhance V5 records with extra fields

This idea was rejected in order to move to the protobufs format which is more standardized and has native support and GRPC compilers for different programming languages.

Open Issues

References

Copyright/license

This document is licensed under the Apache License, Version 2.0 – see LICENSE or (https://www.apache.org/licenses/LICENSE-2.0)

Citation

Please cite this document as: