FastPQ
FastPQ is Iroha's STARK proof path for selected execution effects. It does not replace normal transaction execution or consensus. Transactions still run through ISI, IVM, and Sumeragi as usual; FastPQ consumes the deterministic execution witness and turns supported effects into proof batches.
The current host integration has three main paths:
- transparent numeric asset transfers recorded during block execution
- Nexus verified lane relays whose AXT proof envelope carries a FastPQ binding
- SCCP transparent message proof helpers that wrap a FastPQ proof in an open-verification envelope
Transfer Witness Path
Transparent numeric transfers create a structured transfer transcript when the instruction mutates balances. The transcript records:
- the source account, destination account, asset definition, and amount
- sender and receiver balances before and after the transfer
- the transaction entrypoint hash used as the batch hash
- an authority digest derived from the submitting account
- a Poseidon digest for single-delta transcripts
Batch transfers use one transcript with multiple deltas. In that case the single-delta Poseidon digest is omitted until per-delta digests are available.
At block finalization, Iroha groups these transcripts by entrypoint hash. The execution witness then carries both the original transcript bundles and the FastPQ transition batches prepared for the prover.
Each transfer delta becomes two transition rows:
| Row | Key shape | Pre-value | Post-value |
|---|---|---|---|
| Sender debit | asset/<asset-definition>/<source-account> | sender balance before | sender balance after |
| Receiver credit | asset/<asset-definition>/<destination-account> | receiver balance before | receiver balance after |
Numeric values are normalized into integer witness units. A value is rejected for FastPQ batching if it cannot be represented as a non-negative u64 at the selected decimal scale.
Public Inputs
Every FastPQ transition batch carries public inputs that bind the proof to the block and execution context:
| Input | Meaning |
|---|---|
dsid | Dataspace identifier encoded as little-endian bytes |
slot | Block creation time converted to nanoseconds |
old_root | Parent state root derived from the execution witness |
new_root | Post-state root derived from the execution witness |
perm_root | Poseidon commitment over active role permissions |
tx_set_hash | Hash over sorted transaction and time-trigger entrypoint hashes |
The host uses fastpq-lane-balanced as the canonical parameter set for these batches.
Mathematical Model
This section describes the arithmetic implemented by the current Rust prover and verifier. All field operations below are over the Goldilocks prime field:
FastPQ uses Poseidon2 over F for field commitments. The sponge has width t = 3, rate r = 2, and capacity 1. The hash absorbs field elements in rate-2 blocks and appends a single field element 1 before the final permutation:
Byte strings are packed into 7-byte little-endian limbs so every limb is strictly below p:
Domain-separated field hashes are represented as:
For hashes that start from byte-domain digests, FastPQ maps the first eight little-endian bytes into the field:
Here Hash means Iroha's iroha_crypto::Hash::new, a 32-byte Blake2bVar digest, unless a formula explicitly names Poseidon2 or SHA-256.
Field Arithmetic
The Rust code represents field elements as canonical u64 values in [0,p). Addition and subtraction are:
Multiplication first computes the 128-bit product:
Goldilocks reduction then uses the identity:
If:
then the reducer computes:
The implementation conditionally adds or subtracts p until the result is canonical. Signed integers, such as balance deltas, are embedded by:
Poseidon2 Permutation
The Poseidon2 permutation state is:
Its S-box is:
FastPQ uses four full rounds, fifty-seven partial rounds, then four more full rounds. A full round with round constants c_r = (c_{r,0}, c_{r,1}, c_{r,2}) is:
A partial round is:
All additions and multiplications are in F. The canonical MDS matrix is:
The field hash starts from zero state. For every complete rate-2 block (u,v):
The final block appends the 1 padding element before one last permutation. The output is x_0.
Public Input Binding
The host encodes a dataspace id by writing its u64 value into the first eight little-endian bytes of the 16-byte field:
The block creation time is converted from milliseconds to nanoseconds:
The transaction-set hash is a byte-domain hash over the sorted entrypoint hashes:
where h_i are sorted transaction and time-trigger entrypoint hashes. In the proof public IO, if perm_root or tx_set_hash is all zero, the prover fills fallback values:
Numeric Normalization
For each transfer delta, the target decimal scale is the maximum trimmed scale across the amount and both balance snapshots:
A Numeric value with mantissa m and scale q is accepted only when m >= 0 and q <= s. Its FastPQ witness value is:
The normalized result must fit in u64.
Canonical Ordering
Before trace construction, the batch is sorted by transition key, operation rank, and original insertion index:
The ordering commitment is a Poseidon2 field hash over the domain fastpq:v1:ordering and the Norito encoding of the sorted transitions:
where P is 7-byte packing, E is Norito encoding, D_o is fastpq:v1:ordering, and T* is the sorted transition list.
Transfer Equations
For a transfer amount a, sender balance f, and receiver balance t, FastPQ validates the normalized witness values before building the trace:
The transition rows then encode:
Inside the trace, signed deltas are reduced into F:
The optional single-delta transfer digest commits the encoded transfer preimage:
For multi-delta transfer transcripts, the current code requires this top-level digest to be absent until per-delta digest plumbing is available.
The host authority digest for transfer transcripts is:
Trace Rows
Let the sorted transition list contain n real rows. The trace length is the next power of two:
Rows 0..n-1 are active; rows n..N-1 are padding rows. Each real row has one operation selector set:
All selector columns are Boolean:
Permission lookup rows are exactly role grant and role revoke rows:
For numeric operation rows:
The builder also tracks running per-asset deltas:
Only mint and burn rows update the supply counter:
Metadata and dataspace trace columns are field hashes derived before row materialization:
The metadata hash, dataspace hash, and slot are stable across adjacent trace rows:
Transfer Merkle Columns
Transfer rows carry a 32-level sparse Merkle path. If a host proof is missing, the prover synthesizes a deterministic path from the row key, pre-balance, and whether the row is the sender or receiver side.
For synthetic paths, the flavor salt is fastpq:smt:from for sender rows and fastpq:smt:to for receiver rows:
The synthetic leaf and internal nodes are:
The trace records the bit b_l, sibling s_l, input node x_l, and output node x_{l+1} at every level. With the code's branch convention:
Permission Hashes
Role grant and revoke rows hash the permission witness:
The host permission table root sorts entries by role bytes, permission bytes, and epoch bytes, then builds a Poseidon2 Merkle tree:
Odd-width levels duplicate the final element.
Trace Commitment
For each trace column c, FastPQ first interpolates the column values over the trace domain and hashes the coefficient vector:
The trace root is a Poseidon2 Merkle root over column commitments:
The final trace commitment is a byte hash over the domain, parameter set, trace shape, column digests, and trace root:
where D_c is fastpq:v1:trace_commitment.
AIR Composition
The V1 AIR composition value is a linear combination of row-local residues. The transcript samples two challenges:
For each adjacent row pair (i,i+1), the prover computes:
The residues rho are, in code order:
For rows with numeric columns:
And for stable batch context columns:
The verifier recomputes A_i for sampled row openings and checks it against the composition value committed under the AIR composition Merkle root.
Lookup Product
The permission lookup accumulator uses the Fiat-Shamir challenge gamma. Over the low-degree extension evaluations of s_perm and perm_hash, the running product is:
The proof records:
Low-Degree Extension
Let omega_T be the trace-domain generator, omega_E the evaluation-domain generator, and g the configured coset offset. For a trace column with values v_i, interpolation produces coefficients a_j such that:
The low-degree extension evaluates the same polynomial on the coset:
The implementation computes this by multiplying coefficients by powers of the coset offset before FFT:
and then evaluating a' on the evaluation domain.
The CPU FFT is an iterative radix-2 Cooley-Tukey transform over bit-reversed inputs. At stage length L, half length H=L/2, and stage root:
each butterfly computes:
The inverse FFT runs the same transform with omega^{-1} and scales by the inverse domain size:
Catalogue roots are validated before use:
For smaller domains derived from the catalogue root, the generator is:
Row and Leaf Hashes
After LDE, FastPQ hashes each row across all LDE columns. For m columns:
If row hashes are still on the trace domain rather than the evaluation domain, the prover interpolates and extends that single row-hash column with the same coset LDE process.
Merkle Openings
LDE values are grouped into chunks of:
Each chunk leaf is:
Merkle parents are:
Odd levels duplicate the last node. Query paths verify by hashing left or right according to the query leaf index parity at each level.
For a leaf at index i, a path (s_0,\ldots,s_{d-1}) verifies against root R by the recurrence:
The check passes only when:
AIR trace row leaves are:
AIR composition leaves are:
The LDE query opening also checks that the value opened at evaluation index i is present in its authenticated chunk:
FRI Folding
FRI commits to AIR composition evaluations. For each round l, the transcript samples a challenge beta_l. The layer is padded to a multiple of the arity by repeating the last value. Each arity-sized group folds to:
where a is the FRI arity. The verifier checks, for every sampled query chain, that:
and authenticates each opened FRI group against the corresponding FRI layer root.
Fiat-Shamir Transcript
The canonical parameter catalogue labels the transcript hash as SHA3-256. The current prover and verifier implementation derives challenge bytes with iroha_crypto::Hash::new, which is a 32-byte Blake2bVar digest, then reduces the first eight little-endian bytes into F:
Challenge calls append the full digest to the transcript state. The replay order is:
- public IO, protocol version, parameter version, and parameter name
- LDE root and trace root
gamma- AIR composition challenges
alpha_0,alpha_1 - AIR trace root and AIR composition root
- lookup grand product
- FRI layer roots and
beta_lchallenges - sampled query indices
Query sampling keeps drawing 32-byte challenge digests and reading them as little-endian u64 chunks until it has the requested number of unique indices:
The sampled set is returned in sorted order.
Verifier Replay
The verifier first recomputes the batch commitment:
and requires:
It also rebuilds public IO:
Every field must match the proof's public IO byte-for-byte. The verifier then reconstructs the same transcript and derives the same:
For each sampled query q, it checks:
and:
The AIR composition opening must authenticate under R_air_composition. The FRI chain then starts from the same A_q and must end in an authenticated final FRI leaf under the terminal FRI root.
What The Prover Checks
Before building the trace, the FastPQ prover canonicalizes the batch order by transition key, operation rank, and insertion order. Transfer rows also require transcript metadata. A batch with transfer rows but no transfer transcripts is invalid.
For transfer transcripts, the prover-side checks include:
- the sender balance must not underflow
sender_aftermust equalsender_before - amountreceiver_aftermust equalreceiver_before + amount- the transcript must cover every transfer row in the batch
- a single-delta Poseidon digest, when present, must match the transcript preimage
- provided sparse-Merkle proofs must decode as version 1; missing paths are filled with deterministic synthetic proofs
The trace contains selector columns for transfer, mint, burn, role grant, role revoke, metadata set, and permission lookup rows. Numeric operation rows also carry signed deltas, running per-asset deltas, and supply counters.
Prover Lane
irohad starts the FastPQ prover lane at startup if the prover backend can be initialized. The lane is a background task with a bounded queue. After a block produces an execution witness, the commit path submits a prover job containing the block hash, height, view, and witness.
If the lane is not running or the queue is full, the job is skipped and normal block processing continues. This means the background prover lane is not a transaction admission or consensus gate. It is a proof-production path over state that has already been executed.
The lane constructs a prover with:
parameter = "fastpq-lane-balanced"
execution_mode = auto | cpu | gpu
poseidon_mode = auto | cpu | gpuauto lets the prover choose the available backend. cpu pins execution to the CPU. gpu prefers GPU execution, with CPU fallback where the backend cannot use the requested kernels.
Verification
FastPQ proof verification rebuilds the canonical batch commitment and replays the public transcript. The verifier checks the protocol version, parameter-set version, replay limits, trace commitment, public inputs, sampled Merkle openings, AIR openings, and FRI query chain.
Default replay limits include:
| Limit | Default |
|---|---|
| Transition rows | 256 |
| Batch payload size | 256 KiB |
| FRI layers | 16 |
| Query openings | 128 |
Nexus Verified Relays
Nexus AXT proof envelopes can embed an AxtFastpqBinding. When RegisterVerifiedLaneRelay executes, Iroha:
- verifies the lane relay envelope and FastPQ proof material
- checks the dataspace and manifest root
- decodes the AXT proof envelope
- requires a
fastpq_binding - rebuilds the FastPQ batch from that binding
- decodes the embedded FastPQ proof
- calls the FastPQ verifier on the rebuilt batch and proof
If verification succeeds, Iroha stores a VerifiedLaneRelayRecord containing the relay reference, original envelope, proof payload hash, verification height, manifest root, and FastPQ binding.
Lane relay envelopes also carry compact FastPQ proof material. The material is a digest over the lane id, dataspace id, block height, verification height, block header hash, settlement hash, and manifest root. A relay is merge admissible only when it has both a QC and valid FastPQ proof material.
AXT Binding Math
For Nexus AXT envelopes, AxtFastpqBinding is canonicalized before proof replay. Empty parameter values default to fastpq-lane-balanced; empty verifier id and version default to fastpq and v1; claim type is trimmed and lowercased.
The AXT FastPQ public inputs are deterministic byte hashes:
AXT transition keys are:
The authorization claim inserts a role-grant row:
and a metadata row binding the authorization policy. The compliance claim inserts two metadata rows: one for policy and one for target dataspaces.
For tx_predicate and value_conservation, an explicit effect amount is used when the binding contains a positive source or destination amount. Otherwise the code derives a bounded deterministic amount:
Then the same transfer equations are used:
The synthetic sender and receiver account ids are generated from key seeds:
The transfer batch hash is:
The AXT batch manifest digest is SHA-256 over the Norito encoding of the canonical binding:
SCCP Transparent Message Proofs
The SCCP helper crate also uses FastPQ for transparent cross-chain message proofs. This path is separate from the irohad background prover lane. It builds a FastPQ batch directly from an SCCP message proof bundle and manifest, then wraps the resulting proof for open verification.
The SCCP batch uses fastpq-lane-balanced and three metadata transitions:
| Key | Operation |
|---|---|
sccp:transparent:v1:statement | MetaSet |
sccp:transparent:v1:context | MetaSet |
sccp:transparent:v1:payload | MetaSet |
Its public inputs are derived from the SCCP transparent inner proof:
| FastPQ input | SCCP source |
|---|---|
dsid | First 16 bytes of a Blake2b digest over the statement hash |
slot | Finality height |
old_root | Payload hash |
new_root | Commitment root |
perm_root | Finality block hash |
tx_set_hash | Statement hash |
The SCCP canonical encoders write integers little-endian and encode variable-length byte arrays as:
The transparent public input byte string is:
The transparent statement bytes are the concatenation of version, chain family, local and counterparty domains, security model, anchor governance, account codec, finality model, verifier target, verifier backend family, length-prefixed chain/backend/manifest fields, destination binding hash, account codec key, payload kind, public input bytes, and payload hash. The statement hash is:
The FastPQ dataspace id for this proof path is the first sixteen bytes of another prefixed Blake2b digest:
The SCCP FastPQ batch is exactly:
then sorted by the same FastPQ ordering rule.
The OpenVerify verifier commitment is SHA-256 over the SCCP message backend name and the canonical FastPQ verifier descriptor:
The raw FastPQ proof is Norito-encoded into a StarkFriOpenProofV1, then wrapped in an OpenVerifyEnvelope with backend Stark. SCCP verification rebuilds the same FastPQ batch from the bundle and manifest, checks the open verification envelope metadata, and calls the FastPQ verifier on the rebuilt batch and proof.
Parameter Sets
The canonical parameter catalogue exposes two parameter sets. The host prover lane currently uses fastpq-lane-balanced.
| Parameter | Purpose | Field | Hashes | FRI |
|---|---|---|---|---|
fastpq-lane-balanced | balanced prover throughput | Goldilocks quadratic extension | Poseidon2 commitments, catalogue SHA3 label | arity 8, blowup 8, 46 queries |
fastpq-lane-latency | latency-sensitive lanes | Goldilocks quadratic extension | Poseidon2 commitments, catalogue SHA3 label | arity 16, blowup 16, 34 queries |
Both target 128-bit security and use a trace domain size of 2^16. The Rust V1 transcript replay code currently derives Fiat-Shamir challenge bytes with iroha_crypto::Hash::new rather than directly invoking SHA3-256.
The exact catalogue constants used by the Rust prover are:
| Constant | fastpq-lane-balanced | fastpq-lane-latency |
|---|---|---|
target_security | 128 | 128 |
grinding_bits | 23 | 21 |
trace_log_size | 16 | 16 |
trace_root | 0x002a247f81c6f850 | 0x6a9f4eb38fb9b892 |
lde_log_size | 19 | 20 |
lde_root | 0x60263388dbbf9b2a | 0x9c9c3a571b6f89ac |
permutation_size | 65,536 | 65,536 |
lookup_log_size | 19 | 20 |
omega_coset | 0x6af325e825ad5c18 | 0x3a5fd4171e3c3a4d |
fri_arity | 8 | 16 |
fri_blowup | 8 | 16 |
fri_max_reductions | 8 | 6 |
fri_queries | 46 | 34 |
Configuration
FastPQ configuration is nested under zk.fastpq.
[zk.fastpq]
execution_mode = "auto"
poseidon_mode = "auto"
# Optional telemetry labels.
device_class = "apple-m4"
chip_family = "m4"
gpu_kind = "integrated"
# Optional Metal backend tuning.
metal_queue_fanout = 3
metal_queue_column_threshold = 24
metal_max_in_flight = 5
metal_threadgroup_width = 128
metal_trace = false
metal_debug_enum = false
metal_debug_fused = falseThe same execution and telemetry labels can be overridden from irohad:
irohad --fastpq-execution-mode auto
irohad --fastpq-poseidon-mode cpu
irohad --fastpq-device-class apple-m4
irohad --fastpq-chip-family m4
irohad --fastpq-gpu-kind integratedEnvironment variables are also supported for the configuration fields. The FastPQ-specific variables include:
FASTPQ_EXECUTION_MODEFASTPQ_POSEIDON_MODEFASTPQ_DEVICE_CLASSFASTPQ_CHIP_FAMILYFASTPQ_GPU_KINDFASTPQ_METAL_QUEUE_FANOUTFASTPQ_METAL_COLUMN_THRESHOLDFASTPQ_METAL_MAX_IN_FLIGHTFASTPQ_METAL_THREADGROUPFASTPQ_METAL_TRACEFASTPQ_DEBUG_METAL_ENUMFASTPQ_DEBUG_FUSED
Metrics
When telemetry is enabled, FastPQ exports metrics for backend selection and Metal runtime behavior:
| Metric | Meaning |
|---|---|
fastpq_execution_mode_total | Requested and resolved execution mode by backend and device labels |
fastpq_poseidon_pipeline_total | Requested and resolved Poseidon pipeline path |
fastpq_metal_queue_depth | Metal queue limit, max in-flight count, dispatch count, and sampling window |
fastpq_metal_queue_ratio | Metal queue busy and overlap ratios |
fastpq_zero_fill_duration_ms | Host zero-fill duration for Metal runs |
fastpq_zero_fill_bandwidth_gbps | Derived zero-fill bandwidth |
For general performance triage, use these with the consensus and queue signals listed in Performance and Metrics.
Related Reference
- Data Model Schema for generated type details
FastpqTransitionBatchFastpqPublicInputsTransferTranscriptAxtFastpqBindingLaneFastpqProofMaterialirohadFastPQ options