Introduction — What is GroveDB?
The Core Idea
GroveDB is a hierarchical authenticated data structure — essentially a grove (tree of trees) built on Merkle AVL trees. Each node in the database is part of a cryptographically authenticated tree, and each tree can contain other trees as children, forming a deep hierarchy of verifiable state.
graph TD
subgraph root["Root Merk Tree"]
R_contracts[""contracts"<br/><i>Tree</i>"]
R_identities[""identities"<br/><i>Tree</i>"]
R_balances[""balances"<br/><i>SumTree</i>"]
R_contracts --- R_identities
R_contracts --- R_balances
end
subgraph ident["Identities Merk"]
I_bob[""bob"<br/><i>Tree</i>"]
I_alice[""alice"<br/><i>Tree</i>"]
I_carol[""carol"<br/><i>Item</i>"]
I_bob --- I_alice
I_bob --- I_carol
end
subgraph contracts["Contracts Merk"]
C_c2[""C2"<br/><i>Item</i>"]
C_c1[""C1"<br/><i>Item</i>"]
C_c3[""C3"<br/><i>Item</i>"]
C_c2 --- C_c1
C_c2 --- C_c3
end
subgraph balances["Balances SumTree — sum=5300"]
B_bob[""bob"<br/>SumItem(2500)"]
B_al[""alice"<br/>SumItem(2000)"]
B_eve[""eve"<br/>SumItem(800)"]
B_bob --- B_al
B_bob --- B_eve
end
subgraph alice_merk["Alice Merk"]
A_name[""name" → Alice"]
A_bal[""balance" → 1000"]
end
subgraph bob_merk["Bob Merk"]
Bo_name[""name" → Bob"]
end
R_identities -.->|subtree| ident
R_contracts -.->|subtree| contracts
R_balances -.->|subtree| balances
I_alice -.->|subtree| alice_merk
I_bob -.->|subtree| bob_merk
style root fill:#e8f4fd,stroke:#2980b9,stroke-width:2px
style ident fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style contracts fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style balances fill:#fdedec,stroke:#e74c3c,stroke-width:2px
style alice_merk fill:#eafaf1,stroke:#27ae60,stroke-width:2px
style bob_merk fill:#eafaf1,stroke:#27ae60,stroke-width:2px
Each colored box is a separate Merk tree. Dashed arrows show the subtree relationship — a Tree element in the parent contains the root key of the child Merk.
In a traditional database, you might store data in a flat key-value store with a single Merkle tree on top for authentication. GroveDB takes a different approach: it nests Merkle trees inside Merkle trees. This gives you:
- Efficient secondary indexes — query by any path, not just primary key
- Compact cryptographic proofs — prove the existence (or absence) of any data
- Aggregate data — trees can automatically sum, count, or otherwise aggregate their children
- Atomic cross-tree operations — batch operations span multiple subtrees
Why GroveDB Exists
GroveDB was designed for Dash Platform, a decentralized application platform where every piece of state must be:
- Authenticated: Any node can prove any piece of state to a light client
- Deterministic: Every node computes exactly the same state root
- Efficient: Operations must complete within block time constraints
- Queryable: Applications need rich queries, not just key lookups
Traditional approaches fall short:
| Approach | Problem |
|---|---|
| Plain Merkle Tree | Only supports key lookups, no range queries |
| Ethereum MPT | Expensive rebalancing, large proof sizes |
| Flat key-value + single tree | No hierarchical queries, single proof covers everything |
| B-tree | Not naturally Merklized, complex authentication |
GroveDB solves these by combining the proven balance guarantees of AVL trees with hierarchical nesting and a rich element type system.
Architecture Overview
GroveDB is organized into distinct layers, each with a clear responsibility:
graph TD
APP["<b>Application Layer</b><br/>Dash Platform, etc.<br/><code>insert() get() query() prove() apply_batch()</code>"]
GROVE["<b>GroveDB Core</b> — <code>grovedb/src/</code><br/>Hierarchical subtree management · Element type system<br/>Reference resolution · Batch ops · Multi-layer proofs"]
MERK["<b>Merk Layer</b> — <code>merk/src/</code><br/>Merkle AVL tree · Self-balancing rotations<br/>Link system · Blake3 hashing · Proof encoding"]
STORAGE["<b>Storage Layer</b> — <code>storage/src/</code><br/>RocksDB OptimisticTransactionDB<br/>4 column families · Blake3 prefix isolation · Batched writes"]
COST["<b>Cost Layer</b> — <code>costs/src/</code><br/>OperationCost tracking · CostContext monad<br/>Worst-case & average-case estimation"]
APP ==>|"writes ↓"| GROVE
GROVE ==>|"tree ops"| MERK
MERK ==>|"disk I/O"| STORAGE
STORAGE -.->|"cost accumulation ↑"| COST
COST -.-> MERK
MERK -.-> GROVE
GROVE ==>|"reads ↑"| APP
style APP fill:#f5f5f5,stroke:#999,stroke-width:1px
style GROVE fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style MERK fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style STORAGE fill:#fdebd0,stroke:#e67e22,stroke-width:2px
style COST fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
Data flows down through these layers during writes and up during reads. Every operation accumulates costs as it traverses the stack, enabling precise resource accounting.
The Merk Tree — A Merkle AVL Tree
The Merk tree is the fundamental building block of GroveDB. Every subtree in the grove is a Merk tree — a self-balancing binary search tree where every node is cryptographically hashed, producing a single root hash that authenticates the entire tree's contents.
What is a Merk Node?
Unlike many Merkle tree implementations where data lives only at the leaves, in a Merk tree every node stores a key-value pair. This means there are no "empty" internal nodes — the tree is both a search structure and a data store simultaneously.
graph TD
subgraph TreeNode
subgraph inner["inner: Box<TreeNodeInner>"]
subgraph kv["kv: KV"]
KEY["<b>key:</b> Vec<u8><br/><i>e.g. b"alice"</i>"]
VAL["<b>value:</b> Vec<u8><br/><i>serialized Element bytes</i>"]
FT["<b>feature_type:</b> TreeFeatureType<br/><i>BasicMerkNode | SummedMerkNode(n) | ...</i>"]
VH["<b>value_hash:</b> [u8; 32]<br/><i>H(varint(value.len) ‖ value)</i>"]
KVH["<b>hash:</b> [u8; 32] — the kv_hash<br/><i>H(varint(key.len) ‖ key ‖ value_hash)</i>"]
end
LEFT["<b>left:</b> Option<Link>"]
RIGHT["<b>right:</b> Option<Link>"]
end
OLD["<b>old_value:</b> Option<Vec<u8>> — previous value for cost deltas"]
KNOWN["<b>known_storage_cost:</b> Option<KeyValueStorageCost>"]
end
LEFT -->|"smaller keys"| LC["Left Child<br/><i>Link::Reference | Modified<br/>Uncommitted | Loaded</i>"]
RIGHT -->|"larger keys"| RC["Right Child<br/><i>Link::Reference | Modified<br/>Uncommitted | Loaded</i>"]
style kv fill:#eaf2f8,stroke:#2980b9
style inner fill:#fef9e7,stroke:#f39c12
style TreeNode fill:#f9f9f9,stroke:#333
style LC fill:#d5f5e3,stroke:#27ae60
style RC fill:#d5f5e3,stroke:#27ae60
In code (merk/src/tree/mod.rs):
#![allow(unused)] fn main() { pub struct TreeNode { pub(crate) inner: Box<TreeNodeInner>, pub(crate) old_value: Option<Vec<u8>>, // Previous value for cost tracking pub(crate) known_storage_cost: Option<KeyValueStorageCost>, } pub struct TreeNodeInner { pub(crate) left: Option<Link>, // Left child (smaller keys) pub(crate) right: Option<Link>, // Right child (larger keys) pub(crate) kv: KV, // The key-value payload } }
The Box<TreeNodeInner> keeps the node on the heap, which is essential since child
links can recursively contain entire TreeNode instances.
The KV Structure
The KV struct holds both the raw data and its cryptographic digests
(merk/src/tree/kv.rs):
#![allow(unused)] fn main() { pub struct KV { pub(super) key: Vec<u8>, // The lookup key pub(super) value: Vec<u8>, // The stored value pub(super) feature_type: TreeFeatureType, // Aggregation behavior pub(crate) value_defined_cost: Option<ValueDefinedCostType>, pub(super) hash: CryptoHash, // kv_hash pub(super) value_hash: CryptoHash, // H(value) } }
Two important points:
-
Keys are not stored on disk as part of the encoded node. They are stored as the RocksDB key. When a node is decoded from storage, the key is injected from the outside. This avoids duplicating key bytes.
-
Two hash fields are maintained. The
value_hashisH(value)and thehash(kv_hash) isH(key, value_hash). Keeping both allows the proof system to choose how much information to reveal.
The Semi-Balanced Nature — How AVL "Wobbles"
A Merk tree is an AVL tree — the classic self-balancing binary search tree invented by Adelson-Velsky and Landis. The key invariant is:
For every node, the height difference between its left and right subtrees is at most 1.
This is expressed as the balance factor:
balance_factor = right_height - left_height
Valid values: {-1, 0, 1}
#![allow(unused)] fn main() { // merk/src/tree/mod.rs pub const fn balance_factor(&self) -> i8 { let left_height = self.child_height(true) as i8; let right_height = self.child_height(false) as i8; right_height - left_height } }
But here's the subtle point: while each individual node can only lean by one level, these leanings can compound through the tree. This is why we call it "semi-balanced" — the tree isn't perfectly balanced like a complete binary tree.
Consider a tree of 10 nodes. A perfectly balanced tree would have height 4 (⌈log₂(10+1)⌉). But an AVL tree might have height 5:
Perfectly balanced (height 4) — every level fully packed:
graph TD
N5["5<br/><small>bf=0</small>"]
N3["3<br/><small>bf=0</small>"]
N8["8<br/><small>bf=0</small>"]
N2["2<br/><small>bf=0</small>"]
N4["4<br/><small>bf=0</small>"]
N6["6<br/><small>bf=0</small>"]
N9["9<br/><small>bf=+1</small>"]
N10["10<br/><small>bf=0</small>"]
N5 --- N3
N5 --- N8
N3 --- N2
N3 --- N4
N8 --- N6
N8 --- N9
N9 --- N10
style N5 fill:#d4e6f1,stroke:#2980b9
AVL-valid "wobble" (height 5) — each node leans at most 1, but it compounds:
graph TD
N4["4<br/><small>bf=+1</small>"]
N2["2<br/><small>bf=-1</small>"]
N7["7<br/><small>bf=+1</small>"]
N1["1<br/><small>bf=-1</small>"]
N3["3<br/><small>bf=0</small>"]
N5["5<br/><small>bf=0</small>"]
N9["9<br/><small>bf=-1</small>"]
N0["0<br/><small>bf=0</small>"]
N8["8<br/><small>bf=0</small>"]
N10["10<br/><small>bf=0</small>"]
N4 --- N2
N4 --- N7
N2 --- N1
N2 --- N3
N7 --- N5
N7 --- N9
N1 --- N0
N9 --- N8
N9 --- N10
style N4 fill:#fadbd8,stroke:#e74c3c
Height 5 vs the perfect 4 — that's the "wobble". Worst case: h ≤ 1.44 × log₂(n+2).
Both trees are valid AVL trees! The worst-case height of an AVL tree is:
h ≤ 1.4404 × log₂(n + 2) − 0.3277
So for n = 1,000,000 nodes:
- Perfect balance: height 20
- AVL worst case: height ≈ 29
This ~44% overhead is the price of AVL's simple rotation rules. In practice, random insertions produce trees much closer to perfectly balanced.
Here's what valid and invalid trees look like:
VALID — all balance factors in {-1, 0, +1}:
graph TD
subgraph balanced["Balanced (bf=0)"]
D1["D<br/>bf=0"] --- B1["B<br/>bf=0"]
D1 --- F1["F<br/>bf=0"]
B1 --- A1["A"] & C1["C"]
F1 --- E1["E"] & G1["G"]
end
subgraph rightlean["Right-leaning (bf=+1)"]
D2["D<br/>bf=+1"] --- B2["B<br/>bf=0"]
D2 --- F2["F<br/>bf=0"]
B2 --- A2["A"] & C2["C"]
F2 --- E2["E"] & G2["G"]
end
subgraph leftlean["Left-leaning (bf=-1)"]
D3["D<br/>bf=-1"] --- B3["B<br/>bf=-1"]
D3 --- E3["E"]
B3 --- A3["A"]
end
style balanced fill:#d5f5e3,stroke:#27ae60
style rightlean fill:#d5f5e3,stroke:#27ae60
style leftlean fill:#d5f5e3,stroke:#27ae60
INVALID — balance factor = +2 (needs rotation!):
graph TD
B["B<br/><b>bf=+2 ✗</b>"]
D["D<br/>bf=+1"]
F["F<br/>bf=0"]
B --- D
D --- F
style B fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
The right subtree is 2 levels taller than the left (which is empty). This triggers a left rotation to restore the AVL invariant.
Rotations — Restoring Balance
When an insertion or deletion causes a balance factor to reach ±2, the tree must be rotated to restore the AVL invariant. There are four cases, reducible to two fundamental operations.
Single Left Rotation
Used when a node is right-heavy (bf = +2) and its right child is right-heavy or balanced (bf ≥ 0):
Before (bf=+2):
graph TD
A["A<br/><small>bf=+2</small>"]
t1["t₁"]
B["B<br/><small>bf≥0</small>"]
X["X"]
C["C"]
A --- t1
A --- B
B --- X
B --- C
style A fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
After left rotation — B promoted to root:
graph TD
B2["B<br/><small>bf=0</small>"]
A2["A"]
C2["C"]
t12["t₁"]
X2["X"]
B2 --- A2
B2 --- C2
A2 --- t12
A2 --- X2
style B2 fill:#d5f5e3,stroke:#27ae60,stroke-width:3px
Steps: (1) Detach B from A. (2) Detach X (B's left child). (3) Attach X as A's right child. (4) Attach A as B's left child. The subtree rooted at B is now balanced.
In code (merk/src/tree/ops.rs):
#![allow(unused)] fn main() { fn rotate<V>(self, left: bool, ...) -> CostResult<Self, Error> { // Detach child on the heavy side let (tree, child) = self.detach_expect(left, ...); // Detach grandchild from opposite side of child let (child, maybe_grandchild) = child.detach(!left, ...); // Attach grandchild to original root tree.attach(left, maybe_grandchild) .maybe_balance(...) .flat_map_ok(|tree| { // Attach original root as child of promoted node child.attach(!left, Some(tree)) .maybe_balance(...) }) } }
Note how maybe_balance is called recursively — the rotation itself might create
new imbalances that need further correction.
Double Rotation (Left-Right)
Used when a node is left-heavy (bf = -2) but its left child is right-heavy (bf > 0). A single rotation would not fix this:
Step 0: Before — C is left-heavy (bf=-2) but its left child A leans right (bf=+1). A single rotation won't fix this:
graph TD
C0["C<br/><small>bf=-2</small>"]
A0["A<br/><small>bf=+1</small>"]
t4["t₄"]
t1["t₁"]
B0["B"]
t2["t₂"]
t3["t₃"]
C0 --- A0
C0 --- t4
A0 --- t1
A0 --- B0
B0 --- t2
B0 --- t3
style C0 fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
Step 1: Left-rotate child A — now both C and B lean left, fixable by a single rotation:
graph TD
C1["C<br/><small>bf=-2</small>"]
B1["B"]
t41["t₄"]
A1["A"]
t31["t₃"]
t11["t₁"]
t21["t₂"]
C1 --- B1
C1 --- t41
B1 --- A1
B1 --- t31
A1 --- t11
A1 --- t21
style C1 fill:#fdebd0,stroke:#e67e22,stroke-width:2px
Step 2: Right-rotate root C — balanced!
graph TD
B2["B<br/><small>bf=0</small>"]
A2["A"]
C2["C"]
t12["t₁"]
t22["t₂"]
t32["t₃"]
t42["t₄"]
B2 --- A2
B2 --- C2
A2 --- t12
A2 --- t22
C2 --- t32
C2 --- t42
style B2 fill:#d5f5e3,stroke:#27ae60,stroke-width:3px
The algorithm detects this case by comparing the parent's lean direction with the child's balance factor:
#![allow(unused)] fn main() { fn maybe_balance<V>(self, ...) -> CostResult<Self, Error> { let balance_factor = self.balance_factor(); if balance_factor.abs() <= 1 { return Ok(self); // Already balanced } let left = balance_factor < 0; // true if left-heavy // Double rotation needed when child leans opposite to parent let tree = if left == (self.tree().link(left).unwrap().balance_factor() > 0) { // First rotation: rotate child in opposite direction self.walk_expect(left, |child| child.rotate(!left, ...).map_ok(Some), ... ) } else { self }; // Second (or only) rotation tree.rotate(left, ...) } }
Batch Operations — Building and Applying
Rather than inserting elements one at a time, Merk supports batch operations that apply multiple changes in a single pass. This is critical for efficiency: a batch of N operations on a tree of M elements takes O((M + N) log(M + N)) time, versus O(N log M) for sequential inserts.
The MerkBatch Type
#![allow(unused)] fn main() { type MerkBatch<K> = [(K, Op)]; enum Op { Put(Vec<u8>, TreeFeatureType), // Insert or update with value and feature type PutWithSpecializedCost(...), // Insert with predefined cost PutCombinedReference(...), // Insert reference with combined hash Replace(Vec<u8>, TreeFeatureType), Patch { .. }, // Partial value update Delete, // Remove key DeleteLayered, // Remove with layered cost DeleteMaybeSpecialized, // Remove with optional specialized cost } }
Strategy 1: build() — Building from Scratch
When the tree is empty, build() constructs a balanced tree directly from the
sorted batch using a median-split algorithm:
Input batch (sorted): [A, B, C, D, E, F, G] — pick middle (D) as root, recurse on each half:
graph TD
D["<b>D</b><br/><small>root = mid(0..6)</small>"]
B["<b>B</b><br/><small>mid(A,B,C)</small>"]
F["<b>F</b><br/><small>mid(E,F,G)</small>"]
A["A"]
C["C"]
E["E"]
G["G"]
D --- B
D --- F
B --- A
B --- C
F --- E
F --- G
style D fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style B fill:#d5f5e3,stroke:#27ae60
style F fill:#d5f5e3,stroke:#27ae60
Result: perfectly balanced tree with height = 3 = ⌈log₂(7)⌉.
#![allow(unused)] fn main() { fn build(batch: &MerkBatch<K>, ...) -> CostResult<Option<TreeNode>, Error> { let mid_index = batch.len() / 2; let (mid_key, mid_op) = &batch[mid_index]; // Create root node from middle element let mid_tree = TreeNode::new(mid_key.clone(), value.clone(), None, feature_type)?; // Recursively build left and right subtrees let left = Self::build(&batch[..mid_index], ...); let right = Self::build(&batch[mid_index + 1..], ...); // Attach children mid_tree.attach(true, left).attach(false, right) } }
This produces a tree with height ⌈log₂(n)⌉ — perfectly balanced.
Strategy 2: apply_sorted() — Merging into Existing Tree
When the tree already has data, apply_sorted() uses binary search to find
where each batch operation belongs, then recursively applies operations to the left
and right subtrees:
Existing tree with batch [(B, Put), (F, Delete)]:
Binary search: B < D (go left), F > D (go right).
Before:
graph TD
D0["D"] --- C0["C"]
D0 --- E0["E"]
E0 --- F0["F"]
style D0 fill:#d4e6f1,stroke:#2980b9
After applying batch and rebalancing:
graph TD
D1["D"] --- B1["B"]
D1 --- E1["E"]
B1 --- C1["C"]
style D1 fill:#d5f5e3,stroke:#27ae60
B inserted as left subtree, F deleted from right subtree.
maybe_balance()confirms bf(D) = 0.
#![allow(unused)] fn main() { fn apply_sorted(self, batch: &MerkBatch<K>, ...) -> CostResult<...> { let search = batch.binary_search_by(|(key, _)| key.cmp(self.tree().key())); match search { Ok(index) => { // Key matches this node — apply operation directly // (Put replaces value, Delete removes node) } Err(mid) => { // Key not found — mid is the split point // Recurse on left_batch[..mid] and right_batch[mid..] } } self.recurse(batch, mid, exclusive, ...) } }
The recurse method splits the batch and walks left and right:
#![allow(unused)] fn main() { fn recurse(self, batch: &MerkBatch<K>, mid: usize, ...) { let left_batch = &batch[..mid]; let right_batch = &batch[mid..]; // or mid+1 if exclusive // Apply left batch to left subtree let tree = self.walk(true, |maybe_left| { Self::apply_to(maybe_left, left_batch, ...) }); // Apply right batch to right subtree let tree = tree.walk(false, |maybe_right| { Self::apply_to(maybe_right, right_batch, ...) }); // Re-balance after modifications tree.maybe_balance(...) } }
Node Removal
When deleting a node with two children, Merk promotes the edge node from the taller subtree. This minimizes the chance of needing additional rotations:
Before — deleting D (has two children, right subtree height ≥ left):
graph TD
D["D ✗ delete"]
B0["B"]
F0["F"]
A0["A"]
C0["C"]
E0["E ← successor"]
G0["G"]
D --- B0
D --- F0
B0 --- A0
B0 --- C0
F0 --- E0
F0 --- G0
style D fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
style E0 fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
After — E (leftmost in right subtree = in-order successor) promoted to D's position:
graph TD
E1["E"]
B1["B"]
F1["F"]
A1["A"]
C1["C"]
G1["G"]
E1 --- B1
E1 --- F1
B1 --- A1
B1 --- C1
F1 --- G1
style E1 fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
Rule: If left height > right → promote right-edge of left subtree. If right height ≥ left → promote left-edge of right subtree. This minimizes post-deletion rebalancing.
#![allow(unused)] fn main() { pub fn remove(self, ...) -> CostResult<Option<Self>, Error> { let has_left = tree.link(true).is_some(); let has_right = tree.link(false).is_some(); let left = tree.child_height(true) > tree.child_height(false); if has_left && has_right { // Two children: promote edge of taller child let (tree, tall_child) = self.detach_expect(left, ...); let (_, short_child) = tree.detach_expect(!left, ...); tall_child.promote_edge(!left, short_child, ...) } else if has_left || has_right { // One child: promote it directly self.detach_expect(left, ...).1 } else { // Leaf node: just remove None } } }
The Link System — Lazy Loading Architecture
Loading an entire Merk tree into memory would be prohibitively expensive for large trees. The Link system solves this by representing child connections in four possible states, enabling lazy loading — children are fetched from storage only when actually needed.
Four Link States
#![allow(unused)] fn main() { // merk/src/tree/link.rs pub enum Link { Reference { // Pruned: only metadata, no tree in memory hash: CryptoHash, child_heights: (u8, u8), key: Vec<u8>, aggregate_data: AggregateData, }, Modified { // Recently changed, hash not yet computed pending_writes: usize, child_heights: (u8, u8), tree: TreeNode, }, Uncommitted { // Hashed but not yet persisted to storage hash: CryptoHash, child_heights: (u8, u8), tree: TreeNode, aggregate_data: AggregateData, }, Loaded { // Fully loaded from storage hash: CryptoHash, child_heights: (u8, u8), tree: TreeNode, aggregate_data: AggregateData, }, } }
State Transition Diagram
stateDiagram-v2
[*] --> Reference : decode from storage<br/>(hash + key + child_heights)
Reference --> Loaded : fetch()<br/>load from RocksDB
Loaded --> Modified : put / delete<br/>any mutation invalidates hash
Modified --> Uncommitted : commit()<br/>recompute hashes bottom-up
Uncommitted --> Loaded : write to disk<br/>persist to RocksDB
Loaded --> Reference : into_reference()<br/>prune to free memory
state Reference {
[*] : hash ✓ · tree ✗ · key ✓
}
state Loaded {
[*] : hash ✓ · tree ✓
}
state Modified {
[*] : hash ✗ INVALID · tree ✓ dirty<br/>pending_writes: n
}
state Uncommitted {
[*] : hash ✓ fresh · tree ✓ clean
}
What Each State Stores
| State | Hash? | Tree in Memory? | Purpose |
|---|---|---|---|
| Reference | Yes | No | Compact on-disk representation. Only stores key, hash, child heights, and aggregate data. |
| Modified | No | Yes | After any mutation. Tracks pending_writes count for batch optimization. |
| Uncommitted | Yes | Yes | After hash computation but before storage write. Intermediate state during commit. |
| Loaded | Yes | Yes | Fully materialized. Ready for reads or further modifications. |
The pending_writes field in Modified is noteworthy:
#![allow(unused)] fn main() { // Computed as: 1 + left_pending_writes + right_pending_writes pending_writes: 1 + tree.child_pending_writes(true) + tree.child_pending_writes(false), }
This count helps the commit phase decide how to order writes for optimal performance.
The Fetch Callback Pattern
The Link system uses a Fetch trait to abstract over how child nodes are loaded:
#![allow(unused)] fn main() { pub trait Fetch { fn fetch( &self, link: &Link, value_defined_cost_fn: Option<&impl Fn(&[u8], &GroveVersion) -> Option<ValueDefinedCostType>>, grove_version: &GroveVersion, ) -> CostResult<TreeNode, Error>; } }
Different fetch implementations serve different purposes:
- StorageFetch: Loads from RocksDB (the normal path)
- PanicSource: Used in tests where fetching should never happen
- MockSource: Returns controlled test data
This pattern allows the tree operations to be storage-agnostic — the same balancing and mutation logic works regardless of where the data comes from.
The Walker Pattern
The Walker wraps a TreeNode with a Fetch source, providing safe tree
traversal with automatic lazy loading (merk/src/tree/walk/mod.rs):
#![allow(unused)] fn main() { pub struct Walker<S: Fetch + Sized + Clone> { tree: Owner<TreeNode>, source: S, } }
The Walker provides three key operations:
walk() — Detach a child, transform it, and reattach:
#![allow(unused)] fn main() { pub fn walk<F, T>(self, left: bool, f: F, ...) -> CostResult<Self, Error> where F: FnOnce(Option<Self>) -> CostResult<Option<T>, Error>, T: Into<TreeNode>, }
detach() — Remove a child, loading it from storage if necessary:
#![allow(unused)] fn main() { pub fn detach(self, left: bool, ...) -> CostResult<(Self, Option<Self>), Error> }
If the child is a Link::Reference (pruned), detach will call fetch() to load
it first. If the child is already in memory (Modified, Uncommitted, Loaded),
it simply takes ownership.
attach() — Connect a child to a parent:
#![allow(unused)] fn main() { pub fn attach(self, left: bool, maybe_child: Option<Self>) -> Self }
Attaching always creates a Link::Modified since the parent-child relationship
has changed.
Memory Efficiency Through Pruning
After committing changes, the tree can prune loaded subtrees back to
Link::Reference, freeing memory while retaining the hash needed for proof
generation:
Before pruning — all 7 nodes in memory:
graph TD
D["D<br/><small>Loaded</small>"]
B["B<br/><small>Loaded</small>"]
F["F<br/><small>Loaded</small>"]
A["A<br/><small>Loaded</small>"]
C["C<br/><small>Loaded</small>"]
E["E<br/><small>Loaded</small>"]
G["G<br/><small>Loaded</small>"]
D --- B & F
B --- A & C
F --- E & G
style D fill:#d4e6f1,stroke:#2980b9
style B fill:#d4e6f1,stroke:#2980b9
style F fill:#d4e6f1,stroke:#2980b9
style A fill:#d4e6f1,stroke:#2980b9
style C fill:#d4e6f1,stroke:#2980b9
style E fill:#d4e6f1,stroke:#2980b9
style G fill:#d4e6f1,stroke:#2980b9
After pruning — only root in memory, children are Link::Reference (just hash + key):
graph TD
D2["D<br/><small>Loaded (in memory)</small>"]
B2["B<br/><small>Reference<br/>hash + key only</small>"]
F2["F<br/><small>Reference<br/>hash + key only</small>"]
D2 --- B2 & F2
style D2 fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style B2 fill:#f5f5f5,stroke:#999,stroke-dasharray: 5 5
style F2 fill:#f5f5f5,stroke:#999,stroke-dasharray: 5 5
Link::Loaded holds
hash + child_heights + tree (TreeNode). Link::Reference holds onlyhash + child_heights + key— the TreeNode is released from memory.
The transformation is simple:
#![allow(unused)] fn main() { pub fn into_reference(self) -> Link { Link::Reference { hash: self.hash(), child_heights: self.child_heights(), key: self.key().to_vec(), aggregate_data: self.aggregate_data(), } } }
This is crucial for keeping memory usage bounded in large trees — only the nodes actively being accessed need to be in memory.
Hashing — Cryptographic Integrity
Every node in a Merk tree is hashed to produce a root hash — a single 32-byte value that authenticates the entire tree. Any change to any key, value, or structural relationship will produce a different root hash.
Three-Level Hash Hierarchy
Merk uses a three-level hashing scheme, from innermost to outermost:
Example: key = "bob" (3 bytes), value = "hello" (5 bytes):
graph LR
subgraph level1["Level 1: value_hash"]
V_IN["varint(5) ‖ "hello""]
V_BLAKE["Blake3"]
V_OUT(["value_hash<br/><small>32 bytes</small>"])
V_IN --> V_BLAKE --> V_OUT
end
subgraph level2["Level 2: kv_hash"]
K_IN["varint(3) ‖ "bob" ‖ value_hash"]
K_BLAKE["Blake3"]
K_OUT(["kv_hash<br/><small>32 bytes</small>"])
K_IN --> K_BLAKE --> K_OUT
end
subgraph level3["Level 3: node_hash"]
N_LEFT(["left_child_hash<br/><small>32B (or NULL_HASH)</small>"])
N_KV(["kv_hash<br/><small>32B</small>"])
N_RIGHT(["right_child_hash<br/><small>32B (or NULL_HASH)</small>"])
N_BLAKE["Blake3<br/><small>96B input = 2 blocks</small>"]
N_OUT(["node_hash<br/><small>32 bytes</small>"])
N_LEFT --> N_BLAKE
N_KV --> N_BLAKE
N_RIGHT --> N_BLAKE
N_BLAKE --> N_OUT
end
V_OUT -.-> K_IN
K_OUT -.-> N_KV
style level1 fill:#eaf2f8,stroke:#2980b9
style level2 fill:#fef9e7,stroke:#f39c12
style level3 fill:#fdedec,stroke:#e74c3c
The ROOT of the tree =
node_hashof the root node — authenticates every key, value, and structural relationship. Missing children useNULL_HASH = [0x00; 32].
Level 1: value_hash
#![allow(unused)] fn main() { // merk/src/tree/hash.rs pub fn value_hash(value: &[u8]) -> CostContext<CryptoHash> { let mut hasher = blake3::Hasher::new(); let val_length = value.len().encode_var_vec(); // Varint encoding hasher.update(val_length.as_slice()); hasher.update(value); // ... } }
The value's length is varint-encoded and prepended. This is critical for
collision resistance — without it, H("AB" ‖ "C") would equal H("A" ‖ "BC").
Level 2: kv_hash
#![allow(unused)] fn main() { pub fn kv_hash(key: &[u8], value: &[u8]) -> CostContext<CryptoHash> { let mut hasher = blake3::Hasher::new(); let key_length = key.len().encode_var_vec(); hasher.update(key_length.as_slice()); hasher.update(key); let vh = value_hash(value); hasher.update(vh.as_slice()); // Nested hash // ... } }
This binds the key to the value. For proof verification, there's also a variant that takes a pre-computed value_hash:
#![allow(unused)] fn main() { pub fn kv_digest_to_kv_hash(key: &[u8], value_hash: &CryptoHash) -> CostContext<CryptoHash> }
This is used when the verifier already has the value_hash (e.g., for subtrees where value_hash is a combined hash).
Level 3: node_hash
#![allow(unused)] fn main() { pub fn node_hash( kv: &CryptoHash, left: &CryptoHash, right: &CryptoHash, ) -> CostContext<CryptoHash> { let mut hasher = blake3::Hasher::new(); hasher.update(kv); // 32 bytes hasher.update(left); // 32 bytes hasher.update(right); // 32 bytes — total 96 bytes // Always exactly 2 hash operations (96 bytes / 64-byte block = 2) } }
If a child is absent, its hash is the NULL_HASH — 32 zero bytes:
#![allow(unused)] fn main() { pub const NULL_HASH: CryptoHash = [0; HASH_LENGTH]; // [0u8; 32] }
Blake3 as the Hash Function
GroveDB uses Blake3 for all hashing. Key properties:
- 256-bit output (32 bytes)
- Block size: 64 bytes
- Speed: ~3x faster than SHA-256 on modern hardware
- Streaming: Can incrementally feed data
The hash operation cost is calculated based on how many 64-byte blocks are processed:
#![allow(unused)] fn main() { let hashes = 1 + (hasher.count() - 1) / 64; // Number of hash operations }
Length-Prefix Encoding for Collision Resistance
Every variable-length input is prefixed with its length using varint encoding:
graph LR
subgraph bad["Without length prefix — VULNERABLE"]
BAD1["H("AB" + "C") = H(0x41 0x42 0x43)"]
BAD2["H("A" + "BC") = H(0x41 0x42 0x43)"]
BAD1 --- SAME["SAME HASH!"]
BAD2 --- SAME
end
subgraph good["With length prefix — collision resistant"]
GOOD1["H([02] 0x41 0x42)<br/><small>varint(2) ‖ "AB"</small>"]
GOOD2["H([03] 0x41 0x42 0x43)<br/><small>varint(3) ‖ "ABC"</small>"]
GOOD1 --- DIFF["DIFFERENT"]
GOOD2 --- DIFF
end
style bad fill:#fadbd8,stroke:#e74c3c
style good fill:#d5f5e3,stroke:#27ae60
style SAME fill:#e74c3c,color:#fff,stroke:#c0392b
style DIFF fill:#27ae60,color:#fff,stroke:#229954
value_hash input:
[varint(value.len)] [value bytes]kv_hash input:[varint(key.len)] [key bytes] [value_hash: 32 bytes]
Without length prefixes, an attacker could craft different key-value pairs that hash to the same digest. The length prefix makes this cryptographically infeasible.
Combined Hashing for Special Elements
For subtrees and references, the value_hash is not simply H(value).
Instead, it's a combined hash that binds the element to its target:
graph LR
subgraph item["Regular Item"]
I_val["value bytes"] --> I_hash["H(len ‖ bytes)"] --> I_vh(["value_hash"])
end
subgraph subtree["Subtree Element"]
S_elem["tree element bytes"] --> S_hash1["H(len ‖ bytes)"]
S_root(["child Merk<br/>root hash"])
S_hash1 --> S_combine["combine_hash()<br/><small>Blake3(a ‖ b)</small>"]
S_root --> S_combine
S_combine --> S_vh(["value_hash"])
end
subgraph reference["Reference Element"]
R_elem["ref element bytes"] --> R_hash1["H(len ‖ bytes)"]
R_target["target value"] --> R_hash2["H(len ‖ bytes)"]
R_hash1 --> R_combine["combine_hash()"]
R_hash2 --> R_combine
R_combine --> R_vh(["value_hash"])
end
style item fill:#eaf2f8,stroke:#2980b9
style subtree fill:#fef9e7,stroke:#f39c12
style reference fill:#fdedec,stroke:#e74c3c
Subtree: binds the child Merk's root hash into the parent. Reference: binds both the reference path AND the target value. Changing either changes the root hash.
The combine_hash function:
#![allow(unused)] fn main() { pub fn combine_hash(hash_one: &CryptoHash, hash_two: &CryptoHash) -> CostContext<CryptoHash> { let mut hasher = blake3::Hasher::new(); hasher.update(hash_one); // 32 bytes hasher.update(hash_two); // 32 bytes — total 64 bytes, exactly 1 hash op // ... } }
This is what allows GroveDB to authenticate the entire hierarchy through a single root hash — each parent tree's value_hash for a subtree element includes the child tree's root hash.
Aggregate Hashing for ProvableCountTree
ProvableCountTree nodes include the aggregate count in the node hash:
#![allow(unused)] fn main() { pub fn node_hash_with_count( kv: &CryptoHash, left: &CryptoHash, right: &CryptoHash, count: u64, ) -> CostContext<CryptoHash> { let mut hasher = blake3::Hasher::new(); hasher.update(kv); // 32 bytes hasher.update(left); // 32 bytes hasher.update(right); // 32 bytes hasher.update(&count.to_be_bytes()); // 8 bytes — total 104 bytes // Still exactly 2 hash ops (104 < 128 = 2 × 64) } }
This means a proof of count doesn't require revealing the actual data — the count is baked into the cryptographic commitment.
The Element System
While Merk deals with raw key-value pairs, GroveDB operates at a higher level using Elements — typed values that carry semantic meaning. Every value stored in GroveDB is an Element.
The Element Enum
#![allow(unused)] fn main() { // grovedb-element/src/element/mod.rs pub enum Element { Item(Vec<u8>, Option<ElementFlags>), // [0] Reference(ReferencePathType, MaxReferenceHop, Option<ElementFlags>), // [1] Tree(Option<Vec<u8>>, Option<ElementFlags>), // [2] SumItem(SumValue, Option<ElementFlags>), // [3] SumTree(Option<Vec<u8>>, SumValue, Option<ElementFlags>), // [4] BigSumTree(Option<Vec<u8>>, BigSumValue, Option<ElementFlags>), // [5] CountTree(Option<Vec<u8>>, CountValue, Option<ElementFlags>), // [6] CountSumTree(Option<Vec<u8>>, CountValue, SumValue, Option<ElementFlags>), // [7] ProvableCountTree(Option<Vec<u8>>, CountValue, Option<ElementFlags>), // [8] ItemWithSumItem(Vec<u8>, SumValue, Option<ElementFlags>), // [9] ProvableCountSumTree(Option<Vec<u8>>, CountValue, SumValue, Option<ElementFlags>), // [10] CommitmentTree(u64, u8, Option<ElementFlags>), // [11] MmrTree(u64, Option<ElementFlags>), // [12] BulkAppendTree(u64, u8, Option<ElementFlags>), // [13] DenseAppendOnlyFixedSizeTree(u16, u8, Option<ElementFlags>), // [14] } }
The discriminant numbers (shown in brackets) are used during serialization.
Type aliases used throughout:
#![allow(unused)] fn main() { pub type ElementFlags = Vec<u8>; // Arbitrary metadata per element pub type MaxReferenceHop = Option<u8>; // Optional hop limit for references pub type SumValue = i64; // 64-bit signed sum pub type BigSumValue = i128; // 128-bit signed sum pub type CountValue = u64; // 64-bit unsigned count }
Item — Basic Key-Value Storage
The simplest element. Stores arbitrary bytes:
#![allow(unused)] fn main() { Element::Item(value: Vec<u8>, flags: Option<ElementFlags>) }
graph TD
subgraph item["Element::Item — discriminant 0x00"]
VALUE["<b>value:</b> Vec<u8><br/><small>arbitrary bytes: serialized proto, JSON, raw u64, etc.</small>"]
FLAGS["<b>flags:</b> Option<ElementFlags><br/><small>epoch info, storage credits, metadata</small>"]
FEAT["Merk feature: <b>BasicMerkNode</b> (no aggregation)"]
end
style item fill:#eaf2f8,stroke:#2980b9
Constructors:
#![allow(unused)] fn main() { Element::new_item(b"hello world".to_vec()) Element::new_item_with_flags(b"data".to_vec(), Some(vec![0x01, 0x02])) }
Items participate in sum aggregation: within a SumTree, an Item contributes a default sum of 0. A SumItem contributes its explicit value.
Tree — Containers for Subtrees
A Tree element is a portal to another Merk tree. It stores the root key of the child tree (if any):
#![allow(unused)] fn main() { Element::Tree(root_key: Option<Vec<u8>>, flags: Option<ElementFlags>) }
graph TD
subgraph parent["Parent Merk Tree"]
settings[""settings"<br/>Element::Item"]
logs[""logs"<br/>Element::Item"]
users[""users"<br/>Element::Tree<br/>root_key = "bob""]
settings --> logs
settings --> users
end
subgraph child[""users" Child Merk Tree"]
bob[""bob"<br/>(root_key)"]
alice[""alice"<br/>Item"]
carol[""carol"<br/>Item"]
bob --> alice
bob --> carol
end
users -.->|"portal: root_key<br/>links to child tree"| bob
style parent fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style child fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style users fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style bob fill:#fef9e7,stroke:#f39c12,stroke-width:2px
The Tree element in the parent Merk stores the
root_keyof the child Merk tree. This creates a portal — a link from one Merk tree into another.
When a tree is empty, root_key is None. The constructor Element::empty_tree()
creates Element::Tree(None, None).
SumItem / SumTree — Aggregate Sums
A SumTree automatically maintains the sum of all its direct children's sum-contributions:
#![allow(unused)] fn main() { Element::SumTree(root_key: Option<Vec<u8>>, sum: SumValue, flags: Option<ElementFlags>) Element::SumItem(value: SumValue, flags: Option<ElementFlags>) }
graph TD
subgraph sumtree["SumTree "balances" — sum=350"]
bob[""bob"<br/>SumItem(150)<br/><i>subtree_sum: 150+100+100 = 350</i>"]
alice[""alice"<br/>SumItem(100)<br/><i>subtree_sum: 100</i>"]
carol[""carol"<br/>SumItem(100)<br/><i>subtree_sum: 100</i>"]
bob --> alice
bob --> carol
end
style sumtree fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style bob fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style alice fill:#d4e6f1,stroke:#2980b9
style carol fill:#d4e6f1,stroke:#2980b9
Aggregation formula:
node_sum = own_value + left_child_sum + right_child_sumBob: 150 + 100 (alice) + 100 (carol) = 350. The root sum (350) is stored in the parent's SumTree element.
The sum is maintained at the Merk level through the TreeFeatureType::SummedMerkNode(i64)
feature type. During tree propagation, each node's aggregate data is recomputed:
aggregate_sum = own_sum + left_child_sum + right_child_sum
CountTree, CountSumTree, BigSumTree
Additional aggregate tree types:
| Element Type | Merk Feature Type | Aggregates |
|---|---|---|
CountTree | CountedMerkNode(u64) | Number of elements |
CountSumTree | CountedSummedMerkNode(u64, i64) | Both count and sum |
BigSumTree | BigSummedMerkNode(i128) | 128-bit sum for large values |
ProvableCountTree | ProvableCountedMerkNode(u64) | Count baked into hash |
ProvableCountSumTree | ProvableCountedSummedMerkNode(u64, i64) | Count in hash + sum |
ProvableCountTree is special: its count is included in the node_hash
computation (via node_hash_with_count), so a proof can verify the count without
revealing any values.
Element Serialization
Elements are serialized using bincode with big-endian byte order:
#![allow(unused)] fn main() { pub fn serialize(&self, grove_version: &GroveVersion) -> Result<Vec<u8>, ElementError> { let config = config::standard().with_big_endian().with_no_limit(); bincode::encode_to_vec(self, config) } }
The first byte is the discriminant, allowing O(1) type detection:
#![allow(unused)] fn main() { pub fn from_serialized_value(value: &[u8]) -> Option<ElementType> { match value.first()? { 0 => Some(ElementType::Item), 1 => Some(ElementType::Reference), 2 => Some(ElementType::Tree), 3 => Some(ElementType::SumItem), // ... etc } } }
TreeFeatureType and Aggregate Data Flow
The TreeFeatureType enum bridges the gap between GroveDB Elements and Merk nodes:
#![allow(unused)] fn main() { pub enum TreeFeatureType { BasicMerkNode, // No aggregation SummedMerkNode(i64), // Sum aggregation BigSummedMerkNode(i128), // Large sum CountedMerkNode(u64), // Count CountedSummedMerkNode(u64, i64), // Count + sum ProvableCountedMerkNode(u64), // Count in hash ProvableCountedSummedMerkNode(u64, i64), // Count in hash + sum } }
Aggregate data flows upward through the tree:
graph TD
subgraph counttree["CountTree "users" — count = 5"]
C["C<br/>own=1, total=<b>5</b>"]
B["B<br/>own=1, total=2"]
D["D<br/>own=1, total=2"]
A["A<br/>own=1, total=1"]
E["E<br/>own=1, total=1"]
C --> B
C --> D
B --> A
D --> E
end
style counttree fill:#e8daef,stroke:#8e44ad,stroke-width:2px
style C fill:#fef9e7,stroke:#f39c12,stroke-width:2px
Aggregation table: Each node's aggregate = own(1) + left_aggregate + right_aggregate
Node own left_agg right_agg total A 1 0 0 1 B 1 1 (A) 0 2 E 1 0 0 1 D 1 0 1 (E) 2 C 1 2 (B) 2 (D) 5 (root)
The count stored at each node represents the total count in the subtree rooted at that node, including itself. The root node's count is the total for the entire tree.
The AggregateData enum carries this through the Link system:
#![allow(unused)] fn main() { pub enum AggregateData { NoAggregateData, Sum(i64), BigSum(i128), Count(u64), CountAndSum(u64, i64), ProvableCount(u64), ProvableCountAndSum(u64, i64), } }
CommitmentTree — Sinsemilla Commitment Tree
A CommitmentTree provides a depth-32 Sinsemilla Merkle tree for tracking
note commitment anchors, as used in Zcash's Orchard shielded protocol. It wraps
incrementalmerkletree::Frontier<MerkleHashOrchard, 32> for O(1) append and
root computation:
#![allow(unused)] fn main() { Element::CommitmentTree( total_count: u64, // Number of commitments appended chunk_power: u8, // BulkAppendTree compaction size (chunk_size = 2^chunk_power) flags: Option<ElementFlags>, ) // discriminant [11] }
Note: The Sinsemilla frontier root hash is NOT stored in the Element. It is persisted in data storage and flows through the Merk child hash mechanism (
insert_subtree'ssubtree_root_hashparameter). Any change to the frontier automatically propagates up through the GroveDB Merk hierarchy.
graph TD
subgraph ct["CommitmentTree — BulkAppendTree + Sinsemilla Frontier"]
ROOT["sinsemilla_root<br/><small>depth-32 Merkle root using<br/>MerkleHashOrchard hashing</small>"]
FRONTIER["<b>Frontier</b><br/><small>~1KB: rightmost path only<br/>O(1) append, O(1) root</small>"]
ITEMS["Data namespace:<br/>BulkAppendTree entries + frontier<br/>(buffer + chunks + MMR + __ct_data__)"]
ROOT --- FRONTIER
FRONTIER --- ITEMS
end
style ct fill:#e8daef,stroke:#8e44ad,stroke-width:2px
Architecture:
- The frontier (rightmost path of the Merkle tree, ~1KB constant size) is
stored in the data namespace, keyed by
COMMITMENT_TREE_DATA_KEY - The actual note data (
cmx || ciphertext) is stored via a BulkAppendTree in the data namespace — chunk-compacted, retrievable by position - Historical anchors are tracked by Platform in a separate provable tree
- The Sinsemilla root is NOT stored in the Element — it flows as the Merk child hash through the GroveDB hash hierarchy
Operations:
commitment_tree_insert(path, key, cmx, ciphertext, tx)— Typed append acceptingTransmittedNoteCiphertext<M>; returns(new_root, position)commitment_tree_anchor(path, key, tx)— Get current Orchard Anchorcommitment_tree_get_value(path, key, position, tx)— Retrieve value by positioncommitment_tree_count(path, key, tx)— Get total item count
MemoSize generic: CommitmentTree<S, M: MemoSize = DashMemo> validates that
ciphertext payloads match the expected size for M. For Dash (36-byte memos):
epk_bytes (32) + enc_ciphertext (104) + out_ciphertext (80) = 216 bytes.
Cost tracking: Sinsemilla hash operations are tracked via
cost.sinsemilla_hash_calls. Root computation always traverses 32 levels.
Ommer merges cascade through trailing_ones() of the previous position.
BulkAppendTree operations add Blake3 hash costs.
MmrTree — Merkle Mountain Range
An MmrTree stores data in an append-only Merkle Mountain Range (MMR) using Blake3 hashing. MMR nodes are stored in the data column (same as Merk nodes), not in a child Merk subtree. See Chapter 13 for a comprehensive deep-dive into how MMRs work, how they fill up, how proofs are generated and verified, and how MmrTree integrates with GroveDB.
#![allow(unused)] fn main() { Element::MmrTree( mmr_size: u64, // Internal MMR size (nodes, not leaves) flags: Option<ElementFlags>, ) // discriminant [12] }
Note: The MMR root hash is NOT stored in the Element. It flows as the Merk child hash via
insert_subtree'ssubtree_root_hashparameter.
Operations: mmr_tree_append, mmr_tree_root_hash, mmr_tree_get_value,
mmr_tree_leaf_count. Proofs: V1 proofs (see §9.6 and §13.9).
BulkAppendTree — Two-Level Append-Only Structure
A BulkAppendTree combines a dense Merkle tree buffer with a chunk-level MMR for efficient high-throughput appends with provable range queries. It is a non-Merk tree — data lives in the data namespace, not in a child Merk subtree. See Chapter 14 for a comprehensive deep-dive into the two-level architecture, chunk compaction, proof generation, verification, and GroveDB integration.
#![allow(unused)] fn main() { Element::BulkAppendTree( total_count: u64, // Total values appended chunk_power: u8, // Dense tree height (buffer capacity = 2^chunk_power - 1) flags: Option<ElementFlags>, ) // discriminant [13] }
Note: The state root (
blake3("bulk_state" || mmr_root || dense_tree_root)) is NOT stored in the Element. It flows as the Merk child hash viainsert_subtree'ssubtree_root_hashparameter.
Operations: bulk_append, bulk_get_value, bulk_get_chunk,
bulk_get_buffer, bulk_count, bulk_chunk_count.
Proofs: V1 range proofs (see §9.6 and §14.10).
DenseAppendOnlyFixedSizeTree — Dense Fixed-Capacity Storage
A DenseAppendOnlyFixedSizeTree is a complete binary tree of fixed height h where every node (internal and leaf) stores a data value. Positions are filled in level-order (BFS). The root hash is recomputed on the fly — no intermediate hashes are persisted. See Chapter 16 for the full deep-dive.
#![allow(unused)] fn main() { Element::DenseAppendOnlyFixedSizeTree( count: u16, // Number of values stored (max 65,535) height: u8, // Tree height (1..=16, immutable), capacity = 2^h - 1 flags: Option<ElementFlags>, ) // discriminant [14] }
Note: The root hash is NOT stored in the Element — it is recomputed on the fly and flows as the Merk child hash. The
countfield isu16(not u64), limiting trees to 65,535 positions. Heights are restricted to 1..=16.
Operations: dense_tree_insert, dense_tree_get, dense_tree_root_hash,
dense_tree_count.
Proofs: Element-level only (no subquery proofs yet).
Non-Merk Trees — Common Patterns
CommitmentTree, MmrTree, BulkAppendTree, and DenseAppendOnlyFixedSizeTree share a common architectural pattern that distinguishes them from the Merk-based tree types (Tree, SumTree, CountTree, etc.):
| Property | Merk-based trees | Non-Merk trees |
|---|---|---|
| Child Merk subtree | Yes (root_key = Some(...)) | No (no root_key field) |
| Data storage | Merk key-value pairs | Data column blobs (non-Merk keys) |
| Root hash binding | combine_hash(elem_hash, child_root_hash) | combine_hash(elem_hash, type_specific_root) |
| Type-specific root | Maintained by Merk AVL | Flows as Merk child hash (NOT in element bytes) |
| Proof format | V0 (layer-by-layer Merk) | V1 (type-specific proof) |
| TreeFeatureType | BasicMerkNode (no aggregation) | BasicMerkNode |
Storage column note: All four non-Merk tree types (MmrTree, CommitmentTree, BulkAppendTree, DenseAppendOnlyFixedSizeTree) store their data in the data column using non-Merk keys. CommitmentTree stores its Sinsemilla frontier alongside BulkAppendTree entries in the same data column (key
b"__ct_data__").
The type-specific root (sinsemilla root, MMR root, state root, or dense tree root hash) is
NOT stored in the Element. Instead, it flows as the Merk child hash via
insert_subtree's subtree_root_hash parameter. The Merk combined_value_hash
becomes combine_hash(value_hash(element_bytes), type_specific_root).
Any change to the type-specific root changes the child hash, which changes
the combined_value_hash, which propagates up through the GroveDB hash
hierarchy — maintaining cryptographic integrity.
The Hierarchical Grove — Tree of Trees
How Subtrees Nest Inside Parent Trees
The defining feature of GroveDB is that a Merk tree can contain elements that are themselves Merk trees. This creates a hierarchical namespace:
graph TD
subgraph root["ROOT MERK TREE — path: []"]
contracts[""contracts"<br/>Tree"]
identities[""identities"<br/>Tree"]
balances[""balances"<br/>SumTree, sum=0"]
contracts --> identities
contracts --> balances
end
subgraph ident["IDENTITIES MERK — path: ["identities"]"]
bob456[""bob456"<br/>Tree"]
alice123[""alice123"<br/>Tree"]
eve[""eve"<br/>Item"]
bob456 --> alice123
bob456 --> eve
end
subgraph bal["BALANCES MERK (SumTree) — path: ["balances"]"]
bob_bal[""bob456"<br/>SumItem(800)"]
end
subgraph alice_tree["ALICE123 MERK — path: ["identities", "alice123"]"]
name[""name"<br/>Item("Al")"]
balance_item[""balance"<br/>SumItem(1000)"]
docs[""docs"<br/>Tree"]
name --> balance_item
name --> docs
end
identities -.-> bob456
balances -.-> bob_bal
alice123 -.-> name
docs -.->|"... more subtrees"| more["..."]
style root fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style ident fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style bal fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style alice_tree fill:#e8daef,stroke:#8e44ad,stroke-width:2px
style more fill:none,stroke:none
Each colored box is a separate Merk tree. Dashed arrows represent the portal links from Tree elements to their child Merk trees. The path to each Merk is shown in its label.
Path Addressing System
Every element in GroveDB is addressed by a path — a sequence of byte strings that navigate from the root through subtrees to the target key:
Path: ["identities", "alice123", "name"]
Step 1: In root tree, look up "identities" → Tree element
Step 2: Open identities subtree, look up "alice123" → Tree element
Step 3: Open alice123 subtree, look up "name" → Item("Alice")
Paths are represented as Vec<Vec<u8>> or using the SubtreePath type for
efficient manipulation without allocation:
#![allow(unused)] fn main() { // The path to the element (all segments except the last) let path: &[&[u8]] = &[b"identities", b"alice123"]; // The key within the final subtree let key: &[u8] = b"name"; }
Blake3 Prefix Generation for Storage Isolation
Each subtree in GroveDB gets its own isolated storage namespace in RocksDB. The namespace is determined by hashing the path with Blake3:
#![allow(unused)] fn main() { pub type SubtreePrefix = [u8; 32]; // The prefix is computed by hashing the path segments // storage/src/rocksdb_storage/storage.rs }
For example:
Path: ["identities", "alice123"]
Prefix: Blake3(["identities", "alice123"]) = [0xab, 0x3f, ...] (32 bytes)
In RocksDB, keys for this subtree are stored as:
[prefix: 32 bytes][original_key]
So "name" in this subtree becomes:
[0xab, 0x3f, ...][0x6e, 0x61, 0x6d, 0x65] ("name")
This ensures:
- No key collisions between subtrees (32-byte prefix = 256-bit isolation)
- Efficient prefix computation (single Blake3 hash over the path bytes)
- Subtree data is colocated in RocksDB for cache efficiency
Root Hash Propagation Through the Hierarchy
When a value changes deep in the grove, the change must propagate upward to update the root hash:
Change: Update "name" to "ALICE" in identities/alice123/
Step 1: Update value in alice123's Merk tree
→ alice123 tree gets new root hash: H_alice_new
Step 2: Update "alice123" element in identities tree
→ identities tree's value_hash for "alice123" =
combine_hash(H(tree_element_bytes), H_alice_new)
→ identities tree gets new root hash: H_ident_new
Step 3: Update "identities" element in root tree
→ root tree's value_hash for "identities" =
combine_hash(H(tree_element_bytes), H_ident_new)
→ ROOT HASH changes
graph TD
subgraph step3["STEP 3: Update root tree"]
R3["Root tree recalculates:<br/>value_hash for "identities" =<br/>combine_hash(H(tree_bytes), H_ident_NEW)<br/>→ new ROOT HASH"]
end
subgraph step2["STEP 2: Update identities tree"]
R2["identities tree recalculates:<br/>value_hash for "alice123" =<br/>combine_hash(H(tree_bytes), H_alice_NEW)<br/>→ new root hash: H_ident_NEW"]
end
subgraph step1["STEP 1: Update alice123 Merk"]
R1["alice123 tree recalculates:<br/>value_hash("ALICE") → new kv_hash<br/>→ new root hash: H_alice_NEW"]
end
R1 -->|"H_alice_NEW flows up"| R2
R2 -->|"H_ident_NEW flows up"| R3
style step1 fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style step2 fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style step3 fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
Before vs After — changed nodes marked with red:
graph TD
subgraph before["BEFORE"]
B_root["Root: aabb1122"]
B_ident[""identities": cc44.."]
B_contracts[""contracts": 1234.."]
B_balances[""balances": 5678.."]
B_alice[""alice123": ee55.."]
B_bob[""bob456": bb22.."]
B_name[""name": 7f.."]
B_docs[""docs": a1.."]
B_root --- B_ident
B_root --- B_contracts
B_root --- B_balances
B_ident --- B_alice
B_ident --- B_bob
B_alice --- B_name
B_alice --- B_docs
end
subgraph after["AFTER"]
A_root["Root: ff990033"]
A_ident[""identities": dd88.."]
A_contracts[""contracts": 1234.."]
A_balances[""balances": 5678.."]
A_alice[""alice123": 1a2b.."]
A_bob[""bob456": bb22.."]
A_name[""name": 3c.."]
A_docs[""docs": a1.."]
A_root --- A_ident
A_root --- A_contracts
A_root --- A_balances
A_ident --- A_alice
A_ident --- A_bob
A_alice --- A_name
A_alice --- A_docs
end
style A_root fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
style A_ident fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
style A_alice fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
style A_name fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
Only the nodes on the path from the changed value up to the root are recomputed. Siblings and other branches remain unchanged.
The propagation is implemented by propagate_changes_with_transaction, which walks
up the path from the modified subtree to the root, updating each parent's element
hash along the way.
Multi-Level Grove Structure Example
Here's a complete example showing how Dash Platform structures its state:
graph TD
ROOT["GroveDB Root"]
ROOT --> contracts["[01] "data_contracts"<br/>Tree"]
ROOT --> identities["[02] "identities"<br/>Tree"]
ROOT --> balances["[03] "balances"<br/>SumTree"]
ROOT --> pools["[04] "pools"<br/>Tree"]
contracts --> c1["contract_id_1<br/>Tree"]
contracts --> c2["contract_id_2<br/>Tree"]
c1 --> docs[""documents"<br/>Tree"]
docs --> profile[""profile"<br/>Tree"]
docs --> note[""note"<br/>Tree"]
profile --> d1["doc_id_1<br/>Item"]
profile --> d2["doc_id_2<br/>Item"]
note --> d3["doc_id_3<br/>Item"]
identities --> id1["identity_id_1<br/>Tree"]
identities --> id2["identity_id_2<br/>Tree"]
id1 --> keys[""keys"<br/>Tree"]
id1 --> rev[""revision"<br/>Item(u64)"]
keys --> k1["key_id_1<br/>Item(pubkey)"]
keys --> k2["key_id_2<br/>Item(pubkey)"]
balances --> b1["identity_id_1<br/>SumItem(balance)"]
balances --> b2["identity_id_2<br/>SumItem(balance)"]
style ROOT fill:#2c3e50,stroke:#2c3e50,color:#fff
style contracts fill:#d4e6f1,stroke:#2980b9
style identities fill:#d5f5e3,stroke:#27ae60
style balances fill:#fef9e7,stroke:#f39c12
style pools fill:#e8daef,stroke:#8e44ad
Each box is a separate Merk tree, authenticated all the way up to a single root hash that validators agree on.
The Reference System
Why References Exist
In a hierarchical database, you often need the same data accessible from multiple paths. For example, documents might be stored under their contract but also queryable by owner identity. References are GroveDB's answer — they are pointers from one location to another, similar to symbolic links in a filesystem.
graph LR
subgraph primary["Primary Storage"]
item["contracts/C1/docs/D1<br/><b>Item</b>(data)"]
end
subgraph secondary["Secondary Index"]
ref["identities/alice/docs/D1<br/><b>Reference</b>"]
end
ref -->|"points to"| item
style primary fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style secondary fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style ref fill:#fef9e7,stroke:#f39c12,stroke-width:2px
Key properties:
- References are authenticated — the reference's value_hash includes both the reference itself and the referenced element
- References can be chained — a reference can point to another reference
- Cycle detection prevents infinite loops
- A configurable hop limit prevents resource exhaustion
The Seven Reference Types
#![allow(unused)] fn main() { // grovedb-element/src/reference_path/mod.rs pub enum ReferencePathType { AbsolutePathReference(Vec<Vec<u8>>), UpstreamRootHeightReference(u8, Vec<Vec<u8>>), UpstreamRootHeightWithParentPathAdditionReference(u8, Vec<Vec<u8>>), UpstreamFromElementHeightReference(u8, Vec<Vec<u8>>), CousinReference(Vec<u8>), RemovedCousinReference(Vec<Vec<u8>>), SiblingReference(Vec<u8>), } }
Let's walk through each with diagrams.
AbsolutePathReference
The simplest type. Stores the full path to the target:
graph TD
subgraph root["Root Merk — path: []"]
A["A<br/>Tree"]
P["P<br/>Tree"]
end
subgraph merkA["Merk [A]"]
B["B<br/>Tree"]
end
subgraph merkP["Merk [P]"]
Q["Q<br/>Tree"]
end
subgraph merkAB["Merk [A, B]"]
X["X = Reference<br/>AbsolutePathRef([P, Q, R])"]
end
subgraph merkPQ["Merk [P, Q]"]
R["R = Item<br/>"target""]
end
A -.-> B
P -.-> Q
B -.-> X
Q -.-> R
X ==>|"resolves to [P, Q, R]"| R
style merkAB fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style merkPQ fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style X fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style R fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
X stores the full absolute path
[P, Q, R]. No matter where X is located, it always resolves to the same target.
UpstreamRootHeightReference
Keeps the first N segments of the current path, then appends a new path:
graph TD
subgraph resolve["Resolution: keep first 2 segments + append [P, Q]"]
direction LR
curr["current: [A, B, C, D]"] --> keep["keep first 2: [A, B]"] --> append["append: [A, B, <b>P, Q</b>]"]
end
subgraph grove["Grove Hierarchy"]
gA["A (height 0)"]
gB["B (height 1)"]
gC["C (height 2)"]
gD["D (height 3)"]
gX["X = Reference<br/>UpstreamRootHeight(2, [P,Q])"]
gP["P (height 2)"]
gQ["Q (height 3) — target"]
gA --> gB
gB --> gC
gB -->|"keep first 2 → [A,B]<br/>then descend [P,Q]"| gP
gC --> gD
gD -.-> gX
gP --> gQ
end
gX ==>|"resolves to"| gQ
style resolve fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style gX fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style gQ fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
UpstreamRootHeightWithParentPathAdditionReference
Like UpstreamRootHeight, but re-appends the last segment of the current path:
Reference at path [A, B, C, D, E] key=X
UpstreamRootHeightWithParentPathAdditionReference(2, [P, Q])
Current path: [A, B, C, D, E]
Keep first 2: [A, B]
Append [P, Q]: [A, B, P, Q]
Re-append last: [A, B, P, Q, E] ← "E" from original path added back
Useful for: indexes where the parent key should be preserved
UpstreamFromElementHeightReference
Discards the last N segments, then appends:
Reference at path [A, B, C, D] key=X
UpstreamFromElementHeightReference(1, [P, Q])
Current path: [A, B, C, D]
Discard last 1: [A, B, C]
Append [P, Q]: [A, B, C, P, Q]
CousinReference
Replaces only the immediate parent with a new key:
graph TD
subgraph resolve["Resolution: pop last 2, push cousin C, push key X"]
direction LR
r1["path: [A, B, M, D]"] --> r2["pop last 2: [A, B]"] --> r3["push C: [A, B, C]"] --> r4["push key X: [A, B, C, X]"]
end
subgraph merkAB["Merk [A, B]"]
M["M<br/>Tree"]
C["C<br/>Tree<br/>(cousin of M)"]
end
subgraph merkABM["Merk [A, B, M]"]
D["D<br/>Tree"]
end
subgraph merkABMD["Merk [A, B, M, D]"]
Xref["X = Reference<br/>CousinReference(C)"]
end
subgraph merkABC["Merk [A, B, C]"]
Xtarget["X = Item<br/>(target)"]
end
M -.-> D
D -.-> Xref
C -.-> Xtarget
Xref ==>|"resolves to [A, B, C, X]"| Xtarget
style resolve fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style Xref fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style Xtarget fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style M fill:#fadbd8,stroke:#e74c3c
style C fill:#d5f5e3,stroke:#27ae60
The "cousin" is a sibling subtree of the reference's grandparent. The reference navigates up two levels, then descends into the cousin subtree.
RemovedCousinReference
Like CousinReference but replaces the parent with a multi-segment path:
Reference at path [A, B, C, D] key=X
RemovedCousinReference([M, N])
Current path: [A, B, C, D]
Pop parent C: [A, B]
Append [M, N]: [A, B, M, N]
Push key X: [A, B, M, N, X]
SiblingReference
The simplest relative reference — just changes the key within the same parent:
graph TD
subgraph merk["Merk [A, B, C] — same tree, same path"]
M_sib["M"]
X_sib["X = Reference<br/>SiblingRef(Y)"]
Y_sib["Y = Item<br/>(target)"]
Z_sib["Z = Item"]
M_sib --> X_sib
M_sib --> Y_sib
end
X_sib ==>|"resolves to [A, B, C, Y]"| Y_sib
style merk fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style X_sib fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style Y_sib fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
The simplest reference type. X and Y are siblings in the same Merk tree — the resolution just changes the key while keeping the same path.
Reference Following and the Hop Limit
When GroveDB encounters a Reference element, it must follow it to find the actual value. Since references can point to other references, this involves a loop:
#![allow(unused)] fn main() { // grovedb/src/reference_path.rs pub const MAX_REFERENCE_HOPS: usize = 10; pub fn follow_reference(...) -> CostResult<ResolvedReference, Error> { let mut hops_left = MAX_REFERENCE_HOPS; let mut visited = HashSet::new(); while hops_left > 0 { // Resolve reference path to absolute path let target_path = current_ref.absolute_qualified_path(...); // Check for cycles if !visited.insert(target_path.clone()) { return Err(Error::CyclicReference); } // Fetch element at target let element = Element::get(target_path); match element { Element::Reference(next_ref, ..) => { // Still a reference — keep following current_ref = next_ref; hops_left -= 1; } other => { // Found the actual element! return Ok(ResolvedReference { element: other, ... }); } } } Err(Error::ReferenceLimit) // Exceeded 10 hops } }
Cycle Detection
The visited HashSet tracks all paths we've seen. If we encounter a path we've
already visited, we have a cycle:
graph LR
A["A<br/>Reference"] -->|"step 1"| B["B<br/>Reference"]
B -->|"step 2"| C["C<br/>Reference"]
C -->|"step 3"| A
style A fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
style B fill:#fef9e7,stroke:#f39c12
style C fill:#fef9e7,stroke:#f39c12
Cycle detection trace:
Step Follow visited set Result 1 Start at A { A } A is Ref → follow 2 A → B { A, B } B is Ref → follow 3 B → C { A, B, C } C is Ref → follow 4 C → A A already in visited! Error::CyclicRef Without cycle detection, this would loop forever.
MAX_REFERENCE_HOPS = 10also caps traversal depth for long chains.
References in Merk — Combined Value Hashes
When a Reference is stored in a Merk tree, its value_hash must authenticate
both the reference structure and the referenced data:
#![allow(unused)] fn main() { // merk/src/tree/kv.rs pub fn update_hashes_using_reference_value_hash( mut self, reference_value_hash: CryptoHash, ) -> CostContext<Self> { // Hash the reference element's own bytes let actual_value_hash = value_hash(self.value_as_slice()); // Combine: H(reference_bytes) ⊕ H(referenced_data) let combined = combine_hash(&actual_value_hash, &reference_value_hash); self.value_hash = combined; self.hash = kv_digest_to_kv_hash(self.key(), self.value_hash()); // ... } }
This means changing either the reference itself OR the data it points to will change the root hash — both are cryptographically bound.
The Storage Layer
RocksDB with OptimisticTransactionDB
GroveDB uses RocksDB as its storage backend, specifically the
OptimisticTransactionDB variant that supports transactions:
#![allow(unused)] fn main() { // storage/src/rocksdb_storage/storage.rs pub(crate) type Db = OptimisticTransactionDB; pub(crate) type Tx<'db> = Transaction<'db, Db>; pub struct RocksDbStorage { db: OptimisticTransactionDB, } }
Optimistic transactions work by assuming there won't be conflicts. If two transactions modify the same data, the second one to commit will fail and can be retried. This is more efficient than pessimistic locking for workloads where conflicts are rare.
RocksDB options are tuned for GroveDB's workload:
#![allow(unused)] fn main() { lazy_static! { static ref DEFAULT_OPTS: rocksdb::Options = { let mut opts = rocksdb::Options::default(); opts.create_if_missing(true); opts.increase_parallelism(num_cpus::get() as i32); opts.set_allow_mmap_writes(true); opts.set_allow_mmap_reads(true); opts.create_missing_column_families(true); opts.set_atomic_flush(true); opts }; } }
Four Column Families
RocksDB column families act like separate key-value namespaces within a single database. GroveDB uses four:
graph TD
subgraph rocksdb["RocksDB (OptimisticTransactionDB)"]
subgraph cf_main["Column Family: "default" (main)"]
main_desc["All Merk tree nodes<br/>Key: [prefix: 32B][node_key]<br/>Value: encoded TreeNode"]
end
subgraph cf_aux["Column Family: "aux""]
aux_desc["Per-subtree auxiliary data<br/>Key: [prefix: 32B][aux_key]<br/>Used for application-defined metadata"]
end
subgraph cf_roots["Column Family: "roots""]
roots_desc["Subtree existence + root keys<br/>Key: [prefix: 32B]<br/>Value: root_key bytes"]
end
subgraph cf_meta["Column Family: "meta""]
meta_desc["Global metadata<br/>NOT prefixed — shared across entire DB<br/>Grove version, feature flags, etc."]
end
end
style rocksdb fill:#f8f9fa,stroke:#2c3e50,stroke-width:3px
style cf_main fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style cf_aux fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style cf_roots fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style cf_meta fill:#e8daef,stroke:#8e44ad,stroke-width:2px
Example: Key
[ab3fc2...][6e616d65]in the "default" CF maps toTreeNode{key:"name", val:"Al"}, whereab3fc2...isBlake3(path)and6e616d65is"name"in bytes.
#![allow(unused)] fn main() { pub(crate) const AUX_CF_NAME: &str = "aux"; pub(crate) const ROOTS_CF_NAME: &str = "roots"; pub(crate) const META_CF_NAME: &str = "meta"; // Main data uses the default column family }
Prefixed Storage Contexts
Each subtree gets its own prefixed storage context — a wrapper that automatically prepends the 32-byte Blake3 prefix to all keys:
Subtree path: ["identities", "alice"]
Prefix: Blake3(path) = [0xab, 0x3f, 0xc2, ...] (32 bytes)
When subtree stores key "name" with value "Alice":
RocksDB key: [0xab 0x3f 0xc2 ... (32 bytes) | 0x6e 0x61 0x6d 0x65]
\_________prefix________/ \_____"name"_____/
RocksDB value: [encoded TreeNode with value "Alice"]
The context types:
Without transaction:
PrefixedRocksDbImmediateStorageContext
└── Reads/writes directly to DB with prefix
With transaction:
PrefixedRocksDbTransactionContext
└── Reads/writes through a Transaction with prefix
Both implement the StorageContext trait:
#![allow(unused)] fn main() { pub trait StorageContext<'db> { fn get(&self, key: &[u8]) -> CostResult<Option<Vec<u8>>, Error>; fn get_aux(&self, key: &[u8]) -> CostResult<Option<Vec<u8>>, Error>; fn get_root(&self, key: &[u8]) -> CostResult<Option<Vec<u8>>, Error>; fn get_meta(&self, key: &[u8]) -> CostResult<Option<Vec<u8>>, Error>; fn put(&self, key: &[u8], value: &[u8], ...) -> CostResult<(), Error>; fn put_aux(&self, key: &[u8], value: &[u8], ...) -> CostResult<(), Error>; fn put_root(&self, key: &[u8], value: &[u8], ...) -> CostResult<(), Error>; fn put_meta(&self, key: &[u8], value: &[u8], ...) -> CostResult<(), Error>; fn delete(&self, key: &[u8], ...) -> CostResult<(), Error>; // ... } }
Write Batches and Transaction Model
For performance, GroveDB accumulates writes into batches:
graph LR
subgraph individual["Individual Writes (3 I/O ops)"]
i1["put("a","1")"] --> d1["DISK fsync"]
i2["put("b","2")"] --> d2["DISK fsync"]
i3["put("c","3")"] --> d3["DISK fsync"]
end
subgraph batched["Batched Writes (1 atomic I/O op)"]
b1["put("a","1")"] --> buf["Memory<br/>Buffer"]
b2["put("b","2")"] --> buf
b3["put("c","3")"] --> buf
buf --> d4["DISK<br/>fsync once"]
end
style individual fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
style batched fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style buf fill:#fef9e7,stroke:#f39c12
3 disk syncs vs 1 disk sync = ~3x faster. Batched writes are also atomic (all-or-nothing).
The StorageBatch accumulates operations that are flushed together:
#![allow(unused)] fn main() { pub struct StorageBatch { operations: RefCell<Vec<AbstractBatchOperation>>, } }
The Critical commit_local() Pattern
When using transactions, there's a critical pattern that must be followed. Writes within a transaction are buffered — they're not visible until committed:
#![allow(unused)] fn main() { // CORRECT pattern: { let tx = db.start_transaction(); let storage_ctx = db.get_transactional_storage_context(path, &tx); storage_ctx.put(key, value); // Writes to transaction buffer drop(storage_ctx); // Release borrow on tx tx.commit_local(); // Flush transaction to DB } // INCORRECT — data is lost: { let tx = db.start_transaction(); let storage_ctx = db.get_transactional_storage_context(path, &tx); storage_ctx.put(key, value); // Writes to transaction buffer // tx drops here without commit_local()! // All writes are ROLLED BACK! } }
This is especially important because the storage_ctx borrows the transaction.
You must drop(storage_ctx) before you can call tx.commit_local().
The Proof System
GroveDB's proof system lets any party verify the correctness of query results without having the full database. A proof is a compact representation of the relevant tree structure that allows reconstruction of the root hash.
Stack-Based Proof Operations
Proofs are encoded as a sequence of operations that reconstruct a partial tree using a stack machine:
#![allow(unused)] fn main() { // merk/src/proofs/mod.rs pub enum Op { Push(Node), // Push a node onto the stack (ascending key order) PushInverted(Node),// Push a node (descending key order) Parent, // Pop parent, pop child → attach child as LEFT of parent Child, // Pop child, pop parent → attach child as RIGHT of parent ParentInverted, // Pop parent, pop child → attach child as RIGHT of parent ChildInverted, // Pop child, pop parent → attach child as LEFT of parent } }
Execution uses a stack:
Proof ops: Push(B), Push(A), Parent, Push(C), Child
| Step | Operation | Stack (top→right) | Action |
|---|---|---|---|
| 1 | Push(B) | [ B ] | Push B onto stack |
| 2 | Push(A) | [ B , A ] | Push A onto stack |
| 3 | Parent | [ A{left:B} ] | Pop A (parent), pop B (child), B → LEFT of A |
| 4 | Push(C) | [ A{left:B} , C ] | Push C onto stack |
| 5 | Child | [ A{left:B, right:C} ] | Pop C (child), pop A (parent), C → RIGHT of A |
Final result — one tree on the stack:
graph TD
A_proof["A<br/>(root)"]
B_proof["B<br/>(left)"]
C_proof["C<br/>(right)"]
A_proof --> B_proof
A_proof --> C_proof
style A_proof fill:#fef9e7,stroke:#f39c12,stroke-width:2px
The verifier computes
node_hash(A) = Blake3(kv_hash_A || node_hash_B || node_hash_C)and checks it matches the expected root hash.
This is the execute function (merk/src/proofs/tree.rs):
#![allow(unused)] fn main() { pub fn execute<I, F>(ops: I, collapse: bool, mut visit_node: F) -> CostResult<Tree, Error> where I: IntoIterator<Item = Result<Op, Error>>, F: FnMut(&Node) -> Result<(), Error>, { let mut stack: Vec<Tree> = Vec::with_capacity(32); for op in ops { match op? { Op::Parent => { let (mut parent, child) = (try_pop(&mut stack), try_pop(&mut stack)); parent.left = Some(Child { tree: Box::new(child), hash: child.hash() }); stack.push(parent); } Op::Child => { let (child, mut parent) = (try_pop(&mut stack), try_pop(&mut stack)); parent.right = Some(Child { tree: Box::new(child), hash: child.hash() }); stack.push(parent); } Op::Push(node) => { visit_node(&node)?; stack.push(Tree::from(node)); } // ... Inverted variants swap left/right } } // Final item on stack is the root } }
Node Types in Proofs
Each Push carries a Node that contains just enough information for
verification:
#![allow(unused)] fn main() { pub enum Node { // Minimum info — just the hash. Used for distant siblings. Hash(CryptoHash), // KV hash for nodes on the path but not queried. KVHash(CryptoHash), // Full key-value for queried items. KV(Vec<u8>, Vec<u8>), // Key, value, and pre-computed value_hash. // Used for subtrees where value_hash = combine_hash(...) KVValueHash(Vec<u8>, Vec<u8>, CryptoHash), // KV with feature type — for ProvableCountTree or chunk restoration. KVValueHashFeatureType(Vec<u8>, Vec<u8>, CryptoHash, TreeFeatureType), // Reference: key, dereferenced value, hash of reference element. KVRefValueHash(Vec<u8>, Vec<u8>, CryptoHash), // For items in ProvableCountTree. KVCount(Vec<u8>, Vec<u8>, u64), // KV hash + count for non-queried ProvableCountTree nodes. KVHashCount(CryptoHash, u64), // Reference in ProvableCountTree. KVRefValueHashCount(Vec<u8>, Vec<u8>, CryptoHash, u64), // For boundary/absence proofs in ProvableCountTree. KVDigestCount(Vec<u8>, CryptoHash, u64), // Key + value_hash for absence proofs (regular trees). KVDigest(Vec<u8>, CryptoHash), } }
The choice of Node type determines what information the verifier needs:
Query: "Get value for key 'bob'"
graph TD
dave["dave<br/><b>KVHash</b><br/>(on path, not queried)"]
bob["bob<br/><b>KVValueHash</b><br/>key + value + value_hash<br/><i>THE QUERIED NODE</i>"]
frank["frank<br/><b>Hash</b><br/>(distant sibling,<br/>32-byte hash only)"]
alice["alice<br/><b>Hash</b><br/>(32-byte hash only)"]
carol["carol<br/><b>Hash</b><br/>(32-byte hash only)"]
dave --> bob
dave --> frank
bob --> alice
bob --> carol
style bob fill:#d5f5e3,stroke:#27ae60,stroke-width:3px
style dave fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style frank fill:#e8e8e8,stroke:#999
style alice fill:#e8e8e8,stroke:#999
style carol fill:#e8e8e8,stroke:#999
Green = queried node (full data revealed). Yellow = on the path (kv_hash only). Gray = siblings (just 32-byte node hashes).
Encoded as proof ops:
| # | Op | Effect |
|---|---|---|
| 1 | Push(Hash(alice_node_hash)) | Push alice hash |
| 2 | Push(KVValueHash("bob", value, value_hash)) | Push bob with full data |
| 3 | Parent | alice becomes left child of bob |
| 4 | Push(Hash(carol_node_hash)) | Push carol hash |
| 5 | Child | carol becomes right child of bob |
| 6 | Push(KVHash(dave_kv_hash)) | Push dave kv_hash |
| 7 | Parent | bob subtree becomes left of dave |
| 8 | Push(Hash(frank_node_hash)) | Push frank hash |
| 9 | Child | frank becomes right child of dave |
Multi-Layer Proof Generation
Since GroveDB is a tree of trees, proofs span multiple layers. Each layer proves the relevant portion of one Merk tree, and the layers are connected by the combined value_hash mechanism:
Query: Get ["identities", "alice", "name"]
graph TD
subgraph layer0["LAYER 0: Root tree proof"]
L0["Proves "identities" exists<br/>Node: KVValueHash<br/>value_hash = combine_hash(<br/> H(tree_element_bytes),<br/> identities_root_hash<br/>)"]
end
subgraph layer1["LAYER 1: Identities tree proof"]
L1["Proves "alice" exists<br/>Node: KVValueHash<br/>value_hash = combine_hash(<br/> H(tree_element_bytes),<br/> alice_root_hash<br/>)"]
end
subgraph layer2["LAYER 2: Alice subtree proof"]
L2["Proves "name" = "Alice"<br/>Node: KV (full key + value)<br/>Result: <b>"Alice"</b>"]
end
state_root["Known State Root"] -->|"verify"| L0
L0 -->|"identities_root_hash<br/>must match"| L1
L1 -->|"alice_root_hash<br/>must match"| L2
style layer0 fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style layer1 fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style layer2 fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style state_root fill:#2c3e50,stroke:#2c3e50,color:#fff
Chain of trust:
known_state_root → verify Layer 0 → verify Layer 1 → verify Layer 2 → "Alice". Each layer's reconstructed root hash must match the value_hash from the layer above.
The verifier checks each layer, confirming that:
- The layer proof reconstructs to the expected root hash
- The root hash matches the value_hash from the parent layer
- The top-level root hash matches the known state root
Proof Verification
Verification follows the proof layers bottom-up or top-down, using the execute
function to reconstruct each layer's tree. The Tree::hash() method in the proof
tree computes the hash based on the node type:
#![allow(unused)] fn main() { impl Tree { pub fn hash(&self) -> CostContext<CryptoHash> { match &self.node { Node::Hash(hash) => *hash, // Already a hash, return directly Node::KVHash(kv_hash) => node_hash(kv_hash, &self.child_hash(true), &self.child_hash(false)), Node::KV(key, value) => kv_hash(key, value) .flat_map(|kv_hash| node_hash(&kv_hash, &left, &right)), Node::KVValueHash(key, _, value_hash) => kv_digest_to_kv_hash(key, value_hash) .flat_map(|kv_hash| node_hash(&kv_hash, &left, &right)), Node::KVValueHashFeatureType(key, _, value_hash, feature_type) => { let kv = kv_digest_to_kv_hash(key, value_hash); match feature_type { ProvableCountedMerkNode(count) => node_hash_with_count(&kv, &left, &right, *count), _ => node_hash(&kv, &left, &right), } } Node::KVRefValueHash(key, referenced_value, ref_element_hash) => { let ref_value_hash = value_hash(referenced_value); let combined = combine_hash(ref_element_hash, &ref_value_hash); let kv = kv_digest_to_kv_hash(key, &combined); node_hash(&kv, &left, &right) } // ... other variants } } } }
Absence Proofs
GroveDB can prove that a key does not exist. This uses boundary nodes — the nodes that would be adjacent to the missing key if it existed:
Prove: "charlie" does NOT exist
graph TD
dave_abs["dave<br/><b>KVDigest</b><br/>(right boundary)"]
bob_abs["bob"]
frank_abs["frank<br/>Hash"]
alice_abs["alice<br/>Hash"]
carol_abs["carol<br/><b>KVDigest</b><br/>(left boundary)"]
missing["(no right child!)<br/>"charlie" would be here"]
dave_abs --> bob_abs
dave_abs --> frank_abs
bob_abs --> alice_abs
bob_abs --> carol_abs
carol_abs -.->|"right = None"| missing
style carol_abs fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
style dave_abs fill:#fadbd8,stroke:#e74c3c,stroke-width:3px
style missing fill:none,stroke:#e74c3c,stroke-dasharray:5 5
style alice_abs fill:#e8e8e8,stroke:#999
style frank_abs fill:#e8e8e8,stroke:#999
Binary search: alice < bob < carol < "charlie" < dave < frank. "charlie" would be between carol and dave. Carol's right child is
None, proving nothing exists between carol and dave. Therefore "charlie" cannot exist in this tree.
For range queries, absence proofs show that there are no keys within the queried range that were not included in the result set.
V1 Proofs — Non-Merk Trees
The V0 proof system works exclusively with Merk subtrees, descending layer by layer through the grove hierarchy. However, CommitmentTree, MmrTree, BulkAppendTree, and DenseAppendOnlyFixedSizeTree elements store their data outside a child Merk tree. They have no child Merk to descend into — their type-specific root hash flows as the Merk child hash instead.
The V1 proof format extends V0 to handle these non-Merk trees with type-specific proof structures:
#![allow(unused)] fn main() { /// Which proof format a layer uses. pub enum ProofBytes { Merk(Vec<u8>), // Standard Merk proof ops MMR(Vec<u8>), // MMR membership proof BulkAppendTree(Vec<u8>), // BulkAppendTree range proof DenseTree(Vec<u8>), // Dense tree inclusion proof CommitmentTree(Vec<u8>), // Sinsemilla root (32 bytes) + BulkAppendTree proof } /// One layer of a V1 proof. pub struct LayerProof { pub merk_proof: ProofBytes, pub lower_layers: BTreeMap<Vec<u8>, LayerProof>, } }
V0/V1 selection rule: If every layer in the proof is a standard Merk tree,
prove_query produces a GroveDBProof::V0 (backward compatible). If any layer
involves an MmrTree, BulkAppendTree, or DenseAppendOnlyFixedSizeTree, it produces
GroveDBProof::V1.
How Non-Merk Tree Proofs Bind to the Root Hash
The parent Merk tree proves the element's serialized bytes via a standard Merk
proof node (KVValueHash). The type-specific root (e.g., mmr_root or
state_root) flows as the Merk child hash — it is NOT embedded in the
element bytes:
combined_value_hash = combine_hash(
Blake3(varint(len) || element_bytes), ← contains count, height, etc.
type_specific_root ← mmr_root / state_root / dense_root
)
The type-specific proof then proves that the queried data is consistent with the type-specific root that was used as the child hash.
MMR Tree Proofs
An MMR proof demonstrates that specific leaves exist at known positions within the MMR, and that the MMR's root hash matches the child hash stored in the parent Merk node:
#![allow(unused)] fn main() { pub struct MmrProof { pub mmr_size: u64, pub proof: MerkleProof, // ckb_merkle_mountain_range::MerkleProof pub leaves: Vec<MmrProofLeaf>, } pub struct MmrProofLeaf { pub position: u64, // MMR position pub leaf_index: u64, // Logical leaf index pub hash: [u8; 32], // Leaf hash pub value: Vec<u8>, // Leaf value bytes } }
graph TD
subgraph parent_merk["Parent Merk (V0 layer)"]
elem[""my_mmr"<br/><b>KVValueHash</b><br/>element bytes contain mmr_root"]
end
subgraph mmr_proof["MMR Proof (V1 layer)"]
peak1["Peak 1<br/>hash"]
peak2["Peak 2<br/>hash"]
leaf_a["Leaf 5<br/><b>proved</b><br/>value = 0xABCD"]
sibling["Sibling<br/>hash"]
peak2 --> leaf_a
peak2 --> sibling
end
elem -->|"mmr_root must match<br/>MMR root from peaks"| mmr_proof
style parent_merk fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style mmr_proof fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style leaf_a fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
Query keys are positions: Query items encode positions as big-endian u64
bytes (which preserves sort order). QueryItem::RangeInclusive with BE-encoded
start/end positions selects a contiguous range of MMR leaves.
Verification:
- Reconstruct
MmrNodeleaves from the proof - Verify the ckb
MerkleProofagainst the expected MMR root from the parent Merk child hash - Cross-validate that
proof.mmr_sizematches the element's stored size - Return the proved leaf values
BulkAppendTree Proofs
BulkAppendTree proofs are more complex because data lives in two places: sealed chunk blobs and the in-progress buffer. A range proof must return:
- Full chunk blobs for any completed chunk overlapping the query range
- Individual buffer entries for positions still in the buffer
#![allow(unused)] fn main() { pub struct BulkAppendTreeProof { pub chunk_power: u8, pub total_count: u64, pub chunk_blobs: Vec<(u64, Vec<u8>)>, // (chunk_index, blob_bytes) pub chunk_mmr_size: u64, pub chunk_mmr_proof_items: Vec<[u8; 32]>, // MMR sibling hashes pub chunk_mmr_leaves: Vec<(u64, [u8; 32])>, // (mmr_pos, dense_merkle_root) pub buffer_entries: Vec<Vec<u8>>, // ALL current buffer (dense tree) entries pub chunk_mmr_root: [u8; 32], } }
graph TD
subgraph verify["Verification Steps"]
step1["1. For each chunk blob:<br/>compute dense_merkle_root<br/>verify matches chunk_mmr_leaves"]
step2["2. Verify chunk MMR proof<br/>against chunk_mmr_root"]
step3["3. Recompute dense_tree_root<br/>from ALL buffer entries<br/>using dense Merkle tree"]
step4["4. Verify state_root =<br/>blake3("bulk_state" ||<br/>chunk_mmr_root ||<br/>dense_tree_root)"]
step5["5. Extract entries in<br/>queried position range"]
step1 --> step2 --> step3 --> step4 --> step5
end
style verify fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style step4 fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
Why include ALL buffer entries? The buffer is a dense Merkle tree whose root hash commits to every entry. To verify the
dense_tree_root, the verifier must rebuild the tree from all entries. Since the buffer is bounded bycapacityentries (at most 65,535), this is acceptable.
Limit accounting: Each individual value (within a chunk or the buffer) counts
toward the query limit, not each chunk blob as a whole. If a query has
limit: 100 and a chunk contains 1024 entries with 500 overlapping the range,
all 500 entries count toward the limit.
DenseAppendOnlyFixedSizeTree Proofs
A dense tree proof demonstrates that specific positions hold specific values,
authenticated against the tree's root hash (which flows as the Merk child hash).
All nodes use blake3(H(value) || H(left) || H(right)), so ancestor nodes on the
auth path only need their 32-byte value hash — not the full value.
#![allow(unused)] fn main() { pub struct DenseTreeProof { pub entries: Vec<(u16, Vec<u8>)>, // proved (position, value) pub node_value_hashes: Vec<(u16, [u8; 32])>, // ancestor value hashes on auth path pub node_hashes: Vec<(u16, [u8; 32])>, // precomputed sibling subtree hashes } }
heightandcountcome from the parent Element (authenticated by the Merk hierarchy), not the proof.
graph TD
subgraph parent_merk["Parent Merk (V0 layer)"]
elem[""my_dense"<br/><b>KVValueHash</b><br/>element bytes contain root_hash"]
end
subgraph dense_proof["Dense Tree Proof (V1 layer)"]
root["Position 0<br/>node_value_hashes<br/>H(value[0])"]
node1["Position 1<br/>node_value_hashes<br/>H(value[1])"]
hash2["Position 2<br/>node_hashes<br/>H(subtree)"]
hash3["Position 3<br/>node_hashes<br/>H(node)"]
leaf4["Position 4<br/><b>entries</b><br/>value[4] (proved)"]
root --> node1
root --> hash2
node1 --> hash3
node1 --> leaf4
end
elem -->|"root_hash must match<br/>recomputed H(0)"| dense_proof
style parent_merk fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style dense_proof fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style leaf4 fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
Verification is a pure function requiring no storage:
- Build lookup maps from
entries,node_value_hashes, andnode_hashes - Recursively recompute the root hash from position 0:
- Position has precomputed hash in
node_hashes→ use it directly - Position with value in
entries→blake3(blake3(value) || H(left) || H(right)) - Position with hash in
node_value_hashes→blake3(hash || H(left) || H(right)) - Position
>= countor>= capacity→[0u8; 32]
- Position has precomputed hash in
- Compare computed root with expected root hash from parent element
- Return proved entries on success
Multi-position proofs merge overlapping auth paths: shared ancestors and their values appear only once, making them more compact than independent proofs.
The Query System
PathQuery Structure
GroveDB queries use the PathQuery type, which combines a path (where to look)
with a query (what to select):
#![allow(unused)] fn main() { pub struct PathQuery { pub path: Vec<Vec<u8>>, // Starting path in the grove pub query: SizedQuery, // What to select } pub struct SizedQuery { pub query: Query, // The selection criteria pub limit: Option<u16>, // Maximum number of results pub offset: Option<u16>, // Skip first N results } }
The Query Type
#![allow(unused)] fn main() { pub struct Query { pub items: Vec<QueryItem>, // What to match pub default_subquery_branch: SubqueryBranch, pub conditional_subquery_branches: Option<IndexMap<QueryItem, SubqueryBranch>>, pub left_to_right: bool, // Iteration direction pub add_parent_tree_on_subquery: bool, // Include parent tree element in results (v2) } }
add_parent_tree_on_subquery(v2): Whentrue, the parent tree element (e.g., a CountTree or SumTree) is included in query results alongside its children's values. This lets you retrieve both aggregate values and individual elements in one query.
QueryItems — What to Select
Each QueryItem specifies a key or range to match:
#![allow(unused)] fn main() { pub enum QueryItem { Key(Vec<u8>), // Exact key match Range(Range<Vec<u8>>), // Exclusive range [start..end) RangeInclusive(RangeInclusive<Vec<u8>>),// Inclusive range [start..=end] RangeFull(RangeFull), // All keys RangeFrom(RangeFrom<Vec<u8>>), // [start..) RangeTo(RangeTo<Vec<u8>>), // [..end) RangeToInclusive(RangeToInclusive<Vec<u8>>), // [..=end] RangeAfter(RangeFrom<Vec<u8>>), // (start..) exclusive start RangeAfterTo(Range<Vec<u8>>), // (start..end) exclusive both RangeAfterToInclusive(RangeInclusive<Vec<u8>>), // (start..=end] } }
Example queries:
Merk tree (sorted): alice bob carol dave eve frank
| Query | Selection | Result |
|---|---|---|
Key("bob") | alice [bob] carol dave eve frank | bob |
RangeInclusive("bob"..="dave") | alice [bob carol dave] eve frank | bob, carol, dave |
RangeAfter("carol"..) | alice bob carol [dave eve frank] | dave, eve, frank |
RangeFull, limit=2 | [alice bob] carol dave eve frank (stopped by limit) | alice, bob |
RangeFull, limit=2, right-to-left | alice bob carol dave [eve frank] (stopped by limit) | frank, eve |
Subqueries and Conditional Branches
The real power of GroveDB queries is subqueries — when a query matches a Tree element, the query can automatically descend into that subtree:
graph TD
subgraph contracts["Merk ["contracts"]"]
cA[""contract_A"<br/>Tree"]
cB[""contract_B"<br/>Tree"]
end
subgraph merkCA["Merk [contracts, contract_A]"]
f1a["field1 → "value1""]
f2a["field2 → "value2""]
end
subgraph merkCB["Merk [contracts, contract_B]"]
f1b["field1 → "value3""]
f2b["field2 → "value4""]
end
cA -.-> f1a
cB -.-> f1b
style contracts fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style merkCA fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style merkCB fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style f1a fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style f1b fill:#fef9e7,stroke:#f39c12,stroke-width:2px
PathQuery:
path: ["contracts"], query: RangeFullwithdefault_subquery: Key("field1")Execution:
RangeFullon ["contracts"] → matches contract_A, contract_B- Both are Tree elements → descend with subquery
Key("field1")- contract_A → "value1", contract_B → "value3"
Result:
["value1", "value3"]
Conditional subqueries let you apply different subqueries based on which key was matched:
#![allow(unused)] fn main() { conditional_subquery_branches: Some(indexmap! { QueryItem::Key(b"contract_A".to_vec()) => SubqueryBranch { subquery: Some(Query { items: vec![Key(b"field1".to_vec())] }), .. }, QueryItem::Key(b"contract_B".to_vec()) => SubqueryBranch { subquery: Some(Query { items: vec![Key(b"field2".to_vec())] }), .. }, }) }
This would get field1 from contract_A but field2 from contract_B.
Sized Queries — Limit and Offset
The SizedQuery wrapper adds pagination:
graph LR
subgraph offset_zone["Offset (skip 2)"]
oA["A"]
oB["B"]
end
subgraph result_zone["Result (limit = 3)"]
rC["C"]
rD["D"]
rE["E"]
end
subgraph remaining["Remaining (not returned)"]
xF["F"]
xG["G"]
xH["H"]
end
oA --> oB --> rC --> rD --> rE --> xF --> xG --> xH
style offset_zone fill:#e8e8e8,stroke:#999,stroke-dasharray:5 5
style result_zone fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style remaining fill:#fadbd8,stroke:#e74c3c,stroke-dasharray:5 5
SizedQuery { query: RangeFull, limit: Some(3), offset: Some(2) }→ Result: [C, D, E]
When combined with left_to_right: false, the iteration is reversed:
SizedQuery {
query: Query { items: [RangeFull], left_to_right: false, .. },
limit: Some(3),
offset: None
}
Result: [H, G, F]
Query Merging
Multiple PathQueries can be merged into a single query for efficiency. The merge algorithm finds common path prefixes and combines query items:
Query A: path=["users"], query=Key("alice")
Query B: path=["users"], query=Key("bob")
Merged: path=["users"], query=items=[Key("alice"), Key("bob")]
Aggregate Sum Queries
Overview
Aggregate Sum Queries are a specialized query type designed for SumTrees in GroveDB. While regular queries retrieve elements by key or range, aggregate sum queries iterate through elements and accumulate their sum values until a sum limit is reached.
This is useful for questions like:
- "Give me transactions until the running total exceeds 1000"
- "Which items contribute to the first 500 units of value in this tree?"
- "Collect sum items up to a budget of N"
Core Concepts
How It Differs from Regular Queries
| Feature | PathQuery | AggregateSumPathQuery |
|---|---|---|
| Target | Any element type | SumItem / ItemWithSumItem elements |
| Stop condition | Limit (count) or end of range | Sum limit (running total) and/or item limit |
| Returns | Elements or keys | Key-sum value pairs |
| Subqueries | Yes (descend into subtrees) | No (single tree level) |
| References | Resolved by GroveDB layer | Optionally followed or ignored |
The AggregateSumQuery Structure
#![allow(unused)] fn main() { pub struct AggregateSumQuery { pub items: Vec<QueryItem>, // Keys or ranges to scan pub left_to_right: bool, // Iteration direction pub sum_limit: u64, // Stop when running total reaches this pub limit_of_items_to_check: Option<u16>, // Max number of matching items to return } }
The query is wrapped in an AggregateSumPathQuery to specify where in the grove to look:
#![allow(unused)] fn main() { pub struct AggregateSumPathQuery { pub path: Vec<Vec<u8>>, // Path to the SumTree pub aggregate_sum_query: AggregateSumQuery, } }
Sum Limit — The Running Total
The sum_limit is the central concept. As elements are scanned, their sum values are
accumulated. Once the running total meets or exceeds the sum limit, iteration stops:
graph LR
subgraph scanned["Scanned (sum_limit = 15)"]
A["a: 7"]
B["b: 5"]
C["c: 3"]
end
subgraph remaining["Not reached"]
D["d: 11"]
end
A -->|"total: 7"| B -->|"total: 12"| C -->|"total: 15 ≥ 15 → stop"| D
style scanned fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style remaining fill:#fadbd8,stroke:#e74c3c,stroke-dasharray:5 5
Result:
[(a, 7), (b, 5), (c, 3)]— iteration stops because 7 + 5 + 3 = 15 >= sum_limit
Negative sum values are supported. A negative value increases the remaining budget:
sum_limit = 12, elements: a(10), b(-3), c(5)
a: total = 10, remaining = 2
b: total = 7, remaining = 5 ← negative value gave us more room
c: total = 12, remaining = 0 ← stop
Result: [(a, 10), (b, -3), (c, 5)]
Query Options
The AggregateSumQueryOptions struct controls query behavior:
#![allow(unused)] fn main() { pub struct AggregateSumQueryOptions { pub allow_cache: bool, // Use cached reads (default: true) pub error_if_intermediate_path_tree_not_present: bool, // Error on missing path (default: true) pub error_if_non_sum_item_found: bool, // Error on non-sum elements (default: true) pub ignore_references: bool, // Skip references (default: false) } }
Handling Non-Sum Elements
SumTrees may contain a mix of element types: SumItem, Item, Reference, ItemWithSumItem,
and others. By default, encountering a non-sum, non-reference element produces an error.
When error_if_non_sum_item_found is set to false, non-sum elements are silently skipped
without consuming a user limit slot:
Tree contents: a(SumItem=7), b(Item), c(SumItem=3)
Query: sum_limit=100, limit_of_items_to_check=2, error_if_non_sum_item_found=false
Scan: a(7) → returned, limit=1
b(Item) → skipped, limit still 1
c(3) → returned, limit=0 → stop
Result: [(a, 7), (c, 3)]
Note: ItemWithSumItem elements are always processed (never skipped), because they carry
a sum value.
Reference Handling
By default, Reference elements are followed — the query resolves the reference chain
(up to 3 intermediate hops) to find the target element's sum value:
Tree contents: a(SumItem=7), ref_b(Reference → a)
Query: sum_limit=100
ref_b is followed → resolves to a(SumItem=7)
Result: [(a, 7), (ref_b, 7)]
When ignore_references is true, references are silently skipped without consuming a limit
slot, similar to how non-sum elements are skipped.
Reference chains deeper than 3 intermediate hops produce a ReferenceLimit error.
The Result Type
Queries return an AggregateSumQueryResult:
#![allow(unused)] fn main() { pub struct AggregateSumQueryResult { pub results: Vec<(Vec<u8>, i64)>, // Key-sum value pairs pub hard_limit_reached: bool, // True if system limit truncated results } }
The hard_limit_reached flag indicates whether the system's hard scan limit (default: 1024
elements) was reached before the query completed naturally. When true, more results may
exist beyond what was returned.
Three Limit Systems
Aggregate sum queries have three stopping conditions:
| Limit | Source | What it counts | Effect when reached |
|---|---|---|---|
| sum_limit | User (query) | Running total of sum values | Stops iteration |
| limit_of_items_to_check | User (query) | Matching items returned | Stops iteration |
| Hard scan limit | System (GroveVersion, default 1024) | All elements scanned (including skipped) | Stops iteration, sets hard_limit_reached |
The hard scan limit prevents unbounded iteration when no user limit is set. Skipped elements
(non-sum items with error_if_non_sum_item_found=false, or references with
ignore_references=true) count against the hard scan limit but not against the user's
limit_of_items_to_check.
API Usage
Simple Query
#![allow(unused)] fn main() { use grovedb::AggregateSumPathQuery; use grovedb_merk::proofs::query::AggregateSumQuery; // "Give me items from this SumTree until the total reaches 1000" let query = AggregateSumQuery::new(1000, None); let path_query = AggregateSumPathQuery { path: vec![b"my_tree".to_vec()], aggregate_sum_query: query, }; let result = db.query_aggregate_sums( &path_query, true, // allow_cache true, // error_if_intermediate_path_tree_not_present None, // transaction grove_version, ).unwrap().expect("query failed"); for (key, sum_value) in &result.results { println!("{}: {}", String::from_utf8_lossy(key), sum_value); } }
Query with Options
#![allow(unused)] fn main() { use grovedb::{AggregateSumPathQuery, AggregateSumQueryOptions}; use grovedb_merk::proofs::query::AggregateSumQuery; // Skip non-sum items and ignore references let query = AggregateSumQuery::new(1000, Some(50)); let path_query = AggregateSumPathQuery { path: vec![b"mixed_tree".to_vec()], aggregate_sum_query: query, }; let result = db.query_aggregate_sums_with_options( &path_query, AggregateSumQueryOptions { error_if_non_sum_item_found: false, // skip Items, Trees, etc. ignore_references: true, // skip References ..AggregateSumQueryOptions::default() }, None, grove_version, ).unwrap().expect("query failed"); if result.hard_limit_reached { println!("Warning: results may be incomplete (hard limit reached)"); } }
Key-Based Queries
Instead of scanning a range, you can query specific keys:
#![allow(unused)] fn main() { // Check the sum value of specific keys let query = AggregateSumQuery::new_with_keys( vec![b"alice".to_vec(), b"bob".to_vec(), b"carol".to_vec()], u64::MAX, // no sum limit None, // no item limit ); }
Descending Queries
Iterate from the highest key to the lowest:
#![allow(unused)] fn main() { let query = AggregateSumQuery::new_descending(500, Some(10)); // Or: query.left_to_right = false; }
Constructor Reference
| Constructor | Description |
|---|---|
new(sum_limit, limit) | Full range, ascending |
new_descending(sum_limit, limit) | Full range, descending |
new_single_key(key, sum_limit) | Single key lookup |
new_with_keys(keys, sum_limit, limit) | Multiple specific keys |
new_with_keys_reversed(keys, sum_limit, limit) | Multiple keys, descending |
new_single_query_item(item, sum_limit, limit) | Single QueryItem (key or range) |
new_with_query_items(items, sum_limit, limit) | Multiple QueryItems |
Batch Operations at the Grove Level
GroveOp Variants
At the GroveDB level, operations are represented as GroveOp:
#![allow(unused)] fn main() { pub enum GroveOp { // User-facing operations: InsertOnly { element: Element }, InsertOrReplace { element: Element }, Replace { element: Element }, Patch { element: Element, change_in_bytes: i32 }, RefreshReference { reference_path_type, max_reference_hop, flags, trust_refresh_reference }, Delete, DeleteTree(TreeType), // Parameterized by tree type // Non-Merk tree append operations (user-facing): CommitmentTreeInsert { cmx: [u8; 32], payload: Vec<u8> }, MmrTreeAppend { value: Vec<u8> }, BulkAppend { value: Vec<u8> }, DenseTreeInsert { value: Vec<u8> }, // Internal operations (created by preprocessing/propagation, rejected by from_ops): ReplaceTreeRootKey { hash, root_key, aggregate_data }, InsertTreeWithRootHash { hash, root_key, flags, aggregate_data }, ReplaceNonMerkTreeRoot { hash: [u8; 32], meta: NonMerkTreeMeta }, InsertNonMerkTree { hash, root_key, flags, aggregate_data, meta: NonMerkTreeMeta }, } }
NonMerkTreeMeta carries tree-type-specific state through batch processing:
#![allow(unused)] fn main() { pub enum NonMerkTreeMeta { CommitmentTree { total_count: u64, chunk_power: u8 }, MmrTree { mmr_size: u64 }, BulkAppendTree { total_count: u64, chunk_power: u8 }, DenseTree { count: u16, height: u8 }, } }
Each operation is wrapped in a QualifiedGroveDbOp that includes the path:
#![allow(unused)] fn main() { pub struct QualifiedGroveDbOp { pub path: KeyInfoPath, // Where in the grove pub key: Option<KeyInfo>, // Which key (None for append-only tree ops) pub op: GroveOp, // What to do } }
Note: The
keyfield isOption<KeyInfo>— it isNonefor append-only tree operations (CommitmentTreeInsert,MmrTreeAppend,BulkAppend,DenseTreeInsert) where the tree key is the last segment ofpathinstead.
Two-Phase Processing
Batch operations are processed in two phases:
graph TD
input["Input: Vec<QualifiedGroveDbOp>"]
subgraph phase1["PHASE 1: VALIDATION"]
v1["1. Sort by path + key<br/>(stable sort)"]
v2["2. Build batch structure<br/>(group ops by subtree)"]
v3["3. Validate element types<br/>match targets"]
v4["4. Resolve & validate<br/>references"]
v1 --> v2 --> v3 --> v4
end
v4 -->|"validation OK"| phase2_start
v4 -->|"validation failed"| abort["Err(Error)<br/>abort, no changes"]
subgraph phase2["PHASE 2: APPLICATION"]
phase2_start["Start application"]
a1["1. Open all affected<br/>subtrees (TreeCache)"]
a2["2. Apply MerkBatch ops<br/>(deferred propagation)"]
a3["3. Propagate root hashes<br/>upward (leaf → root)"]
a4["4. Commit transaction<br/>atomically"]
phase2_start --> a1 --> a2 --> a3 --> a4
end
input --> v1
style phase1 fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style phase2 fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style abort fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
style a4 fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
TreeCache and Deferred Propagation
During batch application, GroveDB uses a TreeCache to defer root hash propagation until all operations in a subtree are complete:
graph TD
subgraph without["WITHOUT TreeCache (naive)"]
w1["Op 1: Insert A in X"]
w1p["Propagate X → parent → root"]
w2["Op 2: Insert B in X"]
w2p["Propagate X → parent → root"]
w3["Op 3: Insert C in X"]
w3p["Propagate X → parent → root"]
w1 --> w1p --> w2 --> w2p --> w3 --> w3p
end
subgraph with_tc["WITH TreeCache (deferred)"]
t1["Op 1: Insert A in X<br/>→ buffered"]
t2["Op 2: Insert B in X<br/>→ buffered"]
t3["Op 3: Insert C in X<br/>→ buffered"]
tp["Propagate X → parent → root<br/>(walk up ONCE)"]
t1 --> t2 --> t3 --> tp
end
style without fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
style with_tc fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style w1p fill:#fadbd8,stroke:#e74c3c
style w2p fill:#fadbd8,stroke:#e74c3c
style w3p fill:#fadbd8,stroke:#e74c3c
style tp fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
3 propagations × O(depth) vs 1 propagation × O(depth) = 3x faster for this subtree.
This is a significant optimization when many operations target the same subtree.
Atomic Cross-Subtree Operations
A key property of GroveDB batches is atomicity across subtrees. A single batch can modify elements in multiple subtrees, and either all changes commit or none do:
Batch:
1. Delete ["balances", "alice"] (remove balance)
2. Insert ["balances", "bob"] = 100 (add balance)
3. Update ["identities", "bob", "rev"] = 2 (update revision)
Three subtrees affected: balances, identities, identities/bob
If ANY operation fails → ALL operations are rolled back
If ALL succeed → ALL are committed atomically
The batch processor handles this by:
- Collecting all affected paths
- Opening all needed subtrees
- Applying all operations
- Propagating all root hashes in dependency order
- Committing the entire transaction
Batch Preprocessing for Non-Merk Trees
CommitmentTree, MmrTree, BulkAppendTree, and DenseAppendOnlyFixedSizeTree operations
require access to storage contexts outside the Merk, which is not available inside the
standard execute_ops_on_path method (it only has access to the Merk). These operations
use a preprocessing pattern: before the main apply_body phase, the entry
points scan for non-Merk tree ops and convert them to standard internal ops.
#![allow(unused)] fn main() { pub enum GroveOp { // ... standard ops ... // Non-Merk tree operations (user-facing): CommitmentTreeInsert { cmx: [u8; 32], payload: Vec<u8> }, MmrTreeAppend { value: Vec<u8> }, BulkAppend { value: Vec<u8> }, DenseTreeInsert { value: Vec<u8> }, // Internal ops (produced by preprocessing): ReplaceNonMerkTreeRoot { hash: [u8; 32], meta: NonMerkTreeMeta }, } }
graph TD
subgraph preprocess["PREPROCESSING PHASE"]
scan["Scan ops for<br/>CommitmentTreeInsert<br/>MmrTreeAppend<br/>BulkAppend<br/>DenseTreeInsert"]
load["Load current state<br/>from storage"]
mutate["Apply append to<br/>in-memory structure"]
save["Write updated state<br/>back to storage"]
convert["Convert to<br/>ReplaceNonMerkTreeRoot<br/>with new root hash + meta"]
scan --> load --> mutate --> save --> convert
end
subgraph apply["STANDARD APPLY_BODY"]
body["execute_ops_on_path<br/>sees ReplaceNonMerkTreeRoot<br/>(non-Merk tree update)"]
prop["Propagate root hash<br/>upward through grove"]
body --> prop
end
convert --> body
style preprocess fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style apply fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
Why preprocessing? The execute_ops_on_path function operates on a single
Merk subtree and has no access to self.db or broader storage contexts.
Preprocessing in the entry points (apply_batch_with_element_flags_update,
apply_partial_batch_with_element_flags_update) has full access to the database,
so it can load/save data and then hand off a simple ReplaceNonMerkTreeRoot
to the standard batch machinery.
Each preprocessing method follows the same pattern:
preprocess_commitment_tree_ops— Loads frontier and BulkAppendTree from data storage, appends to both, saves back, converts toReplaceNonMerkTreeRootwith updated combined root andCommitmentTree { total_count, chunk_power }metapreprocess_mmr_tree_ops— Loads MMR from data storage, appends values, saves back, converts toReplaceNonMerkTreeRootwith updated MMR root andMmrTree { mmr_size }metapreprocess_bulk_append_ops— Loads BulkAppendTree from data storage, appends values (may trigger chunk compaction), saves back, converts toReplaceNonMerkTreeRootwith updated state root andBulkAppendTree { total_count, chunk_power }metapreprocess_dense_tree_ops— Loads DenseFixedSizedMerkleTree from data storage, inserts values sequentially, recomputes root hash, saves back, converts toReplaceNonMerkTreeRootwith updated root hash andDenseTree { count, height }meta
The ReplaceNonMerkTreeRoot op carries the new root hash and a NonMerkTreeMeta enum
so the element can be fully reconstructed after processing.
Cost Tracking
The OperationCost Structure
Every operation in GroveDB accumulates costs, measured in computational resources:
#![allow(unused)] fn main() { // costs/src/lib.rs pub struct OperationCost { pub seek_count: u32, // Number of storage seeks pub storage_cost: StorageCost, // Bytes added/replaced/removed pub storage_loaded_bytes: u64, // Bytes read from disk pub hash_node_calls: u32, // Number of Blake3 hash operations pub sinsemilla_hash_calls: u32, // Number of Sinsemilla hash operations (EC ops) } }
Sinsemilla hash calls track elliptic curve hash operations for CommitmentTree anchors. These are significantly more expensive than Blake3 node hashes.
Storage costs break down further:
#![allow(unused)] fn main() { // costs/src/storage_cost/mod.rs pub struct StorageCost { pub added_bytes: u32, // New data written pub replaced_bytes: u32, // Existing data overwritten pub removed_bytes: StorageRemovedBytes, // Data freed } }
The CostContext Pattern
All operations return their result wrapped in a CostContext:
#![allow(unused)] fn main() { pub struct CostContext<T> { pub value: T, // The operation result pub cost: OperationCost, // Resources consumed } pub type CostResult<T, E> = CostContext<Result<T, E>>; }
This creates a monadic cost-tracking pattern — costs flow through chains of operations automatically:
#![allow(unused)] fn main() { // Unwrap a result, adding its cost to an accumulator let result = expensive_operation().unwrap_add_cost(&mut total_cost); // Chain operations, accumulating costs let final_result = op1() .flat_map(|x| op2(x)) // Costs from op1 + op2 .flat_map(|y| op3(y)); // + costs from op3 }
The cost_return_on_error! Macro
The most common pattern in GroveDB code is the cost_return_on_error! macro,
which acts like ? but preserves costs on early return:
#![allow(unused)] fn main() { macro_rules! cost_return_on_error { ( &mut $cost:ident, $($body:tt)+ ) => { { let result_with_cost = { $($body)+ }; let result = result_with_cost.unwrap_add_cost(&mut $cost); match result { Ok(x) => x, Err(e) => return Err(e).wrap_with_cost($cost), } } }; } }
In practice:
#![allow(unused)] fn main() { fn insert_element(&self, path: &[&[u8]], key: &[u8], element: Element) -> CostResult<(), Error> { let mut cost = OperationCost::default(); // Each macro call adds the operation's cost to `cost` // and returns the Ok value (or early-returns with accumulated cost on Err) let merk = cost_return_on_error!(&mut cost, self.open_merk(path)); cost_return_on_error!(&mut cost, merk.insert(key, element)); cost_return_on_error!(&mut cost, self.propagate_changes(path)); Ok(()).wrap_with_cost(cost) // `cost` now contains the sum of all three operations' costs } }
Storage Cost Breakdown
When a value is updated, the cost depends on whether the new value is larger, smaller, or the same size:
graph LR
subgraph case1["CASE 1: Same Size (old=100, new=100)"]
c1_old["old: 100B"]
c1_new["new: 100B"]
c1_cost["replaced_bytes += 100"]
end
subgraph case2["CASE 2: Growing (old=100, new=120)"]
c2_old["old: 100B"]
c2_new["new: 120B"]
c2_replaced["replaced: 100B"]
c2_added["added: +20B"]
c2_cost["replaced_bytes += 100<br/>added_bytes += 20"]
end
subgraph case3["CASE 3: Shrinking (old=100, new=70)"]
c3_old["old: 100B"]
c3_new["new: 70B"]
c3_replaced["replaced: 70B"]
c3_removed["removed: 30B"]
c3_cost["replaced_bytes += 70<br/>removed_bytes += 30"]
end
style case1 fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
style case2 fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style c2_added fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style case3 fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
style c3_removed fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
Hash Operation Costs
Hash costs are measured in "hash node calls" — the number of Blake3 block compressions:
| Operation | Input Size | Hash Calls |
|---|---|---|
value_hash(small) | < 64 bytes | 1 |
value_hash(medium) | 64-127 bytes | 2 |
kv_hash | key + value_hash | varies |
node_hash | 96 bytes (3 × 32) | 2 (always) |
combine_hash | 64 bytes (2 × 32) | 1 (always) |
node_hash_with_count | 104 bytes (3 × 32 + 8) | 2 (always) |
| Sinsemilla (CommitmentTree) | Pallas curve EC op | tracked separately via sinsemilla_hash_calls |
The general formula for Blake3:
hash_calls = 1 + (input_bytes - 1) / 64
Worst-Case and Average-Case Estimation
GroveDB provides functions to estimate operation costs before executing them. This is crucial for blockchain fee calculation — you need to know the cost before committing to pay for it.
#![allow(unused)] fn main() { // Worst-case cost for reading a node pub fn add_worst_case_get_merk_node( cost: &mut OperationCost, not_prefixed_key_len: u32, max_element_size: u32, node_type: NodeType, ) { cost.seek_count += 1; // One disk seek cost.storage_loaded_bytes += TreeNode::worst_case_encoded_tree_size( not_prefixed_key_len, max_element_size, node_type ) as u64; } // Worst-case propagation cost pub fn add_worst_case_merk_propagate( cost: &mut OperationCost, input: &WorstCaseLayerInformation, ) { let levels = match input { MaxElementsNumber(n) => ((*n + 1) as f32).log2().ceil() as u32, NumberOfLevels(n) => *n, }; let mut nodes_updated = levels; // AVL rotations may update additional nodes if levels > 2 { nodes_updated += 2; // At most 2 extra nodes for rotations } cost.storage_cost.replaced_bytes += nodes_updated * MERK_BIGGEST_VALUE_SIZE; cost.storage_loaded_bytes += nodes_updated as u64 * (MERK_BIGGEST_VALUE_SIZE + MERK_BIGGEST_KEY_SIZE) as u64; cost.seek_count += nodes_updated; cost.hash_node_calls += nodes_updated * 2; } }
Constants used:
#![allow(unused)] fn main() { pub const MERK_BIGGEST_VALUE_SIZE: u32 = u16::MAX as u32; // 65535 pub const MERK_BIGGEST_KEY_SIZE: u32 = 256; }
The MMR Tree — Append-Only Authenticated Logs
The MmrTree is GroveDB's append-only authenticated data structure, built on a Merkle Mountain Range (MMR) with Blake3 hashing. While Merk AVL trees (Chapter 2) excel at random key-value operations with O(log N) updates, MMRs are purpose-built for the append-only case: they offer zero rotations, O(1) amortized hash cost per append, and sequential I/O patterns.
This chapter covers the MMR data structure in depth — how it grows, how nodes are stored, how appends cascade, and how the proof system lets any third party verify that a specific value was appended at a specific position.
Why a Separate Tree Type?
GroveDB's standard Merk trees handle ordered key-value data well, but append-only logs have different requirements:
| Property | Merk AVL Tree | MMR |
|---|---|---|
| Operations | Insert, update, delete | Append only |
| Rebalancing | O(log N) rotations per write | None |
| I/O pattern | Random (rebalancing touches many nodes) | Sequential (new nodes always at end) |
| Total hashes for N inserts | O(N log N) | O(N) |
| Structure | Determined by insertion order | Determined only by leaf count |
| Proofs | Path from root to leaf | Sibling + peak hashes |
For use cases like transaction logs, event streams, or any monotonically growing data, the MMR is strictly better: simpler, faster, and more predictable.
The MMR Data Structure
An MMR is a forest of perfect binary trees (called "peaks") that grows left to right. Each peak is a complete binary tree of some height h, containing exactly 2^h leaves.
The key insight: the binary representation of the leaf count determines the peak structure. Each 1-bit in the binary form corresponds to one peak:
Leaf count Binary Peaks
───────── ────── ─────
1 1 one peak h=0
2 10 one peak h=1
3 11 peaks h=1, h=0
4 100 one peak h=2
5 101 peaks h=2, h=0
6 110 peaks h=2, h=1
7 111 peaks h=2, h=1, h=0
8 1000 one peak h=3
This means the MMR structure is fully determined by a single number — the leaf count. Two MMRs with the same number of leaves always have the same shape, regardless of what values were appended.
How the MMR Fills Up
Each node in the MMR has a position (0-indexed). Leaves and internal nodes are interleaved in a specific pattern. Here is step-by-step growth:
After 1 leaf (mmr_size = 1):
pos: 0
leaf₀ ← one peak at height 0
After 2 leaves (mmr_size = 3):
pos: 2 ← internal: blake3(leaf₀.hash || leaf₁.hash)
/ \
0 1 ← leaves
One peak at height 1. Positions 0 and 1 are leaves, position 2 is internal.
When leaf₁ was appended, it created a height-0 peak. But there was already a height-0 peak (leaf₀), so they merged into a height-1 peak.
After 3 leaves (mmr_size = 4):
pos: 2 3 ← peak h=1, peak h=0
/ \
0 1
Two peaks. No merge — heights 1 and 0 are different.
(Binary: 3 = 11₂ → one peak per 1-bit)
After 4 leaves (mmr_size = 7):
pos: 6 ← internal: merge of nodes 2 and 5
/ \
2 5 ← internal nodes
/ \ / \
0 1 3 4 ← leaves
One peak at height 2.
This is where it gets interesting. Appending leaf₃ (position 4) creates node₅ (merging positions 3 and 4). But now node₅ (height 1) and node₂ (height 1) are equal-height adjacent peaks, so they merge into node₆. A cascade of two merges from a single append.
After 5 leaves (mmr_size = 8):
pos: 6 7 ← peak h=2, peak h=0
/ \
2 5
/ \ / \
0 1 3 4
Two peaks. (Binary: 5 = 101₂)
After 7 leaves (mmr_size = 11):
pos: 6 10 ← peak h=2, peak h=1, peak h=0
/ \ / \
2 5 8 9 7
/ \ / \
0 1 3 4
Three peaks. (Binary: 7 = 111₂)
After 8 leaves (mmr_size = 15):
pos: 14 ← single peak h=3
/ \
6 13
/ \ / \
2 5 9 12
/ \ / \ / \ / \
0 1 3 4 7 8 10 11
One peak at height 3. Three cascading merges from appending leaf₇.
graph TD
subgraph mmr5["MMR with 5 leaves (mmr_size = 8)"]
pos6["pos 6<br/>H(2,5)<br/><b>peak h=2</b>"]
pos2["pos 2<br/>H(0,1)"]
pos5["pos 5<br/>H(3,4)"]
pos0["pos 0<br/>leaf₀"]
pos1["pos 1<br/>leaf₁"]
pos3["pos 3<br/>leaf₂"]
pos4["pos 4<br/>leaf₃"]
pos7["pos 7<br/>leaf₄<br/><b>peak h=0</b>"]
pos6 --> pos2
pos6 --> pos5
pos2 --> pos0
pos2 --> pos1
pos5 --> pos3
pos5 --> pos4
end
style pos6 fill:#d4e6f1,stroke:#2980b9,stroke-width:3px
style pos7 fill:#d4e6f1,stroke:#2980b9,stroke-width:3px
style pos0 fill:#d5f5e3,stroke:#27ae60
style pos1 fill:#d5f5e3,stroke:#27ae60
style pos3 fill:#d5f5e3,stroke:#27ae60
style pos4 fill:#d5f5e3,stroke:#27ae60
style pos7 fill:#d5f5e3,stroke:#27ae60
Blue = peaks (roots of perfect binary subtrees). Green = leaf nodes.
The Merge Cascade
When a new leaf is appended, it may trigger a chain of merges. The number of merges equals the number of trailing 1-bits in the binary representation of the current leaf count:
| Leaf count (before push) | Binary | trailing 1s | Merges | Total hashes |
|---|---|---|---|---|
| 0 | 0 | 0 | 0 | 1 (leaf only) |
| 1 | 1 | 1 | 1 | 2 |
| 2 | 10 | 0 | 0 | 1 |
| 3 | 11 | 2 | 2 | 3 |
| 4 | 100 | 0 | 0 | 1 |
| 5 | 101 | 1 | 1 | 2 |
| 6 | 110 | 0 | 0 | 1 |
| 7 | 111 | 3 | 3 | 4 |
Total hashes per push = 1 + trailing_ones(leaf_count):
- 1 hash for the leaf itself:
blake3(value) - N hashes for the merge cascade:
blake3(left.hash || right.hash)for each merge
This is how GroveDB tracks hash costs for each append. The implementation:
#![allow(unused)] fn main() { pub fn hash_count_for_push(leaf_count: u64) -> u32 { 1 + leaf_count.trailing_ones() } }
MMR Size vs Leaf Count
The MMR stores both leaves and internal nodes in a flat position space, so
mmr_size is always larger than the leaf count. The exact relationship is:
mmr_size = 2 * leaf_count - popcount(leaf_count)
where popcount is the number of 1-bits (i.e., the number of peaks). Each
internal node merges two subtrees, reducing the node count by one per merge.
The reverse computation — leaf count from mmr_size — uses the peak positions:
#![allow(unused)] fn main() { fn mmr_size_to_leaf_count(mmr_size: u64) -> u64 { // Each peak at height h contains 2^h leaves get_peaks(mmr_size).iter() .map(|&peak_pos| 1u64 << pos_height_in_tree(peak_pos)) .sum() } }
| mmr_size | leaf_count | peaks |
|---|---|---|
| 0 | 0 | (empty) |
| 1 | 1 | h=0 |
| 3 | 2 | h=1 |
| 4 | 3 | h=1, h=0 |
| 7 | 4 | h=2 |
| 8 | 5 | h=2, h=0 |
| 10 | 6 | h=2, h=1 |
| 11 | 7 | h=2, h=1, h=0 |
| 15 | 8 | h=3 |
GroveDB stores mmr_size in the Element (not leaf count) because the ckb MMR
library uses positions internally. The mmr_tree_leaf_count operation derives
the leaf count on the fly.
MMR Root Hash — Bagging the Peaks
An MMR has multiple peaks (one per 1-bit in the leaf count). To produce a single 32-byte root hash, the peaks are "bagged" right-to-left:
root = bag_rhs_peaks(peaks):
start with rightmost peak
fold leftward: blake3(left_peak || accumulated_right)
With 1 peak, the root is just that peak's hash. With 3 peaks:
graph LR
subgraph bag["Bagging 3 peaks (7 leaves = 111₂)"]
P0["peak h=2<br/>(4 leaves)"]
P1["peak h=1<br/>(2 leaves)"]
P2["peak h=0<br/>(1 leaf)"]
B1["blake3(P1 || P2)"]
P1 --> B1
P2 --> B1
B0["blake3(P0 || B1)<br/><b>= MMR root</b>"]
P0 --> B0
B1 --> B0
end
style B0 fill:#d4e6f1,stroke:#2980b9,stroke-width:3px
style bag fill:#fef9e7,stroke:#f39c12,stroke-width:2px
The root hash changes with every append, even when no merges occur, because the rightmost peak changes and bagging must be recomputed.
Node Structure and Serialization
Each MMR node is an MmrNode:
#![allow(unused)] fn main() { struct MmrNode { hash: [u8; 32], // Blake3 hash value: Option<Vec<u8>>, // Some for leaves, None for internal nodes } }
Leaf node: hash = blake3(value_bytes), value = Some(value_bytes)
Internal node: hash = blake3(left.hash || right.hash), value = None
The merge function is straightforward — concatenate two 32-byte hashes and Blake3 the result:
#![allow(unused)] fn main() { fn blake3_merge(left: &[u8; 32], right: &[u8; 32]) -> [u8; 32] { let mut input = [0u8; 64]; input[..32].copy_from_slice(left); input[32..].copy_from_slice(right); *blake3::hash(&input).as_bytes() } }
Note on PartialEq:
MmrNodeimplementsPartialEqby comparing only the hash field, not the value. This is critical for proof verification: the ckb verifier compares a reconstructed root (value = None) against the expected root. If PartialEq compared the value field, single-leaf MMR proofs would always fail because the leaf hasvalue: Some(...)but the root reconstruction producesvalue: None.
Serialization format:
Internal: [0x00] [hash: 32 bytes] = 33 bytes
Leaf: [0x01] [hash: 32 bytes] [value_len: 4 BE] [value...] = 37 + len bytes
The flag byte distinguishes internal nodes from leaves. Deserialization validates exact length — no trailing bytes allowed.
Storage Architecture
MmrTree stores its nodes in the data column (the same column family used
by Merk nodes), not in a child Merk subtree. The Element has no root_key
field — the MMR root hash flows as the Merk child hash via
insert_subtree(subtree_root_hash), authenticating the MMR state.
Storage keys are position-based:
key = 'm' || position_as_be_u64 (9 bytes: prefix + u64 BE)
So position 42 is stored at key [0x6D, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x2A].
Looking up leaf i requires computing the MMR position first:
pos = leaf_index_to_pos(i), then reading data key m{pos}.
Write-through cache: During appends, newly written nodes must be
immediately readable for subsequent merges in the same push. Because GroveDB's
transactional storage defers writes to a batch (they aren't visible to reads
until commit), an MmrStore adapter wraps the storage context with an
in-memory HashMap cache:
graph LR
subgraph store["MmrStore"]
read["get_elem(pos)"] --> cache_check{"In cache?"}
cache_check -->|"Yes"| cache_hit["Return cached node"]
cache_check -->|"No"| data_read["Read from data storage"]
write["append(pos, nodes)"] --> data_write["Write to data storage"]
data_write --> cache_write["Also insert into cache"]
end
style store fill:#fef9e7,stroke:#f39c12,stroke-width:2px
This ensures that when appending leaf₃ triggers a merge cascade (creating internal nodes at positions 5 and 6), node₅ is immediately available when computing node₆, even though node₅ hasn't been committed to RocksDB yet.
Root hash propagation to the GroveDB state root:
combined_value_hash = blake3(
blake3(varint(len) || element_bytes), ← value_hash from serialized Element
mmr_root_hash ← child_hash = type-specific root
)
The MMR root hash flows as the Merk child hash via insert_subtree. Any change
to the MMR state produces a different combined_value_hash, which propagates up
through the parent Merk hierarchy all the way to the GroveDB state root.
GroveDB Operations
MmrTree provides four operations:
#![allow(unused)] fn main() { // Append a value — returns (new_mmr_root, leaf_index) db.mmr_tree_append(path, key, value, tx, version) // Read the current root hash (from Element, no storage access) db.mmr_tree_root_hash(path, key, tx, version) // Get a leaf value by 0-based index db.mmr_tree_get_value(path, key, leaf_index, tx, version) // Get the number of leaves appended db.mmr_tree_leaf_count(path, key, tx, version) }
Append Flow
The append operation is the most complex, performing 8 steps:
graph TD
subgraph append["mmr_tree_append(path, key, value, tx)"]
A1["1. Read Element at path/key<br/>→ get mmr_size, flags"]
A2["2. Open data storage at<br/>path/key subtree"]
A3["3. Create MmrNode::leaf(value)<br/>hash = blake3(value)"]
A4["4. Push leaf into ckb MMR<br/>→ cascading merges write<br/>new internal nodes"]
A5["5. Commit → flush new<br/>nodes to storage"]
A6["6. Bag peaks → new mmr_root"]
A7["7. Update Element with<br/>new mmr_root and mmr_size"]
A8["8. Propagate changes up<br/>through GroveDB Merk<br/>hierarchy + commit tx"]
A1 --> A2 --> A3 --> A4 --> A5 --> A6 --> A7 --> A8
end
style append fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style A6 fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
Step 4 may write 1 node (leaf only) or 1 + N nodes (leaf + N internal merge
nodes). Step 5 calls mmr.commit() which flushes the ckb MemStore to the
MmrStore. Step 7 calls insert_subtree with the new MMR root as the child hash
(via subtree_root_hash), since MmrTree has no child Merk.
Read Operations
mmr_tree_root_hash computes the root from MMR data in storage.
mmr_tree_leaf_count derives the leaf count from mmr_size in the Element.
No data storage access needed.
mmr_tree_get_value computes pos = leaf_index_to_pos(leaf_index), reads
the single data storage entry at m{pos}, deserializes the MmrNode, and
returns node.value.
Batch Operations
Multiple MMR appends can be batched using GroveOp::MmrTreeAppend { value }.
Because the standard batch execute_ops_on_path function only has access to
the Merk (not the MMR's storage context), MMR appends use a preprocessing phase:
graph TD
subgraph batch["Batch Processing"]
subgraph pre["1. PREPROCESSING (preprocess_mmr_tree_ops)"]
P1["Group MmrTreeAppend ops<br/>by (path, key)"]
P2["For each group:<br/>load MMR from data storage"]
P3["Push ALL values<br/>into the MMR"]
P4["Save updated nodes<br/>back to data storage"]
P5["Replace group with single<br/>ReplaceNonMerkTreeRoot op<br/>carrying new mmr_root<br/>and mmr_size"]
P1 --> P2 --> P3 --> P4 --> P5
end
subgraph body["2. STANDARD apply_body"]
B1["execute_ops_on_path sees<br/>ReplaceNonMerkTreeRoot<br/>(standard Merk update)"]
B2["Propagate root hash<br/>upward through grove"]
B1 --> B2
end
P5 --> B1
end
style pre fill:#fef9e7,stroke:#f39c12,stroke-width:2px
style body fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
Example: A batch with 3 appends to the same MMR:
#![allow(unused)] fn main() { vec![ QualifiedGroveDbOp { path: p, key: k, op: MmrTreeAppend { value: v1 } }, QualifiedGroveDbOp { path: p, key: k, op: MmrTreeAppend { value: v2 } }, QualifiedGroveDbOp { path: p, key: k, op: MmrTreeAppend { value: v3 } }, ] }
Preprocessing loads the MMR once, pushes v1, v2, v3 (creating all intermediate
nodes), saves everything to data storage, then emits a single ReplaceNonMerkTreeRoot
with the final mmr_root and mmr_size. The standard batch machinery handles
the rest.
Proof Generation
MMR proofs are V1 proofs — they use the ProofBytes::MMR variant in the
layered proof structure (see §9.6). The proof demonstrates that specific leaf
values exist at specific positions within the MMR, and that their hashes are
consistent with the mmr_root stored in the parent element.
Query Encoding
Query keys encode positions as big-endian u64 bytes. This preserves
lexicographic sort order (since BE encoding is monotonic), allowing all standard
QueryItem variants to work:
QueryItem::Key([0,0,0,0,0,0,0,5]) → leaf index 5
QueryItem::RangeInclusive([..2]..=[..7]) → leaf indices [2, 3, 4, 5, 6, 7]
QueryItem::RangeFrom([..10]..) → leaf indices [10, 11, ..., N-1]
QueryItem::RangeFull → all leaves [0..leaf_count)
A safety cap of 10,000,000 indices prevents memory exhaustion from unbounded range queries. An empty MMR (zero leaves) returns an empty proof.
The MmrTreeProof Structure
#![allow(unused)] fn main() { struct MmrTreeProof { mmr_size: u64, // MMR size at proof time leaves: Vec<(u64, Vec<u8>)>, // (leaf_index, value) for each proved leaf proof_items: Vec<[u8; 32]>, // Sibling/peak hashes for verification } }
The proof_items contain the minimal set of hashes needed to reconstruct
paths from the proved leaves up to the MMR root. These are the sibling nodes
at each level and the uninvolved peak hashes.
Generation Flow
graph TD
subgraph gen["generate_mmr_layer_proof"]
G1["1. Get subquery items<br/>from PathQuery"]
G2["2. Decode BE u64 keys<br/>→ leaf indices"]
G3["3. Open data storage<br/>at subtree path"]
G4["4. Load all MMR nodes<br/>into read-only MemNodeStore"]
G5["5. Call ckb gen_proof(positions)<br/>→ MerkleProof"]
G6["6. Read each proved leaf's<br/>full value from storage"]
G7["7. Extract proof_items<br/>as [u8; 32] hashes"]
G8["8. Encode MmrTreeProof<br/>→ ProofBytes::MMR(bytes)"]
G1 --> G2 --> G3 --> G4 --> G5 --> G6 --> G7 --> G8
end
style gen fill:#fef9e7,stroke:#f39c12,stroke-width:2px
Step 4 uses a MemNodeStore — a read-only BTreeMap that pre-loads all MMR
nodes from data storage. The ckb proof generator needs random access, so all
nodes must be in memory.
Step 5 is where the ckb library does the heavy lifting: given the MMR size and the positions to prove, it determines which sibling and peak hashes are needed.
Worked Example
Proving leaf 2 in a 5-leaf MMR (mmr_size = 8):
MMR structure:
pos: 6 7
/ \
2 5
/ \ / \
0 1 3 4
Leaf index 2 → MMR position 3
To verify leaf at position 3:
1. Hash the claimed value: leaf_hash = blake3(value)
2. Sibling at position 4: node₅ = blake3(leaf_hash || proof[pos 4])
3. Sibling at position 2: node₆ = blake3(proof[pos 2] || node₅)
4. Peak at position 7: root = bag(node₆, proof[pos 7])
5. Compare: root == expected mmr_root ✓
proof_items = [hash(pos 4), hash(pos 2), hash(pos 7)]
leaves = [(2, original_value_bytes)]
The proof size for this example is: 3 hashes (96 bytes) + 1 leaf value + metadata. In general, proving K leaves from an N-leaf MMR requires O(K * log N) sibling hashes.
Proof Verification
Verification is pure — it requires no database access. The verifier needs only the proof bytes and the expected MMR root hash (which it extracts from the parent element proven in the Merk layer above).
Verification Steps
graph TD
subgraph verify["verify_mmr_lower_layer"]
V1["1. Deserialize MmrTreeProof<br/>(bincode, 100MB limit)"]
V2["2. Cross-validate:<br/>proof.mmr_size == element.mmr_size"]
V3["3. For each proved leaf:<br/>reconstruct MmrNode::leaf(value)<br/>hash = blake3(value)"]
V4["4. Reconstruct proof_items<br/>as MmrNode::internal(hash)"]
V5["5. Build ckb MerkleProof<br/>(mmr_size, proof_nodes)"]
V6["6. Call proof.verify(<br/>root = MmrNode::internal(mmr_root),<br/>leaves = [(pos, leaf_node), ...]<br/>)"]
V7["7. Return verified<br/>(leaf_index, value) pairs"]
V1 --> V2 --> V3 --> V4 --> V5 --> V6 --> V7
end
V6 -->|"root mismatch"| FAIL["Err: InvalidProof<br/>'MMR proof root hash mismatch'"]
V6 -->|"root matches"| V7
style verify fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style FAIL fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
The ckb MerkleProof::verify function reconstructs the root from the leaves
and proof items, then compares it (using PartialEq, which checks only the
hash) against the expected root.
Chain of Trust
The full chain from GroveDB state root to a verified leaf value:
GroveDB state_root (known/trusted)
│
├─ V0 Merk proof layer 0: proves subtree exists at root
│ └─ root_hash matches state_root ✓
│
├─ V0 Merk proof layer 1: proves MmrTree element at path/key
│ └─ KVValueHash node: element_bytes contain mmr_root
│ └─ combined_hash = combine_hash(H(element_bytes), mmr_root)
│ └─ root_hash matches parent layer ✓
│
└─ V1 MMR proof: proves leaf values are in the MMR
└─ Reconstruct paths from leaves through siblings to peaks
└─ Bag peaks → reconstructed root
└─ reconstructed root == mmr_root from element_bytes ✓
└─ Result: leaf₂ = [verified value bytes]
Security Properties
- mmr_size cross-validation: The proof's
mmr_sizemust match the element'smmr_size. A mismatch indicates the proof was generated against a different state and is rejected. - Bincode size limit: Deserialization uses a 100MB limit to prevent crafted length headers from causing huge allocations.
- Limit accounting: Each proved leaf decrements the overall query limit by
1 using
saturating_subto prevent underflow. - Child hash return: The verifier returns the computed MMR root as the child hash for the combine_hash computation in the parent layer.
- V0 rejection: Attempting a subquery into an MmrTree with V0 proofs
returns
Error::NotSupported. Only V1 proofs can descend into non-Merk trees.
Cost Tracking
MMR operations track costs with precision:
| Operation | Hash calls | Storage operations |
|---|---|---|
| Append 1 leaf | 1 + trailing_ones(leaf_count) | 1 leaf write + N internal writes |
| Root hash | 0 (cached in Element) | 1 Element read |
| Get value | 0 | 1 Element read + 1 data read |
| Leaf count | 0 | 1 Element read |
The hash count formula 1 + trailing_ones(N) gives the exact number of Blake3
calls: 1 for the leaf hash, plus one merge hash per cascade level.
Amortized analysis: Over N appends, the total hash count is:
Σ (1 + trailing_ones(i)) for i = 0..N-1
= N + Σ trailing_ones(i) for i = 0..N-1
= N + (N - popcount(N))
≈ 2N
So the amortized cost per append is approximately 2 Blake3 hash calls — constant and independent of the tree size. Compare this to Merk AVL trees where each insert requires O(log N) hashes for the path plus potential rotation hashes.
Storage cost: Each append writes 1 leaf node (37 + value_len bytes) plus 0 to log₂(N) internal nodes (33 bytes each). The amortized storage write per append is approximately 33 + 37 + value_len bytes ≈ 70 + value_len bytes.
Implementation Files
| File | Purpose |
|---|---|
grovedb-mmr/src/node.rs | MmrNode struct, Blake3 merge, serialization |
grovedb-mmr/src/grove_mmr.rs | GroveMmr wrapper around ckb MMR |
grovedb-mmr/src/util.rs | mmr_node_key, hash_count_for_push, mmr_size_to_leaf_count |
grovedb-mmr/src/proof.rs | MmrTreeProof generation and verification |
grovedb-mmr/src/dense_merkle.rs | Dense Merkle tree roots (used by BulkAppendTree) |
grovedb/src/operations/mmr_tree.rs | GroveDB operations + MmrStore adapter + batch preprocessing |
grovedb/src/operations/proof/generate.rs | V1 proof generation: generate_mmr_layer_proof, query_items_to_leaf_indices |
grovedb/src/operations/proof/verify.rs | V1 proof verification: verify_mmr_lower_layer |
grovedb/src/tests/mmr_tree_tests.rs | 28 integration tests |
Comparison with Other Authenticated Structures
| MMR (MmrTree) | Merk AVL (Tree) | Sinsemilla (CommitmentTree) | |
|---|---|---|---|
| Use case | Append-only logs | Key-value store | ZK-friendly commitments |
| Hash function | Blake3 | Blake3 | Sinsemilla (Pallas curve) |
| Operations | Append, read by index | Insert, update, delete, query | Append, witness |
| Amortized hash/write | ~2 | O(log N) | ~33 (32 levels + ommers) |
| Proof type | V1 (MMR sibling hashes) | V0 (Merk path proof) | Witness (Merkle auth path) |
| ZK-friendly | No | No | Yes (Halo 2 circuits) |
| Rebalancing | None | AVL rotations | None |
| Delete support | No | Yes | No |
The BulkAppendTree — High-Throughput Append-Only Storage
The BulkAppendTree is GroveDB's answer to a specific engineering challenge: how do you build a high-throughput append-only log that supports efficient range proofs, minimises per-write hashing, and produces immutable chunk snapshots suitable for CDN distribution?
While an MmrTree (Chapter 13) is ideal for individual leaf proofs, the BulkAppendTree is designed for workloads where thousands of values arrive per block and clients need to sync by fetching ranges of data. It achieves this with a two-level architecture: a dense Merkle tree buffer that absorbs incoming appends, and a chunk-level MMR that records finalized chunk roots.
The Two-Level Architecture
┌────────────────────────────────────────────────────────────────┐
│ BulkAppendTree │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Chunk MMR │ │
│ │ ┌────┐ ┌────┐ ┌────┐ ┌────┐ │ │
│ │ │ R0 │ │ R1 │ │ R2 │ │ H │ ← Dense Merkle roots │ │
│ │ └────┘ └────┘ └────┘ └────┘ of each chunk blob │ │
│ │ peak hashes bagged together = MMR root│ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ ┌──────────────────────────────────────────────────────────┐ │
│ │ Buffer (DenseFixedSizedMerkleTree, capacity = 2^h - 1) │ │
│ │ ┌───┐ ┌───┐ ┌───┐ │ │
│ │ │v_0│ │v_1│ │v_2│ ... (fills in level-order) │ │
│ │ └───┘ └───┘ └───┘ │ │
│ │ dense_tree_root = recomputed root hash of dense tree │ │
│ └──────────────────────────────────────────────────────────┘ │
│ │
│ state_root = blake3("bulk_state" || mmr_root || dense_tree_root) │
│ │
└────────────────────────────────────────────────────────────────┘
Level 1 — The Buffer. Incoming values are written to a DenseFixedSizedMerkleTree
(see Chapter 16). The buffer capacity is 2^height - 1 positions. The dense tree's
root hash (dense_tree_root) updates after every insert.
Level 2 — The Chunk MMR. When the buffer fills (reaches chunk_size entries),
all entries are serialized into an immutable chunk blob, a dense Merkle root is
computed over those entries, and that root is appended as a leaf to the chunk MMR.
The buffer is then cleared.
The state root combines both levels into a single 32-byte commitment that changes on every append, ensuring the parent Merk tree always reflects the latest state.
How Values Fill the Buffer
Each call to append() follows this sequence:
Step 1: Write value to dense tree buffer at next position
dense_tree.insert(value, store)
Step 2: Increment total_count
total_count += 1
Step 3: Check if buffer is full (dense tree at capacity)
if dense_tree.count() == capacity:
→ trigger compaction (§14.3)
Step 4: Compute new state root (+1 blake3 call)
dense_tree_root = dense_tree.root_hash(store)
state_root = blake3("bulk_state" || mmr_root || dense_tree_root)
The buffer IS a DenseFixedSizedMerkleTree (see Chapter 16). Its root hash changes after every insert, providing a commitment to all current buffer entries. This root hash is what flows into the state root computation.
Chunk Compaction
When the buffer fills (reaches chunk_size entries), compaction fires automatically:
Compaction Steps:
─────────────────
1. Read all chunk_size buffer entries
2. Compute dense Merkle root
- Hash each entry: leaf[i] = blake3(entry[i])
- Build complete binary tree bottom-up
- Extract root hash
Hash cost: chunk_size + (chunk_size - 1) = 2 * chunk_size - 1
3. Serialize entries into chunk blob
- Auto-selects fixed-size or variable-size format (§14.6)
- Store as: store.put(chunk_key(chunk_index), blob)
4. Append dense Merkle root to chunk MMR
- MMR push with merge cascade (see Chapter 13)
Hash cost: ~2 amortized (trailing_ones pattern)
5. Reset the dense tree (clear all buffer entries from storage)
- Dense tree count reset to 0
After compaction, the chunk blob is permanently immutable — it never changes again. This makes chunk blobs ideal for CDN caching, client sync, and archival storage.
Example: 4 appends with chunk_power=2 (chunk_size=4)
Append v_0: dense_tree=[v_0], dense_root=H(v_0), total=1
Append v_1: dense_tree=[v_0,v_1], dense_root=H(v_0,v_1), total=2
Append v_2: dense_tree=[v_0..v_2], dense_root=H(v_0..v_2), total=3
Append v_3: dense_tree=[v_0..v_3], dense_root=H(v_0..v_3), total=4
→ COMPACTION:
chunk_blob_0 = serialize([v_0, v_1, v_2, v_3])
dense_root_0 = dense_merkle_root(v_0..v_3)
mmr.push(dense_root_0)
dense tree cleared (count=0)
Append v_4: dense_tree=[v_4], dense_root=H(v_4), total=5
→ state_root = blake3("bulk_state" || mmr_root || dense_root)
The State Root
The state root binds both levels into one hash:
#![allow(unused)] fn main() { fn compute_state_root( mmr_root: &[u8; 32], // Chunk MMR root (or [0;32] if empty) dense_tree_root: &[u8; 32], // Root hash of current buffer (dense tree) ) -> [u8; 32] { blake3("bulk_state" || mmr_root || dense_tree_root) } }
The total_count and chunk_power are not included in the state root because
they are already authenticated by the Merk value hash — they are fields of the
serialized Element stored in the parent Merk node. The state root captures only the
data-level commitments (mmr_root and dense_tree_root). This is the hash that
flows as the Merk child hash and propagates up to the GroveDB root hash.
The Dense Merkle Root
When a chunk compacts, the entries need a single 32-byte commitment. The BulkAppendTree uses a dense (complete) binary Merkle tree:
Given entries [e_0, e_1, e_2, e_3]:
Level 0 (leaves): blake3(e_0) blake3(e_1) blake3(e_2) blake3(e_3)
\__________/ \__________/
Level 1: blake3(h_0 || h_1) blake3(h_2 || h_3)
\____________________/
Level 2 (root): blake3(h_01 || h_23) ← this is the dense Merkle root
Because chunk_size is always a power of 2 (by construction: 1u32 << chunk_power),
the tree is always complete (no padding or dummy leaves needed). The hash count is
exactly 2 * chunk_size - 1:
chunk_sizeleaf hashes (one per entry)chunk_size - 1internal node hashes
The dense Merkle root implementation lives in grovedb-mmr/src/dense_merkle.rs and
provides two functions:
compute_dense_merkle_root(hashes)— from pre-hashed leavescompute_dense_merkle_root_from_values(values)— hashes values first, then builds the tree
Chunk Blob Serialization
Chunk blobs are the immutable archives produced by compaction. The serializer auto-selects the most compact wire format based on entry sizes:
Fixed-size format (flag 0x01) — when all entries have the same length:
┌──────┬──────────┬─────────────┬─────────┬─────────┬─────────┐
│ 0x01 │ count │ entry_size │ entry_0 │ entry_1 │ ... │
│ 1B │ 4B (BE) │ 4B (BE) │ N bytes │ N bytes │ │
└──────┴──────────┴─────────────┴─────────┴─────────┴─────────┘
Total: 1 + 4 + 4 + (count × entry_size) bytes
Variable-size format (flag 0x00) — when entries have different lengths:
┌──────┬──────────┬─────────┬──────────┬─────────┬─────────────┐
│ 0x00 │ len_0 │ entry_0 │ len_1 │ entry_1 │ ... │
│ 1B │ 4B (BE) │ N bytes │ 4B (BE) │ M bytes │ │
└──────┴──────────┴─────────┴──────────┴─────────┴─────────────┘
Total: 1 + Σ(4 + len_i) bytes
The fixed-size format saves 4 bytes per entry compared to variable-size, which adds up significantly for large chunks of uniform-size data (like 32-byte hash commitments). For 1024 entries of 32 bytes each:
- Fixed:
1 + 4 + 4 + 32768 = 32,777 bytes - Variable:
1 + 1024 × (4 + 32) = 36,865 bytes - Savings: ~11%
Storage Key Layout
All BulkAppendTree data lives in the data namespace, keyed with single-character prefixes:
| Key pattern | Format | Size | Purpose |
|---|---|---|---|
M | 1 byte | 1B | Metadata key |
b + {index} | b + u32 BE | 5B | Buffer entry at index |
e + {index} | e + u64 BE | 9B | Chunk blob at index |
m + {pos} | m + u64 BE | 9B | MMR node at position |
Metadata stores mmr_size (8 bytes BE). The total_count and chunk_power are
stored in the Element itself (in the parent Merk), not in data namespace metadata.
This split means reading the count is a simple element lookup without opening the
data storage context.
Buffer keys use u32 indices (0 to chunk_size - 1) because the buffer capacity is
limited by chunk_size (a u32, computed as 1u32 << chunk_power). Chunk keys use u64
indices because the number of completed chunks can grow indefinitely.
The BulkAppendTree Struct
#![allow(unused)] fn main() { pub struct BulkAppendTree<S> { pub total_count: u64, // Total values ever appended pub dense_tree: DenseFixedSizedMerkleTree<S>, // The buffer (dense tree) } }
The buffer IS a DenseFixedSizedMerkleTree — its root hash is dense_tree_root.
Accessors:
capacity() -> u16:dense_tree.capacity()(=2^height - 1)epoch_size() -> u64:capacity + 1(=2^height, the number of entries per chunk)height() -> u8:dense_tree.height()
Derived values (not stored):
| Value | Formula |
|---|---|
chunk_count | total_count / epoch_size |
buffer_count | dense_tree.count() |
mmr_size | leaf_count_to_mmr_size(chunk_count) |
GroveDB Operations
The BulkAppendTree integrates with GroveDB through six operations defined in
grovedb/src/operations/bulk_append_tree.rs:
bulk_append
The primary mutating operation. Follows the standard GroveDB non-Merk storage pattern:
1. Validate element is BulkAppendTree
2. Open data storage context
3. Load tree from store
4. Append value (may trigger compaction)
5. Update element in parent Merk with new state_root + total_count
6. Propagate changes up through Merk hierarchy
7. Commit transaction
The AuxBulkStore adapter wraps GroveDB's get_aux/put_aux/delete_aux calls and
accumulates OperationCost in a RefCell for cost tracking. Hash costs from the
append operation are added to cost.hash_node_calls.
Read operations
| Operation | What it returns | Aux storage? |
|---|---|---|
bulk_get_value(path, key, position) | Value at global position | Yes — reads from chunk blob or buffer |
bulk_get_chunk(path, key, chunk_index) | Raw chunk blob | Yes — reads chunk key |
bulk_get_buffer(path, key) | All current buffer entries | Yes — reads buffer keys |
bulk_count(path, key) | Total count (u64) | No — reads from element |
bulk_chunk_count(path, key) | Completed chunks (u64) | No — computed from element |
The get_value operation transparently routes by position:
if position < completed_chunks × chunk_size:
chunk_idx = position / chunk_size
intra_idx = position % chunk_size
→ read chunk blob, deserialize, return entry[intra_idx]
else:
buffer_idx = position % chunk_size
→ read buffer_key(buffer_idx)
Batch Operations and Preprocessing
BulkAppendTree supports batch operations through the GroveOp::BulkAppend variant.
Since execute_ops_on_path doesn't have access to the data storage context, all BulkAppend ops
must be preprocessed before apply_body.
The preprocessing pipeline:
Input: [BulkAppend{v1}, Insert{...}, BulkAppend{v2}, BulkAppend{v3}]
↑ same (path,key) as v1
Step 1: Group BulkAppend ops by (path, key)
group_1: [v1, v2, v3]
Step 2: For each group:
a. Read existing element → get (total_count, chunk_power)
b. Open transactional storage context
c. Load BulkAppendTree from store
d. Load existing buffer into memory (Vec<Vec<u8>>)
e. For each value:
tree.append_with_mem_buffer(store, value, &mut mem_buffer)
f. Save metadata
g. Compute final state_root
Step 3: Replace all BulkAppend ops with one ReplaceNonMerkTreeRoot per group
carrying: hash=state_root, meta=BulkAppendTree{total_count, chunk_power}
Output: [ReplaceNonMerkTreeRoot{...}, Insert{...}]
The append_with_mem_buffer variant avoids read-after-write issues: buffer entries
are tracked in a Vec<Vec<u8>> in memory, so compaction can read them even though
the transactional storage hasn't committed yet.
The BulkStore Trait
#![allow(unused)] fn main() { pub trait BulkStore { fn get(&self, key: &[u8]) -> Result<Option<Vec<u8>>, String>; fn put(&self, key: &[u8], value: &[u8]) -> Result<(), String>; fn delete(&self, key: &[u8]) -> Result<(), String>; } }
Methods take &self (not &mut self) to match GroveDB's interior mutability pattern
where writes go through a batch. The GroveDB integration implements this via
AuxBulkStore which wraps a StorageContext and accumulates OperationCost.
The MmrAdapter bridges BulkStore to the ckb MMR's MMRStoreReadOps/
MMRStoreWriteOps traits, adding a write-through cache for read-after-write
correctness.
Proof Generation
BulkAppendTree proofs support range queries over positions. The proof structure captures everything needed for a stateless verifier to confirm that specific data exists in the tree:
#![allow(unused)] fn main() { pub struct BulkAppendTreeProof { pub chunk_power: u8, pub total_count: u64, pub chunk_blobs: Vec<(u64, Vec<u8>)>, // Full chunk blobs pub chunk_mmr_size: u64, pub chunk_mmr_proof_items: Vec<[u8; 32]>, // MMR sibling hashes pub chunk_mmr_leaves: Vec<(u64, [u8; 32])>, // (leaf_idx, dense_root) pub buffer_entries: Vec<Vec<u8>>, // ALL buffer entries pub chunk_mmr_root: [u8; 32], } }
Generation steps for a range [start, end) (with chunk_size = 1u32 << chunk_power):
1. Determine overlapping chunks
first_chunk = start / chunk_size
last_chunk = min((end-1) / chunk_size, completed_chunks - 1)
2. Read chunk blobs for overlapping chunks
For each chunk_idx in [first_chunk, last_chunk]:
chunk_blobs.push((chunk_idx, store.get(chunk_key(idx))))
3. Compute dense Merkle root for each chunk blob
For each blob:
deserialize → values
dense_root = compute_dense_merkle_root_from_values(values)
chunk_mmr_leaves.push((chunk_idx, dense_root))
4. Generate MMR proof for those chunk positions
positions = chunk_indices.map(|idx| leaf_to_pos(idx))
proof = mmr.gen_proof(positions)
chunk_mmr_proof_items = proof.proof_items().map(|n| n.hash)
5. Get chunk MMR root
6. Read ALL buffer entries (bounded by chunk_size)
for i in 0..buffer_count:
buffer_entries.push(store.get(buffer_key(i)))
Why include ALL buffer entries? The buffer is a dense Merkle tree whose root hash
commits to every entry. The verifier must rebuild the tree from all entries to verify
the dense_tree_root. Since the buffer is bounded by capacity (at most 65,535
entries), this is a reasonable cost.
Proof Verification
Verification is a pure function — no database access needed. It performs five checks:
Step 0: Metadata consistency
- chunk_power <= 31
- buffer_entries.len() == total_count % chunk_size
- MMR leaf count == completed_chunks
Step 1: Chunk blob integrity
For each (chunk_idx, blob):
values = deserialize(blob)
computed_root = dense_merkle_root(values)
assert computed_root == chunk_mmr_leaves[chunk_idx]
Step 2: Chunk MMR proof
Reconstruct MmrNode leaves and proof items
proof.verify(chunk_mmr_root, leaves) == true
Step 3: Buffer (dense tree) integrity
Rebuild DenseFixedSizedMerkleTree from buffer_entries
dense_tree_root = compute root hash of rebuilt tree
Step 4: State root
computed = blake3("bulk_state" || chunk_mmr_root || dense_tree_root)
assert computed == expected_state_root
After verification succeeds, the BulkAppendTreeProofResult provides a
values_in_range(start, end) method that extracts specific values from the verified
chunk blobs and buffer entries.
How It Ties to the GroveDB Root Hash
The BulkAppendTree is a non-Merk tree — it stores data in the data namespace, not in a child Merk subtree. In the parent Merk, the element is stored as:
Element::BulkAppendTree(total_count, chunk_power, flags)
The state root flows as the Merk child hash. The parent Merk node hash is:
combine_hash(value_hash(element_bytes), state_root)
The state_root flows as the Merk child hash (via insert_subtree's
subtree_root_hash parameter). Any change to the state root propagates up through
the GroveDB Merk hierarchy to the root hash.
In V1 proofs (§9.6), the parent Merk proof proves the element bytes and the child
hash binding, and the BulkAppendTreeProof proves that the queried data is consistent
with the state_root used as the child hash.
Cost Tracking
Each operation's hash cost is tracked explicitly:
| Operation | Blake3 calls | Notes |
|---|---|---|
| Single append (no compaction) | 3 | 2 for buffer hash chain + 1 for state root |
| Single append (with compaction) | 3 + 2C - 1 + ~2 | Chain + dense Merkle (C=chunk_size) + MMR push + state root |
get_value from chunk | 0 | Pure deserialization, no hashing |
get_value from buffer | 0 | Direct key lookup |
| Proof generation | Depends on chunk count | Dense Merkle root per chunk + MMR proof |
| Proof verification | 2C·K - K + B·2 + 1 | K chunks, B buffer entries, C chunk_size |
Amortized cost per append: For chunk_size=1024 (chunk_power=10), the compaction overhead of ~2047 hashes (dense Merkle root) is amortized over 1024 appends, adding ~2 hashes per append. Combined with the 3 per-append hashes, the amortized total is ~5 blake3 calls per append — very efficient for a cryptographically authenticated structure.
Comparison with MmrTree
| BulkAppendTree | MmrTree | |
|---|---|---|
| Architecture | Two-level (buffer + chunk MMR) | Single MMR |
| Per-append hash cost | 3 (+ amortized ~2 for compaction) | ~2 |
| Proof granularity | Range queries over positions | Individual leaf proofs |
| Immutable snapshots | Yes (chunk blobs) | No |
| CDN-friendly | Yes (chunk blobs cacheable) | No |
| Buffer entries | Yes (need all for proof) | N/A |
| Best for | High-throughput logs, bulk sync | Event logs, individual lookups |
| Element discriminant | 13 | 12 |
| TreeType | 9 | 8 |
Choose MmrTree when you need individual leaf proofs with minimal overhead. Choose BulkAppendTree when you need range queries, bulk synchronization, and chunk-based snapshots.
Implementation Files
| File | Purpose |
|---|---|
grovedb-bulk-append-tree/src/lib.rs | Crate root, re-exports |
grovedb-bulk-append-tree/src/tree/mod.rs | BulkAppendTree struct, state accessors, metadata persistence |
grovedb-bulk-append-tree/src/tree/append.rs | append(), append_with_mem_buffer(), compact_entries() |
grovedb-bulk-append-tree/src/tree/hash.rs | compute_state_root |
grovedb-bulk-append-tree/src/tree/keys.rs | META_KEY, buffer_key, chunk_key, mmr_node_key |
grovedb-bulk-append-tree/src/tree/query.rs | get_value, get_chunk, get_buffer |
grovedb-bulk-append-tree/src/tree/mmr_adapter.rs | MmrAdapter with write-through cache |
grovedb-bulk-append-tree/src/chunk.rs | Chunk blob serialization (fixed + variable formats) |
grovedb-bulk-append-tree/src/proof.rs | BulkAppendTreeProof generation and verification |
grovedb-bulk-append-tree/src/store.rs | BulkStore trait |
grovedb-bulk-append-tree/src/error.rs | BulkAppendError enum |
grovedb/src/operations/bulk_append_tree.rs | GroveDB operations, AuxBulkStore, batch preprocessing |
grovedb/src/tests/bulk_append_tree_tests.rs | 27 integration tests |
The CommitmentTree — Sinsemilla Commitment Anchors
The CommitmentTree is GroveDB's bridge between authenticated storage and zero-knowledge proof systems. It combines a BulkAppendTree (Chapter 14) for efficient chunk-compacted data storage with a Sinsemilla frontier in the data namespace for ZK-compatible anchors. Like MmrTree and BulkAppendTree, it has no child Merk — the combined root hash flows as the Merk child hash. Both the BulkAppendTree entries and the Sinsemilla frontier live in the data namespace.
This chapter covers the Sinsemilla hash function and why it matters for zero-knowledge circuits, the frontier data structure and its compact serialization, the dual-namespace storage architecture, the GroveDB operations, batch preprocessing, client-side witness generation, and how proofs work.
Why a ZK-Friendly Tree?
GroveDB's standard trees use Blake3 hashing. Blake3 is fast in software, but expensive inside zero-knowledge circuits. When a spender needs to prove "I know a note at position P in the commitment tree" without revealing P, they must evaluate the Merkle hash function 32 times (once per tree level) inside a ZK circuit.
Sinsemilla (specified in ZIP-244 for the Zcash Orchard protocol) is designed for exactly this use case — it provides efficient in-circuit hashing over the Pallas elliptic curve, one half of the Pasta curve cycle used by the Halo 2 proof system.
| Property | Blake3 | Sinsemilla |
|---|---|---|
| Circuit cost | ~25,000 constraints per hash | ~800 constraints per hash |
| Software speed | Very fast (~2 GB/s) | Slow (~10,000 hashes/s) |
| Algebraic structure | None (bitwise) | Pallas curve point operations |
| Primary purpose | General hashing, Merkle trees | In-circuit Merkle proofs |
| Used by | GroveDB Merk trees, MMR, Bulk | Orchard shielded protocol |
| Output size | 32 bytes | 32 bytes (Pallas field element) |
The CommitmentTree uses Sinsemilla for the Merkle tree that ZK circuits reason about, while still using Blake3 for the GroveDB Merk hierarchy above it. Items inserted into the tree are stored via a BulkAppendTree in the data namespace (chunk-compacted, retrievable by position) and simultaneously appended to the Sinsemilla frontier (producing a ZK-provable anchor).
The Data-Namespace Architecture
The CommitmentTree stores all data in the data namespace at the same subtree
path. Like MmrTree and BulkAppendTree, it has no child Merk (no root_key
field — the type-specific root flows as the Merk child hash). The BulkAppendTree entries and the Sinsemilla frontier coexist
in the data namespace using distinct key prefixes:
┌──────────────────────────────────────────────────────────────┐
│ CommitmentTree │
│ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ Data Namespace │ │
│ │ │ │
│ │ BulkAppendTree storage (Chapter 14): │ │
│ │ Buffer entries → chunk blobs → chunk MMR │ │
│ │ value = cmx (32) || rho (32) || ciphertext (216) │ │
│ │ │ │
│ │ Sinsemilla Frontier (~1KB): │ │
│ │ key: b"__ct_data__" (COMMITMENT_TREE_DATA_KEY) │ │
│ │ Depth-32 incremental Merkle tree │ │
│ │ Stores only the rightmost path (leaf + ommers) │ │
│ │ O(1) append, O(1) root computation │ │
│ │ Produces Orchard-compatible Anchor for ZK proofs │ │
│ └─────────────────────────────────────────────────────────┘ │
│ │
│ sinsemilla_root embedded in Element bytes │
│ → flows through Merk value_hash → GroveDB state root │
└──────────────────────────────────────────────────────────────┘
Why two structures? The BulkAppendTree provides efficient, chunk-compacted storage and retrieval for potentially millions of encrypted notes. The Sinsemilla frontier provides ZK-compatible anchors that can be proved inside a Halo 2 circuit. Both are updated in lockstep on every append.
Compare with the other non-standard tree types:
| CommitmentTree | MmrTree | BulkAppendTree | |
|---|---|---|---|
| Child Merk | No | No | No |
| Data namespace | BulkAppendTree entries + frontier | MMR nodes | Buffer + chunks + MMR |
| Aux namespace | — | — | — |
| Items queryable | Via V1 proofs | Via V1 proofs | Via V1 proofs |
| Hash function | Sinsemilla + Blake3 | Blake3 | Blake3 |
The Sinsemilla Frontier
The frontier is a depth-32 incremental Merkle tree implemented by the
incrementalmerkletree crate's Frontier<MerkleHashOrchard, 32> type. Instead
of storing all 2^32 possible leaves, it stores only the information needed to
append the next leaf and compute the current root: the rightmost leaf and
its ommers (sibling hashes needed for root computation).
root (level 32)
/ \
... ...
/ \
(level 2) (level 2)
/ \ / \
(level 1) (level 1) (level 1) ?
/ \ / \ / \
L0 L1 L2 L3 L4 ? ← frontier stores L4
+ ommers at levels
where left sibling exists
The frontier stores:
- leaf: the most recently appended value (a Pallas field element)
- ommers: the left-sibling hashes at each level where the frontier path goes right (at most 32 ommers for a depth-32 tree)
- position: the 0-indexed position of the leaf
Key properties:
- O(1) append: insert a new leaf, update ommers, recompute root
- O(1) root: traverse the stored ommers from leaf to root
- ~1KB constant size: regardless of how many leaves have been appended
- Deterministic: two frontiers with the same sequence of appends produce the same root
The EMPTY_SINSEMILLA_ROOT constant is the root of an empty depth-32 tree,
precomputed as MerkleHashOrchard::empty_root(Level::from(32)).to_bytes():
0xae2935f1dfd8a24aed7c70df7de3a668eb7a49b1319880dde2bbd9031ae5d82f
How Appending Works — The Ommer Cascade
When a new commitment is appended at position N, the number of ommers that must
be updated equals trailing_ones(N) — the number of trailing 1-bits in N's
binary representation. This is the same pattern as the MMR merge cascade
(§13.4), but operating on ommers rather than peaks.
Worked example — appending 4 leaves:
Position 0 (binary: 0, trailing_ones: 0):
frontier = { leaf: L0, ommers: [], position: 0 }
Sinsemilla hashes: 32 (root computation) + 0 (no ommer merges) = 32
Position 1 (binary: 1, trailing_ones: 0 of PREVIOUS position 0):
Before: position 0 has trailing_ones = 0
frontier = { leaf: L1, ommers: [H(L0,L1) at level 1], position: 1 }
Sinsemilla hashes: 32 + 0 = 32
Position 2 (binary: 10, trailing_ones: 0 of PREVIOUS position 1):
Before: position 1 has trailing_ones = 1
frontier = { leaf: L2, ommers: [level1_hash], position: 2 }
Sinsemilla hashes: 32 + 1 = 33
Position 3 (binary: 11, trailing_ones: 0 of PREVIOUS position 2):
Before: position 2 has trailing_ones = 0
frontier = { leaf: L3, ommers: [level1_hash, level2_hash], position: 3 }
Sinsemilla hashes: 32 + 0 = 32
The total Sinsemilla hashes per append is:
32 (root computation always traverses all 32 levels)
+ trailing_ones(current_position) (ommer cascade)
On average, trailing_ones is ~1 (geometric distribution), so the average
cost is ~33 Sinsemilla hashes per append. The worst case (at position
2^32 - 1, where all bits are 1) is 64 hashes.
The Frontier Serialization Format
The frontier is stored in data storage at key b"__ct_data__". The wire
format is:
┌──────────────────────────────────────────────────────────────────┐
│ has_frontier: u8 │
│ 0x00 → empty tree (no more fields) │
│ 0x01 → non-empty (fields follow) │
├──────────────────────────────────────────────────────────────────┤
│ position: u64 BE (8 bytes) — 0-indexed leaf position │
├──────────────────────────────────────────────────────────────────┤
│ leaf: [u8; 32] — Pallas field element bytes │
├──────────────────────────────────────────────────────────────────┤
│ ommer_count: u8 — number of ommers (0..=32) │
├──────────────────────────────────────────────────────────────────┤
│ ommers: [ommer_count × 32 bytes] — Pallas field elements │
└──────────────────────────────────────────────────────────────────┘
Size analysis:
| State | Size | Breakdown |
|---|---|---|
| Empty | 1 byte | 0x00 flag only |
| 1 leaf, 0 ommers | 42 bytes | 1 + 8 + 32 + 1 |
| ~16 ommers (average) | 554 bytes | 1 + 8 + 32 + 1 + 16×32 |
| 32 ommers (maximum) | 1,066 bytes | 1 + 8 + 32 + 1 + 32×32 |
The frontier size is bounded by ~1.1KB regardless of how many millions of commitments have been appended. This makes the load→modify→save cycle very cheap (1 seek to read, 1 seek to write).
Element Representation
#![allow(unused)] fn main() { CommitmentTree( u64, // total_count: number of appended items u8, // chunk_power: dense tree height for BulkAppendTree buffer Option<ElementFlags>, // flags: optional metadata ) }
The chunk_power parameter controls the BulkAppendTree buffer's dense tree height;
chunk_power must be in the range 1..=16 (see §14.1 and §16).
Type identifiers:
| Identifier | Value |
|---|---|
| Element discriminant | 11 |
TreeType | CommitmentTree = 7 |
ElementType | 11 |
COMMITMENT_TREE_COST_SIZE | 12 bytes (8 total_count + 1 chunk_power + 1 discriminant + 2 overhead) |
The Sinsemilla root is NOT stored in the Element. It flows as the Merk child hash
through the insert_subtree mechanism. When the parent Merk computes its
combined_value_hash, the Sinsemilla-derived root is included as the child hash:
combined_value_hash = blake3(value_hash || child_hash)
↑ sinsemilla/BulkAppendTree combined root
This means any change to the Sinsemilla frontier automatically propagates through the GroveDB Merk hierarchy to the state root.
Constructor methods:
| Method | Creates |
|---|---|
Element::empty_commitment_tree(chunk_power) | Empty tree, count=0, no flags |
Element::empty_commitment_tree_with_flags(chunk_power, flags) | Empty tree with flags |
Element::new_commitment_tree(total_count, chunk_power, flags) | All fields explicit |
Storage Architecture
The CommitmentTree stores all its data in a single data namespace at the subtree path. BulkAppendTree entries and the Sinsemilla frontier coexist in the same column using distinct key prefixes. No aux namespace is used.
┌──────────────────────────────────────────────────────────────────┐
│ Data Namespace (all CommitmentTree storage) │
│ │
│ BulkAppendTree storage keys (see §14.7): │
│ b"m" || pos (u64 BE) → MMR node blobs │
│ b"b" || index (u64 BE)→ buffer entries (cmx || rho || ciphertext) │
│ b"e" || chunk (u64 BE)→ chunk blobs (compacted buffer) │
│ b"M" → BulkAppendTree metadata │
│ │
│ Sinsemilla frontier: │
│ b"__ct_data__" → serialized CommitmentFrontier (~1KB) │
│ │
│ No Merk nodes — this is a non-Merk tree. │
│ Data authenticated via BulkAppendTree state_root (Blake3). │
│ Sinsemilla root authenticates all cmx values via Pallas curve. │
└──────────────────────────────────────────────────────────────────┘
The load→modify→save pattern: Every mutating operation loads the frontier from data storage, modifies it in memory, and writes it back. Since the frontier is at most ~1KB, this is an inexpensive pair of I/O operations (1 seek to read, 1 seek to write). Simultaneously, the BulkAppendTree is loaded, appended to, and saved.
Root hash propagation: When an item is inserted, two things change:
- The BulkAppendTree state changes (new entry in buffer or chunk compaction)
- The Sinsemilla root changes (new commitment in the frontier)
Both are captured in the updated CommitmentTree element. The parent Merk node
hash becomes:
combined_hash = combine_hash(
value_hash(element_bytes), ← includes total_count + chunk_power
child_hash(combined_root) ← sinsemilla/BulkAppendTree combined root
)
Like MmrTree and BulkAppendTree, the type-specific root flows as the Merk child hash. All data authentication flows through this child hash binding.
Non-Merk data storage implications: Because the data namespace contains
BulkAppendTree keys (not Merk nodes), operations that iterate storage as
Merk elements — such as find_subtrees, is_empty_tree, and
verify_merk_and_submerks — must special-case CommitmentTree (and other
non-Merk tree types). The uses_non_merk_data_storage() helper on both
Element and TreeType identifies these tree types. Delete operations clear
the data namespace directly instead of iterating it, and verify_grovedb skips
sub-merk recursion for these types.
GroveDB Operations
CommitmentTree provides four operations. The insert operation is generic over
M: MemoSize (from the orchard crate), which controls ciphertext payload
size validation. The default M = DashMemo gives a 216-byte payload
(32 epk + 104 enc + 80 out).
#![allow(unused)] fn main() { // Insert a commitment (typed) — returns (sinsemilla_root, position) // M controls ciphertext size validation db.commitment_tree_insert::<_, _, M>(path, key, cmx, rho, ciphertext, tx, version) // Insert a commitment (raw bytes) — validates payload.len() == ciphertext_payload_size::<DashMemo>() db.commitment_tree_insert_raw(path, key, cmx, rho, payload_vec, tx, version) // Get the current Orchard Anchor db.commitment_tree_anchor(path, key, tx, version) // Retrieve a value by global position db.commitment_tree_get_value(path, key, position, tx, version) // Get the current item count db.commitment_tree_count(path, key, tx, version) }
The typed commitment_tree_insert accepts a TransmittedNoteCiphertext<M> and
serializes it internally. The raw commitment_tree_insert_raw (pub(crate))
accepts Vec<u8> and is used by batch preprocessing where payloads are already
serialized.
commitment_tree_insert
The insert operation updates both the BulkAppendTree and the Sinsemilla frontier in a single atomic operation:
Step 1: Validate element at path/key is a CommitmentTree
→ extract total_count, chunk_power, flags
Step 2: Build ct_path = path ++ [key]
Step 3: Open data storage context at ct_path
Load CommitmentTree (frontier + BulkAppendTree)
Serialize ciphertext → validate payload size matches M
Append cmx||rho||ciphertext to BulkAppendTree
Append cmx to Sinsemilla frontier → get new sinsemilla_root
Track Blake3 + Sinsemilla hash costs
Step 4: Save updated frontier to data storage
Step 5: Open parent Merk at path
Write updated CommitmentTree element:
new total_count, same chunk_power, same flags
Child hash = combined_root (sinsemilla + bulk state)
Step 6: Propagate changes from parent upward through Merk hierarchy
Step 7: Commit storage batch and local transaction
Return (sinsemilla_root, position)
graph TD
A["commitment_tree_insert(path, key, cmx, rho, ciphertext)"] --> B["Validate: is CommitmentTree?"]
B --> C["Open data storage, load CommitmentTree"]
C --> D["Serialize & validate ciphertext size"]
D --> E["BulkAppendTree.append(cmx||rho||payload)"]
E --> F["frontier.append(cmx)"]
F --> G["Save frontier to data storage"]
G --> H["Update parent CommitmentTree element<br/>new sinsemilla_root + total_count"]
H --> I["Propagate changes upward"]
I --> J["Commit batch + transaction"]
J --> K["Return (sinsemilla_root, position)"]
style F fill:#fadbd8,stroke:#e74c3c,stroke-width:2px
style E fill:#d5f5e3,stroke:#27ae60,stroke-width:2px
style H fill:#d4e6f1,stroke:#2980b9,stroke-width:2px
Red = Sinsemilla operations. Green = BulkAppendTree operations. Blue = element update bridging both.
commitment_tree_anchor
The anchor operation is a read-only query:
Step 1: Validate element at path/key is a CommitmentTree
Step 2: Build ct_path = path ++ [key]
Step 3: Load frontier from data storage
Step 4: Return frontier.anchor() as orchard::tree::Anchor
The Anchor type is the Orchard-native representation of the Sinsemilla root,
suitable for passing directly to orchard::builder::Builder when constructing
spend authorization proofs.
commitment_tree_get_value
Retrieves a stored value (cmx || rho || payload) by its global position:
Step 1: Validate element at path/key is a CommitmentTree
→ extract total_count, chunk_power
Step 2: Build ct_path = path ++ [key]
Step 3: Open data storage context, wrap in CachedBulkStore
Step 4: Load BulkAppendTree, call get_value(position)
Step 5: Return Option<Vec<u8>>
This follows the same pattern as bulk_get_value (§14.9) — the BulkAppendTree
transparently retrieves from the buffer or a compacted chunk blob depending on
where the position falls.
commitment_tree_count
Returns the total number of items appended to the tree:
Step 1: Read element at path/key
Step 2: Verify it is a CommitmentTree
Step 3: Return total_count from element fields
This is a simple element field read — no storage access beyond the parent Merk.
Batch Operations
CommitmentTree supports batch inserts through the GroveOp::CommitmentTreeInsert
variant:
#![allow(unused)] fn main() { GroveOp::CommitmentTreeInsert { cmx: [u8; 32], // extracted note commitment rho: [u8; 32], // nullifier of the spent note payload: Vec<u8>, // serialized ciphertext (216 bytes for DashMemo) } }
Two constructors create this op:
#![allow(unused)] fn main() { // Raw constructor — caller serializes payload manually QualifiedGroveDbOp::commitment_tree_insert_op(path, cmx, rho, payload_vec) // Typed constructor — serializes TransmittedNoteCiphertext<M> internally QualifiedGroveDbOp::commitment_tree_insert_op_typed::<M>(path, cmx, rho, &ciphertext) }
Multiple inserts targeting the same tree are allowed in a single batch. Since
execute_ops_on_path doesn't have access to data storage, all CommitmentTree
ops must be preprocessed before apply_body.
The preprocessing pipeline (preprocess_commitment_tree_ops):
Input: [CTInsert{cmx1}, Insert{...}, CTInsert{cmx2}, CTInsert{cmx3}]
↑ same (path,key) as cmx1
Step 1: Group CommitmentTreeInsert ops by (path, key)
group_1: [cmx1, cmx2, cmx3]
Step 2: For each group:
a. Read existing element → verify CommitmentTree, extract chunk_power
b. Open transactional storage context at ct_path
c. Load CommitmentTree from data storage (frontier + BulkAppendTree)
d. For each (cmx, rho, payload):
- ct.append_raw(cmx, rho, payload) — validates size, appends to both
e. Save updated frontier to data storage
Step 3: Replace all CTInsert ops with one ReplaceNonMerkTreeRoot per group
carrying: hash=bulk_state_root (combined root),
meta=NonMerkTreeMeta::CommitmentTree {
total_count: new_count,
chunk_power,
}
Output: [ReplaceNonMerkTreeRoot{...}, Insert{...}]
The first CommitmentTreeInsert op in each group is replaced by the
ReplaceNonMerkTreeRoot; subsequent ops for the same (path, key) are dropped.
The standard batch machinery then handles element update and root hash
propagation.
MemoSize Generic and Ciphertext Handling
The CommitmentTree<S, M> struct is generic over M: MemoSize (from the
orchard crate). This controls the size of encrypted note ciphertexts stored
alongside each commitment.
#![allow(unused)] fn main() { pub struct CommitmentTree<S, M: MemoSize = DashMemo> { frontier: CommitmentFrontier, pub bulk_tree: BulkAppendTree<S>, _memo: PhantomData<M>, } }
The default M = DashMemo means existing code that doesn't care about memo
size (like verify_grovedb, commitment_tree_anchor, commitment_tree_count)
works without specifying M.
Serialization helpers (public free functions):
| Function | Description |
|---|---|
ciphertext_payload_size::<M>() | Expected payload size for a given MemoSize |
serialize_ciphertext::<M>(ct) | Serialize TransmittedNoteCiphertext<M> to bytes |
deserialize_ciphertext::<M>(data) | Deserialize bytes back to TransmittedNoteCiphertext<M> |
Payload validation: The append_raw() method validates that
payload.len() == ciphertext_payload_size::<M>() and returns
CommitmentTreeError::InvalidPayloadSize on mismatch. The typed append()
method serializes internally, so size is always correct by construction.
Stored Record Layout (280 bytes for DashMemo)
Each entry in the BulkAppendTree stores the complete encrypted note record. The full layout, accounting for every byte:
┌─────────────────────────────────────────────────────────────────────┐
│ Offset Size Field │
├─────────────────────────────────────────────────────────────────────┤
│ 0 32 cmx — extracted note commitment (Pallas base field)│
│ 32 32 rho — nullifier of the spent note │
│ 64 32 epk_bytes — ephemeral public key (Pallas point) │
│ 96 104 enc_ciphertext — encrypted note plaintext + MAC │
│ 200 80 out_ciphertext — encrypted outgoing data + MAC │
├─────────────────────────────────────────────────────────────────────┤
│ Total: 280 bytes │
└─────────────────────────────────────────────────────────────────────┘
The first two fields (cmx and rho) are unencrypted protocol fields —
they are public by design. The remaining three fields (epk_bytes,
enc_ciphertext, out_ciphertext) form the TransmittedNoteCiphertext and
are the encrypted payload.
Field-by-Field Breakdown
cmx (32 bytes) — The extracted note commitment, a Pallas base field element. This is the leaf value appended to the Sinsemilla frontier. It commits to all note fields (recipient, value, randomness) without revealing them. The cmx is what makes the note "findable" in the commitment tree.
rho (32 bytes) — The nullifier of the note being spent in this action.
Nullifiers are already public on the blockchain (they must be to prevent
double-spending). Storing rho alongside the commitment allows light clients
performing trial decryption to verify esk = PRF(rseed, rho) and confirm
epk' == epk without a separate nullifier lookup. This field sits between
cmx and the ciphertext as an unencrypted protocol-level association.
epk_bytes (32 bytes) — The ephemeral public key, a serialized Pallas curve
point. Derived deterministically from the note's rseed via:
rseed → esk = ToScalar(PRF^expand(rseed, [4] || rho))
esk → epk = [esk] * g_d (scalar multiplication on Pallas)
epk → epk_bytes = Serialize(epk)
where g_d = DiversifyHash(d) is the diversified base point for the
recipient's diversifier. The epk enables the recipient to compute the shared
secret for decryption: shared_secret = [ivk] * epk. It is transmitted in the
clear because it reveals nothing about sender or recipient without knowing
either esk or ivk.
enc_ciphertext (104 bytes for DashMemo) — The encrypted note plaintext, produced by ChaCha20-Poly1305 AEAD encryption:
enc_ciphertext = ChaCha20-Poly1305.Encrypt(key, nonce=[0;12], aad=[], plaintext)
= ciphertext (88 bytes) || MAC tag (16 bytes) = 104 bytes
The symmetric key is derived via ECDH:
key = BLAKE2b-256("Zcash_OrchardKDF", shared_secret || epk_bytes).
When decrypted by the recipient (using ivk), the note plaintext
(88 bytes for DashMemo) contains:
| Offset | Size | Field | Description |
|---|---|---|---|
| 0 | 1 | version | Always 0x02 (Orchard, post-ZIP-212) |
| 1 | 11 | diversifier (d) | Recipient's diversifier, derives the base point g_d |
| 12 | 8 | value (v) | 64-bit little-endian note value in duffs |
| 20 | 32 | rseed | Random seed for deterministic derivation of rcm, psi, esk |
| 52 | 36 | memo | Application-layer memo data (DashMemo: 36 bytes) |
| Total | 88 |
The first 52 bytes (version + diversifier + value + rseed) are the compact note — light clients can trial-decrypt just this portion using the ChaCha20 stream cipher (without verifying the MAC) to check whether the note belongs to them. If it does, they decrypt the full 88 bytes and verify the MAC.
out_ciphertext (80 bytes) — The encrypted outgoing data, allowing the sender to recover the note after the fact. Encrypted with the Outgoing Cipher Key:
ock = BLAKE2b-256("Zcash_Orchardock", ovk || cv_net || cmx || epk)
out_ciphertext = ChaCha20-Poly1305.Encrypt(ock, nonce=[0;12], aad=[], plaintext)
= ciphertext (64 bytes) || MAC tag (16 bytes) = 80 bytes
When decrypted by the sender (using ovk), the outgoing plaintext
(64 bytes) contains:
| Offset | Size | Field | Description |
|---|---|---|---|
| 0 | 32 | pk_d | Diversified transmission key (recipient's public key) |
| 32 | 32 | esk | Ephemeral secret key (Pallas scalar) |
| Total | 64 |
With pk_d and esk, the sender can reconstruct the shared secret, decrypt
enc_ciphertext, and recover the full note. If the sender sets ovk = null,
the outgoing plaintext is filled with random bytes before encryption, making
recovery impossible even for the sender (non-recoverable output).
Encryption Scheme: ChaCha20-Poly1305
Both enc_ciphertext and out_ciphertext use ChaCha20-Poly1305 AEAD (RFC 8439):
| Parameter | Value |
|---|---|
| Key size | 256 bits (32 bytes) |
| Nonce | [0u8; 12] (safe because each key is used exactly once) |
| AAD | Empty |
| MAC tag | 16 bytes (Poly1305) |
The zero nonce is safe because the symmetric key is derived from a fresh Diffie-Hellman exchange per note — each key encrypts exactly one message.
DashMemo vs ZcashMemo Size Comparison
| Component | DashMemo | ZcashMemo | Notes |
|---|---|---|---|
| Memo field | 36 bytes | 512 bytes | Application data |
| Note plaintext | 88 bytes | 564 bytes | 52 fixed + memo |
| enc_ciphertext | 104 bytes | 580 bytes | plaintext + 16 MAC |
| Ciphertext payload (epk+enc+out) | 216 bytes | 692 bytes | Transmitted per note |
| Full stored record (cmx+rho+payload) | 280 bytes | 756 bytes | BulkAppendTree entry |
DashMemo's smaller memo (36 vs 512 bytes) reduces each stored record by 476 bytes — significant when storing millions of notes.
Trial Decryption Flow (Light Client)
A light client scanning for its own notes performs this sequence for each stored record:
1. Read record: cmx (32) || rho (32) || epk (32) || enc_ciphertext (104) || out_ciphertext (80)
2. Compute shared_secret = [ivk] * epk (ECDH with incoming viewing key)
3. Derive key = BLAKE2b-256("Zcash_OrchardKDF", shared_secret || epk)
4. Trial-decrypt compact note (first 52 bytes of enc_ciphertext):
→ version (1) || diversifier (11) || value (8) || rseed (32)
5. Reconstruct esk = PRF(rseed, rho) ← rho is needed here!
Verify: [esk] * g_d == epk ← confirms this is our note
6. If match: decrypt full enc_ciphertext (88 bytes + 16 MAC):
→ compact_note (52) || memo (36)
Verify MAC tag for authenticity
7. Reconstruct full Note from (diversifier, value, rseed, rho)
This note can later be spent by proving knowledge of it in ZK
Step 5 is why rho must be stored alongside the ciphertext — without it,
the light client cannot verify the ephemeral key binding during trial
decryption.
Client-Side Witness Generation
The grovedb-commitment-tree crate provides a client-side tree for wallets
and test harnesses that need to generate Merkle witness paths for spending notes.
Enable the client feature to use it:
grovedb-commitment-tree = { version = "4", features = ["client"] }
#![allow(unused)] fn main() { pub struct ClientMemoryCommitmentTree { inner: ShardTree<MemoryShardStore<MerkleHashOrchard, u32>, 32, 4>, } }
The ClientMemoryCommitmentTree wraps ShardTree — a full commitment tree (not just
a frontier) that keeps complete history in memory. This allows generating
authentication paths for any marked leaf, which the frontier alone cannot do.
API:
| Method | Description |
|---|---|
new(max_checkpoints) | Create empty tree with checkpoint retention limit |
append(cmx, retention) | Append a commitment with retention policy |
checkpoint(id) | Create a checkpoint at current state |
max_leaf_position() | Position of most recently appended leaf |
witness(position, depth) | Generate MerklePath for spending a note |
anchor() | Current root as orchard::tree::Anchor |
Retention policies control which leaves can be witnessed later:
| Retention | Meaning |
|---|---|
Retention::Ephemeral | Leaf cannot be witnessed (other people's notes) |
Retention::Marked | Leaf can be witnessed (your own notes) |
Retention::Checkpoint { id, marking } | Create a checkpoint, optionally mark |
Server vs Client comparison:
CommitmentFrontier (server) | ClientMemoryCommitmentTree (client) | ClientPersistentCommitmentTree (sqlite) | |
|---|---|---|---|
| Storage | ~1KB frontier in data storage | Full tree in memory | Full tree in SQLite |
| Can witness | No | Yes (marked leaves only) | Yes (marked leaves only) |
| Can compute anchor | Yes | Yes | Yes |
| Anchor matches | Same sequence → same anchor | Same sequence → same anchor | Same sequence → same anchor |
| Persists across restarts | Yes (GroveDB data storage) | No (lost on drop) | Yes (SQLite database) |
| Use case | GroveDB server-side anchor tracking | Testing, ephemeral wallets | Production wallets |
| Feature flag | server | client | sqlite |
All three produce identical anchors for the same sequence of appends. This is
verified by the test_frontier_and_client_same_root test.
Persistent Client — SQLite-Backed Witness Generation
The in-memory ClientMemoryCommitmentTree loses all state on drop. For
production wallets that must survive restarts without re-scanning the entire
blockchain, the crate provides ClientPersistentCommitmentTree backed by
SQLite. Enable the sqlite feature:
grovedb-commitment-tree = { version = "4", features = ["sqlite"] }
#![allow(unused)] fn main() { pub struct ClientPersistentCommitmentTree { inner: ShardTree<SqliteShardStore, 32, 4>, } }
Three constructor modes:
| Constructor | Description |
|---|---|
open(conn, max_checkpoints) | Takes ownership of an existing rusqlite::Connection |
open_on_shared_connection(arc, max_checkpoints) | Shares an Arc<Mutex<Connection>> with other components |
open_path(path, max_checkpoints) | Convenience — opens/creates a SQLite DB at the given file path |
The bring-your-own-connection constructors (open, open_on_shared_connection)
allow the wallet to use its existing database for commitment tree storage.
The SqliteShardStore creates its tables with a commitment_tree_ prefix, so
it coexists safely alongside other application tables.
API is identical to ClientMemoryCommitmentTree:
| Method | Description |
|---|---|
append(cmx, retention) | Append a commitment with retention policy |
checkpoint(id) | Create a checkpoint at current state |
max_leaf_position() | Position of most recently appended leaf |
witness(position, depth) | Generate MerklePath for spending a note |
anchor() | Current root as orchard::tree::Anchor |
SQLite schema (4 tables, created automatically):
commitment_tree_shards -- Shard data (serialized prunable trees)
commitment_tree_cap -- Tree cap (single-row, top of shard tree)
commitment_tree_checkpoints -- Checkpoint metadata (position or empty)
commitment_tree_checkpoint_marks_removed -- Marks removed per checkpoint
Persistence example:
#![allow(unused)] fn main() { use grovedb_commitment_tree::{ClientPersistentCommitmentTree, Retention, Position}; // First session: append notes and close let mut tree = ClientPersistentCommitmentTree::open_path("wallet.db", 100)?; tree.append(cmx_0, Retention::Marked)?; tree.append(cmx_1, Retention::Ephemeral)?; let anchor_before = tree.anchor()?; drop(tree); // Second session: reopen, state is preserved let tree = ClientPersistentCommitmentTree::open_path("wallet.db", 100)?; let anchor_after = tree.anchor()?; assert_eq!(anchor_before, anchor_after); // same anchor, no re-scan needed }
Shared connection example (for wallets with an existing SQLite database):
#![allow(unused)] fn main() { use std::sync::{Arc, Mutex}; use grovedb_commitment_tree::rusqlite::Connection; let conn = Arc::new(Mutex::new(Connection::open("wallet.db")?)); // conn is also used by other wallet components... let mut tree = ClientPersistentCommitmentTree::open_on_shared_connection( conn.clone(), 100 )?; }
The grovedb-commitment-tree crate re-exports rusqlite under the sqlite
feature flag, so downstream consumers do not need to add rusqlite as a
separate dependency.
SqliteShardStore internals:
The SqliteShardStore implements all 18 methods of the ShardStore trait.
Shard trees are serialized using a compact binary format:
Nil: [0x00] — 1 byte
Leaf: [0x01][hash: 32][flags: 1] — 34 bytes
Parent: [0x02][has_ann: 1][ann?: 32][left][right] — recursive
LocatedPrunableTree adds an address prefix: [level: 1][index: 8][tree_bytes].
The ConnectionHolder enum abstracts over owned vs shared connections:
#![allow(unused)] fn main() { enum ConnectionHolder { Owned(Connection), // exclusive access Shared(Arc<Mutex<Connection>>), // shared with other components } }
All database operations acquire the connection through a with_conn helper that
transparently handles both modes, locking the mutex only when shared.
Proof Integration
CommitmentTree supports two proof paths:
1. Sinsemilla anchor proof (ZK path):
GroveDB root hash
↓ Merk proof (V0, standard)
Parent Merk node
↓ value_hash includes CommitmentTree element bytes
CommitmentTree element bytes
↓ contains sinsemilla_root field
Sinsemilla root (Orchard Anchor)
↓ ZK proof (Halo 2 circuit, off-chain)
Note commitment at position P
- The parent Merk proof demonstrates that the
CommitmentTreeelement exists at the claimed path/key, with specific bytes. - Those bytes include the
sinsemilla_rootfield. - The client (wallet) independently constructs a Merkle witness in the
Sinsemilla tree using
ClientMemoryCommitmentTree::witness()(testing) orClientPersistentCommitmentTree::witness()(production, SQLite-backed). - The ZK circuit verifies the witness against the anchor (sinsemilla_root).
2. Item retrieval proof (V1 path):
Individual items (cmx || rho || payload) can be queried by position and proved using V1 proofs (§9.6), the same mechanism used by standalone BulkAppendTree. The V1 proof includes the BulkAppendTree authentication path for the requested position, chained to the parent Merk proof for the CommitmentTree element.
Cost Tracking
CommitmentTree introduces a dedicated cost field for Sinsemilla operations:
#![allow(unused)] fn main() { pub struct OperationCost { pub seek_count: u32, pub storage_cost: StorageCost, pub storage_loaded_bytes: u64, pub hash_node_calls: u32, pub sinsemilla_hash_calls: u32, // ← new field for CommitmentTree } }
The sinsemilla_hash_calls field is separate from hash_node_calls because
Sinsemilla hashes are dramatically more expensive than Blake3 in both CPU time
and ZK circuit cost.
Per-append cost breakdown:
| Component | Average case | Worst case |
|---|---|---|
| Sinsemilla hashes | 33 (32 root + 1 ommer avg) | 64 (32 root + 32 ommers) |
| Frontier I/O seeks | 2 (get + put) | 2 |
| Frontier bytes loaded | 554 (~16 ommers) | 1,066 (32 ommers) |
| Frontier bytes written | 554 | 1,066 |
| BulkAppendTree hashes | ~5 Blake3 (amortized, see §14.15) | O(chunk_size) on compaction |
| BulkAppendTree I/O | 2-3 seeks (metadata + buffer) | +2 on chunk compaction |
Cost estimation constants (from average_case_costs.rs and
worst_case_costs.rs):
#![allow(unused)] fn main() { // Average case const AVG_FRONTIER_SIZE: u32 = 554; // ~16 ommers const AVG_SINSEMILLA_HASHES: u32 = 33; // 32 root levels + 1 avg ommer // Worst case const MAX_FRONTIER_SIZE: u32 = 1066; // 32 ommers (max depth) const MAX_SINSEMILLA_HASHES: u32 = 64; // 32 root levels + 32 ommers }
The BulkAppendTree component cost is tracked alongside the Sinsemilla cost,
combining both Blake3 hashes (from BulkAppendTree buffer/chunk operations) and
Sinsemilla hashes (from frontier append) into a single OperationCost.
The Orchard Key Hierarchy and Re-exports
The grovedb-commitment-tree crate re-exports the full Orchard API needed to
construct and verify shielded transactions. This allows Platform code to import
everything from a single crate.
Key management types:
SpendingKey
├── SpendAuthorizingKey → SpendValidatingKey
└── FullViewingKey
├── IncomingViewingKey (decrypt received notes)
├── OutgoingViewingKey (decrypt sent notes)
└── Address (= PaymentAddress, derive recipient addresses)
Note types:
| Type | Purpose |
|---|---|
Note | Full note with value, recipient, randomness |
ExtractedNoteCommitment | The cmx extracted from a note (32 bytes) |
Nullifier | Unique tag that marks a note as spent |
Rho | Nullifier derivation input (links spend to prior note) |
NoteValue | 64-bit note value |
ValueCommitment | Pedersen commitment to a note value |
Proof and bundle types:
| Type | Purpose |
|---|---|
ProvingKey | Halo 2 proving key for Orchard circuits |
VerifyingKey | Halo 2 verifying key for Orchard circuits |
BatchValidator | Batch verification of multiple Orchard bundles |
Bundle<T, V> | Collection of Actions forming a shielded transfer |
Action | Single spend/output pair within a bundle |
Authorized | Bundle authorization state (signatures + ZK proof) |
Flags | Bundle flags (spends enabled, outputs enabled) |
Proof | The Halo 2 proof within an authorized bundle |
Builder types:
| Type | Purpose |
|---|---|
Builder | Constructs an Orchard bundle from spends and outputs |
BundleType | Configures padding strategy for the bundle |
Tree types:
| Type | Purpose |
|---|---|
Anchor | Sinsemilla root as an Orchard-native type |
MerkleHashOrchard | Sinsemilla hash node in the commitment tree |
MerklePath | 32-level authentication path for witness generation |
Implementation Files
| File | Purpose |
|---|---|
grovedb-commitment-tree/src/lib.rs | CommitmentFrontier struct, serialization, EMPTY_SINSEMILLA_ROOT, re-exports |
grovedb-commitment-tree/src/commitment_tree/mod.rs | CommitmentTree<S, M> struct, typed/raw append, ciphertext ser/de helpers |
grovedb-commitment-tree/src/commitment_frontier/mod.rs | CommitmentFrontier (Sinsemilla frontier wrapping Frontier) |
grovedb-commitment-tree/src/error.rs | CommitmentTreeError (including InvalidPayloadSize) |
grovedb-commitment-tree/src/client/mod.rs | ClientMemoryCommitmentTree, in-memory witness generation |
grovedb-commitment-tree/src/client/sqlite_store.rs | SqliteShardStore, ShardStore impl over SQLite, tree serialization |
grovedb-commitment-tree/src/client/client_persistent_commitment_tree.rs | ClientPersistentCommitmentTree, SQLite-backed witness generation |
grovedb-commitment-tree/Cargo.toml | Feature flags: server, client, sqlite |
grovedb-element/src/element/mod.rs | Element::CommitmentTree variant (3 fields: u64, u8, Option<ElementFlags>) |
grovedb-element/src/element/constructor.rs | empty_commitment_tree(chunk_power), new_commitment_tree_with_all() |
grovedb-element/src/element/helpers.rs | uses_non_merk_data_storage() helper |
merk/src/tree_type/costs.rs | COMMITMENT_TREE_COST_SIZE = 12 |
merk/src/tree_type/mod.rs | TreeType::CommitmentTree = 7, uses_non_merk_data_storage() |
grovedb/src/operations/commitment_tree.rs | GroveDB operations: typed insert, raw insert, anchor, get_value, count, batch preprocessing |
grovedb/src/operations/delete/mod.rs | Non-Merk tree type delete handling |
grovedb/src/batch/mod.rs | GroveOp::CommitmentTreeInsert, commitment_tree_insert_op_typed constructor |
grovedb/src/batch/estimated_costs/average_case_costs.rs | Average case cost model |
grovedb/src/batch/estimated_costs/worst_case_costs.rs | Worst case cost model |
grovedb/src/tests/commitment_tree_tests.rs | 32 integration tests |
Comparison with Other Tree Types
| CommitmentTree | MmrTree | BulkAppendTree | DenseTree | |
|---|---|---|---|---|
| Element discriminant | 11 | 12 | 13 | 14 |
| TreeType | 7 | 8 | 9 | 10 |
| Has child Merk | No | No | No | No |
| Data namespace | BulkAppendTree entries + Sinsemilla frontier | MMR nodes | Buffer + chunks + MMR | Values by position |
| Hash function | Sinsemilla + Blake3 | Blake3 | Blake3 | Blake3 |
| Proof type | V1 (Bulk) + ZK (Sinsemilla) | V1 (MMR proof) | V1 (Bulk proof) | V1 (DenseTree proof) |
| Hashes per append | ~33 Sinsemilla + ~5 Blake3 | ~2 Blake3 | ~5 Blake3 (amortized) | O(n) Blake3 |
| Cost size | 12 bytes | 11 bytes | 12 bytes | 6 bytes |
| Capacity | Unlimited | Unlimited | Unlimited | Fixed (2^h - 1) |
| ZK-friendly | Yes (Halo 2) | No | No | No |
| Chunk compaction | Yes (configurable chunk_power) | No | Yes | No |
| Use case | Shielded note commitments | Event/transaction logs | High-throughput bulk logs | Small bounded structures |
Choose CommitmentTree when you need ZK-provable anchors for shielded protocols with efficient chunk-compacted storage. Choose MmrTree when you need a simple append-only log with individual leaf proofs. Choose BulkAppendTree when you need high-throughput range queries with chunk-based snapshots. Choose DenseAppendOnlyFixedSizeTree when you need a compact, fixed-capacity structure where every position stores a value and the root hash is always recomputed on the fly.
The DenseAppendOnlyFixedSizeTree — Dense Fixed-Capacity Merkle Storage
The DenseAppendOnlyFixedSizeTree is a complete binary tree of a fixed height where every node — both internal and leaf — stores a data value. Positions are filled sequentially in level-order (BFS): root first (position 0), then left-to-right at each level. No intermediate hashes are persisted; the root hash is recomputed on the fly by recursively hashing from leaves to root.
This design is ideal for small, bounded data structures where the maximum capacity is known in advance and you need O(1) append, O(1) retrieval by position, and a compact 32-byte root hash commitment that changes after every insert.
Tree Structure
A tree of height h has capacity 2^h - 1 positions. Positions use 0-based level-order
indexing:
Height 3 tree (capacity = 7):
pos 0 ← root (level 0)
/ \
pos 1 pos 2 ← level 1
/ \ / \
pos 3 pos 4 pos 5 pos 6 ← level 2 (leaves)
Navigation:
left_child(i) = 2i + 1
right_child(i) = 2i + 2
parent(i) = (i - 1) / 2
is_leaf(i) = 2i + 1 >= capacity
Values are appended sequentially: the first value goes to position 0 (root), then position 1, 2, 3, and so on. This means the root always has data, and the tree fills in level-order — the most natural traversal order for a complete binary tree.
Hash Computation
The root hash is not stored separately — it is recomputed from scratch whenever needed. The recursive algorithm visits only filled positions:
hash(position, store):
value = store.get(position)
if position is unfilled (>= count):
return [0; 32] ← empty sentinel
value_hash = blake3(value)
left_hash = hash(2 * position + 1, store)
right_hash = hash(2 * position + 2, store)
return blake3(value_hash || left_hash || right_hash)
Key properties:
- All nodes (leaf and internal):
blake3(blake3(value) || H(left) || H(right)) - Leaf nodes: left_hash and right_hash are both
[0; 32](unfilled children) - Unfilled positions:
[0u8; 32](zero hash) - Empty tree (count = 0):
[0u8; 32]
No leaf/internal domain separation tags are used. The tree structure (height
and count) is externally authenticated in the parent Element::DenseAppendOnlyFixedSizeTree,
which flows through the Merk hierarchy. The verifier always knows exactly which
positions are leaves vs internal nodes from the height and count, so an attacker
cannot substitute one for the other without breaking the parent authentication chain.
This means the root hash encodes a commitment to every stored value and its exact position in the tree. Changing any value (if it were mutable) would cascade through all ancestor hashes up to the root.
Hash cost: Computing the root hash visits all filled positions plus any unfilled children. For a tree with n values, worst case is O(n) blake3 calls. This is acceptable because the tree is designed for small, bounded capacities (max height 16, max 65,535 positions).
The Element Variant
#![allow(unused)] fn main() { Element::DenseAppendOnlyFixedSizeTree( u16, // count — number of values stored (max 65,535) u8, // height — immutable after creation (1..=16) Option<ElementFlags>, // flags — storage flags ) }
| Field | Type | Description |
|---|---|---|
count | u16 | Number of values inserted so far (max 65,535) |
height | u8 | Tree height (1..=16), immutable after creation |
flags | Option<ElementFlags> | Optional storage flags |
The root hash is NOT stored in the Element — it flows as the Merk child hash
via insert_subtree's subtree_root_hash parameter.
Discriminant: 14 (ElementType), TreeType = 10
Cost size: DENSE_TREE_COST_SIZE = 6 bytes (2 count + 1 height + 1 discriminant
- 2 overhead)
Storage Layout
Like MmrTree and BulkAppendTree, the DenseAppendOnlyFixedSizeTree stores data in the
data namespace (not a child Merk). Values are keyed by their position as a big-endian u64:
Subtree path: blake3(parent_path || key)
Storage keys:
[0, 0, 0, 0, 0, 0, 0, 0] → value at position 0 (root)
[0, 0, 0, 0, 0, 0, 0, 1] → value at position 1
[0, 0, 0, 0, 0, 0, 0, 2] → value at position 2
...
The Element itself (stored in the parent Merk) carries the count and height.
The root hash flows as the Merk child hash. This means:
- Reading the root hash requires recomputation from storage (O(n) hashing)
- Reading a value by position is O(1) — single storage lookup
- Inserting is O(n) hashing — one storage write + full root hash recomputation
Operations
dense_tree_insert(path, key, value, tx, grove_version)
Appends a value to the next available position. Returns (root_hash, position).
Step 1: Read element, extract (count, height)
Step 2: Check capacity: if count >= 2^height - 1 → error
Step 3: Build subtree path, open storage context
Step 4: Write value to position = count
Step 5: Reconstruct DenseFixedSizedMerkleTree from state
Step 6: Call tree.insert(value, store) → (root_hash, position, hash_calls)
Step 7: Update element with new root_hash and count + 1
Step 8: Propagate changes up through Merk hierarchy
Step 9: Commit transaction
dense_tree_get(path, key, position, tx, grove_version)
Retrieves the value at a given position. Returns None if position >= count.
dense_tree_root_hash(path, key, tx, grove_version)
Returns the root hash stored in the element. This is the hash computed during the most recent insert — no recomputation needed.
dense_tree_count(path, key, tx, grove_version)
Returns the number of values stored (the count field from the element).
Batch Operations
The GroveOp::DenseTreeInsert variant supports batch insertion through the standard
GroveDB batch pipeline:
#![allow(unused)] fn main() { let ops = vec![ QualifiedGroveDbOp::dense_tree_insert_op( vec![b"parent".to_vec()], b"my_dense_tree".to_vec(), b"value_data".to_vec(), ), ]; db.apply_batch(ops, None, None, grove_version)?; }
Preprocessing: Like all non-Merk tree types, DenseTreeInsert ops are preprocessed
before the main batch body executes. The preprocess_dense_tree_ops method:
- Groups all
DenseTreeInsertops by(path, key) - For each group, executes the inserts sequentially (reading the element, inserting each value, updating the root hash)
- Converts each group into a
ReplaceNonMerkTreeRootop that carries the finalroot_hashandcountthrough the standard propagation machinery
Multiple inserts to the same dense tree within a single batch are supported — they are processed in order and the consistency check allows duplicate keys for this op type.
Propagation: The root hash and count flow through the NonMerkTreeMeta::DenseTree
variant in ReplaceNonMerkTreeRoot, following the same pattern as MmrTree and
BulkAppendTree.
Proofs
DenseAppendOnlyFixedSizeTree supports V1 subquery proofs via the ProofBytes::DenseTree
variant. Individual positions can be proved against the tree's root hash using inclusion
proofs that carry ancestor values and sibling subtree hashes.
Auth Path Structure
Because internal nodes hash their own value (not just child hashes), the
authentication path differs from a standard Merkle tree. To verify a leaf at position
p, the verifier needs:
- The leaf value (the proved entry)
- Ancestor value hashes for every internal node on the path from
pto the root (only the 32-byte hash, not the full value) - Sibling subtree hashes for every child that is NOT on the path
Because all nodes use blake3(H(value) || H(left) || H(right)) (no domain tags),
the proof only carries 32-byte value hashes for ancestors — not full values. This
keeps proofs compact regardless of how large individual values are.
#![allow(unused)] fn main() { pub struct DenseTreeProof { pub entries: Vec<(u16, Vec<u8>)>, // proved (position, value) pairs pub node_value_hashes: Vec<(u16, [u8; 32])>, // ancestor value hashes on the auth path pub node_hashes: Vec<(u16, [u8; 32])>, // precomputed sibling subtree hashes } }
Note:
heightandcountare not in the proof struct — the verifier gets them from the parent Element, which is authenticated by the Merk hierarchy.
Walkthrough Example
Tree with height=3, capacity=7, count=5, proving position 4:
0
/ \
1 2
/ \ / \
3 4 5 6
Path from 4 to root: 4 → 1 → 0. Expanded set: {0, 1, 4}.
The proof contains:
- entries:
[(4, value[4])]— the proved position - node_value_hashes:
[(0, H(value[0])), (1, H(value[1]))]— ancestor value hashes (32 bytes each, not full values) - node_hashes:
[(2, H(subtree_2)), (3, H(node_3))]— siblings not on the path
Verification recomputes the root hash bottom-up:
H(4) = blake3(blake3(value[4]) || [0;32] || [0;32])— leaf (children are unfilled)H(3)— fromnode_hashesH(1) = blake3(H(value[1]) || H(3) || H(4))— internal uses value hash fromnode_value_hashesH(2)— fromnode_hashesH(0) = blake3(H(value[0]) || H(1) || H(2))— root uses value hash fromnode_value_hashes- Compare
H(0)against expected root hash
Multi-Position Proofs
When proving multiple positions, the expanded set merges overlapping auth paths. Shared ancestors are included only once, making multi-position proofs more compact than independent single-position proofs.
V0 Limitation
V0 proofs cannot descend into dense trees. If a V0 query matches a
DenseAppendOnlyFixedSizeTree with a subquery, the system returns
Error::NotSupported directing the caller to use prove_query_v1.
Query Key Encoding
Dense tree positions are encoded as big-endian u16 (2-byte) query keys, unlike
MmrTree and BulkAppendTree which use u64. All standard QueryItem range types
are supported.
Comparison with Other Non-Merk Trees
| DenseTree | MmrTree | BulkAppendTree | CommitmentTree | |
|---|---|---|---|---|
| Element discriminant | 14 | 12 | 13 | 11 |
| TreeType | 10 | 8 | 9 | 7 |
| Capacity | Fixed (2^h - 1, max 65,535) | Unlimited | Unlimited | Unlimited |
| Data model | Every position stores a value | Leaf-only | Dense tree buffer + chunks | Leaf-only |
| Hash in Element? | No (flows as child hash) | No (flows as child hash) | No (flows as child hash) | No (flows as child hash) |
| Insert cost (hashing) | O(n) blake3 | O(1) amortized | O(1) amortized | ~33 Sinsemilla |
| Cost size | 6 bytes | 11 bytes | 12 bytes | 12 bytes |
| Proof support | V1 (Dense) | V1 (MMR) | V1 (Bulk) | V1 (CommitmentTree) |
| Best for | Small bounded structures | Event logs | High-throughput logs | ZK commitments |
When to choose DenseAppendOnlyFixedSizeTree:
- The maximum number of entries is known at creation time
- You need every position (including internal nodes) to store data
- You want the simplest possible data model with no unbounded growth
- O(n) root hash recomputation is acceptable (small tree heights)
When NOT to choose it:
- You need unlimited capacity → use MmrTree or BulkAppendTree
- You need ZK compatibility → use CommitmentTree
Usage Example
#![allow(unused)] fn main() { use grovedb::Element; use grovedb_version::version::GroveVersion; let grove_version = GroveVersion::latest(); // Create a dense tree of height 4 (capacity = 15 values) db.insert( &[b"state"], b"validator_slots", Element::empty_dense_tree(4), None, None, grove_version, )?; // Append values — positions filled 0, 1, 2, ... let (root_hash, pos) = db.dense_tree_insert( &[b"state"], b"validator_slots", validator_pubkey.to_vec(), None, grove_version, )?; // pos == 0, root_hash = blake3(validator_pubkey) // Read back by position let value = db.dense_tree_get( &[b"state"], b"validator_slots", 0, // position None, grove_version, )?; assert_eq!(value, Some(validator_pubkey.to_vec())); // Query metadata let count = db.dense_tree_count(&[b"state"], b"validator_slots", None, grove_version)?; let hash = db.dense_tree_root_hash(&[b"state"], b"validator_slots", None, grove_version)?; }
Implementation Files
| File | Contents |
|---|---|
grovedb-dense-fixed-sized-merkle-tree/src/lib.rs | DenseTreeStore trait, DenseFixedSizedMerkleTree struct, recursive hash |
grovedb-dense-fixed-sized-merkle-tree/src/proof.rs | DenseTreeProof struct, generate(), encode_to_vec(), decode_from_slice() |
grovedb-dense-fixed-sized-merkle-tree/src/verify.rs | DenseTreeProof::verify() — pure function, no storage needed |
grovedb-element/src/element/mod.rs | Element::DenseAppendOnlyFixedSizeTree (discriminant 14) |
grovedb-element/src/element/constructor.rs | empty_dense_tree(), new_dense_tree() |
merk/src/tree_type/mod.rs | TreeType::DenseAppendOnlyFixedSizeTree = 10 |
merk/src/tree_type/costs.rs | DENSE_TREE_COST_SIZE = 6 |
grovedb/src/operations/dense_tree.rs | GroveDB operations, AuxDenseTreeStore, batch preprocessing |
grovedb/src/operations/proof/generate.rs | generate_dense_tree_layer_proof(), query_items_to_positions() |
grovedb/src/operations/proof/verify.rs | verify_dense_tree_lower_layer() |
grovedb/src/operations/proof/mod.rs | ProofBytes::DenseTree variant |
grovedb/src/batch/estimated_costs/average_case_costs.rs | Average case cost model |
grovedb/src/batch/estimated_costs/worst_case_costs.rs | Worst case cost model |
grovedb/src/tests/dense_tree_tests.rs | 22 integration tests |
Quantum Cryptography — Post-Quantum Threat Analysis
This chapter analyzes how quantum computers would affect the cryptographic primitives used in GroveDB and the shielded transaction protocols built on top of it (Orchard, Dash Platform). It covers which components are vulnerable, which are safe, what "harvest now, decrypt later" means for stored data, and what mitigation strategies exist — including hybrid KEM designs.
Two Quantum Algorithms That Matter
Only two quantum algorithms are relevant to cryptography in practice:
Shor's algorithm solves the discrete logarithm problem (and integer factoring) in polynomial time. For a 255-bit elliptic curve like Pallas, this requires roughly 510 logical qubits — but with error correction overhead, the real requirement is approximately 4 million physical qubits. Shor's algorithm completely breaks all elliptic curve cryptography regardless of key size.
Grover's algorithm provides a quadratic speedup for brute-force search. A 256-bit symmetric key effectively becomes 128-bit. However, the circuit depth for Grover's on a 128-bit key space is still 2^64 quantum operations — many cryptographers believe this will never be practical on real hardware due to decoherence limits. Grover's reduces security margins but does not break well-parameterized symmetric cryptography.
| Algorithm | Targets | Speedup | Practical impact |
|---|---|---|---|
| Shor's | ECC discrete log, RSA factoring | Polynomial time (exponential speedup over classical) | Total break of ECC |
| Grover's | Symmetric key search, hash preimage | Quadratic (halves key bits) | 256-bit → 128-bit (still safe) |
GroveDB's Cryptographic Primitives
GroveDB and the Orchard-based shielded protocol use a mix of elliptic curve and symmetric/hash-based primitives. The table below classifies every primitive by its quantum vulnerability:
Quantum-Vulnerable (Shor's algorithm — 0 bits post-quantum)
| Primitive | Where used | What breaks |
|---|---|---|
| Pallas ECDLP | Note commitments (cmx), ephemeral keys (epk/esk), viewing keys (ivk), payment keys (pk_d), nullifier derivation | Recover any private key from its public counterpart |
| ECDH key agreement (Pallas) | Deriving symmetric encryption keys for note ciphertexts | Recover shared secret → decrypt all notes |
| Sinsemilla hash | Commitment tree Merkle paths, in-circuit hashing | Collision resistance depends on ECDLP; degrades when Pallas breaks |
| Halo 2 IPA | ZK proof system (polynomial commitment over Pasta curves) | Forge proofs for false statements (counterfeit, unauthorized spends) |
| Pedersen commitments | Value commitments (cv_net) hiding transaction amounts | Recover hidden amounts; forge balance proofs |
Quantum-Safe (Grover's algorithm — 128+ bits post-quantum)
| Primitive | Where used | Post-quantum security |
|---|---|---|
| Blake3 | Merk tree node hashes, MMR nodes, BulkAppendTree state roots, subtree path prefixes | 128-bit preimage, 128-bit second-preimage |
| BLAKE2b-256 | KDF for symmetric key derivation, outgoing cipher key, PRF^expand | 128-bit preimage |
| ChaCha20-Poly1305 | Encrypts enc_ciphertext and out_ciphertext (256-bit keys) | 128-bit key search (safe, but key derivation path through ECDH is not) |
| PRF^expand (BLAKE2b-512) | Derives esk, rcm, psi from rseed | 128-bit PRF security |
GroveDB Infrastructure: Believed Quantum-Safe Under Current Assumptions
All of GroveDB's own data structures rely exclusively on Blake3 hashing, which is believed to be quantum-resistant under current cryptographic assumptions:
- Merk AVL trees — node hashes, combined_value_hash, child hash propagation
- MMR trees — internal node hashes, peak computation, root derivation
- BulkAppendTree — buffer hash chains, dense Merkle roots, epoch MMR
- CommitmentTree state root —
blake3("ct_state" || sinsemilla_root || bulk_state_root) - Subtree path prefixes — Blake3 hashing of path segments
- V1 proofs — authentication chains through Merk hierarchy
No changes needed based on known attacks. GroveDB's Merk tree proofs, MMR consistency checks, BulkAppendTree epoch roots, and all V1 proof authentication chains are believed to remain secure against quantum computers. Hash-based infrastructure is the strongest part of the system post-quantum, though assessments may evolve with new cryptanalytic techniques.
Retroactive vs Real-Time Threats
This distinction is critical for prioritizing what to fix and when.
Retroactive threats compromise data that is already stored. An adversary records data today and decrypts it when quantum computers become available. These threats cannot be mitigated after the fact — once the data is on-chain, it cannot be re-encrypted or recalled.
Real-time threats only affect transactions created in the future. An adversary with a quantum computer could forge signatures or proofs, but only for new transactions. Old transactions were already validated and confirmed by the network.
| Threat | Type | What's exposed | Urgency |
|---|---|---|---|
| Note decryption (enc_ciphertext) | Retroactive | Note contents: recipient, amount, memo, rseed | High — stored forever |
| Value commitment opening (cv_net) | Retroactive | Transaction amounts (but not sender/receiver) | Medium — amounts only |
| Sender recovery data (out_ciphertext) | Retroactive | Sender's recovery keys for sent notes | High — stored forever |
| Spend authorization forgery | Real-time | Could forge new spend signatures | Low — upgrade before QC arrives |
| Halo 2 proof forgery | Real-time | Could forge new proofs (counterfeit) | Low — upgrade before QC arrives |
| Sinsemilla collision | Real-time | Could forge new Merkle paths | Low — subsumed by proof forgery |
| Binding signature forgery | Real-time | Could forge new balance proofs | Low — upgrade before QC arrives |
What Exactly Gets Revealed?
If note encryption is broken (the primary HNDL threat):
A quantum adversary recovers esk from the stored epk via Shor's algorithm,
computes the ECDH shared secret, derives the symmetric key, and decrypts
enc_ciphertext. This reveals the full note plaintext:
| Field | Size | What it reveals |
|---|---|---|
| version | 1 byte | Protocol version (not sensitive) |
| diversifier | 11 bytes | Recipient's address component |
| value | 8 bytes | Exact transaction amount |
| rseed | 32 bytes | Enables nullifier linkage (deanonymizes transaction graph) |
| memo | 36 bytes (DashMemo) | Application data, potentially identifying |
With rseed and rho (stored alongside the ciphertext), the adversary can
compute esk = PRF(rseed, rho) and verify the ephemeral key binding. Combined
with the diversifier, this links inputs to outputs across the entire transaction
history — full deanonymization of the shielded pool.
If only value commitments are broken (secondary HNDL threat):
The adversary recovers v from cv_net = [v]*V + [rcv]*R by solving ECDLP.
This reveals transaction amounts but not sender or receiver identities.
The adversary sees "someone sent 5.0 Dash to someone" but cannot link the
amount to any address or identity without also breaking note encryption.
On its own, amounts without linkage are limited in usefulness. But combined with external data (timing, known invoices, amounts matching public requests), correlation attacks become possible.
The "Harvest Now, Decrypt Later" Attack
This is the most urgent and practical quantum threat.
Attack model: A state-level adversary (or any party with sufficient storage) records all on-chain shielded transaction data today. This data is publicly available on the blockchain and immutable. The adversary waits for a cryptographically relevant quantum computer (CRQC), then:
Step 1: Read stored record from CommitmentTree BulkAppendTree:
cmx (32) || rho (32) || epk (32) || enc_ciphertext (104) || out_ciphertext (80)
Step 2: Solve ECDLP on Pallas via Shor's algorithm:
epk = [esk] * g_d → recover esk
Step 3: Compute shared secret:
shared_secret = [esk] * pk_d
Step 4: Derive symmetric key (BLAKE2b is quantum-safe, but input is compromised):
K_enc = BLAKE2b-256("Zcash_OrchardKDF", shared_secret || epk)
Step 5: Decrypt enc_ciphertext using ChaCha20-Poly1305:
→ version || diversifier || value || rseed || memo
Step 6: With rseed + rho, link nullifiers to note commitments:
esk = PRF(rseed, rho)
→ full transaction graph reconstruction
Key insight: The symmetric encryption (ChaCha20-Poly1305) is perfectly quantum-safe. The vulnerability is entirely in the key derivation path — the symmetric key is derived from an ECDH shared secret, and ECDH is broken by Shor's algorithm. The attacker doesn't break the encryption; they recover the key.
Retroactivity: This attack is fully retroactive. Every encrypted note ever stored on-chain can be decrypted once a CRQC exists. The data cannot be re-encrypted or protected after the fact. This is why it must be addressed before data is stored, not after.
Mitigation: Hybrid KEM (ML-KEM + ECDH)
The defense against HNDL is to derive the symmetric encryption key from two independent key agreement mechanisms, such that breaking only one is insufficient. This is called a hybrid KEM.
ML-KEM-768 (CRYSTALS-Kyber)
ML-KEM is the NIST-standardized (FIPS 203, August 2024) post-quantum key encapsulation mechanism based on the Module Learning With Errors (MLWE) problem.
| Parameter | ML-KEM-512 | ML-KEM-768 | ML-KEM-1024 |
|---|---|---|---|
| Public key (ek) | 800 bytes | 1,184 bytes | 1,568 bytes |
| Ciphertext (ct) | 768 bytes | 1,088 bytes | 1,568 bytes |
| Shared secret | 32 bytes | 32 bytes | 32 bytes |
| NIST Category | 1 (128-bit) | 3 (192-bit) | 5 (256-bit) |
ML-KEM-768 is the recommended choice — it is the parameter set used by X-Wing, Signal's PQXDH, and Chrome/Firefox TLS hybrid key exchange. Category 3 provides a comfortable margin against future lattice cryptanalysis advances.
How the Hybrid Scheme Works
Current flow (vulnerable):
Sender:
esk = PRF(rseed, rho) // deterministic from note
epk = [esk] * g_d // Pallas curve point
shared_secret = [esk] * pk_d // ECDH (broken by Shor's)
K_enc = BLAKE2b("Zcash_OrchardKDF", shared_secret || epk)
enc_ciphertext = ChaCha20(K_enc, note_plaintext)
Hybrid flow (quantum-resistant):
Sender:
esk = PRF(rseed, rho) // unchanged
epk = [esk] * g_d // unchanged
ss_ecdh = [esk] * pk_d // same ECDH
(ct_pq, ss_pq) = ML-KEM-768.Encaps(ek_pq) // NEW: lattice-based KEM
// ek_pq from recipient's address
K_enc = BLAKE2b( // MODIFIED: combines both secrets
"DashPlatform_HybridKDF",
ss_ecdh || ss_pq || ct_pq || epk
)
enc_ciphertext = ChaCha20(K_enc, note_plaintext) // unchanged
Recipient decryption:
Recipient:
ss_ecdh = [ivk] * epk // same ECDH (using incoming viewing key)
ss_pq = ML-KEM-768.Decaps(dk_pq, ct_pq) // NEW: KEM decapsulation
K_enc = BLAKE2b("DashPlatform_HybridKDF", ss_ecdh || ss_pq || ct_pq || epk)
note_plaintext = ChaCha20.Decrypt(K_enc, enc_ciphertext)
Security Guarantee
The combined KEM is IND-CCA2 secure if either component KEM is secure. This is formally proven by Giacon, Heuer, and Poettering (2018) for KEM combiners using a PRF (BLAKE2b qualifies), and independently by the X-Wing security proof.
| Scenario | ECDH | ML-KEM | Combined key | Status |
|---|---|---|---|---|
| Classical world | Secure | Secure | Secure | Both intact |
| Quantum breaks ECC | Broken | Secure | Secure | ML-KEM protects |
| Lattice advances break ML-KEM | Secure | Broken | Secure | ECDH protects (same as today) |
| Both broken | Broken | Broken | Broken | Requires two simultaneous breakthroughs |
Size Impact
The hybrid KEM adds the ML-KEM-768 ciphertext (1,088 bytes) to each stored note and expands the outgoing ciphertext to include the ML-KEM shared secret for sender recovery:
Stored record per note:
┌──────────────────────────────────────────────────────────────────┐
│ Current (280 bytes) Hybrid (1,400 bytes) │
│ │
│ cmx: 32 cmx: 32 │
│ rho: 32 rho: 32 │
│ epk: 32 epk: 32 │
│ enc_ciphertext: 104 ct_pq: 1,088 ← NEW │
│ out_ciphertext: 80 enc_ciphertext: 104 │
│ out_ciphertext: 112 ← +32 │
│ ───────────────── ────────────────────── │
│ Total: 280 Total: 1,400 (5.0x) │
└──────────────────────────────────────────────────────────────────┘
Storage at scale:
| Notes | Current (280 B) | Hybrid (1,400 B) | Delta |
|---|---|---|---|
| 100,000 | 26.7 MB | 133 MB | +106 MB |
| 1,000,000 | 267 MB | 1.33 GB | +1.07 GB |
| 10,000,000 | 2.67 GB | 13.3 GB | +10.7 GB |
Address size:
Current: diversifier (11) + pk_d (32) = 43 bytes
Hybrid: diversifier (11) + pk_d (32) + ek_pq (1,184) = 1,227 bytes
The 1,184-byte ML-KEM public key must be included in the address so senders can perform encapsulation. At ~1,960 Bech32m characters, this is large but still fits in a QR code (max ~2,953 alphanumeric characters).
Key Management
The ML-KEM keypair derives deterministically from the spending key:
SpendingKey (sk) [32 bytes]
|
+-> ... (all existing Orchard key derivation unchanged)
|
+-> ml_kem_d = PRF^expand(sk, [0x09])[0..32]
+-> ml_kem_z = PRF^expand(sk, [0x0A])[0..32]
|
+-> (ek_pq, dk_pq) = ML-KEM-768.KeyGen(d=ml_kem_d, z=ml_kem_z)
ek_pq: 1,184 bytes (public, included in address)
dk_pq: 2,400 bytes (private, part of viewing key)
No backup changes needed. The existing 24-word seed phrase covers the ML-KEM key because it derives from the spending key deterministically. Wallet recovery works as before.
Diversified addresses all share the same ek_pq because ML-KEM has no
natural diversification mechanism like Pallas scalar multiplication. This means
an observer with two of a user's addresses can link them by comparing ek_pq.
Trial Decryption Performance
| Step | Current | Hybrid | Delta |
|---|---|---|---|
| Pallas ECDH | ~100 us | ~100 us | — |
| ML-KEM-768 Decaps | — | ~40 us | +40 us |
| BLAKE2b KDF | ~0.5 us | ~1 us | — |
| ChaCha20 (52 bytes) | ~0.1 us | ~0.1 us | — |
| Total per note | ~101 us | ~141 us | +40% overhead |
Scanning 100,000 notes: ~10.1 sec → ~14.1 sec. The overhead is meaningful but not prohibitive. ML-KEM decapsulation is constant-time with no batching advantage (unlike elliptic curve operations), so it scales linearly.
Impact on ZK Circuits
None. The hybrid KEM is entirely in the transport/encryption layer. The Halo 2 circuit proves note existence, nullifier correctness, and value balance — it does not prove anything about encryption. No changes to proving keys, verifying keys, or circuit constraints.
Comparison with Industry
| System | Approach | Status |
|---|---|---|
| Signal (PQXDH) | X25519 + ML-KEM-768, mandatory for all users | Deployed (2023) |
| Chrome/Firefox TLS | X25519 + ML-KEM-768 hybrid key exchange | Deployed (2024) |
| X-Wing (IETF draft) | X25519 + ML-KEM-768, purpose-built combiner | Draft standard |
| Zcash | Quantum recoverability draft ZIP (fund recovery, not encryption) | Discussion only |
| Dash Platform | Pallas ECDH + ML-KEM-768 (proposed) | Design phase |
When to Deploy
The Timeline Question
- Current state (2026): No quantum computer can break 255-bit ECC. Largest demonstrated quantum factoring: ~50 bits. Gap: orders of magnitude.
- Near-term (2030-2035): Hardware roadmaps from IBM, Google, Quantinuum target millions of qubits. ML-KEM implementations and parameter sets will have matured.
- Medium-term (2035-2050): Most estimates place CRQC arrival in this window. HNDL data collected today is at risk.
- Long-term (2050+): Consensus among cryptographers: large-scale quantum computers are a matter of "when," not "if."
Recommended Strategy
1. Design for upgradability now. Ensure the stored record format, the
TransmittedNoteCiphertext struct, and the BulkAppendTree entry layout are
versioned and extensible. This is low-cost and preserves the option to add
hybrid KEM later.
2. Deploy hybrid KEM when ready, make it mandatory. Do not offer two pools (classical and hybrid). Splitting the anonymity set defeats the purpose of shielded transactions — users hiding among a smaller group are less private, not more. When deployed, every note uses the hybrid scheme.
3. Target the 2028-2030 window. This is well before any realistic quantum threat but after ML-KEM implementations and parameter sizes have stabilized. It also allows learning from Zcash's and Signal's deployment experience.
4. Monitor trigger events:
- NIST or NSA mandating post-quantum migration deadlines
- Significant advances in quantum hardware (>100,000 physical qubits with error correction)
- Cryptanalytic advances against lattice problems (would affect ML-KEM choice)
What Does Not Need Urgent Action
| Component | Why it can wait |
|---|---|
| Spend authorization signatures | Forgery is real-time, not retroactive. Upgrade to ML-DSA/SLH-DSA before CRQC arrives. |
| Halo 2 proof system | Proof forgery is real-time. Migrate to STARK-based system when needed. |
| Sinsemilla collision resistance | Only useful for new attacks, not retroactive. Subsumed by proof system migration. |
| GroveDB Merk/MMR/Blake3 infrastructure | Already quantum-safe under current cryptographic assumptions. No action needed based on known attacks. |
Post-Quantum Alternatives Reference
For Encryption (replacing ECDH)
| Scheme | Type | Public key | Ciphertext | NIST Category | Notes |
|---|---|---|---|---|---|
| ML-KEM-768 | Lattice (MLWE) | 1,184 B | 1,088 B | 3 (192-bit) | FIPS 203, industry standard |
| ML-KEM-512 | Lattice (MLWE) | 800 B | 768 B | 1 (128-bit) | Smaller, lower margin |
| ML-KEM-1024 | Lattice (MLWE) | 1,568 B | 1,568 B | 5 (256-bit) | Overkill for hybrid |
For Signatures (replacing RedPallas/Schnorr)
| Scheme | Type | Public key | Signature | NIST Category | Notes |
|---|---|---|---|---|---|
| ML-DSA-65 (Dilithium) | Lattice | 1,952 B | 3,293 B | 3 | FIPS 204, fast |
| SLH-DSA (SPHINCS+) | Hash-based | 32-64 B | 7,856-49,856 B | 1-5 | FIPS 205, conservative |
| XMSS/LMS | Hash-based (stateful) | 60 B | 2,500 B | varies | Stateful — reuse = break |
For ZK Proofs (replacing Halo 2)
| System | Assumption | Proof size | Post-quantum | Notes |
|---|---|---|---|---|
| STARKs | Hash functions (collision resistance) | ~100-400 KB | Yes | Used by StarkNet |
| Plonky3 | FRI (hash-based polynomial commitment) | ~50-200 KB | Yes | Active development |
| Halo 2 (current) | ECDLP on Pasta curves | ~5 KB | No | Current Orchard system |
| Lattice SNARKs | MLWE | Research | Yes | Not production-ready |
Rust Crate Ecosystem
| Crate | Source | FIPS 203 | Verified | Notes |
|---|---|---|---|---|
libcrux-ml-kem | Cryspen | Yes | Formally verified (hax/F*) | Highest assurance |
ml-kem | RustCrypto | Yes | Constant-time, not audited | Ecosystem compatibility |
fips203 | integritychain | Yes | Constant-time | Pure Rust, no_std |
Summary
┌─────────────────────────────────────────────────────────────────────┐
│ QUANTUM THREAT SUMMARY FOR GROVEDB + ORCHARD │
│ │
│ SAFE UNDER CURRENT ASSUMPTIONS (hash-based): │
│ ✓ Blake3 Merk trees, MMR, BulkAppendTree │
│ ✓ BLAKE2b KDF, PRF^expand │
│ ✓ ChaCha20-Poly1305 symmetric encryption │
│ ✓ All GroveDB proof authentication chains │
│ │
│ FIX BEFORE DATA IS STORED (retroactive HNDL): │
│ ✗ Note encryption (ECDH key agreement) → Hybrid KEM │
│ ✗ Value commitments (Pedersen) → amounts revealed │
│ │
│ FIX BEFORE QUANTUM COMPUTERS ARRIVE (real-time only): │
│ ~ Spend authorization → ML-DSA / SLH-DSA │
│ ~ ZK proofs → STARKs / Plonky3 │
│ ~ Sinsemilla → hash-based Merkle tree │
│ │
│ RECOMMENDED TIMELINE: │
│ 2026-2028: Design for upgradability, version stored formats │
│ 2028-2030: Deploy mandatory hybrid KEM for note encryption │
│ 2035+: Migrate signatures and proof system if needed │
└─────────────────────────────────────────────────────────────────────┘
Appendix A: Complete Element Type Reference
| Discriminant | Variant | TreeType | Fields | Cost Size | Purpose |
|---|---|---|---|---|---|
| 0 | Item | N/A | (value, flags) | varies | Basic key-value storage |
| 1 | Reference | N/A | (path, max_hop, flags) | varies | Link between elements |
| 2 | Tree | 0 (NormalTree) | (root_key, flags) | TREE_COST_SIZE | Container for subtrees |
| 3 | SumItem | N/A | (value, flags) | varies | Contributes to parent sum |
| 4 | SumTree | 1 (SumTree) | (root_key, sum, flags) | SUM_TREE_COST_SIZE | Maintains sum of descendants |
| 5 | BigSumTree | 4 (BigSumTree) | (root_key, sum128, flags) | BIG_SUM_TREE_COST_SIZE | 128-bit sum tree |
| 6 | CountTree | 2 (CountTree) | (root_key, count, flags) | COUNT_TREE_COST_SIZE | Element counting tree |
| 7 | CountSumTree | 3 (CountSumTree) | (root_key, count, sum, flags) | COUNT_SUM_TREE_COST_SIZE | Combined count + sum |
| 8 | ItemWithSumItem | N/A | (value, sum, flags) | varies | Item with sum contribution |
| 9 | ProvableCountTree | 5 | (root_key, count, flags) | COUNT_TREE_COST_SIZE | Provable count tree |
| 10 | ProvableCountSumTree | 6 | (root_key, count, sum, flags) | COUNT_SUM_TREE_COST_SIZE | Provable count + sum |
| 11 | CommitmentTree | 7 | (total_count: u64, chunk_power: u8, flags) | 12 | ZK-friendly Sinsemilla + BulkAppendTree |
| 12 | MmrTree | 8 | (mmr_size: u64, flags) | 11 | Append-only MMR log |
| 13 | BulkAppendTree | 9 | (total_count: u64, chunk_power: u8, flags) | 12 | High-throughput append-only log |
| 14 | DenseAppendOnlyFixedSizeTree | 10 | (count: u16, height: u8, flags) | 6 | Dense fixed-capacity Merkle storage |
Notes:
- Discriminants 11–14 are non-Merk trees: data lives outside a child Merk subtree
- All four store non-Merk data in the data column
CommitmentTreestores its Sinsemilla frontier alongside BulkAppendTree entries in the same data column (keyb"__ct_data__")
- Non-Merk trees do NOT have a
root_keyfield — their type-specific root hash flows as the Merk child hash viainsert_subtree CommitmentTreeuses Sinsemilla hashing (Pallas curve); all others use Blake3- Cost behavior for non-Merk trees follows
NormalTree(BasicMerkNode, no aggregation) DenseAppendOnlyFixedSizeTreecount isu16(max 65,535); heights restricted to 1..=16
Appendix B: Proof System Quick Reference
V0 Proofs (Merk-only)
Generate:
1. Start at root Merk, execute query → collect matching elements
2. For each matching tree element with subquery:
Open child Merk → prove subquery recursively
3. Serialize as GroveDBProof::V0(root_layer, Vec<(key, GroveDBProof)>)
Verify:
1. Verify root Merk proof → root_hash, elements
2. For each sub-proof:
Verify child Merk proof → child_hash
Check: combine_hash(value_hash(parent_element), child_hash) == expected
3. Final root_hash must match known GroveDB root
V1 Proofs (Mixed Merk + Non-Merk)
When any layer involves a CommitmentTree, MmrTree, BulkAppendTree, or DenseAppendOnlyFixedSizeTree, a V1 proof is generated:
Generate:
1. Same as V0 for Merk layers
2. When descending into CommitmentTree:
→ generate_commitment_tree_layer_proof(query_items) → ProofBytes::CommitmentTree(bytes)
(sinsemilla_root (32 bytes) || BulkAppendTree proof bytes)
3. When descending into MmrTree:
→ generate_mmr_layer_proof(query_items) → ProofBytes::MMR(bytes)
4. When descending into BulkAppendTree:
→ generate_bulk_append_layer_proof(query_items) → ProofBytes::BulkAppendTree(bytes)
5. When descending into DenseAppendOnlyFixedSizeTree:
→ generate_dense_tree_layer_proof(query_items) → ProofBytes::DenseTree(bytes)
6. Store as LayerProof { merk_proof, lower_layers }
Verify:
1. Same as V0 for Merk layers
2. For ProofBytes::CommitmentTree: extract sinsemilla_root, verify inner
BulkAppendTree proof, recompute combined root hash
3. For ProofBytes::MMR: verify MMR proof against MMR root from parent child hash
4. For ProofBytes::BulkAppendTree: verify against state_root from parent child hash
5. For ProofBytes::DenseTree: verify against root_hash from parent child hash
(recompute root from entries + ancestor values + sibling hashes)
6. Type-specific root flows as Merk child hash (not NULL_HASH)
V0/V1 selection: If all layers are Merk, produce V0 (backward compatible). If any layer is a non-Merk tree, produce V1.
Appendix C: Hash Computation Reference
Merk Node Hash
node_hash = blake3(key_len(1) || key || value_hash(32) || left_hash(32) || right_hash(32))
GroveDB Subtree Prefix
prefix = blake3(parent_prefix || key) → 32 bytes
State Root — BulkAppendTree
state_root = blake3("bulk_state" || mmr_root || dense_tree_root)
The buffer is a DenseFixedSizedMerkleTree — dense_tree_root is its root hash.
Dense Merkle Root — BulkAppendTree Chunks
leaves[i] = blake3(entry[i])
internal[parent] = blake3(left_child || right_child)
root = internal[0] (top of complete binary tree)
Dense Tree Root Hash — DenseAppendOnlyFixedSizeTree
Recursive from root (position 0):
all nodes with data: blake3(blake3(value) || H(left) || H(right))
(no domain separation tags — structure externally authenticated)
leaf nodes: left_hash = right_hash = [0; 32]
unfilled position: [0; 32]
empty tree: [0; 32]
Dense Tree Proof Verification — DenseAppendOnlyFixedSizeTree
Given: entries, node_value_hashes, node_hashes, expected_root
Pre-checks:
1. Validate height in [1, 16]
2. Validate count <= capacity (= 2^height - 1)
3. Reject if any field has > 100,000 elements (DoS prevention)
4. Reject duplicate positions within entries, node_value_hashes, node_hashes
5. Reject overlapping positions between the three fields
6. Reject node_hashes at ancestors of any proved entry (prevents forgery)
recompute_hash(position):
if position >= capacity or position >= count → [0; 32]
if position in node_hashes → return precomputed hash
value_hash = blake3(entries[position]) or node_value_hashes[position]
left_hash = recompute_hash(2*pos+1)
right_hash = recompute_hash(2*pos+2)
return blake3(value_hash || left_hash || right_hash)
Verify: recompute_hash(0) == expected_root
GroveDB cross-validation:
proof.height must match element.height
proof.count must match element.count
MMR Node Merge — MmrTree / BulkAppendTree Chunk MMR
parent.hash = blake3(left.hash || right.hash)
Sinsemilla Root — CommitmentTree
Sinsemilla hash over Pallas curve (see Zcash ZIP-244)
MerkleHashOrchard::empty_root(Level::from(32)) for empty tree
combine_hash (Parent-Child Binding)
combine_hash(value_hash, child_hash) = blake3(value_hash || child_hash)
For Merk trees: child_hash = child Merk root hash
For non-Merk trees: child_hash = type-specific root (mmr_root, state_root, dense_root, etc.)
The GroveDB Book — documenting the internals of GroveDB for developers and researchers. Based on the GroveDB source code.