Count Index Group By Examples

This chapter is the GROUP BY companion to Count Index Examples. It uses the same widget contract, the same 100 000-row fixture, and the same bench at packages/rs-drive/benches/document_count_worst_case.rs. Read chapter 29 first — most of the mechanics (CountTree variants, the merk-proof reconstruction algorithm, node_hash_with_count and friends) carry over unchanged.

What's different here:

Every query in chapter 29 returns either a single u64 aggregate or a small list of CountTrees the caller sums. The verifier-side payload shape is one count, total.
Every query in this chapter returns one count per group. The caller gets back a Vec<(group_key, count)> and can index it directly — no summation.

The most important thing to understand up front: group_by is two things at once — a result-shaping directive for the SDK and (for some queries) a proof-shaping directive for the prover. When you pass group_by = [...] in a count request, you're always telling the SDK "don't collapse the result into a single number — give me one count per group key." That result-shaping role is universal: it's what turns Aggregate(sum) into Entries([(key, count), …]).

Whether group_by also changes the proof bytes depends on the query shape. For queries where the underlying proof already commits one CountTree per matched key (single-property INs, for instance), the per-group breakdown is reconstructible from the existing bytes — the prover ships the same proof, the SDK just zips it with the group keys instead of summing. For range queries and certain compound shapes, the per-group breakdown can't be reconstructed from the aggregate-style proof (which commits opaque subtree counts rather than per-key counts), so passing group_by forces the prover to emit a structurally different, larger proof.

The interesting question this chapter answers is: which queries fall into which bucket, and why?

When `group_by` Changes the Proof (and When It Doesn't)

Filter	`group_by`	Aggregate proof (no `group_by`)	Group-By proof	Proof bytes change?
`brand IN [b0, b1]`	`[brand]`	Q5 — 1 102 B	1 102 B (2 entries)	No — byte-identical
`color IN [c0, c1]`	`[color]`	Q6 — 1 381 B	1 381 B (2 entries)	No — byte-identical
`color > floor`	`[color]`	Q7 — 2 072 B (1 `u64`)	10 992 B (100 entries)	Yes — different primitive
`brand == X AND color > floor`	`[brand, color]`	Q8 — 2 656 B (1 `u64`)	not allowed in this form	—

The key observation: IN clauses produce proofs that already commit one CountTree per resolved key, so adding group_by on the same property is purely a verifier-side relabel — the prover ships the same bytes, the verifier just returns them as Entries(...) instead of Aggregate(sum). This is why G1 and G2 below are not new proofs — they're Q5 and Q6 reinterpreted.

So why pass group_by at all if the proof bytes don't change? Because without it, the SDK has no way to know you want the per-key breakdown. The same brand IN ["brand_000", "brand_001"] proof can answer two different questions:

"How many widgets total are made by brand_000 or brand_001?" → caller passes no group_by, SDK returns Aggregate(2 000).
"How many widgets per brand?" → caller passes group_by = [brand], SDK returns Entries([("brand_000", 1 000), ("brand_001", 1 000)]).

The bytes on the wire and the cryptographic guarantees are identical; the only thing that changes is which result shape the SDK delivers. Think of group_by as the count-query equivalent of SELECT brand, COUNT(*) ... GROUP BY brand versus SELECT COUNT(*) ... in SQL — same scan plan, different projection.

Range queries are different. AggregateCountOnRange (chapter 29's Q7) walks the boundary of the range over a ProvableCountTree and sums per-subtree counts directly — it never resolves individual keys. GroupByRange (this chapter) has to enumerate the distinct in-range keys to label each group, so it produces a different proof shape with one CountTree (or CountTree-feature-typed element) per distinct key in the range. That's where group_by genuinely earns its bytes — the prover has to do additional work because the per-group breakdown can't be reconstructed from AggregateCountOnRange's opaque-subtree-count commitments.

Queries in this Chapter

All proof-size and behaviour numbers below come from the same bench helper (report_group_by_matrix) as chapter 29's. The dispatcher's group_by surface validation lives in validate_count_query_groupby_against_index; the per-mode path-query builders sit in packages/rs-drive/src/query/drive_document_count_query/path_query.rs's group_by_* family.

#	Query	Filter + group_by	Complexity	Avg time	Proof size	Verified shape	Notes
G1	`In` on `byBrand`	`brand IN ["brand_000", "brand_001"]` `group_by = [brand]`	O(k · log B)	38.6 µs	1 102 B	`Entries(2 groups, sum = 2 000)`	Byte-identical to Q5
G1a	`In` on `byBrand` with an absent value	`brand IN ["brand_000", "brand_100"]` `group_by = [brand]`	O(k · log B)	44.4 µs	1 357 B	`Entries(1 group, sum = 1 000)`	One In value (`brand_100`) is absent — proof grows by 255 B for the absence subproof; verifier omits the absent branch from entries
G1b	High-fanout `In` on `byBrand` (\|IN\| = B)	`brand IN [100 values]` `group_by = [brand]`	O(k · log B)	1 532 µs	10 038 B	`Entries(100 groups, sum = 100 000)`	Same shape as G1, scaled from `\|IN\| = 2` → `\|IN\| = 100`; reveals every byBrand entry when `\|IN\| = B`
G2	`In` on `byColor`	`color IN ["color_00000000", "color_00000001"]` `group_by = [color]`	O(k · log C)	62.1 µs	1 381 B	`Entries(2 groups, sum = 200)`	Byte-identical to Q6
G3	Compound `In` + Equal	`brand IN [...] AND color == Y` `group_by = [brand]`	O(k · (log B + log C'))	106.2 µs	2 842 B	`Entries(2 groups, sum = 2)`	Per-In compound resolution; two parallel Q4 descents sharing L1–L6
G4	Range on `byColor`	`color > "color_00000500"` `group_by = [color]`	O(R · log C)	762.9 µs	10 992 B	`Entries(100 groups, sum = 10 000)`	`GroupByRange`: enumerates distinct in-range keys instead of Q7's boundary aggregate
G5	Compound `In` + Range	`brand IN [...] AND color > "color_00000500"` `group_by = [brand, color]`	O(k · R' · log C')	737.5 µs	11 554 B	`Entries(100 groups, sum = 100)`	Compound In-fan-out × in-range distinct keys (G3 outer × G4 inner)
G7	Carrier `In` + Range (`byBrandColor`)	`brand IN [...] AND color > "color_00000500"` `group_by = [brand]`	O(k · (log B + log C'))	255.9 µs	4 332 B	`Entries(2 groups, sum = 998)`	Per-In aggregate via `AggregateCountOnRange` as a carrier subquery; one `u64` per branch
G8	Carrier outer Range + Range (`byBrandColor`)	`brand > "brand_050" AND color > "color_00000500"` `group_by = [brand]`	O(L · (log B + log C'))	523 µs	18 022 B	`Entries(10 groups, sum = 4 990)`	Outer-Range carrier with a platform-max `SizedQuery::limit` of 10; caller may pass smaller, can't pass larger
G8a	Bounded carrier + bounded ACOR, descending	`brand > "brand_050" AND brand < "brand_065" AND color > "color_00000200" AND color < "color_00000400"` `group_by = [brand]`, `order_by = [(brand, desc)]`	O(L · (log B + log C'))	807 µs	29 010 B	`Entries(10 groups, sum = 1 990)`	Bounded ranges on both axes + descending walk; same carrier shape as G8, different op variants on both range commitments
G8b	Same carrier `where` but `group_by = [brand, color]`	`brand > "brand_050" AND color > "color_00000500"` `group_by = [brand, color]`	—	—	rejected	`InvalidWhereClauseComponents("count query supports at most one range where-clause; …or use` group_by = [outer_range_field]`with`prove = true`…")`	Two-range carrier is opened only for `GroupByRange + single-field group_by`; the compound shape can't fan over both ranges
G8c	Same carrier `where` but `group_by = []`	`brand > "brand_050" AND color > "color_00000500"` `group_by = []`	—	—	rejected	`InvalidWhereClauseComponents("count query supports at most one range where-clause; …or use` group_by = [outer_range_field]`with`prove = true`…")`	Aggregate (no group_by) can't collapse the carrier's per-branch `u64`s into a single sum at the verifier

Complexity variables. B = distinct brands in the byBrand merk-tree (≈ 100); C = distinct colors in byColor (≈ 1 000); C' = distinct colors per brand in byBrandColor (≈ 1 000); R = distinct in-range values returned by GroupByRange (capped at 100 in this fixture by an implicit response-size limit); R' = distinct in-range values per fan-out branch (similarly capped); k = |IN| for the In-outer carrier shapes; L = the effective outer-walk limit for the Range-outer carrier shape (G8). The platform's MAX_CARRIER_AGGREGATE_OUTER_RANGE_LIMIT = 10 is both the default (when the caller passes no limit) and a hard ceiling; callers may pass a smaller limit to truncate further. See G8 for the rationale. As in chapter 29, the total document count N doesn't appear — count proofs read pre-committed count_values rather than enumerating docs.

Avg time is the criterion-reported median of cargo bench --bench document_count_worst_case -- 'document_count_worst_case/query_g' on the same 100 000-row warmed fixture used by chapter 29's query_N_* cases. Each row reflects 10 samples × ~3 k–130 k iterations per sample with 2 s warm-up and 5 s measurement; the median sits within ±2 % of the mean across reruns. G1 and G2 match their Q5 / Q6 counterparts to within ~3 µs — the residual is the SDK-side zip-vs-sum cost. G4 is ~11 × Q7 because GroupByRange enumerates 100 distinct in-range CountTrees rather than walking O(log C) boundary nodes; the time difference is exactly the complexity difference predicted (O(R · log C) vs O(log C)).

Group-By Shapes That Are Not Allowed

Several plausible-looking (where, group_by) combinations are rejected by the dispatcher before any proof generation. The rejections fall into four buckets — operator/group_by mismatch, missing range window, no covering index, and one currently-deferred aggregate variant. All are surfaced as typed QuerySyntaxErrors; the precise error strings appear in the bench's [matrix] output.

1. `group_by` field constrained by `==` instead of `In` or range

where    = brand == "brand_050"
group_by = [brand]

count query supports only ... (rejected because == produces exactly one entry whose key equals the where-clause's value — grouping by a field that already has a single value contributes no extra information).

Why. GROUP BY [field] is meaningful only when field can take multiple values in the result set. An == clause pins the field to exactly one value, so the group_by is structurally redundant — the dispatcher rejects it rather than silently returning a single-entry response that would look like a bug. Use Q2 / Q3 (no group_by) for single-value == queries.

Applies symmetrically: where = color == X, group_by = [color] is rejected for the same reason.

2. `group_by` contains a range field but the `where` clause doesn't range over it

where    = brand IN[...] AND color == "color_00000500"
group_by = [brand, color]

GROUP BY on a range field requires a range where-clause; the range field must appear in where for the distinct walk to have a window to iterate over

Why. group_by = [in_field, range_field] (GroupByCompound) routes through distinct_count_path_query, which needs a range window on the second field to know what values to enumerate. With color == Y the second dimension collapses to a single value, so the compound walk degenerates to a point lookup — and that's what Q4 / G3 are for. For compound plus range, the where must carry a range on the second field (which is what G5 does).

3. `group_by` orders fields in a way no covering index can serve

where    = color IN[...] AND brand > "brand_050"
group_by = [color, brand]

where clause on non indexed property error: range count requires a range_countable: true index whose last property matches the range field

Why. The covering index for (group_by[0] = color, group_by[1] = brand) would need to be byColorBrand with rangeCountable: true on the brand terminator. The widget contract doesn't have that index — only byBrand, byColor, and byBrandColor. The dispatcher's index picker walks every declared index, finds none whose (properties, last_property_is_range_countable) shape matches the request, and rejects with the "non-indexed property" error.

The fix is contract-level: declare a byColorBrand index with rangeCountable: true if the application needs this group_by order. The dispatcher itself can't infer alternate index orders from the request alone — rangeCountable: true is an explicit opt-in on each index because it changes the on-disk tree shape (NormalTree → ProvableCountTree on the property-name subtree).

To put these three buckets in one place: every rejected (where, group_by) shape on this contract reduces to one of:

the group_by field's where operator doesn't admit multiple values (bucket 1),
the group_by has a range slot that the where doesn't fill with a range (bucket 2),
there's no covering rangeCountable index in property order (bucket 3).

All three checks happen at request validation, before any GroveDB work. The bench's report_group_by_matrix exercises one example of each and prints the exact error string, so adding a new contract or index shape is a quick way to see which checks each new query shape hits.

Historical note. A fourth bucket — group_by = [in_field] with where = in_field IN[...] AND range_field > floor — was rejected before grovedb PR #663. That PR added support for AggregateCountOnRange as a carrier subquery under outer Keys, which unblocked the natural single-field-group_by shape (one aggregate count per In branch) at the merk layer. The dispatcher now routes that shape to [DocumentCountMode::RangeAggregateCarrierProof]; the worked-out example is G7 below.

G1 — `In` on `byBrand`, Grouped By `brand`

select   = COUNT
where    = brand IN ["brand_000", "brand_001"]
group_by = [brand]
prove    = true

Path query (identical to Q5):

path:         ["@", contract_id, 0x01, "widget", "brand"]
query items:  [Key("brand_000"), Key("brand_001")]

Verified payload (the only thing that differs from Q5):

Entries([
  ("brand_000", CountTree { count_value_or_default: 1000 }),
  ("brand_001", CountTree { count_value_or_default: 1000 }),
])

The SDK zips the In values with the two resolved CountTree elements (in lex-asc order) rather than summing them as Q5's CountMode::Aggregate does.

Proof size: 1 102 B. Proof bytes are byte-identical to Q5 — same path query, same merk ops, same hash composition. The dispatcher recognises that CountMode::GroupByIn on a single-property In clause resolves through the same point_lookup_count_path_query as CountMode::Aggregate does; only the response-shaping at the very end differs.

For the verbatim proof display, see Q5 in chapter 29 — every byte of the 1 102-byte proof is the same. Or ▶ open the proof interactively in the visualizer ↗ (same encoded payload). The diagrams below show the result-shaping difference.

flowchart TB
  WD["@/contract_id/0x01/widget"]:::tree
  WD ==> BR["brand: NormalTree"]:::path
  BR ==> B000["brand_000: CountTree count=1000"]:::target
  BR ==> B001["brand_001: CountTree count=1000"]:::target
  BR -.-> BMore["brand_002 ... brand_099"]:::faded

  SDK["Verifier returns Entries([<br/>(&quot;brand_000&quot;, 1000),<br/>(&quot;brand_001&quot;, 1000)<br/>])"]:::sdk

  B000 -.-> SDK
  B001 -.-> SDK

  classDef tree fill:#21262d,color:#c9d1d9,stroke:#1f6feb,stroke-width:2px;
  classDef path fill:#6e7681,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef faded fill:#21262d,color:#6e7681,stroke:#484f58;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

  linkStyle 0 stroke:#1f6feb,stroke-width:3px;
  linkStyle 1 stroke:#1f6feb,stroke-width:3px;
  linkStyle 2 stroke:#1f6feb,stroke-width:3px;

Diagram: per-layer merk-tree structure (Layer 5+)

Identical to Q5's Layer-5+ diagram — same merk ops, same byBrand binary tree, same two KVValueHashFeatureTypeWithChildHash targets. The only difference is what the verifier returns at the end (Entries(...) instead of Aggregate(2000)); the per-layer structure is unchanged. See chapter 29 for the diagram.

G1a — `In` on `byBrand` with one absent value, Grouped By `brand`

select   = COUNT
where    = brand IN ["brand_000", "brand_100"]
group_by = [brand]
prove    = true

The bench fixture has brands brand_000 … brand_099 (BRAND_COUNT = 100); brand_100 is deliberately outside that range. G1a is G1's same-shape sibling: same path query, same point_lookup_count_path_query builder, same CountMode::GroupByIn dispatch. The only structural difference is one of the In keys doesn't exist in the byBrand merk tree.

Path query (identical shape to G1; only the second key differs):

path:         ["@", contract_id, 0x01, "widget", "brand"]
query items:  [Key("brand_000"), Key("brand_100")]

Verified payload (note: only one entry — the absent branch is silently dropped):

Entries([
  ("brand_000", CountTree { count_value_or_default: 1000 }),
])

This is the load-bearing behaviour to know about: grovedb's verify_query without absence_proofs_for_non_existing_searched_keys: true drops absent-Key branches from the elements stream. The drive-side verifier (verify_point_lookup_count_proof_v0) uses the default (off) and so emits one entry per present In value, not one per requested In value. Test coverage: test_point_lookup_proof_omits_absent_in_branches_from_entries.

Caller implication. Callers MUST NOT assume entries.len() == |In|. To check whether a specific In value matched, demux entries by serialized key (the same serialize_value_for_key(field, value) the path-query builder uses for outer Keys) — see the test for the canonical pattern. A 0-count vs absent-key distinction would require passing absence_proofs_for_non_existing_searched_keys: true end-to-end, which the platform doesn't expose today.

Proof size: 1 357 B (+255 B over G1's 1 102 B). The delta is the absence subproof: grovedb walks the byBrand merk tree to commit the rightmost present key (brand_099) and the chain of Child ops that proves there's nothing between brand_099 and end-of-tree. Even though the verifier drops the absent entry, the prover must cryptographically commit to the absence — otherwise a malicious prover could omit a present branch by claiming it's absent.

Mode: CountMode::GroupByIn routed to DocumentCountMode::PointLookupProof — same as G1.

Proof display:

The absence-subproof shape is what makes G1a interesting. The L8 (byBrand value tree) layer commits both:

The present branch (op 0): Push(KVValueHashFeatureTypeWithChildHash(brand_000, CountTree(636f6c6f72, 1000, …))) — brand_000 as a CountTree with count = 1000, exactly as in G1.
The absence commitment (op 36): Push(KVDigest(brand_099, HASH[…])) — the rightmost present brand in the byBrand merk tree, paired with a chain of Child ops (37–42) that the verifier replays to confirm there's no key strictly between brand_099 and end-of-tree. brand_100 would have to sort after brand_099 (which is true: brand_099 < brand_100 lexicographically), so the verifier's merk-root recomputation succeeds with no brand_100 element emitted.

The bench's [gproof] G1a output dumps the full 1357-byte proof:

Expand to see the structured proof (L1–L8 for byBrand, with one present CountTree at L8 + one absence subproof at L8)

GroveDBProofV1 {
  LayerProof {                                          // L1: roots merk
    proof: Merk(
      0: Push(Hash(HASH[bd29…3b3]))                     // sibling: contracts subtree
      1: Push(KVValueHash(@, Tree(4ed2…289), HASH[…]))  // KVValueHash of `@` (data-contract subtree root) — descend
      2: Parent
      3: Push(Hash(HASH[19c9…b71]))                     // sibling
      4: Child)
    lower_layers: {
      @ => {
        LayerProof {                                    // L2: `@` subtree
          proof: Merk(
            0: Push(KVValueHash(0x4ed2…289, Tree(01), HASH[…])))   // descend into contract-id subtree
          lower_layers: {
            0x4ed2…289 => {
              LayerProof {                              // L3: contract-id subtree
                proof: Merk(
                  0: Push(Hash(HASH[49e7…df8]))         // sibling
                  1: Push(KVValueHash(0x01, Tree(widget), HASH[…]))  // descend into doctype `widget`
                  2: Parent)
                lower_layers: {
                  0x01 => {
                    LayerProof {                        // L4: doctype-prefix subtree
                      proof: Merk(
                        0: Push(KVValueHash(widget, Tree(brand), HASH[…])))  // descend into byBrand index
                      lower_layers: {
                        widget => {
                          LayerProof {                  // L5: widget subtree
                            proof: Merk(
                              0: Push(Hash(HASH[9862…9d9]))                // sibling
                              1: Push(KVValueHash(brand, Tree(brand_063), HASH[…]))  // descend into byBrand value tree (rooted at `brand_063`)
                              2: Parent
                              3: Push(Hash(HASH[6c36…a86]))
                              4: Child)
                            lower_layers: {
                              brand => {
                                LayerProof {            // L6+L7+L8: byBrand value tree (binary search down to `brand_000` + absence walk to `brand_099`)
                                  proof: Merk(
                                    0: Push(KVValueHashFeatureTypeWithChildHash(brand_000, CountTree(color, 1000, flags), HASH[…], BasicMerkNode, HASH[…]))  // PRESENT — `brand_000` as CountTree(count=1000)
                                    1: Push(KVHash(HASH[…]))
                                    2: Parent
                                    3: Push(Hash(HASH[…]))
                                    4: Child
                                    … (24 intermediate `KVHash`/`Hash`/`Parent`/`Child` ops walking the binary search)
                                    35: Push(KVHash(HASH[…]))
                                    36: Push(KVDigest(brand_099, HASH[…]))    // ABSENCE COMMITMENT — rightmost present brand
                                    37: Child
                                    38: Child
                                    39: Child
                                    40: Child
                                    41: Child
                                    42: Child)
                                }}}}}}}}}}}}}}}}}

Op 36 (KVDigest(brand_099, …)) is the load-bearing piece. The verifier replays ops 37–42 (Childs) against the byBrand merk root committed at L5; any tampering — say, an honest brand_099 swapped for a malicious brand_100-shaped commitment — would change the merk root and the verification would fail.

Diagram: conceptual flow (where the absence proof sits)

flowchart TB
  RQ["IN [brand_000, brand_100]"]:::request
  RQ --> M["dispatcher → PointLookupProof<br/>(group_by = [brand])"]:::dispatch
  M --> P["point_lookup_count_path_query<br/>outer Keys = [brand_000, brand_100]"]:::path
  P --> V["grovedb walks byBrand merk tree"]:::engine
  V --> P1["brand_000 ✓ present<br/>commit CountTree(count=1000)"]:::present
  V --> P2["brand_100 ✗ absent<br/>commit rightmost present (brand_099)<br/>+ Child chain to end-of-tree"]:::absent
  P1 --> R["Proof bytes: 1357 B<br/>(1102 B for the present branch +<br/>~255 B for the absence subproof)"]:::result
  P2 --> R
  R --> SDK["verify_point_lookup_count_proof<br/>(absence_proofs_for_non_existing_searched_keys = false)"]:::verify
  SDK --> OUT["Entries([(brand_000, 1000)])<br/>brand_100 silently dropped"]:::sdk

  classDef request fill:#1f6feb,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef dispatch fill:#21262d,color:#c9d1d9,stroke:#1f6feb;
  classDef path fill:#6e7681,color:#fff,stroke:#1f6feb;
  classDef engine fill:#21262d,color:#c9d1d9,stroke:#39c5cf;
  classDef present fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef absent fill:#d29922,color:#0d1117,stroke:#d29922,stroke-width:3px,stroke-dasharray: 6 3;
  classDef result fill:#21262d,color:#c9d1d9,stroke:#39c5cf,stroke-width:2px;
  classDef verify fill:#21262d,color:#c9d1d9,stroke:#a371f7,stroke-width:2px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

Per-layer merk-tree structure (Layer 5+)

flowchart TB
  L5["L5 — widget subtree:<br/>KVValueHash(brand, Tree(brand_063))"]:::path
  L5 --> L6["L6 — byBrand value tree root:<br/>brand_063 (binary-search root)"]:::path
  L6 --> L7L["brand_031 (left subtree boundary)"]:::sibling
  L6 --> L7R["brand_095 (right subtree boundary)"]:::sibling
  L7L --> P000["brand_000<br/>(present, CountTree count=1000)"]:::target
  L7R --> A099["brand_099<br/>(rightmost present, absence-proof anchor)"]:::boundary
  L7R -.-> A100["brand_100 (not in tree — absence proven<br/>by Child chain to end-of-tree)"]:::absent

  classDef path fill:#6e7681,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef sibling fill:#6e7681,color:#fff,stroke:#6e7681;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef boundary fill:#d29922,color:#0d1117,stroke:#d29922,stroke-width:2px;
  classDef absent fill:#21262d,color:#d29922,stroke:#d29922,stroke-width:2px,stroke-dasharray: 6 3;

Why absence-proof matters for count queries. The drive count fast path treats absent branches as 0, but it does NOT trust the SDK to apply that rule on un-committed data — every count or non-existence the verifier reports must be cryptographically committed by the prover. If absent branches were silently summed into 0 without a proof, a malicious prover could omit a present branch (with positive count) and claim it's absent, shrinking the result without detection. The 255-B absence-subproof overhead is the price of that integrity — small in absolute terms, but it scales linearly with the number of absent In values, so callers building queries with many speculative In values pay per-absence overhead.

G1b — High-fanout `In` on `byBrand` (|IN| = B), Grouped By `brand`

select   = COUNT
where    = brand IN ["brand_000", "brand_001", ..., "brand_099"]
group_by = [brand]
prove    = true

Path query (same shape as G1, scaled to |IN| = 100):

path:         ["@", contract_id, 0x01, "widget", "brand"]
query items:  [Key("brand_000"), Key("brand_001"), ..., Key("brand_099")]

Verified payload:

Entries(100 groups, sum = 100 000)

Every document in the fixture, partitioned by brand. Each Entries[i] carries (brand_NNN, CountTree count=1000).

Proof size: 10 038 B. Mode: CountMode::GroupByIn.

Same structural shape as G1, scaled from |IN| = 2 to |IN| = 100. The byBrand merk binary tree at L6 emits all 100 brands as KVValueHashFeatureTypeWithChildHash targets — each ~100 B (key + leaf kv-hash + CountTree(00, 1000, ...) + BasicMerkNode feature + child-hash) — plus minimal boundary glue at the binary-tree corners. The proof grows linearly with |IN|: G1 (|IN|=2) was 1 102 B; G1b (|IN|=100) is 10 038 B; the slope is ~99 B per additional In value.

Compare against the byColor equivalent (group_by_color_in_proof_100_rangecountable_branches, 10 512 B): the ProvableCountTree overhead from byColor's KVHashCount running counts adds ~5 % to the byBrand baseline, even though those running counts aren't consumed by a point-lookup group_by. This is the same ProvableCountTree overhead G2 carried at the smaller scale (|IN|=2).

Proof display:

Expand to see the structured proof (5 layers; bottom layer enumerates 100 brands as `KVValueHashFeatureTypeWithChildHash` targets — 192 merk ops total at L6 including binary-tree glue) — or open interactively in the visualizer ↗

GroveDBProofV1 {
  LayerProof {
    proof: Merk(
      0: Push(Hash(HASH[bd291f29893fb6f6d6201087746ca1f23a178dd08e1346cb6c127e91ae3623b3]))
      1: Push(KVValueHash(@, Tree(4ed22624752972af97fb71abf4067b23e6d296a61a02f35b2098819fde39d289), HASH[4a5a28cb1b40226aa35b2f0d502767df13268bdf4678627dbfde26a557acdf73]))
      2: Parent
      3: Push(Hash(HASH[19c924989e473a90d0848277d0b1498ccc8db3dc870cbc130e773f3d79ea5b71]))
      4: Child)
    lower_layers: {
      // L2..L4 are byte-identical to every other query in this chapter
      // (the @ / contract_id / 0x01 descent into widget); see chapter 29's
      // Q1 verbatim for the full L1..L4 chain.
      ...
      widget => {
        LayerProof {
          proof: Merk(
            // L5 widget doctype — `brand` queried, opaque siblings 9862 / 6c36
            0: Push(Hash(HASH[9862894b16a0792688fdcf64edcb2ceade5c8b234649bfc6cfc6426869b0e9d9]))
            1: Push(KVValueHash(brand, Tree(6272616e645f303633), HASH[68b697da99d6ea70a83eb41794dca7ba3938d0ba98fbfaeb3cd0c19b3b5d0ff2]))
            2: Parent
            3: Push(Hash(HASH[6c36729e93b1a316cbf60fe282eb630c0ed6e45db088e365110302b6c9caba86]))
            4: Child)
          lower_layers: {
            brand => {
              LayerProof {
                proof: Merk(
                  // L6 byBrand merk-tree — 100 targets + binary-tree glue
                  // (192 merk ops total; structurally a fully-resolved in-order
                  // traversal of all 100 brand entries in the byBrand merk tree)
                  0: Push(KVValueHashFeatureTypeWithChildHash(brand_000, CountTree(636f6c6f72, 1000, flags: [0, 0, 0]), HASH[90ff6f6d9a3d901195982128130677243bfd27b75736206f3c8400966ef0d37b], BasicMerkNode, HASH[19b58883c492e746861db1e6ad07529a5a91cc8330af522682486db9346d6875]))
                  1: Push(KVValueHashFeatureTypeWithChildHash(brand_001, CountTree(636f6c6f72, 1000, flags: [0, 0, 0]), HASH[484ca11fb4ec8f479be1f78af903ce0c9d4fe630517579fb0172c2576d6b9652], BasicMerkNode, HASH[0bf12023f8e067c12db4cec1583909a0283878d6d909c76196736299750b5879]))
                  2: Parent
                  3: Push(KVValueHashFeatureTypeWithChildHash(brand_002, CountTree(636f6c6f72, 1000, flags: [0, 0, 0]), HASH[4c19f047068654e71813dce7839a579edfdcb446e3d70efa1b8592c73259da16], BasicMerkNode, HASH[e8d5372904b7f4ac9334aeb4ddab619d9ad7a308732a4f231416e10208a0a356]))
                  ...
                  // 97 more KVValueHashFeatureTypeWithChildHash targets following
                  // the same template — brand_003 ... brand_099 — interleaved with
                  // Parent/Child ops glueing them into the byBrand merk binary tree.
                  // Every target shares the structure:
                  //   Push(KVValueHashFeatureTypeWithChildHash(
                  //     brand_NNN,
                  //     CountTree(636f6c6f72, 1000, flags: [0, 0, 0]),   // count_value=1000
                  //     HASH[<per-brand leaf kv-hash>],
                  //     BasicMerkNode,                                  // NormalTree (no count on the merk node)
                  //     HASH[<per-brand subtree child hash>]
                  //   ))
                  ...
                  189: Push(KVValueHashFeatureTypeWithChildHash(brand_097, CountTree(636f6c6f72, 1000, flags: [0, 0, 0]), HASH[92adee932cc12927cd76ad9fd25906bbfe547df2bf21e826845bb4d3b47f5314], BasicMerkNode, HASH[34b69e1e424aa023c74f61554db2823da6c19dcbc51bdd5dece32e3f6f9fd219]))
                  190: Parent
                  191: Push(KVValueHashFeatureTypeWithChildHash(brand_098, CountTree(636f6c6f72, 1000, flags: [0, 0, 0]), HASH[68e02fcf66f86797035fbc8d53290185fe3fed7de897a8654743cae4007c47c3], BasicMerkNode, HASH[acfc3a88b852e8895449b4c7e01f4b1cc25028e6a80e4915cdde578ff6eb029b]))
                  192: Push(KVValueHashFeatureTypeWithChildHash(brand_099, CountTree(636f6c6f72, 1000, flags: [0, 0, 0]), HASH[af9667a8f2a10a9402b3d1fb0ac6e0b64d1e3dde5b8829c03b8d2c9cfc94e16d], BasicMerkNode, HASH[d049fe7e250b7dd763a4a5daa4227dcd2e41733dd95fd0758641ac06c63c3b51]))
                  // + closing Parent/Child ops binding the last few entries
                )
              }
            }
          }
        }
      }
    }
  }
}

The 254-line full verbatim sits in the bench's [gproof] G1b output — same template (one KVValueHashFeatureTypeWithChildHash per brand, all with CountTree count=1000 and BasicMerkNode feature) repeating 100 times. The schematic above shows the first 3 and last 3 targets so the structural pattern is clear without reproducing 100 near-identical lines.

Key observation: BasicMerkNode (not ProvableCountedMerkNode) is the feature type on each L6 op. byBrand is a NormalTree, so its merk binary tree's internal nodes don't carry running counts — only the per-brand CountTree count=1000 values stored inside each brand's element matter. Contrast this with G1b's byColor cousin (group_by_color_in_proof_100_rangecountable_branches, 10 512 B): there the L6 targets would carry ProvableCountedMerkNode(...) features because byColor IS a ProvableCountTree. The ~5 % size difference is exactly those count fields × 100 nodes.

flowchart TB
  WD["@/contract_id/0x01/widget"]:::tree
  WD ==> BR["brand: NormalTree (100 entries)"]:::path
  BR ==> B000["brand_000: CountTree count=1000"]:::target
  BR ==> B001["brand_001: CountTree count=1000"]:::target
  BR ==> BMore["... 96 more in-range targets<br/>(brand_002 ... brand_097)"]:::target
  BR ==> B098["brand_098: CountTree count=1000"]:::target
  BR ==> B099["brand_099: CountTree count=1000"]:::target

  SDK["Entries(100 groups, sum=100 000):<br/>(&quot;brand_000&quot;, 1000),<br/>(&quot;brand_001&quot;, 1000),<br/>...<br/>(&quot;brand_099&quot;, 1000)"]:::sdk
  B000 -.-> SDK
  B099 -.-> SDK

  classDef tree fill:#21262d,color:#c9d1d9,stroke:#1f6feb,stroke-width:2px;
  classDef path fill:#6e7681,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

  linkStyle 0 stroke:#1f6feb,stroke-width:3px;
  linkStyle 1 stroke:#1f6feb,stroke-width:3px;
  linkStyle 2 stroke:#1f6feb,stroke-width:3px;
  linkStyle 3 stroke:#1f6feb,stroke-width:3px;
  linkStyle 4 stroke:#1f6feb,stroke-width:3px;
  linkStyle 5 stroke:#1f6feb,stroke-width:3px;

Diagram: per-layer merk-tree structure (Layer 5+)

Identical to G1's L5–L6 shape, just with all 100 entries in the byBrand merk tree resolved as visible targets rather than just two. The byBrand binary tree has all 100 keys exposed — no opaque sibling subtrees (Hash ops) at all, only KVValueHashFeatureTypeWithChildHash (full reveal) plus Parent / Child glue.

flowchart TB
  subgraph L5["Layer 5 — widget doctype merk-tree"]
    direction TB
    L5_q["<b>brand</b> (queried)<br/>kv_hash=HASH[68b6...]"]:::queried
    L5_left["HASH[9862...]"]:::sibling
    L5_right["HASH[6c36...]"]:::sibling
    L5_q --> L5_left
    L5_q --> L5_right
  end

  subgraph L6["Layer 6 — byBrand merk-tree (ALL 100 targets fully resolved)"]
    direction TB
    L6_t0["<b>brand_000</b><br/>CountTree count=1000<br/>BasicMerkNode"]:::target
    L6_t1["<b>brand_001</b><br/>CountTree count=1000"]:::target
    L6_tmid["... 97 more KVValueHashFeatureTypeWithChildHash<br/>targets, each CountTree count=1000<br/>(192 merk ops total: 100 Push + 92 Parent/Child)"]:::target
    L6_t99["<b>brand_099</b><br/>CountTree count=1000"]:::target

    L6_t0 --> L6_t1
    L6_t1 --> L6_tmid
    L6_tmid --> L6_t99
  end

  L5_q -. "Tree(merk_root[byBrand])" .-> L6_t0

  classDef queried fill:#1f6feb,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef sibling fill:#6e7681,color:#fff,stroke:#6e7681;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;

Because the In set covers every brand in the fixture, the proof has zero opaque-sibling subtree commitments at L6 — every binary-tree node is revealed as a KVValueHashFeatureTypeWithChildHash target. That's the most efficient byte-per-key shape GroupByIn can hit: at |IN| = B (where B is the total entries in the property tree), the proof bytes ≈ B × (kv-hash + count + child-hash + glue) ≈ B × 100 B. For B = 100, that's exactly the 10 038 B we observe.

By contrast, smaller In sets (G1's |IN| = 2) pay the boundary-proof tax: the byBrand merk tree has ~98 unresolved entries, each contributing one KVHash (opaque-key commitment, ~33 B) or Hash (opaque-subtree commitment, ~33 B). The asymptotic crossover at which "reveal everything" becomes cheaper than "reveal-some-and-commit-the-rest" depends on the ratio of |IN| to B — for byBrand with B = 100, the crossover is around |IN| ≈ 50.

G2 — `In` on `byColor`, Grouped By `color`

select   = COUNT
where    = color IN ["color_00000000", "color_00000001"]
group_by = [color]
prove    = true

Path query (identical to Q6):

path:         ["@", contract_id, 0x01, "widget", "color"]
query items:  [Key("color_00000000"), Key("color_00000001")]

Verified payload:

Entries([
  ("color_00000000", CountTree { count_value_or_default: 100 }),
  ("color_00000001", CountTree { count_value_or_default: 100 }),
])

Proof size: 1 381 B. Byte-identical to Q6 — same path query, same ProvableCountTree-style boundary commitments (KVHashCount ops carry running counts even though the SDK doesn't read them for this point lookup). The single difference from G1 is the underlying property-name tree type (ProvableCountTree for byColor vs NormalTree for byBrand); that affects the merk-boundary commitments but not the dispatcher's GroupByIn-vs-Aggregate routing.

For the verbatim proof display, see Q6 in chapter 29 — or ▶ open it interactively in the visualizer ↗.

flowchart TB
  WD["@/contract_id/0x01/widget"]:::tree
  WD ==> CO["color: ProvableCountTree"]:::path
  CO ==> C000["color_00000000: CountTree count=100"]:::target
  CO ==> C001["color_00000001: CountTree count=100"]:::target
  CO -.-> CMore["color_00000002 ... color_00000999"]:::faded

  SDK["Verifier returns Entries([<br/>(&quot;color_00000000&quot;, 100),<br/>(&quot;color_00000001&quot;, 100)<br/>])"]:::sdk
  C000 -.-> SDK
  C001 -.-> SDK

  classDef tree fill:#21262d,color:#c9d1d9,stroke:#1f6feb,stroke-width:2px;
  classDef path fill:#d29922,color:#0d1117,stroke:#1f6feb,stroke-width:2px;
  classDef faded fill:#21262d,color:#6e7681,stroke:#484f58;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

  linkStyle 0 stroke:#1f6feb,stroke-width:3px;
  linkStyle 1 stroke:#1f6feb,stroke-width:3px;
  linkStyle 2 stroke:#1f6feb,stroke-width:3px;

Diagram: per-layer merk-tree structure (Layer 5+)

Identical to Q6's Layer-5+ diagram. The byColor ProvableCountTree at L6 carries the same KVHashCount running counts; the SDK ignores them for point-lookup group_by and reads only the two resolved targets' count_value_or_default.

G3 — Compound `In` + Equal, Grouped By `brand`

select   = COUNT
where    = brand IN ["brand_000", "brand_001"] AND color == "color_00000500"
group_by = [brand]
prove    = true

Path query (per-In compound resolution — outer Query on byBrand, inner subquery on byBrandColor's color terminator):

path:               ["@", contract_id, 0x01, "widget", "brand"]
query items:        [Key("brand_000"), Key("brand_001")]
subquery_path:      ["color"]
subquery items:     [Key("color_00000500")]

Verified payload:

Entries([
  ("brand_000", CountTree { count_value_or_default: 1 }),
  ("brand_001", CountTree { count_value_or_default: 1 }),
])

Each (brand, "color_00000500") pair has exactly 1 document in the bench's deterministic schedule.

Proof size: 2 842 B. Mode: CountMode::GroupByIn over the byBrandColor compound index.

Proof display:

Expand to see the structured proof (8 layers — two parallel brand-X → color → color_00000500 descents sharing L1–L6) — or open interactively in the visualizer ↗

GroveDBProofV1 {
  LayerProof {
    proof: Merk(
      0: Push(Hash(HASH[bd291f29893fb6f6d6201087746ca1f23a178dd08e1346cb6c127e91ae3623b3]))
      1: Push(KVValueHash(@, Tree(4ed22624752972af97fb71abf4067b23e6d296a61a02f35b2098819fde39d289), HASH[4a5a28cb1b40226aa35b2f0d502767df13268bdf4678627dbfde26a557acdf73]))
      2: Parent
      3: Push(Hash(HASH[19c924989e473a90d0848277d0b1498ccc8db3dc870cbc130e773f3d79ea5b71]))
      4: Child)
    lower_layers: {
      @ => {
        LayerProof {
          proof: Merk(
            0: Push(KVValueHash(0x4ed22624752972af97fb71abf4067b23e6d296a61a02f35b2098819fde39d289, Tree(01), HASH[5b90e1e952b7eef903cc9db2d9098e334a37f7e08cade52c6b2ea3bf4b56b645])))
          lower_layers: {
            0x4ed22624752972af97fb71abf4067b23e6d296a61a02f35b2098819fde39d289 => {
              LayerProof {
                proof: Merk(
                  0: Push(Hash(HASH[49e7191075272395ed72cf03e973987ede6e4945e08574fe77d725f4ce7ecdf8]))
                  1: Push(KVValueHash(0x01, Tree(776964676574), HASH[5d9a0fad8a3f32560f8e8950c1e84a7feabaab21b79bc72fec4482442844e2ef]))
                  2: Parent)
                lower_layers: {
                  0x01 => {
                    LayerProof {
                      proof: Merk(
                        0: Push(KVValueHash(widget, Tree(6272616e64), HASH[6c505f53f2ebf3de030cc2aca463d4b429aeb320a9fadb8ae68bb7903a22bb68])))
                      lower_layers: {
                        widget => {
                          LayerProof {
                            proof: Merk(
                              0: Push(Hash(HASH[9862894b16a0792688fdcf64edcb2ceade5c8b234649bfc6cfc6426869b0e9d9]))
                              1: Push(KVValueHash(brand, Tree(6272616e645f303633), HASH[68b697da99d6ea70a83eb41794dca7ba3938d0ba98fbfaeb3cd0c19b3b5d0ff2]))
                              2: Parent
                              3: Push(Hash(HASH[6c36729e93b1a316cbf60fe282eb630c0ed6e45db088e365110302b6c9caba86]))
                              4: Child)
                            lower_layers: {
                              brand => {
                                LayerProof {
                                  proof: Merk(
                                    0: Push(KVValueHash(brand_000, CountTree(636f6c6f72, 1000, flags: [0, 0, 0]), HASH[90ff6f6d9a3d901195982128130677243bfd27b75736206f3c8400966ef0d37b]))
                                    1: Push(KVValueHash(brand_001, CountTree(636f6c6f72, 1000, flags: [0, 0, 0]), HASH[484ca11fb4ec8f479be1f78af903ce0c9d4fe630517579fb0172c2576d6b9652]))
                                    2: Parent
                                    3: Push(Hash(HASH[8ca09dadc802a7efe03534ce4ad991b2f191f368878754a37b5e5c03d9498dab]))
                                    4: Child
                                    5: Push(KVHash(HASH[e5297b3ebe81c6435c29f712074da5f7c90265e12ed3d4f5af1f6d900e50c9f1]))
                                    6: Parent
                                    7: Push(Hash(HASH[50f373fd01dea89c992779764dff82cc7200b492be8f5cf3721627d5323bcbff]))
                                    8: Child
                                    9: Push(KVHash(HASH[cf78c9f1b1a1204bb2e437806f52c21e331392de3436388572bd1fa4bce1cdc7]))
                                    10: Parent
                                    11: Push(Hash(HASH[4a8dc186a95c8c4a1252fb51dbc407727f588eb5bdc8313c96f5c29889e13926]))
                                    12: Child
                                    13: Push(KVHash(HASH[d00ee7653e34e47d46004929b13ded33dff069ed9cc88342cecdf66a65fd8401]))
                                    14: Parent
                                    15: Push(Hash(HASH[7f1d17b9632f0bd440dacf5e841025482bc1d8145df3650301a95a5ee71ce8c8]))
                                    16: Child
                                    17: Push(KVHash(HASH[3ed48a5e35cb7546d329487b0e1ab8a81d7c5bec358c37449e6cbd956e3bb069]))
                                    18: Parent
                                    19: Push(Hash(HASH[eaef9fc530408393bc321409414814b290309a861f474a925a922250327affc6]))
                                    20: Child
                                    21: Push(KVHash(HASH[f776417ede76e6194706e483ac14ab7b3db6aa0461ec14ed5f8e5d20071363af]))
                                    22: Parent
                                    23: Push(Hash(HASH[b3fccba79c14fcc5e97ff6a3cd051228dc755e6de147bef690ba9681264b2b9f]))
                                    24: Child)
                                  lower_layers: {
                                    brand_000 => {
                                      LayerProof {
                                        proof: Merk(
                                          0: Push(Hash(HASH[d605b4b78e674fd77371ea6adb32ce3e58ee3b96d73c4d34df84159661634587]))
                                          1: Push(KVValueHash(color, NonCounted(ProvableCountTree(636f6c6f725f3030303030353131, 1000, flags: [0, 0, 0])), HASH[fccc0c94657f2a78084f789bb6f687c4bba295e3a062f3199bc33f14dd2b7fe2]))
                                          2: Parent)
                                        lower_layers: {
                                          color => {
                                            LayerProof {
                                              proof: Merk(
                                                ... 37 ops — same boundary shape as Q4 / Q8's L8,
                                                terminating at op 18 with
                                                Push(KVValueHashFeatureTypeWithChildHash(
                                                  color_00000500, CountTree(00, 1, ...),
                                                  HASH[6834...], ProvableCountedMerkNode(1),
                                                  HASH[840c...]))
                                                — TARGET 1
                                              )
                                            }
                                          }
                                        }
                                      }
                                    }
                                    brand_001 => {
                                      LayerProof {
                                        proof: Merk(
                                          0: Push(Hash(HASH[f54769bf6e9d24b9dba53ebd37c9ceb3485b3c6511f8de6f17860676fe4d9331]))
                                          1: Push(KVValueHash(color, NonCounted(ProvableCountTree(636f6c6f725f3030303030353131, 1000, flags: [0, 0, 0])), HASH[8f883171c33df0aba2541a5b9d6195faac7bd1ffef93e8ddcaf9d092f0fa5e19]))
                                          2: Parent)
                                        lower_layers: {
                                          color => {
                                            LayerProof {
                                              proof: Merk(
                                                ... 37 ops — same boundary shape as brand_000's
                                                color subtree, terminating at op 18 with
                                                Push(KVValueHashFeatureTypeWithChildHash(
                                                  color_00000500, CountTree(00, 1, ...),
                                                  HASH[881d...], ProvableCountedMerkNode(1),
                                                  HASH[a422...]))
                                                — TARGET 2
                                              )
                                            }
                                          }
                                        }
                                      }
                                    }
                                  }
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

The two parallel descents below brand are the structurally novel part — every other layer above brand is byte-identical to Q4. The byBrand layer (L6) inlines brand_000 and brand_001 as KVValueHash siblings (ops 0–2), then descends via the lower_layers map into each one's value-tree continuation. Each continuation (L7) carries a single color key whose value is NonCounted(ProvableCountTree(…)) — the byBrandColor terminator. The terminator (L8) walks the boundary path through its in-color binary merk tree to land at color_00000500 with CountTree count=1 and a feature-typed child hash.

The bulk of the proof bytes (≈ 2 × 1 100 B = 2 200 B) is the doubled L7+L8 descent. The L1–L6 prefix amortises across both branches (≈ 600 B shared), giving 2 842 B total — significantly less than 2× Q4's 1 911 B because the upper layers aren't repeated.

flowchart TB
  WD["@/contract_id/0x01/widget"]:::tree
  WD ==> BR["brand: NormalTree"]:::path
  BR ==> B000["brand_000: CountTree count=1000"]:::path
  BR ==> B001["brand_001: CountTree count=1000"]:::path
  B000 ==> B000_C["color: NonCounted(ProvableCountTree)"]:::path
  B001 ==> B001_C["color: NonCounted(ProvableCountTree)"]:::path
  B000_C ==> T1["color_00000500: CountTree count=1"]:::target
  B001_C ==> T2["color_00000500: CountTree count=1"]:::target

  SDK["Verifier returns Entries([<br/>(&quot;brand_000&quot;, 1),<br/>(&quot;brand_001&quot;, 1)<br/>])"]:::sdk
  T1 -.-> SDK
  T2 -.-> SDK

  classDef tree fill:#21262d,color:#c9d1d9,stroke:#1f6feb,stroke-width:2px;
  classDef path fill:#6e7681,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

  linkStyle 0 stroke:#1f6feb,stroke-width:3px;
  linkStyle 1 stroke:#1f6feb,stroke-width:3px;
  linkStyle 2 stroke:#1f6feb,stroke-width:3px;
  linkStyle 3 stroke:#1f6feb,stroke-width:3px;
  linkStyle 4 stroke:#1f6feb,stroke-width:3px;
  linkStyle 5 stroke:#1f6feb,stroke-width:3px;
  linkStyle 6 stroke:#1f6feb,stroke-width:3px;

Diagram: per-layer merk-tree structure (Layer 5+)

Layers 5–6 are like Q4's L5 + Q5's L6 combined (one KVValueHash per In brand at byBrand's binary tree); Layers 7–8 fork — one brand_000-rooted continuation chain and one brand_001-rooted chain — each shaped exactly like Q4's L7 + L8 descent.

flowchart TB
  subgraph L5["Layer 5 — widget doctype merk-tree"]
    direction TB
    L5_q["<b>brand</b><br/>kv_hash=HASH[68b6...]<br/>value: Tree (descent into byBrand)"]:::queried
    L5_left["HASH[9862...]"]:::sibling
    L5_right["HASH[6c36...]"]:::sibling
    L5_q --> L5_left
    L5_q --> L5_right
  end

  subgraph L6["Layer 6 — byBrand merk-tree (TWO INTERMEDIATE TARGETS)"]
    direction TB
    L6_t1["<b>brand_001</b><br/>kv_hash=HASH[484c...]<br/>value: CountTree count=1000"]:::queried
    L6_t0["<b>brand_000</b><br/>kv_hash=HASH[90ff...]<br/>value: CountTree count=1000"]:::queried
    L6_boundary["Boundary commitments (22 merk ops):<br/>7 KVHash sibling brands + 7 Hash subtrees"]:::sibling
    L6_t1 --> L6_t0
    L6_t1 --> L6_boundary
  end

  subgraph L7a["Layer 7a — brand_000's continuation merk-tree"]
    direction TB
    L7a_q["<b>color</b><br/>kv_hash=HASH[fccc...]<br/>value: NonCounted(ProvableCountTree)"]:::queried
    L7a_left["HASH[d605...]"]:::sibling
    L7a_q --> L7a_left
  end

  subgraph L7b["Layer 7b — brand_001's continuation merk-tree"]
    direction TB
    L7b_q["<b>color</b><br/>kv_hash=HASH[8f88...]<br/>value: NonCounted(ProvableCountTree)"]:::queried
    L7b_left["HASH[f547...]"]:::sibling
    L7b_q --> L7b_left
  end

  subgraph L8a["Layer 8a — brand_000's byBrandColor color subtree (TARGET 1)"]
    direction TB
    L8a_target["<b>color_00000500</b><br/>kv_hash=HASH[6834...]<br/>value: <b>CountTree count=1</b><br/>feature: ProvableCountedMerkNode(1)"]:::target
    L8a_boundary["37 merk ops:<br/>9 KVHashCount boundary commitments<br/>(running counts 3, 7, 15, 31, 63, 127, 255, 511, 1000)<br/>+ subtree hashes"]:::sibling
    L8a_target --> L8a_boundary
  end

  subgraph L8b["Layer 8b — brand_001's byBrandColor color subtree (TARGET 2)"]
    direction TB
    L8b_target["<b>color_00000500</b><br/>kv_hash=HASH[881d...]<br/>value: <b>CountTree count=1</b><br/>feature: ProvableCountedMerkNode(1)"]:::target
    L8b_boundary["37 merk ops:<br/>same boundary shape as L8a<br/>(different hashes — different brand's subtree)"]:::sibling
    L8b_target --> L8b_boundary
  end

  L5_q -. "Tree(merk_root[byBrand])" .-> L6_t1
  L6_t0 -. "CountTree continuation" .-> L7a_q
  L6_t1 -. "CountTree continuation" .-> L7b_q
  L7a_q -. "NonCounted(ProvableCountTree)" .-> L8a_target
  L7b_q -. "NonCounted(ProvableCountTree)" .-> L8b_target

  classDef queried fill:#1f6feb,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef sibling fill:#6e7681,color:#fff,stroke:#6e7681;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;

The two parallel byBrandColor descents share their L1–L6 commitments (the doctype prefix + byBrand merk root) but each gets its own L7 + L8 sub-proof. Proof bytes ≈ shared upper layers + 2 × per-brand byBrandColor descent ≈ 2 842 B.

G4 — Range on `byColor`, Grouped By `color`

GroupByRange is the proof primitive that enumerates distinct in-range keys with a count per key, as opposed to chapter 29's AggregateCountOnRange which collapses the same range to a single u64.

select   = COUNT
where    = color > "color_00000500"
group_by = [color]
prove    = true

Path query (uses distinct_count_path_query with limit=100, left_to_right=true):

path:         ["@", contract_id, 0x01, "widget", "color"]
query items:  [RangeAfter("color_00000500"..)]
limit:        100

Verified payload:

Entries(100 groups, sum = 10 000)

The 100 groups are color_00000501 through color_00000600 (the first 100 in-range colors in lex-asc order, capped by the limit). Each carries count_value_or_default = 100 since the fixture's deterministic schedule gives each color exactly 100 documents.

Wait — but Q7 said there are 499 distinct in-range colors and sum = 49 900 over the same color > "color_00000500" predicate. So why does G4 see only 100 groups summing to 10 000? Because GroupByRange's distinct_count_path_query applies the 100-entry response cap (Some(limit) in execute_distinct_count_with_proof). Without that cap the proof would scale linearly with the full in-range distinct count (~5.5 KB for the full 499 colors at ~110 B per resolved CountTree branch). The cap is a response-size safety control — the verifier ceases the walk once it has 100 entries.

Proof size: 10 992 B — ~5.3 × Q7. The structural reason:

Q7 (AggregateCountOnRange) walks the boundary of the range and emits one HashWithCount or KVDigestCount per merk-binary-tree boundary node. Total boundary nodes ≈ O(log C) (≈ 36 ops on the 1 000-color tree). The verifier sums subtree counts directly without descending into individual keys.
G4 (GroupByRange) walks the distinct in-range colors themselves — emitting one KVValueHashFeatureTypeWithChildHash(color_X, CountTree count=100, ProvableCountedMerkNode(…), …) per distinct color in the range, not just per merk-tree boundary node. Total ops ≈ O(R) where R is the distinct in-range colors (capped at 100 here).

The trade-off is exactly what you'd expect: AggregateCountOnRange is O(log C) in proof bytes but loses per-key resolution (returns one u64); GroupByRange is O(R) in proof bytes but preserves per-key counts.

Proof display:

Expand to see the structured proof (5 layers; bottom layer enumerates 100 distinct in-range colors as `KVValueHashFeatureTypeWithChildHash` targets, each carrying `CountTree count=100`) — or open interactively in the visualizer ↗

GroveDBProofV1 {
  LayerProof {
    proof: Merk(
      0: Push(Hash(HASH[bd291f29893fb6f6d6201087746ca1f23a178dd08e1346cb6c127e91ae3623b3]))
      1: Push(KVValueHash(@, Tree(4ed22624752972af97fb71abf4067b23e6d296a61a02f35b2098819fde39d289), HASH[4a5a28cb1b40226aa35b2f0d502767df13268bdf4678627dbfde26a557acdf73]))
      2: Parent
      3: Push(Hash(HASH[19c924989e473a90d0848277d0b1498ccc8db3dc870cbc130e773f3d79ea5b71]))
      4: Child)
    lower_layers: {
      @ => {
        LayerProof {
          proof: Merk(
            0: Push(KVValueHash(0x4ed22624752972af97fb71abf4067b23e6d296a61a02f35b2098819fde39d289, Tree(01), HASH[5b90e1e952b7eef903cc9db2d9098e334a37f7e08cade52c6b2ea3bf4b56b645])))
          lower_layers: {
            0x4ed22624752972af97fb71abf4067b23e6d296a61a02f35b2098819fde39d289 => {
              LayerProof {
                proof: Merk(
                  0: Push(Hash(HASH[49e7191075272395ed72cf03e973987ede6e4945e08574fe77d725f4ce7ecdf8]))
                  1: Push(KVValueHash(0x01, Tree(776964676574), HASH[5d9a0fad8a3f32560f8e8950c1e84a7feabaab21b79bc72fec4482442844e2ef]))
                  2: Parent)
                lower_layers: {
                  0x01 => {
                    LayerProof {
                      proof: Merk(
                        0: Push(KVValueHash(widget, Tree(6272616e64), HASH[6c505f53f2ebf3de030cc2aca463d4b429aeb320a9fadb8ae68bb7903a22bb68])))
                      lower_layers: {
                        widget => {
                          LayerProof {
                            proof: Merk(
                              0: Push(Hash(HASH[9862894b16a0792688fdcf64edcb2ceade5c8b234649bfc6cfc6426869b0e9d9]))
                              1: Push(KVHash(HASH[a29ee8f206a253362b6da4fcacf8643ee8e5925cd979fcd449e5906f0f9f8be3]))
                              2: Parent
                              3: Push(KVValueHash(color, ProvableCountTree(636f6c6f725f3030303030353131, 100000), HASH[79569d595db75bbf2e9dca93a15c90b7eecf7b299632668ec410e2076d27f71c]))
                              4: Child)
                            lower_layers: {
                              color => {
                                LayerProof {
                                  proof: Merk(
                                    ... 18 boundary-descent ops walking the binary tree from
                                    root (color_00000511) leftward to the cut point ...
                                    18: Push(KVDigestCount(color_00000500, HASH[47b0ade5...], 100))
                                       // op 18: BOUNDARY (excluded by strict `>`)
                                    19: Push(KVValueHashFeatureTypeWithChildHash(color_00000501,
                                       CountTree(00, 100, flags: [0, 0, 0]),
                                       HASH[9146433eb6d43db2f109f5f7714146624bd646b27c7310f3c2cad7155eb7c741],
                                       ProvableCountedMerkNode(300),
                                       HASH[c285efb8724a488de916ce8301b06c197fc687b5b9b83a04bf3a026f1098d17a]))
                                       // op 19: TARGET 1
                                    20: Parent
                                    21: Push(KVValueHashFeatureTypeWithChildHash(color_00000502, CountTree(00, 100, ...)))
                                       // op 21: TARGET 2
                                    ... 98 more KVValueHashFeatureTypeWithChildHash targets
                                    (color_00000503 ... color_00000600), each emitting
                                    `CountTree count=100` plus its merk feature/child-hash glue,
                                    interleaved with Parent/Child ops walking the binary tree
                                    in lex-asc order. Every target shares the same shape:
                                    Push(KVValueHashFeatureTypeWithChildHash(
                                      color_XXXXXXXX,
                                      CountTree(00, 100, flags: [0, 0, 0]),
                                      HASH[...],
                                      ProvableCountedMerkNode(running_count_at_this_node),
                                      HASH[...]
                                    )) ...
                                    220: Push(KVValueHashFeatureTypeWithChildHash(color_00000600,
                                       CountTree(00, 100, ...))) // op 220: TARGET 100 (LAST)
                                    221..244: closing boundary ops — KVHashCount running
                                    counts (300, 700, 6300, 25500, 48800) and Hash subtrees
                                    proving the still-out-of-range portion to the right of
                                    color_00000600 covers the remainder of the merk root.)
                                }
                              }
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

That schematic gives the shape; the bench's [gproof] output (run cargo bench --bench document_count_worst_case and grep [gproof] G4) has all 245 ops verbatim. The compression in the chapter just elides the 100 KVValueHashFeatureTypeWithChildHash targets since they share the same structural template — only the key name, the leaf kv-hash, the running count, and the child-hash differ.

Why so many targets? Because GroupByRange must enumerate every in-range key with its CountTree value — the SDK needs each individual key→count pair, which the aggregate-style HashWithCount commitment hides. So the prover walks the merk binary tree's in-order traversal across the in-range portion (here, left-to-right starting just past color_00000500) and emits one KVValueHashFeatureTypeWithChildHash per distinct color it visits, until the response-size limit is reached.

flowchart TB
  WD["@/contract_id/0x01/widget"]:::tree
  WD ==> CO["color: ProvableCountTree count=100000"]:::path
  CO -.-> C500["color_00000500 (boundary, excluded)"]:::faded
  CO ==> C501["color_00000501: CountTree count=100"]:::target
  CO ==> CMore["color_00000502 ... color_00000600<br/>(98 more in-range targets,<br/>each CountTree count=100)"]:::target
  CO ==> C600["color_00000600: CountTree count=100"]:::target
  CO -.-> CRest["color_00000601 ... color_00000999<br/>(beyond limit — opaque)"]:::faded

  SDK["Verifier returns Entries(100 groups):<br/>(&quot;color_00000501&quot;, 100),<br/>(&quot;color_00000502&quot;, 100),<br/>... (&quot;color_00000600&quot;, 100)"]:::sdk
  C501 -.-> SDK
  CMore -.-> SDK
  C600 -.-> SDK

  classDef tree fill:#21262d,color:#c9d1d9,stroke:#1f6feb,stroke-width:2px;
  classDef path fill:#d29922,color:#0d1117,stroke:#1f6feb,stroke-width:2px;
  classDef faded fill:#21262d,color:#6e7681,stroke:#484f58;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

  linkStyle 0 stroke:#1f6feb,stroke-width:3px;
  linkStyle 2 stroke:#1f6feb,stroke-width:3px;
  linkStyle 3 stroke:#1f6feb,stroke-width:3px;
  linkStyle 4 stroke:#1f6feb,stroke-width:3px;

Diagram: per-layer merk-tree structure (Layer 5+)

L5 is identical to Q3's / Q6's L5 (color queried under an opaque kv root in the widget doctype tree). L6 is the structural novelty: 245 merk ops, of which 100 are full KVValueHashFeatureTypeWithChildHash targets and the remaining 145 are boundary-walk glue (KVDigestCount / KVHashCount / HashWithCount / Hash + Parent/Child).

flowchart TB
  subgraph L5["Layer 5 — widget doctype merk-tree (proof view for `color`)"]
    direction TB
    L5_root["KVHash[a29e...]<br/>(opaque kv root)"]:::sibling
    L5_left["HASH[9862...]"]:::sibling
    L5_q["<b>color</b><br/>kv_hash=HASH[7956...]<br/>value: ProvableCountTree count=100000"]:::queried
    L5_root --> L5_left
    L5_root --> L5_q
  end

  subgraph L6["Layer 6 — byColor ProvableCountTree merk-tree (100 in-range targets)"]
    direction TB
    L6_boundary_l["Left boundary descent (18 ops):<br/>walks from merk root color_00000511<br/>through KVHashCount running counts<br/>(51100, 25500, 12700, 6300, 3100, 700)<br/>down to color_00000500"]:::sibling
    L6_cut["op 18: KVDigestCount(color_00000500, ..., 100)<br/>(boundary — excluded by strict `>`)"]:::boundary
    L6_targets["ops 19..220: 100 in-range targets<br/>color_00000501 (count=100), color_00000502 (100),<br/>color_00000503 (100), ... color_00000600 (100)<br/>each as KVValueHashFeatureTypeWithChildHash<br/>with ProvableCountedMerkNode(subtree_count)<br/>interleaved with Parent/Child glue"]:::target
    L6_boundary_r["Right closing boundary (24 ops):<br/>KVHashCount running counts<br/>(300, 700, 6300, 25500, 48800)<br/>+ Hash subtree commitments<br/>covering color_00000601 ... color_00000999"]:::sibling

    L6_boundary_l --> L6_cut
    L6_cut --> L6_targets
    L6_targets --> L6_boundary_r
  end

  L5_q -. "ProvableCountTree(merk_root[byColor])" .-> L6_boundary_l

  classDef queried fill:#1f6feb,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef sibling fill:#6e7681,color:#fff,stroke:#6e7681;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef boundary fill:#d29922,color:#0d1117,stroke:#d29922,stroke-width:2px,stroke-dasharray: 6 3;

Three things this diagram makes explicit:

The cut is named. op 18: KVDigestCount(color_00000500, ..., 100) exposes the key at the boundary so the verifier knows the cut sits exactly between color_00000500 (excluded) and color_00000501 (first in-range). Without that named op, a malicious prover could shift the cut and the verifier wouldn't know.
Targets carry their own count, not a running total. Unlike Q7's boundary commitments (where ProvableCountedMerkNode(N) carried a subtree count), G4's targets are individual keys with CountTree(00, 100, ...) — the count_value_or_default = 100 IS the per-key count, not a subtree aggregate. The ProvableCountedMerkNode(N) on the merk feature still carries the subtree count (e.g. 300 for color_00000501's subtree), but G4's verifier reads count_value_or_default directly from the CountTree element.
The right closing boundary doesn't enumerate the rest. Once the limit is hit at color_00000600, the proof commits the remaining ~399 in-range colors as opaque subtree hashes (KVHashCount + Hash ops). The SDK returns only the 100 visible groups; the remainder are provably present but not enumerated. This is the limit's whole point — bound response size without sacrificing soundness on the visible groups.

G5 — Compound `In` + Range, Grouped By `brand, color`

select   = COUNT
where    = brand IN ["brand_000", "brand_001"] AND color > "color_00000500"
group_by = [brand, color]
prove    = true

Path query (outer In on byBrand fans out to per-brand distinct_count_path_query on byBrandColor's color terminator):

outer path:         ["@", contract_id, 0x01, "widget", "brand"]
outer query items:  [Key("brand_000"), Key("brand_001")]
subquery_path:      ["color"]
subquery items:     [RangeAfter("color_00000500"..)]
subquery limit:     100 (shared across both brands)

Verified payload:

Entries(100 groups, sum = 100)

Two brands × 50 in-range colors per brand = 100 distinct (brand, color) groups visible in the proof. Each (brand_X, color_Y) pair has exactly 1 document by the fixture's deterministic schedule.

Proof size: 11 554 B. Mode: CountMode::GroupByCompound.

This is the most general group-by shape supported on this contract: outer In fan-out × inner GroupByRange walk. Structurally it combines G3's two-branch descent with G4's in-range enumeration per branch. Proof bytes ≈ shared upper-layer descent + 2 × per-brand byBrandColor distinct-walk. The bench's group_by_compound_in_range_proof_limit_100 benchmark uses the same shape with |IN| = 100 brands instead of 2 — yielding 17 256 B at the much higher fan-out.

Proof display:

Expand to see the structured proof (8 layers — same descent skeleton as G3, but each brand's L8 enumerates 50 in-range colors instead of one point-lookup target) — or open interactively in the visualizer ↗

GroveDBProofV1 {
  LayerProof {
    proof: Merk(
      0: Push(Hash(HASH[bd291f29893fb6f6d6201087746ca1f23a178dd08e1346cb6c127e91ae3623b3]))
      1: Push(KVValueHash(@, Tree(4ed22624752972af97fb71abf4067b23e6d296a61a02f35b2098819fde39d289), HASH[4a5a28cb1b40226aa35b2f0d502767df13268bdf4678627dbfde26a557acdf73]))
      2: Parent
      3: Push(Hash(HASH[19c924989e473a90d0848277d0b1498ccc8db3dc870cbc130e773f3d79ea5b71]))
      4: Child)
    lower_layers: {
      @ => { LayerProof { ... contract_id descent ... } }
      // L2..L4 identical to G3 / Q4's first three subgroves
    }
  }
  // L5 widget doctype merk tree: same as G3 — `brand` queried, opaque siblings 9862 / 6c36
  // L6 byBrand merk tree: two KVValueHash targets (brand_000 + brand_001), 25 boundary ops
  // L7a brand_000's value tree: single key `color` with NonCounted(ProvableCountTree(...))
  //   L8a byBrandColor's color subtree (under brand_000):
  //     proof: Merk(
  //       ... 18 boundary-descent ops walking from the merk root down to color_00000500 ...
  //       18: Push(KVDigestCount(color_00000500, HASH[...], 1))     // BOUNDARY, excluded
  //       19: Push(KVValueHashFeatureTypeWithChildHash(color_00000501,
  //              CountTree(00, 1, flags: [0, 0, 0]),
  //              HASH[4192...], ProvableCountedMerkNode(3), HASH[c3b4...])) // TARGET (brand_000, color_00000501)
  //       21: Push(KVValueHashFeatureTypeWithChildHash(color_00000502, CountTree(00, 1, ...))) // TARGET 2
  //       24: Push(KVValueHashFeatureTypeWithChildHash(color_00000503, CountTree(00, 1, ...))) // TARGET 3
  //       ... 47 more KVValueHashFeatureTypeWithChildHash targets, each CountTree(00, 1, ...)
  //           — color_00000504 ... color_00000550 (50 per-brand_000 targets total) ...
  //       ... closing boundary ops covering color_00000551 ... color_00000999 for brand_000
  //     )
  //   end L8a
  // end L7a
  // L7b brand_001's value tree: identical structure to L7a, single key `color`
  //   L8b byBrandColor's color subtree (under brand_001):
  //     proof: Merk(
  //       ... 18 boundary-descent ops (different hashes — different brand's subtree) ...
  //       18: Push(KVDigestCount(color_00000500, HASH[...], 1))
  //       19..220: 50 in-range KVValueHashFeatureTypeWithChildHash(color_X, CountTree(00, 1, ...)) targets
  //                + interleaved Parent/Child glue + closing boundary ops
  //     )
  //   end L8b
  // end L7b
  // end L6
}

The 344-line verbatim is available via the bench's [gproof] G5 output. The schematic compresses the 50 per-brand KVValueHashFeatureTypeWithChildHash targets at L8a / L8b — they all share the same template (CountTree(00, 1, ...) since each (brand, color) pair has count=1), differing only in key, leaf kv-hash, running count, and child-hash. Once you've seen G3's L8 structure (single target) and G4's L6 structure (100 in-range targets at the doctype level), G5 is precisely the product: two parallel G3-shaped descents that each terminate in a G4-shaped distinct-walk.

flowchart TB
  WD["@/contract_id/0x01/widget"]:::tree
  WD ==> BR["brand: NormalTree"]:::path
  BR ==> B000["brand_000: CountTree count=1000"]:::path
  BR ==> B001["brand_001: CountTree count=1000"]:::path

  B000 ==> B000_C["brand_000/color: NonCounted(ProvableCountTree)"]:::path
  B001 ==> B001_C["brand_001/color: NonCounted(ProvableCountTree)"]:::path

  B000_C ==> T000_501["color_00000501: CountTree count=1"]:::target
  B000_C ==> T000_more["... 48 more color targets<br/>(brand_000, color_00000502..550)"]:::target
  B000_C ==> T000_550["color_00000550: CountTree count=1"]:::target

  B001_C ==> T001_501["color_00000501: CountTree count=1"]:::target
  B001_C ==> T001_more["... 48 more color targets<br/>(brand_001, color_00000502..550)"]:::target
  B001_C ==> T001_550["color_00000550: CountTree count=1"]:::target

  SDK["Entries(100 groups, sum=100):<br/>(&quot;brand_000&quot;, &quot;color_00000501&quot;, 1),<br/>...<br/>(&quot;brand_001&quot;, &quot;color_00000550&quot;, 1)"]:::sdk

  T000_501 -.-> SDK
  T000_more -.-> SDK
  T000_550 -.-> SDK
  T001_501 -.-> SDK
  T001_more -.-> SDK
  T001_550 -.-> SDK

  classDef tree fill:#21262d,color:#c9d1d9,stroke:#1f6feb,stroke-width:2px;
  classDef path fill:#6e7681,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

  linkStyle 0 stroke:#1f6feb,stroke-width:3px;
  linkStyle 1 stroke:#1f6feb,stroke-width:3px;
  linkStyle 2 stroke:#1f6feb,stroke-width:3px;
  linkStyle 3 stroke:#1f6feb,stroke-width:3px;
  linkStyle 4 stroke:#1f6feb,stroke-width:3px;
  linkStyle 5 stroke:#1f6feb,stroke-width:3px;
  linkStyle 6 stroke:#1f6feb,stroke-width:3px;
  linkStyle 7 stroke:#1f6feb,stroke-width:3px;
  linkStyle 8 stroke:#1f6feb,stroke-width:3px;

Diagram: per-layer merk-tree structure (Layer 5+)

Layers 5–7 are exactly G3's L5–L7. The difference shows up at L8 — instead of a single target per brand (G3's compound point lookup), each brand's L8 walks 50 in-range colors via the same KVValueHashFeatureTypeWithChildHash enumeration G4 uses, plus the boundary descent / closing boundary glue.

flowchart TB
  subgraph L5["Layer 5 — widget doctype merk-tree"]
    direction TB
    L5_q["<b>brand</b> (queried)<br/>kv_hash=HASH[68b6...]"]:::queried
  end

  subgraph L6["Layer 6 — byBrand merk-tree (two intermediate targets)"]
    direction TB
    L6_t0["<b>brand_000</b> (queried)<br/>CountTree count=1000"]:::queried
    L6_t1["<b>brand_001</b> (queried)<br/>CountTree count=1000"]:::queried
  end

  subgraph L7a["Layer 7a — brand_000's continuation"]
    direction TB
    L7a_q["<b>color</b> (queried)<br/>NonCounted(ProvableCountTree)"]:::queried
  end
  subgraph L7b["Layer 7b — brand_001's continuation"]
    direction TB
    L7b_q["<b>color</b> (queried)<br/>NonCounted(ProvableCountTree)"]:::queried
  end

  subgraph L8a["Layer 8a — brand_000's byBrandColor distinct-walk"]
    direction TB
    L8a_targets["50 KVValueHashFeatureTypeWithChildHash targets:<br/>color_00000501 ... color_00000550<br/>each CountTree(00, 1, ...)<br/>+ left/right boundary glue"]:::target
  end
  subgraph L8b["Layer 8b — brand_001's byBrandColor distinct-walk"]
    direction TB
    L8b_targets["50 KVValueHashFeatureTypeWithChildHash targets:<br/>color_00000501 ... color_00000550<br/>each CountTree(00, 1, ...)<br/>+ left/right boundary glue<br/>(different hashes — different brand subtree)"]:::target
  end

  L5_q -. "byBrand" .-> L6_t0
  L5_q -. "byBrand" .-> L6_t1
  L6_t0 -. "continuation" .-> L7a_q
  L6_t1 -. "continuation" .-> L7b_q
  L7a_q -. "byBrandColor distinct-range" .-> L8a_targets
  L7b_q -. "byBrandColor distinct-range" .-> L8b_targets

  classDef queried fill:#1f6feb,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;

The 50-targets-per-brand limit reflects the shared response-size cap. In the 2-brand case the cap kicks in at 50 colors per brand; if the In set had 1 brand it would be 100 colors; if it had 4 brands it would be 25 each. The dispatcher slices the cap evenly across the In fan-out so the total number of returned entries equals the limit, regardless of how many In branches share it. That's why the bench's [matrix] row for this case shows Entries(len=100, sum=100) rather than len=200, sum=200.

G7 — Carrier `In` + Range, Grouped By `brand`

select   = COUNT
where    = brand IN ["brand_000", "brand_001"] AND color > "color_00000500"
group_by = [brand]
prove    = true

Path query (carrier AggregateCountOnRange — outer Keys per In value, ACOR subquery over each brand's color subtree):

path:                  ["@", contract_id, 0x01, "widget", "brand"]
outer query items:     [Key("brand_000"), Key("brand_001")]
subquery_path:         ["color"]
subquery items:        [AggregateCountOnRange([RangeAfter("color_00000500"..)])]

Verified payload (verifier returns one (in_key, u64) per resolved In branch via GroveDb::verify_aggregate_count_query_per_key):

[("brand_000", 499), ("brand_001", 499)]

Each brand has all 1 000 colors in its byBrandColor terminator; the strict > cut at color_00000500 leaves color_00000501..color_00000999 = 499 in-range colors per brand. Total sum = 998 documents.

Proof size: 4 332 B. Mode: CountMode::GroupByIn routed to DocumentCountMode::RangeAggregateCarrierProof (the new dispatcher arm wired up against grovedb PR #663).

This is the natural answer to "give me a per-brand aggregate count over a colour range" — same per-In-aggregate semantics as the no-proof per-In fan-out, just verifiable in a single proof. Strictly smaller and asymptotically better than the alternative two-field shape G5:

G5 (compound distinct walk, group_by = [brand, color]): O(k · R' · log C') bytes; emits one KVValueHashFeatureTypeWithChildHash per resolved (brand, color) pair → 11 554 B for k=2, R'≈50. Carries per-pair granularity the caller may not want.
G7 (carrier aggregate, group_by = [brand]): O(k · (log B + log C')) bytes; emits one HashWithCount/KVDigestCount ACOR boundary walk per brand → 4 332 B for k=2, log C'≈10. ~2.7× smaller than G5 for the same input data, at the cost of losing per-color resolution (which the group_by = [brand] caller didn't ask for anyway).

The win vs Q8 (brand == X AND color > floor, the same shape with k=1 and group_by = []) is asymptotic: Q8 is 2 656 B, G7 is 4 332 B for k=2. The slope (G7 − Q8) / 1 = +1 676 B per additional In branch matches what you'd expect: each brand adds its own L6 commit + its own L7 + L8 ACOR boundary walk (≈ Q8's L7 + L8 ≈ ~1 700 B), with the L1–L5 prefix amortising once across all branches.

Proof display:

Expand to see the structured proof (8 layers — same skeleton as G5, but each brand's L8 is an ACOR boundary walk instead of a 50-target distinct-walk) — or open interactively in the visualizer ↗

GroveDBProofV1 {
  LayerProof {
    proof: Merk(... root-level descent, identical to every other chapter query ...)
    lower_layers: {
      @ => { ... contract_id descent ... }
      // L2..L4 byte-identical to G3 / G5 (the @/contract_id/0x01/widget chain)
    }
  }
  // L5 widget doctype: brand queried (same as G3 / G5 — opaque siblings 9862 / 6c36)
  // L6 byBrand merk-tree: two KVValueHash targets (brand_000 + brand_001), 25 ops
  //                       — same shape as G5's L6
  // L7a brand_000's value tree: single key `color` with NonCounted(ProvableCountTree)
  //   L8a byBrandColor color subtree under brand_000:
  //     proof: Merk(
  //       ... 36-37 ACOR boundary ops over color > color_00000500 ...
  //       18: Push(KVDigestCount(color_00000500, ..., 1))          // BOUNDARY (excluded)
  //       19..35: HashWithCount / KVDigestCount boundary walk
  //                 — same shape as Q8's L8, summing to count=499 for brand_000)
  //   end L8a
  // end L7a
  // L7b brand_001's value tree: same single-key shape, different hashes
  //   L8b byBrandColor color subtree under brand_001:
  //     proof: Merk(
  //       ... 36-37 ACOR boundary ops over color > color_00000500 ...
  //                 — same shape, different hashes, summing to count=499 for brand_001)
  //   end L8b
  // end L7b
}

The 186-line full verbatim is available via the bench's [gproof] G7 output. The schematic compresses the L1–L4 doctype prefix (byte-identical to every other 8-layer chapter query) and the two parallel L7+L8 descents (structurally identical to Q8's, with different hashes for each brand). Each brand's L8 contributes ~1 700 B of ACOR boundary commitments — exactly the predicted Q8 - L1..L5 overhead per branch.

Cryptographic guarantee (via grovedb PR #663): every per-brand count is independently committed to the merk root via node_hash_with_count. A malicious prover can't lie about brand_000's count without breaking brand_001's verification (and vice versa) because each carrier ACOR subquery has its own hash chain back to the merk root.

flowchart TB
  WD["@/contract_id/0x01/widget"]:::tree
  WD ==> BR["brand: NormalTree"]:::path
  BR ==> B000["brand_000: CountTree count=1000"]:::path
  BR ==> B001["brand_001: CountTree count=1000"]:::path
  B000 ==> B000_C["brand_000/color: NonCounted(ProvableCountTree)<br/>ACOR boundary walk (color > color_00000500)"]:::target
  B001 ==> B001_C["brand_001/color: NonCounted(ProvableCountTree)<br/>ACOR boundary walk (color > color_00000500)"]:::target

  SDK["Entries(2 groups, sum=998):<br/>(&quot;brand_000&quot;, 499)<br/>(&quot;brand_001&quot;, 499)"]:::sdk
  B000_C -.-> SDK
  B001_C -.-> SDK

  classDef tree fill:#21262d,color:#c9d1d9,stroke:#1f6feb,stroke-width:2px;
  classDef path fill:#6e7681,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

  linkStyle 0 stroke:#1f6feb,stroke-width:3px;
  linkStyle 1 stroke:#1f6feb,stroke-width:3px;
  linkStyle 2 stroke:#1f6feb,stroke-width:3px;
  linkStyle 3 stroke:#1f6feb,stroke-width:3px;
  linkStyle 4 stroke:#1f6feb,stroke-width:3px;

Diagram: per-layer merk-tree structure (Layer 5+)

L5–L7 are exactly G5's L5–L7 (widget → byBrand → brand_X's continuation). The difference is at L8: G5 enumerates 50 distinct (brand_X, color_Y) pairs as KVValueHashFeatureTypeWithChildHash targets per brand; G7 walks the same color subtree as an ACOR boundary cut (like Q8's L8), emitting HashWithCount / KVDigestCount ops that commit a single aggregate u64 per brand.

flowchart TB
  subgraph L5["Layer 5 — widget doctype merk-tree"]
    direction TB
    L5_q["<b>brand</b> (queried)<br/>kv_hash=HASH[68b6...]"]:::queried
  end

  subgraph L6["Layer 6 — byBrand merk-tree (two intermediate targets)"]
    direction TB
    L6_t0["<b>brand_000</b> (queried)<br/>CountTree count=1000"]:::queried
    L6_t1["<b>brand_001</b> (queried)<br/>CountTree count=1000"]:::queried
  end

  subgraph L7a["Layer 7a — brand_000's continuation"]
    direction TB
    L7a_q["<b>color</b> (queried)<br/>NonCounted(ProvableCountTree)"]:::queried
  end
  subgraph L7b["Layer 7b — brand_001's continuation"]
    direction TB
    L7b_q["<b>color</b> (queried)<br/>NonCounted(ProvableCountTree)"]:::queried
  end

  subgraph L8a["Layer 8a — brand_000's byBrandColor: ACOR cut"]
    direction TB
    L8a_target["<b>Aggregate count = 499</b><br/>(committed via node_hash_with_count)"]:::target
    L8a_ops["~37 merk ops:<br/>KVDigestCount(color_00000500, …) — boundary excluded<br/>+ HashWithCount/KVDigestCount boundary walk<br/>over the in-range portion"]:::sibling
    L8a_target --> L8a_ops
  end
  subgraph L8b["Layer 8b — brand_001's byBrandColor: ACOR cut"]
    direction TB
    L8b_target["<b>Aggregate count = 499</b><br/>(committed via node_hash_with_count)"]:::target
    L8b_ops["~37 merk ops:<br/>same boundary shape as L8a<br/>(different hashes — different brand subtree)"]:::sibling
    L8b_target --> L8b_ops
  end

  L5_q -. "byBrand" .-> L6_t0
  L5_q -. "byBrand" .-> L6_t1
  L6_t0 -. "continuation" .-> L7a_q
  L6_t1 -. "continuation" .-> L7b_q
  L7a_q -. "carrier ACOR subquery" .-> L8a_target
  L7b_q -. "carrier ACOR subquery" .-> L8b_target

  classDef queried fill:#1f6feb,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef sibling fill:#6e7681,color:#fff,stroke:#6e7681;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;

The "carrier" name comes from grovedb's PR #663 terminology: a carrier query is the outer multi-key query that carries an ACOR subquery into each branch. The ACOR primitive itself is unchanged — it still walks one range over one subtree per invocation — but it can now appear as a subquery item under outer Keys, which is what enables the per-brand aggregate proof shape G7 needs.

G8 — Carrier outer Range + Range, Grouped By `brand`

select   = COUNT
where    = brand > "brand_050" AND color > "color_00000500"
group_by = [brand]
limit    = (optional; ≤ 10)
prove    = true

The platform's MAX_CARRIER_AGGREGATE_OUTER_RANGE_LIMIT = 10 is both the default (when the caller passes no limit) and a hard ceiling. Callers may pass a smaller limit (1 through 9) to truncate the outer walk further; passing 0 or any value > 10 is rejected with InvalidLimit. See the rationale below.

Path query (the same carrier-ACOR shape as G7, but with a range outer dimension and SizedQuery::limit bounded by the platform max):

path:                  ["@", contract_id, 0x01, "widget", "brand"]
outer query item:      RangeAfter("brand_050"..)
subquery_path:         ["color"]
subquery items:        [AggregateCountOnRange([RangeAfter("color_00000500"..)])]
SizedQuery::limit:     10  (platform default; caller may request smaller)

Verified payload (verifier returns one (in_key, u64) per in-range outer key, capped at limit, via GroveDb::verify_aggregate_count_query_per_key):

[("brand_051", 499), ("brand_052", 499), …, ("brand_060", 499)]

The bench's 100-brand fixture has 49 brands > "brand_050". The platform's default SizedQuery::limit = 10 caps the carrier at the first 10 (brand_051 … brand_060); each carries the per-brand ACOR count of 499 in-range colors (color_00000501 … color_00000999). Total sum = 10 × 499 = 4 990 documents.

Proof size: 18 022 B. Mode: CountMode::GroupByRange routed to DocumentCountMode::RangeAggregateCarrierProof (the dispatcher distinguishes G7's In-outer shape from G8's Range-outer shape by the carrier clause's operator).

G8 is G7's natural extension from "k specific outer keys" to "L outer keys from an in-range walk." Same carrier proof primitive, same node_hash_with_count commitments per branch, same one-u64-per-branch return shape. The structural differences are exactly two:

Outer dimension: G7 emits k Key(serialized_in_value) items in the carrier query; G8 emits a single RangeAfter(serialized_floor..) (or any Range* variant) and lets grovedb walk it.
Limit: G8 sets SizedQuery::limit = Some(L) where L is the smaller of the caller's request and the platform max. Per grovedb PR #664, this is the load-bearing relaxation — the predecessor PR #663 allowed Range outer items at the validator level but kept the leaf-ACOR rule rejecting SizedQuery::limit, which made unbounded range-outer carriers impractical at any reasonable dataset size (49 brands × ~1 700 B each ≈ 83 KB; with the platform default of 10 we land at 18 KB).

Why the cap exists and where the ceiling lives

The cap bounds the prove-path proof size; the ceiling is a hardcoded compile-time constant for prover/verifier-agreement reasons.

Proof-size bounding. Proof bytes scale linearly with the limit (~1 700 B per outer match, exactly as for G7). 10 keeps the worst-case proof under 20 KB (Tier-1 for the GroveDB Proof Visualizer's shareable-link guidance — Tier-1 ≤ 20 KB works in every browser and link-preview surface; Tier-2 of 20–50 KB works in browsers but may be truncated in Slack/Discord previews; Tier-3 above 50 KB risks Safari's URL ceiling) — enough for typical "top-N brands by an outer range" queries while avoiding pathological proof sizes. Callers that want a window above 10 entries call repeatedly with disjoint outer-range bounds; callers that want fewer pass a smaller limit (1 through 9). Limit 0 is rejected to keep the response shape non-trivial.
Prover/verifier byte-for-byte agreement. SizedQuery::limit is part of the serialized PathQuery and feeds the merk-root reconstruction; both prover and verifier must agree on its value. The caller's request carries limit over the wire, so its specific value (1..=10) is fine to vary. What can't vary is the platform's default when the caller passes nothing — that's why the ceiling is a hardcoded compile-time constant (MAX_CARRIER_AGGREGATE_OUTER_RANGE_LIMIT) rather than an operator-tunable runtime value. Same rationale as RangeDistinctProof's use of crate::config::DEFAULT_QUERY_LIMIT rather than drive_config.default_query_limit.

Caller semantics summary:

Caller `request.limit`	Server uses	Reason
`None`	10 (the platform default)	Default = ceiling
`Some(1..=10)`	the caller's value	Truncates the walk further
`Some(0)`	rejected	Non-trivial response required
`Some(11+)`	rejected	Above the ceiling

Complexity: O(L · (log B + log C')) where L = min(caller_limit, MAX_CARRIER_AGGREGATE_OUTER_RANGE_LIMIT) — L outer-key descents in the byBrand layer + L leaf-ACOR boundary walks in each brand's color subtree. Independent of how many keys the outer range could have walked without the cap.

Proof display:

Expand to see the structured proof (8 layers — same skeleton as G7, but L8 contains 10 per-brand ACOR boundary walks instead of 2) — or open interactively in the visualizer ↗

GroveDBProofV1 {
  LayerProof {
    proof: Merk(... root-level descent, identical to every other chapter query ...)
    lower_layers: {
      @ => { ... contract_id descent ... }
      // L2..L4 byte-identical to G3 / G5 / G7 (the @/contract_id/0x01/widget chain)
    }
  }
  // L5 widget doctype: brand queried (same as G3 / G5 / G7)
  // L6 byBrand merk-tree: 10 outer-key matches inlined as KVValueHash items
  //                       (brand_051 ... brand_060), each descending into its
  //                       continuation. Boundary commitments cover the
  //                       brands_outside_the_limited_window.
  // L7 brand_NNN's value tree: single key `color` with NonCounted(ProvableCountTree)
  //    — repeated 10 times, once per resolved outer brand
  // L8 brand_NNN's byBrandColor color subtree:
  //    proof: Merk(
  //      ... 36-37 ACOR boundary ops over color > color_00000500,
  //          summing to count = 499 per brand ...
  //    )
  //    — repeated 10 times in parallel, each with its own per-brand boundary hashes
}

The 618-line full verbatim is available via the bench's [gproof] G8 output. The schematic compresses the 10 parallel L7+L8 descents — they share the same template (single-key continuation + 37-op ACOR boundary walk), differing only in per-brand kv-hashes and the resulting subtree commits. Each per-brand L8 contributes ~1 700 B of ACOR boundary commitments — exactly the predicted Q8 - L1..L5 overhead per outer match, scaling linearly: 18 022 B ≈ shared upper layers + 10 × ~1 700 B ≈ 18 KB (matches the per-In slope from G7 vs Q8).

Cryptographic guarantee (via grovedb PR #663 + PR #664): every per-brand count is independently committed to the merk root via node_hash_with_count. The SizedQuery::limit is part of the serialized PathQuery and is part of the merk-root reconstruction the verifier performs — a malicious prover can't truncate the outer walk at a different point without breaking the hash chain.

flowchart TB
  WD["@/contract_id/0x01/widget"]:::tree
  WD ==> BR["brand: NormalTree"]:::path
  BR ==> B051["brand_051: CountTree count=1000"]:::path
  BR ==> BMore["… 8 more in-range brands (brand_052 … brand_059) …"]:::path
  BR ==> B060["brand_060: CountTree count=1000"]:::path
  BR -.-> BCapped["brand_061 … brand_099<br/>(beyond platform cap — opaque subtree commitments)"]:::faded
  BR -.-> BBelow["brand_000 … brand_050<br/>(below range floor — boundary commitments)"]:::faded

  B051 ==> B051_C["brand_051/color: NonCounted(ProvableCountTree)<br/>ACOR boundary walk (color > color_00000500)"]:::target
  BMore ==> BMore_C["8 parallel ACOR walks"]:::target
  B060 ==> B060_C["brand_060/color: NonCounted(ProvableCountTree)<br/>ACOR boundary walk (color > color_00000500)"]:::target

  SDK["Entries(10 groups, sum=4 990):<br/>(&quot;brand_051&quot;, 499)<br/>(&quot;brand_052&quot;, 499)<br/>…<br/>(&quot;brand_060&quot;, 499)"]:::sdk
  B051_C -.-> SDK
  BMore_C -.-> SDK
  B060_C -.-> SDK

  classDef tree fill:#21262d,color:#c9d1d9,stroke:#1f6feb,stroke-width:2px;
  classDef path fill:#6e7681,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef faded fill:#21262d,color:#6e7681,stroke:#484f58;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

  linkStyle 0 stroke:#1f6feb,stroke-width:3px;
  linkStyle 1 stroke:#1f6feb,stroke-width:3px;
  linkStyle 2 stroke:#1f6feb,stroke-width:3px;
  linkStyle 3 stroke:#1f6feb,stroke-width:3px;
  linkStyle 6 stroke:#1f6feb,stroke-width:3px;
  linkStyle 7 stroke:#1f6feb,stroke-width:3px;
  linkStyle 8 stroke:#1f6feb,stroke-width:3px;

Diagram: per-layer merk-tree structure (Layer 5+)

L5 is identical to G7's L5 (widget doctype with brand queried). L6 differs: G7 inlined 2 KVValueHash targets for the In-bearing brands; G8 inlines 10 KVValueHash targets for the in-range brands the carrier walks (brand_051 through brand_060), with boundary commitments covering both the below-floor and beyond-cap portions of the byBrand merk tree. L7 + L8 fork into 10 parallel descents, each shaped exactly like G7's L7 + L8 — same NonCounted(ProvableCountTree) continuation, same 37-op ACOR boundary walk over color > color_00000500.

flowchart TB
  subgraph L5["Layer 5 — widget doctype merk-tree"]
    direction TB
    L5_q["<b>brand</b> (queried)<br/>kv_hash=HASH[68b6...]"]:::queried
  end

  subgraph L6["Layer 6 — byBrand merk-tree (10 outer-range targets)"]
    direction TB
    L6_t051["<b>brand_051</b><br/>CountTree count=1000"]:::queried
    L6_tmid["… 8 more in-range targets …<br/>(brand_052 … brand_059)"]:::queried
    L6_t060["<b>brand_060</b><br/>CountTree count=1000"]:::queried
    L6_capped["Beyond-cap commitments:<br/>brand_061 … brand_099<br/>(opaque KVHash / Hash ops)"]:::sibling
    L6_floor["Below-floor commitments:<br/>brand_000 … brand_050<br/>(opaque)"]:::sibling

    L6_t051 --> L6_tmid
    L6_tmid --> L6_t070
    L6_t070 --> L6_capped
    L6_t051 --> L6_floor
  end

  subgraph L7L8["Layers 7+8 — per-brand continuation + ACOR walk (×10)"]
    direction TB
    L7L8_each["For each of brand_051 … brand_060:<br/>L7: single-key `color` continuation (NonCounted(ProvableCountTree))<br/>L8: 37 merk ops — ACOR boundary walk for color > color_00000500<br/>committing one `u64 = 499` per brand"]:::target
  end

  L5_q -. "byBrand" .-> L6_t051
  L6_t051 -. "continuation × 20" .-> L7L8_each

  classDef queried fill:#1f6feb,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef sibling fill:#6e7681,color:#fff,stroke:#6e7681;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;

The slope vs G7 is the proof's whole story: G7's k = 2 outer matches → ~4 KB; G8's L = 10 outer matches → ~18 KB. The per-outer-match cost (~1 700 B) is the same; only the outer-walk count changes. The platform max of 10 keeps the worst-case proof under 20 KB (Tier-1 of the visualizer's shareable-link guidance); larger windows are unreachable without changing the constant — callers that want more results call repeatedly with disjoint outer-range windows.

G8a — Bounded carrier + bounded ACOR, grouped by `brand`, descending

select   = COUNT
where    = brand > "brand_050" AND brand < "brand_065"
       AND color > "color_00000200" AND color < "color_00000400"
group_by = [brand]
order_by = [(brand, desc)]
prove    = true

G8a stresses three carrier-ACOR dimensions G8 didn't: a bounded outer range (instead of half-open), a bounded inner ACOR (instead of > floor), and a descending walk (instead of left-to-right ascending). All three orthogonal. Same RangeAggregateCarrierProof mode, same path-query builder; the differences live entirely in the per-clause QueryItem variants and the carrier's left_to_right flag.

Path query (the carrier query items differ from G8 in three ways: outer item is RangeAfterTo instead of RangeAfter, inner ACOR item is RangeAfterTo instead of RangeAfter, and outer_query.left_to_right = false):

path:                  ["@", contract_id, 0x01, "widget", "brand"]
outer query item:      RangeAfterTo("brand_050".."brand_065")  // exclusive bounds
subquery_path:         ["color"]
subquery items:        [AggregateCountOnRange([RangeAfterTo("color_00000200".."color_00000400")])]
SizedQuery::limit:     10                                       // platform default
outer Query.left_to_right: false                                // from order_by [(brand, desc)]

Same-field range merging. The caller's wire shape carries four range clauses (brand >, brand <, color >, color <). The dispatcher merges each same-field pair into a single BetweenExcludeBounds clause via merge_same_field_range_pairs before mode detection runs. After merging, the structure is identical to G8's two-range shape; mode detection routes to RangeAggregateCarrierProof for the same reasons.

Verified payload (descending walk — outer keys come out from highest to lowest, capped at L = 10):

[("brand_064", 199), ("brand_063", 199), …, ("brand_055", 199)]

The bench's 100-brand fixture has 14 brands strictly between "brand_050" and "brand_065" (i.e. brand_051 through brand_064). The descending walk starts at brand_064 and runs left-to-right=false through the byBrand merk tree; the SizedQuery::limit = 10 halts the walk after 10 outer matches (brand_064 down to brand_055). Each brand's inner ACOR over color > "color_00000200" AND color < "color_00000400" sums to 199 documents (199 colors color_00000201 … color_00000399, one document per (brand, color) pair in the fixture). Total sum = 10 × 199 = 1 990.

Proof size: 29 010 B. Mode: CountMode::GroupByRange routed to DocumentCountMode::RangeAggregateCarrierProof.

G8a is structurally G8 with three independent variant changes, each adding a small amount of merk-proof overhead but no asymptotic complexity change:

Bounded outer range → the byBrand merk tree commits both bounds (brand_050 lower-exclusive + brand_065 upper-exclusive) as boundary KVDigest ops. G8's >-only outer commits one boundary; G8a's > AND < commits two. Modest size delta (~1 extra KVDigest per bound × the carrier's tree depth).
Bounded inner ACOR → each per-brand color subtree commits both bounds as KVDigestCount ops. G8's >-only ACOR walks O(log C') boundary nodes for the lower bound; G8a's two-sided ACOR walks O(log C') for both bounds. The asymptotic stays O(L · (log B + log C')); the constant roughly doubles for the per-brand boundary walk.
Descending walk → grovedb emits PushInverted(...) op variants instead of Push(...) and walks the binary merk tree right-to-left. Same op count as ascending, slightly different serialized encoding (~1–2 bytes per op for the PushInverted opcode discriminant). The verifier's reconstruction is byte-identical given the same left_to_right flag in the PathQuery.

Total proof bytes: 29 010 B vs G8's 18 022 B. Per-outer-match overhead: ~2 900 B (G8a) vs ~1 700 B (G8). The extra ~1 200 B per branch is the bounded-inner-ACOR cost — every per-brand subtree commits twice as many boundary KVDigestCount ops.

Proof display:

Expand to see the structured proof (8 layers; L8 uses two-sided ACOR boundary walks per brand, `PushInverted` outer-walk ops for descending direction) — or open interactively in the visualizer ↗

GroveDBProofV1 {
  LayerProof {
    proof: Merk(... root-level descent, identical to every other chapter query ...)
    lower_layers: {
      @ => { ... contract_id descent ... }
      // L2..L4 byte-identical to every 8-layer carrier query in this chapter
    }
  }
  // L5 widget doctype: brand queried (same as G3 / G5 / G7 / G8)
  // L6 byBrand merk-tree: walked LEFT-TO-RIGHT=FALSE (descending).
  //                       Outer query item: RangeAfterTo("brand_050".."brand_065")
  //                       Inlined targets: brand_064 → brand_063 → ... → brand_055
  //                       via `PushInverted(KVValueHash(brand_NNN, CountTree, ...))` ops.
  //                       Boundary KVDigest nodes name brand_065 (upper-exclusive cut)
  //                       and brand_050 (lower-exclusive cut, capped by SizedQuery::limit).
  // L7 brand_NNN's value tree: single key `color` with NonCounted(ProvableCountTree)
  //    — repeated 10 times, once per resolved outer brand (in descending order).
  // L8 brand_NNN's byBrandColor color subtree:
  //    proof: Merk(
  //      ... ACOR boundary walk for color > "color_00000200" AND color < "color_00000400"
  //          (two-sided cut, ~2× the boundary ops of G8's one-sided ACOR),
  //          summing to count = 199 per brand ...
  //    )
  //    — repeated 10 times in parallel, each with its own per-brand boundary hashes.
}

The 902-line full verbatim sits in the bench's [gproof] G8a output. The schematic compresses the 10 parallel L7+L8 descents and the per-brand boundary commitments — they share the same template (single-key continuation + ~50-op two-sided ACOR boundary walk), differing only in per-brand hashes and the resulting subtree commits. Each per-brand L8 contributes ~2 800 B of ACOR boundary commitments (~1.6× G8's ~1 700 B due to the two-sided range walking both bounds).

The most visually distinctive feature of the descending-walk proof: every L6 carrier op is PushInverted(...) rather than Push(...), signalling grovedb's right-to-left binary-merk-tree iteration. Identical merk-root reconstruction given the same Query.left_to_right = false flag — but the wire-level encoding diverges so the verifier knows which direction to walk.

flowchart TB
  WD["@/contract_id/0x01/widget"]:::tree
  WD ==> BR["brand: NormalTree (descending walk, left_to_right=false)"]:::path
  BR ==> B064["brand_064: CountTree count=1000"]:::path
  BR ==> BMore["brand_063 … brand_056<br/>(8 more in-range brands, descending)"]:::path
  BR ==> B055["brand_055: CountTree count=1000"]:::path
  BR -.-> BBelow["brand_051 … brand_054<br/>(in range but below cap — beyond limit, opaque)"]:::faded
  BR -.-> BAbove["brand_065 (boundary key, excluded by &lt;)"]:::faded
  BR -.-> BCapBelow["brand_000 … brand_050<br/>(below floor, opaque)"]:::faded

  B064 ==> B064_C["brand_064/color: NonCounted(ProvableCountTree)<br/>two-sided ACOR (color > 200 AND color < 400)"]:::target
  BMore ==> BMore_C["8 parallel two-sided ACOR walks<br/>(color > 200 AND color < 400)"]:::target
  B055 ==> B055_C["brand_055/color: NonCounted(ProvableCountTree)<br/>two-sided ACOR (color > 200 AND color < 400)"]:::target

  SDK["Entries(10 groups, sum=1 990) — DESCENDING:<br/>(&quot;brand_064&quot;, 199)<br/>(&quot;brand_063&quot;, 199)<br/>…<br/>(&quot;brand_055&quot;, 199)"]:::sdk
  B064_C -.-> SDK
  BMore_C -.-> SDK
  B055_C -.-> SDK

  classDef tree fill:#21262d,color:#c9d1d9,stroke:#1f6feb,stroke-width:2px;
  classDef path fill:#6e7681,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef faded fill:#21262d,color:#6e7681,stroke:#484f58;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef sdk fill:#21262d,color:#39c5cf,stroke:#39c5cf,stroke-width:2px,stroke-dasharray: 4 2;

  linkStyle 0 stroke:#1f6feb,stroke-width:3px;
  linkStyle 1 stroke:#1f6feb,stroke-width:3px;
  linkStyle 2 stroke:#1f6feb,stroke-width:3px;
  linkStyle 3 stroke:#1f6feb,stroke-width:3px;
  linkStyle 7 stroke:#1f6feb,stroke-width:3px;
  linkStyle 8 stroke:#1f6feb,stroke-width:3px;
  linkStyle 9 stroke:#1f6feb,stroke-width:3px;

Diagram: per-layer merk-tree structure (Layer 5+)

L5 is identical to G7 / G8 (widget doctype with brand queried). L6 differs from G8 in two ways: the outer query item is RangeAfterTo (bounded) rather than RangeAfter (half-open), and every op is PushInverted rather than Push because of left_to_right = false. L7 + L8 fork into 10 parallel descents, each carrying a two-sided ACOR boundary walk over color > "color_00000200" AND color < "color_00000400" instead of G8's one-sided color > "color_00000500".

flowchart TB
  subgraph L5["Layer 5 — widget doctype merk-tree"]
    direction TB
    L5_q["<b>brand</b> (queried)<br/>kv_hash=HASH[68b6...]"]:::queried
  end

  subgraph L6["Layer 6 — byBrand merk-tree (bounded outer range, descending walk, 10 targets)"]
    direction TB
    L6_t064["<b>brand_064</b><br/>PushInverted(KVValueHash …)<br/>CountTree count=1000"]:::queried
    L6_tmid["… 8 more in-range targets …<br/>(brand_063 → brand_056, descending)"]:::queried
    L6_t055["<b>brand_055</b><br/>PushInverted(KVValueHash …)<br/>CountTree count=1000"]:::queried
    L6_upper["Upper-bound commitment:<br/>KVDigest(brand_065, …) — excluded by &lt;"]:::boundary
    L6_lower["Below-cap + below-floor commitments:<br/>brand_051 … brand_054 (capped)<br/>+ brand_000 … brand_050 (below floor)<br/>(opaque KVHash / Hash ops)"]:::sibling

    L6_t064 --> L6_tmid
    L6_tmid --> L6_t055
    L6_t064 --> L6_upper
    L6_t055 --> L6_lower
  end

  subgraph L7L8["Layers 7+8 — per-brand continuation + two-sided ACOR walk (×10)"]
    direction TB
    L7L8_each["For each of brand_064 … brand_055 (descending):<br/>L7: single-key `color` continuation (NonCounted(ProvableCountTree))<br/>L8: ~50 merk ops — two-sided ACOR boundary walk<br/>for color > 200 AND color < 400<br/>committing one `u64 = 199` per brand"]:::target
  end

  L5_q -. "byBrand" .-> L6_t064
  L6_t064 -. "continuation × 10" .-> L7L8_each

  classDef queried fill:#1f6feb,color:#fff,stroke:#1f6feb,stroke-width:2px;
  classDef sibling fill:#6e7681,color:#fff,stroke:#6e7681;
  classDef target fill:#39c5cf,color:#0d1117,stroke:#39c5cf,stroke-width:3px;
  classDef boundary fill:#d29922,color:#0d1117,stroke:#d29922,stroke-width:2px,stroke-dasharray: 6 3;

The size delta between G8 and G8a, per outer match: ~1 700 B (G8) → ~2 800 B (G8a). The extra ~1 100 B per brand is roughly evenly split between (a) the bounded inner ACOR's second boundary walk and (b) the per-op PushInverted discriminant overhead. Both costs are linear in L (the platform-max outer cap), so doubling L doubles the delta. The asymptotic complexity stays O(L · (log B + log C')) — the bounded-vs-unbounded distinction is a constant-factor change in the per-walk boundary commit count, not a complexity-class change.

Reading the descending result: the SDK returns Vec<(Vec<u8>, u64)> in the same wire order grovedb walked the outer dimension. For left_to_right = false, that's lex-descending serialized brand keys (brand_064 before brand_063 before … before brand_055). Callers that expect ascending output sort the result client-side; the prove-path guarantee is on the contents (which brands and which counts), not the client-visible ordering — though for chapter-fixture-deterministic proofs the ordering IS visible in the proof bytes via Push vs PushInverted, so the verifier knows which direction grovedb walked.

G8b — Two-range carrier with `group_by = [brand, color]` (rejected)

select   = COUNT
where    = brand > "brand_050" AND color > "color_00000500"
group_by = [brand, color]
prove    = true

Outcome: Err(QuerySyntaxError::InvalidWhereClauseComponents("count query supports at most one range where-clause; combine two-sided ranges via between*instead of separate>/<clauses, or usegroup_by = [outer_range_field]withprove = true for the carrier-aggregate shape with one outer range and one inner ACOR range on a different field")) — at detect_mode's range_count > 1 short-circuit, before any index picking or path-query building.

Why. The two-range carrier shape (outer_range AND inner_range on distinct fields) is opened by mode detection only when mode == GroupByRange and group_by.len() == 1 and prove = true. G8b violates the first two: with group_by = [brand, color] the request maps to CountMode::GroupByCompound, which routes to distinct_count_path_query — a builder that knows how to walk an In + range fan-out but not a range + range cartesian product. Two design points:

GroupByCompound is specifically the (In, range) shape. Its path-query builder emits outer Key(serialized_in_value) items (one per In branch) and an inner Range* subquery; the walk is |In|-bounded by construction. Extending it to accept range + range would mean replacing the outer Keys with an outer Range* (and a SizedQuery::limit to bound the walk) and swapping the inner from "enumerate distinct values" to "single ACOR aggregate" — at which point the result shape stops being "per-distinct-value entries" and becomes "per-outer-key u64s," i.e. G8's shape with a redundant second group_by field. There's no information gain from adding color to the group_by — the carrier already commits one u64 per outer brand, and the inner range collapses into that u64 rather than being enumerated.
The carrier primitive returns one u64 per outer key, not per (outer, inner) pair. Per-distinct-color counts inside an outer-range brand walk would require the alternative RangeDistinctProof shape (the G5 compound-distinct path) running on a byBrandColor + rangeCountable: true cartesian fan-out — which works for In + range (a finite outer key set) but would explode for range + range (potentially B × C' distinct entries, dwarfing the MAX_CARRIER_AGGREGATE_OUTER_RANGE_LIMIT = 10 cap that bounds G8). The dispatcher rejects rather than silently routing to a path that'd produce a proof orders of magnitude larger than the caller likely expected.

What to use instead.

If you want per-brand totals across an in-range color window (the most common interpretation of this request), use G8 (group_by = [brand]): one u64 per brand, capped at 10 outer matches.
If you want per-(brand, color) distinct counts across both ranges, the dispatcher has no path today — you'd need a byBrandColor + rangeCountable: true index plus a new mode that extends GroupByCompound to range + range with a per-pair SizedQuery::limit. Out of scope for this contract.
If you want a single sum across the whole brand > X AND color > Y window, you'd need to call G8 and sum the returned u64s client-side (server-side aggregation across the carrier's per-branch counts isn't supported on the prove path — see G8c below).

G8c — Two-range carrier with `group_by = []` (rejected)

select   = COUNT
where    = brand > "brand_050" AND color > "color_00000500"
group_by = []
prove    = true

Outcome: same rejection as G8b — Err(QuerySyntaxError::InvalidWhereClauseComponents("count query supports at most one range where-clause; …")). Mode-detection's range_count > 1 short-circuit checks mode == GroupByRange, and the dispatcher maps group_by = [] to CountMode::Aggregate, so the check fails for the same structural reason as G8b.

Why. With no group_by the request asks for a single scalar u64 covering every document matching both ranges. The carrier-ACOR primitive emits one u64 per outer-range key (10 brands in G8's case), not a single sum across the whole walk. Two paths to a single sum, neither viable today:

Server-side sum across the carrier's branches. Would require a new grovedb primitive that takes the carrier shape and emits Σ branch_counts as a single ACOR-style aggregate. Not implemented — the carrier's commitment is per branch, which is what gives the verifier the cryptographic granularity to verify each entry independently. Summing in the server would lose that and force the verifier to trust the server's sum.
Client-side sum after running G8. Allowed and easy — call G8, get back Vec<(brand, u64)>, sum the u64s. The proof still cryptographically commits to each branch, and the client's sum is over verified data. This is the pragmatic path for "give me one number" callers; the chapter recommends it instead of opening up Aggregate for the two-range carrier shape.

The deeper reason Aggregate can't shortcut this. Per chapter 29's Q7 (Range Aggregate byColor), Aggregate + single range uses the leaf-level AggregateCountOnRange primitive directly, which DOES return a single u64. That works because the range is rooted at the index's terminator property — there's a single CountTree under which the boundary walk runs. With G8c's two ranges, the outer range walks the byBrand merk tree (no ProvableCountTree involved) and only the inner range hits the rangeCountable terminator. Collapsing across the outer walk would mean a ProvableCountTree over CountTrees, which grovedb's primitive set doesn't have. The walk could in principle compute and emit a sum at the outer layer, but the verifier wouldn't be able to recompute the per-branch counts to check the sum — defeating the prove-path's whole point.

Future Work

This chapter now mirrors chapter 29's per-query structure: every section above carries a path query, verified payload, proof size, verbatim or schematic proof display, narrative, conceptual flowchart, and per-layer merk-tree diagram.

Two pieces of infrastructure made this possible:

query_g1_* … query_g6_* criterion bench_function calls in document_count_worst_case.rs — produce the Avg time column in Queries in this Chapter.
display_group_by_proofs (a sibling of display_proofs in the same bench file) — emits each group_by shape's verbatim merk-proof structure via bincode decode + GroveDBProof::Display. Tagged with [gproof] prefix in stderr so reviewers can grep deterministically.

Open follow-ups:

Inline the full G4 / G5 / G1b verbatim rather than the schematic-with-elision form. The bench captures every byte; the chapter's <details> blocks currently summarise the 100-target enumerations because reproducing 100 near-identical KVValueHashFeatureTypeWithChildHash lines per case is more noise than signal. If a reader needs byte-exact output, they can run the bench and grep [gproof].
Wire path-query reconstruction + verified-payload printing into display_group_by_proofs. Today it only dumps the proof-display block; chapter 29's display_proofs also reconstructs the PathQuery and prints the verifier's structured result (the verified: block). Adding that to the group_by side would give the chapter parity with chapter 29's verified: sections — currently rendered manually from the [matrix] output's Entries(len=N, sum=M) figures.
A high-fanout byColor variant of G1b (color IN [100 values], group_by = [color]) — captured implicitly in the bench's existing group_by_color_in_proof_100_rangecountable_branches (10 512 B) but not given its own G* section, since it's structurally G1b with ProvableCountTree overhead.

Cross-Reference to Chapter 29

For background on the building blocks every query in this chapter uses:

Document Count Trees — CountTree / ProvableCountTree / NormalTree mechanics.
Count Index Examples § How To Read The Proofs — the four-section per-query template plus the LayerProof / Merk / Push / Parent / Child op grammar.
Count Index Examples § Worked Example: How node_hash_with_count Rebuilds the Merk Root — exact Blake3 formulas underpinning every count proof in either chapter.

The path-query builder (packages/rs-drive/src/query/drive_document_count_query/path_query.rs) and verifier mirror (packages/rs-drive/src/verify/document_count/) live in the same modules for both chapters' queries — the only difference is which point_lookup_* / aggregate_* / group_by_* function the dispatcher calls based on the CountMode carried in the request.

The Dash Platform Book