Indexes
Drive stores documents in GroveDB. Every document type has a primary-key tree (documents keyed by document ID), plus zero or more secondary indexes the contract author declares in the document schema. This chapter is a reference for the Index struct's fields, what they mean for the on-disk layout, and how Drive walks indexes during inserts and queries.
What an Index Is
A document type's indices array tells Drive: "for queries that filter or sort by these properties, build a sorted lookup so they don't have to enumerate every document." Each entry in indices becomes one secondary index; Drive maintains it on every insert/update/delete so that queries which match the index prefix are O(prefix walk) rather than O(documents).
Concrete example. Given:
{
"person": {
"type": "object",
"indices": [
{
"name": "byLastName",
"properties": [{ "lastName": "asc" }]
}
],
"properties": {
"firstName": { "type": "string", "position": 0 },
"lastName": { "type": "string", "position": 1 }
},
"required": ["firstName", "lastName"],
"additionalProperties": false
}
}
— a query like where lastName = "Smith" reaches the matching documents through the byLastName index in O(log n) plus the per-result IO. Without that index it would be a full document-type scan.
The Index Struct
The compiled-Rust shape — the JSON schema fields are deserialized into this — lives in packages/rs-dpp/src/data_contract/document_type/index/mod.rs:
#![allow(unused)] fn main() { pub struct Index { pub name: String, pub properties: Vec<IndexProperty>, pub unique: bool, pub null_searchable: bool, pub contested_index: Option<ContestedIndexInformation>, pub countable: IndexCountability, } pub struct IndexProperty { pub name: String, pub ascending: bool, } }
name
A short, human-readable identifier for the index (e.g. "byOwnerAndType"). It shows up in error messages and is the key used in document_type.indexes() (BTreeMap<String, Index>). If omitted in the schema, a random alphanumeric name is generated. Two indexes within the same document type cannot share a name.
properties: Vec<IndexProperty>
The ordered list of columns this index covers. Each IndexProperty is a (name, ascending) pair. Order matters: a query has to match a prefix of these properties for the index to be useful. An index [lastName, firstName] answers where lastName = X and where lastName = X AND firstName = Y but not where firstName = Y alone.
The schema form is:
"properties": [
{ "lastName": "asc" },
{ "firstName": "asc" }
]
asc / desc controls sort order on result enumeration. Drive currently only uses ascending storage, but the field is preserved through the contract.
unique: bool
If true, no two documents may share the same combination of values for the indexed properties. The platform enforces this on insert: a duplicate trips a DuplicateUniqueIndexError consensus error.
A unique index changes the on-disk layout at the terminal level: instead of a sub-tree of document references keyed by document ID, the terminal stores a single bare Reference element pointing at the one document that matched. See Tree Type at the Terminal Level below.
Uniqueness can't be enforced when an indexed property is null, so a document with any null in the index path falls back to the non-unique storage shape for that document. See Null Handling.
null_searchable: bool
Defaults to true. Controls what happens when all indexed properties of a document are null:
null_searchable: true— the document is still indexed at the all-null path, so a query against the all-null prefix can find it.null_searchable: false— Drive skips the index insertion entirely. Documents with all-null index values exist (in the primary-key tree) but are not reachable via this index.
The flag only affects the all-null case. A document with some null values gets indexed regardless.
contested_index: Option<ContestedIndexInformation>
When set, this index identifies a scarce, contested resource (the canonical example is a DPNS name like dash). Documents trying to register the same value under a contested index don't auto-fail with a uniqueness error — they enter a masternode-vote resolution where each contender's claim is held until voting concludes. Contested indexes must also be unique: true; the parser rejects the combination otherwise.
Out of scope for this chapter; see DPNS / contested-resource docs for the full lifecycle.
countable: IndexCountability
Controls whether the terminal tree under each indexed value carries a count, and which count-tree variant. Three variants:
| Value | Tree variant | Capabilities |
|---|---|---|
NotCountable (default) | NormalTree | No count fast path |
Countable | CountTree | O(1) totals at the root |
CountableAllowingOffset | ProvableCountTree | O(1) totals plus per-node counts that will enable future O(log n) range / offset queries |
The schema accepts both the legacy boolean form (true → Countable, false → NotCountable) and the camelCase string form ("notCountable" / "countable" / "countableAllowingOffset"). For the full design rationale see Document Count Trees.
How Drive Builds the IndexLevel Trie
The flat list of Indexes declared on a document type is compiled, at contract-load time, into an IndexLevel trie (packages/rs-dpp/src/data_contract/document_type/index_level/mod.rs):
#![allow(unused)] fn main() { pub struct IndexLevel { sub_index_levels: BTreeMap<String, IndexLevel>, has_index_with_type: Option<IndexLevelTypeInfo>, level_identifier: u64, } }
Each property name in any index becomes an edge in this trie; indexes that share a prefix share their initial path. An index "terminates" at a level by setting has_index_with_type = Some(...) — that's how the recursive insert / lookup code knows it's at the last property of a defined index, vs. just walking through a shared prefix.
Given two indexes:
byOwnerAndType = [ownerId, docType]byOwnerAndStatus = [ownerId, status]
the trie that gets built is:
flowchart TD
Root["(root)"]
Owner["<b>ownerId</b><br/><i>shared prefix</i>"]
DocType["<b>docType</b><br/>terminates <code>byOwnerAndType</code><br/><i>has_index_with_type = Some(...)</i>"]
Status["<b>status</b><br/>terminates <code>byOwnerAndStatus</code><br/><i>has_index_with_type = Some(...)</i>"]
Root --> Owner
Owner --> DocType
Owner --> Status
style DocType fill:#e0f7fa,stroke:#006064,color:#000
style Status fill:#e0f7fa,stroke:#006064,color:#000
The ownerId level is shared between both indexes. The docType and status levels each set has_index_with_type on themselves with their own unique / countable / null_searchable flags. A third index that also started with ownerId would attach further sub-levels under ownerId instead of duplicating it.
This trie shape directly mirrors the GroveDB path shape used at insert / query time.
GroveDB Layout
A document under contract C of type T with index property propA = vA, propB = vB lives at the grove path:
[ DataContractDocuments, contract_id, 1, doc_type_name,
propA_name, vA, propB_name, vB, 0 → <terminal element> ]
Let's break that down:
DataContractDocuments— root tree byte (u8constant) for "this is a document index, not a contract definition or identity record".contract_id— 32-byte contract identifier.1— separator distinguishing the document storage area from the contract definition area withincontract_id.doc_type_name— UTF-8 bytes of the document type ("person","contactRequest", etc.).propA_name, vA, propB_name, vB— alternating property key and serialized value, one pair per index property, in declaration order.0— the conventional "terminal slot" byte under each value level; it's where the actual reference (or sub-tree-of-references) lives.
The intermediate levels (propA_name, vA, propB_name, vB) are all NormalTrees. The terminal element at [0] varies — see the next section.
Concretely, suppose a widget-store contract has document type widget with a non-unique countable index byColor = [color], and three documents stored: A and B with color = "red", C with color = "blue". Then drive lays out:
flowchart TD
R["<b>DataContractDocuments</b><br/>(root tree byte)"]
CID["<b>widget-store-id</b><br/>(32 bytes)"]
Sep["<b>1</b><br/>(documents area)"]
DT["<b>'widget'</b><br/>(document type)"]
PK["primary-key tree<br/>(docs keyed by ID)<br/><i>not shown — own subtree</i>"]
Color["<b>'color'</b><br/>(index property name)"]
Red["<b>'red'</b><br/>(serialized value)"]
Blue["<b>'blue'</b>"]
RedT["<b>[0]: CountTree</b><br/>count = 2"]
BlueT["<b>[0]: CountTree</b><br/>count = 1"]
RA(["<b>doc_id_A</b><br/>Reference"])
RB(["<b>doc_id_B</b><br/>Reference"])
RC(["<b>doc_id_C</b><br/>Reference"])
R --> CID --> Sep --> DT
DT --> PK
DT --> Color
Color --> Red --> RedT
Color --> Blue --> BlueT
RedT --> RA
RedT --> RB
BlueT --> RC
classDef countTree fill:#fff4e5,stroke:#bf6900,color:#000
classDef reference fill:#e8f5e9,stroke:#1b5e20,color:#000
classDef placeholder fill:#f5f5f5,stroke:#888,color:#000
class RedT,BlueT countTree
class RA,RB,RC reference
class PK placeholder
Legend: rectangles are tree-type elements (intermediate NormalTrees holding sub-keys, terminal CountTrees holding per-doc refs); rounded green nodes are Reference elements (leaf pointers to documents). Amber highlight marks count-bearing trees specifically.
A query like where color = "red" walks the path down to the red value, opens the [0] CountTree, and either reads count_value (for GetDocumentsCount) or enumerates the inner references (for GetDocuments). Because the count is stored on the wrapping element, the count read is O(1) regardless of how many docs are inside.
Shared-Prefix Indexes
Now extend the same widget document type with a shape property and a second, compound, countable index byColorShape = [color, shape]. The IndexLevel trie is:
flowchart TD
Root["(root)"]
ColorLevel["<b>color</b><br/>terminates <code>byColor</code><br/><i>has_index_with_type = Some(...)</i><br/>+ sub-level <code>shape</code>"]
ShapeLevel["<b>shape</b><br/>terminates <code>byColorShape</code><br/><i>has_index_with_type = Some(...)</i>"]
Root --> ColorLevel --> ShapeLevel
style ColorLevel fill:#e0f7fa,stroke:#006064,color:#000
style ShapeLevel fill:#e0f7fa,stroke:#006064,color:#000
Importantly, color is a level that both terminates an index (byColor) and has a sub-level (shape) continuing past it. That dual role shows up directly in the on-disk path: at every [..., color, <value>] subtree, key [0] (the byColor terminal) and key 'shape' (the continuation into byColorShape) live as siblings.
With three documents — A: (red, circle), B: (red, square), C: (blue, square) — the layout is:
flowchart TD
DT["<b>'widget'</b><br/>(document type)"]
ColorKey["<b>'color'</b><br/>(index property)"]
Red["<b>'red'</b>"]
Blue["<b>'blue'</b>"]
%% byColor terminals (at the color-value level)
RedColorT["<b>[0]: CountTree</b><br/>count = 2<br/><i>byColor</i>"]
BlueColorT["<b>[0]: CountTree</b><br/>count = 1<br/><i>byColor</i>"]
%% byColorShape continuation (sibling to [0] under each color value)
RedShape["<b>'shape'</b>"]
BlueShape["<b>'shape'</b>"]
RedCircle["<b>'circle'</b>"]
RedSquare["<b>'square'</b>"]
BlueSquare["<b>'square'</b>"]
%% byColorShape terminals
RCT["<b>[0]: CountTree</b><br/>count = 1<br/><i>byColorShape</i>"]
RST["<b>[0]: CountTree</b><br/>count = 1<br/><i>byColorShape</i>"]
BST["<b>[0]: CountTree</b><br/>count = 1<br/><i>byColorShape</i>"]
%% References — each indexed path stores its own reference, so docs appear
%% multiple times across the diagram (same key, same doc id, but a separate
%% Reference element under each terminal that matches the document).
RA1(["<b>doc_id_A</b><br/>Reference"])
RB1(["<b>doc_id_B</b><br/>Reference"])
RC1(["<b>doc_id_C</b><br/>Reference"])
RA2(["<b>doc_id_A</b><br/>Reference"])
RB2(["<b>doc_id_B</b><br/>Reference"])
RC2(["<b>doc_id_C</b><br/>Reference"])
DT --> ColorKey
ColorKey --> Red
ColorKey --> Blue
Red --> RedColorT
Red --> RedShape
Blue --> BlueColorT
Blue --> BlueShape
RedColorT --> RA1
RedColorT --> RB1
BlueColorT --> RC1
RedShape --> RedCircle
RedShape --> RedSquare
BlueShape --> BlueSquare
RedCircle --> RCT --> RA2
RedSquare --> RST --> RB2
BlueSquare --> BST --> RC2
classDef countTree fill:#fff4e5,stroke:#bf6900,color:#000
classDef reference fill:#e8f5e9,stroke:#1b5e20,color:#000
class RedColorT,BlueColorT,RCT,RST,BST countTree
class RA1,RB1,RC1,RA2,RB2,RC2 reference
Two things to notice:
[0]and the sub-property name ('shape') are siblings under each color value. The[0]count tree is thebyColorterminal at that color value; the'shape'subtree is the continuation thatbyColorShapewalks past for the next index property. Drive descends one or the other depending on which index covers the query.- The same document is stored as a separate
Referenceunder every index path that matches it. Doc A appears underbyColor[red]and underbyColorShape[red, circle]; doc B underbyColor[red]andbyColorShape[red, square]; doc C underbyColor[blue]andbyColorShape[blue, square]. That's why each of A, B, C shows up twice in the diagram — once per index that covers the document. Insert/delete touches all of them; queries walk only the one path their picker selected.
A query like where color = "red" resolves through the byColor terminal ([0] under red) — count = 2, O(1). A query like where color = "red" AND shape = "circle" resolves through byColorShape instead, taking the 'shape' sub-tree past red and reading the terminal under circle — count = 1, also O(1). Both queries are served by the same shared-prefix layout, just descending different branches at the red node.
Range-Countable Indexes
Status: design. Not yet implemented at the time of writing. Depends on a parallel grovedb change that adds
NonCounted<ElementType>element variants — element types that behave exactly like their counterparts except that their count value is not propagated to the parent count tree, and which are only insertable inside aCountTree/ProvableCountTree/CountSumTree/ProvableCountSumTree.
range_countable is a separate per-index property from countable. Where countable makes the count of docs at one specific value O(1), range_countable makes the count of docs between two values O(log n) — answering queries like "how many widgets have a color between red and tomato alphabetically" without enumerating every distinct color value.
Constraints
range_countable: truerequirescountableto beCountableorCountableAllowingOffset. It is additive to countability, not a replacement: range queries are useful only on indexes you'd already want to count by.- The combination is meaningful only on non-unique indexes (or unique indexes whose entries can be null-bearing), for the same reason
countableis mostly inert on unique-with-required-fields: a unique non-null terminal is a bareReference, with no tree to hang per-node counts off of. - Sibling sub-trees that share a prefix with a range-countable index — e.g., the
'shape'continuation whenbyColoris range-countable butbyColorShapeshares itscolorprefix — must useNonCounted<*>variants so their counts do not pollute the range-countable value tree's count.
Mechanism
Where today's countable upgrades only the terminal [0] element under each indexed value to a count tree, range_countable additionally upgrades two more levels:
| Level | Without range_countable | With range_countable |
|---|---|---|
Property-name tree (e.g. 'color') | NormalTree | ProvableCountTree |
Value tree (e.g. 'red', 'blue') | NormalTree | CountTree |
Terminal at [0] under each value | NormalTree / CountTree / ProvableCountTree (per countable) | unchanged — still driven by countable |
Sibling continuations inside the value tree (e.g. 'shape' for a compound index sharing the prefix) | NormalTree | NonCounted<NormalTree> |
The property-name tree is a ProvableCountTree rather than a plain CountTree because the merk-tree internal-node counts are exactly what makes range queries O(log n): walk the boundary path between the lower and upper bound, sum sub-counts at each internal node along the way. (See Document Count Trees for the underlying mechanic.)
The value trees become CountTrees because the property-name ProvableCountTree's aggregate is computed by summing each value tree's count_value. For that aggregate to mean "total docs at this property" rather than "number of distinct values", each value tree's count_value must equal "docs at this exact value" — which is only true if (a) the terminal [0] CountTree contributes its doc count, and (b) every sibling under the value tree (continuation sub-property names like 'shape', etc.) contributes zero rather than the default 1-per-Tree. That's what NonCounted<NormalTree> is for.
Layout
Same byColor + byColorShape example as before, with the same three documents (A: (red, circle), B: (red, square), C: (blue, square)), but now byColor.range_countable: true:
flowchart TD
DT["<b>'widget'</b><br/>(document type)<br/>NormalTree"]
ColorKey["<b>'color'</b><br/><b><i>ProvableCountTree</i></b><br/>count = 3"]
Red["<b>'red'</b><br/><b><i>CountTree</i></b><br/>count = 2"]
Blue["<b>'blue'</b><br/><b><i>CountTree</i></b><br/>count = 1"]
%% byColor terminals (unchanged shape — same as before)
RedColorT["<b>[0]: CountTree</b><br/>count = 2<br/><i>byColor terminal</i>"]
BlueColorT["<b>[0]: CountTree</b><br/>count = 1<br/><i>byColor terminal</i>"]
%% byColorShape continuation — now NonCounted to avoid double-counting
RedShape["<b>'shape'</b><br/><b><i>NonCounted<NormalTree></i></b>"]
BlueShape["<b>'shape'</b><br/><b><i>NonCounted<NormalTree></i></b>"]
RedCircle["<b>'circle'</b><br/>NormalTree"]
RedSquare["<b>'square'</b><br/>NormalTree"]
BlueSquare["<b>'square'</b><br/>NormalTree"]
%% byColorShape terminals
RCT["<b>[0]: CountTree</b><br/>count = 1<br/><i>byColorShape</i>"]
RST["<b>[0]: CountTree</b><br/>count = 1<br/><i>byColorShape</i>"]
BST["<b>[0]: CountTree</b><br/>count = 1<br/><i>byColorShape</i>"]
%% References (one per matching index path, per the earlier section)
RA1(["<b>doc_id_A</b><br/>Reference"])
RB1(["<b>doc_id_B</b><br/>Reference"])
RC1(["<b>doc_id_C</b><br/>Reference"])
RA2(["<b>doc_id_A</b><br/>Reference"])
RB2(["<b>doc_id_B</b><br/>Reference"])
RC2(["<b>doc_id_C</b><br/>Reference"])
DT --> ColorKey
ColorKey --> Red
ColorKey --> Blue
Red --> RedColorT
Red --> RedShape
Blue --> BlueColorT
Blue --> BlueShape
RedColorT --> RA1
RedColorT --> RB1
BlueColorT --> RC1
RedShape --> RedCircle
RedShape --> RedSquare
BlueShape --> BlueSquare
RedCircle --> RCT --> RA2
RedSquare --> RST --> RB2
BlueSquare --> BST --> RC2
classDef provableCount fill:#ede7f6,stroke:#311b92,color:#000
classDef countTree fill:#fff4e5,stroke:#bf6900,color:#000
classDef nonCounted fill:#f3e5f5,stroke:#6a1b9a,color:#000,stroke-dasharray:5 5
classDef reference fill:#e8f5e9,stroke:#1b5e20,color:#000
class ColorKey provableCount
class Red,Blue,RedColorT,BlueColorT,RCT,RST,BST countTree
class RedShape,BlueShape nonCounted
class RA1,RB1,RC1,RA2,RB2,RC2 reference
Legend additions for this diagram: purple = ProvableCountTree; amber = CountTree; dashed lavender = NonCounted<*> (the new grovedb variants); rounded green = Reference.
Walking through how the counts add up:
'red'(CountTree, count=2) — its children are[0](CountTree, contributes itscount_value= 2) and'shape'(NonCounted<NormalTree>, contributes 0 — that's the whole point of the new variant). Aggregate = 2. ✓'blue'(CountTree, count=1) — same shape, 1 doc + 0. ✓'color'(ProvableCountTree, count=3) — its children are'red'(CountTree, contributes 2) and'blue'(CountTree, contributes 1). Aggregate = 3. The provable variant additionally stores per-internal-node counts inside its merk structure, which is what enables the range walk.
If 'shape' were a plain NormalTree instead of NonCounted<NormalTree>, it would contribute 1 to 'red' (every non-count-tree element contributes 1 by default — see Document Count Trees § How Counts Aggregate). Then 'red' would read as 3, 'blue' as 2, 'color' as 5 — a count of "docs + sub-property-trees", not "docs". The NonCounted<*> variant exists exactly to fix this.
Query — "count between two values"
With the layout above, a query like WHERE color BETWEEN 'red' AND 'tomato' resolves at the 'color' ProvableCountTree level:
- Walk the merk tree from
'color''s root, finding the boundary node between'red'(lower bound) and'tomato'(upper bound) — O(log distinct color values). - At each step, decide what to do with the off-boundary subtree using its pre-computed count: include its full
count_value(subtree fully inside the range), exclude (fully outside), or recurse (straddles the boundary). - Sum the contributions; the result is the count of all docs whose color falls in
[red, tomato].
No leaf-level enumeration of distinct color values, no enumeration of individual documents — the count is computed entirely from the tree's pre-aggregated structure.
Compound indexes
range_countable: true on a compound index applies at the index's terminating level (its last property). For byColorShape = [color, shape] with range_countable: true:
'shape'(the property-name tree under each color value) becomes aProvableCountTree.- Each
'circle'/'square'value tree becomes aCountTree. - Documents are referenced as
Element::Referenceleaves under thoseCountTrees, contributing 1 each to the count aggregate.
When the compound's leading prefix is also indexed by another range_countable index (e.g. byColor is also range_countable), sibling continuations under each color CountTree are wrapped with Element::NonCounted so a doc routed via byColorShape doesn't double-count under byColor's color aggregate. The walker (add_indices_for_index_level_for_contract_operations) threads a parent_value_tree_is_range_countable flag down the recursion to decide when to wrap, regardless of whether the inner tree is itself a ProvableCountTree, CountTree, or plain NormalTree.
End-to-end coverage in range_countable_index_e2e_tests (in packages/rs-drive/src/drive/contract/insert/insert_contract/v0/mod.rs) pins the storage layout against a real grovedb — including the count_tree_value_count_excludes_compound_continuation_via_non_counted test that proves NonCounted-wrapping is load-bearing for compound-index correctness.
Tree Type at the Terminal Level
The decision happens in add_reference_for_index_level_for_contract_operations/v0/mod.rs:
#![allow(unused)] fn main() { if !index_type.index_type.is_unique() || any_fields_null { // Non-unique branch: insert an empty tree at [0], then put // each document's reference inside that tree. The tree's variant // is governed by `countable`: // NotCountable → NormalTree // Countable → CountTree // CountableAllowingOffset → ProvableCountTree } else { // Unique branch: store a single Reference element at [0] directly. } }
So the matrix:
unique | any_fields_null | countable | What lives at [0] |
|---|---|---|---|
| false | (any) | NotCountable | empty NormalTree containing per-doc references |
| false | (any) | Countable | empty CountTree containing per-doc references |
| false | (any) | CountableAllowingOffset | empty ProvableCountTree containing per-doc references |
| true | false | (any) | bare Reference to the one matching document |
| true | true | NotCountable | empty NormalTree containing per-doc references |
| true | true | Countable | empty CountTree containing per-doc references |
| true | true | CountableAllowingOffset | empty ProvableCountTree containing per-doc references |
Note the last three rows: a unique index does go through the count-tree branch when any indexed field is null. That's why countable on a unique index is meaningful exactly when at least one of the indexed properties is optional in the schema.
Visualizing the three terminal shapes side by side:
flowchart TD
subgraph SA["Non-unique, countable"]
direction TB
A1["[..., color, 'red']"]
A2["<b>[0]: CountTree</b><br/>count = 2"]
A3(["<b>doc_id_A</b><br/>Reference"])
A4(["<b>doc_id_B</b><br/>Reference"])
A1 --> A2 --> A3
A2 --> A4
end
subgraph SB["Unique, all fields non-null"]
direction TB
B1["[..., email, 'alice@x']"]
B2(["<b>[0]: Reference</b><br/>→ doc_id_X"])
B1 --> B2
end
subgraph SC["Unique with null in path"]
direction TB
C1["[..., a, 'X', b, <empty>]"]
C2["<b>[0]: CountTree</b><br/>count = 1"]
C3(["<b>doc_id_W</b><br/>Reference"])
C1 --> C2 --> C3
end
classDef countTree fill:#fff4e5,stroke:#bf6900,color:#000
classDef reference fill:#e8f5e9,stroke:#1b5e20,color:#000
class A2,C2 countTree
class A3,A4,B2,C3 reference
Same convention as the layout diagram above: rectangles are tree-type elements, rounded green nodes are Reference elements. Same key ([0]) at the terminal in all three panels — what lives there is what differs. The middle case is the one that's "special" — a bare Reference directly at [0] instead of a sub-tree containing references — and it's specifically scoped to the unique-and-no-nulls scenario.
Null Handling
The any_fields_null and all_fields_null flags are accumulated as Drive descends the index property list during insertion (add_indices_for_index_level_for_contract_operations/v0/mod.rs:170-171):
#![allow(unused)] fn main() { any_fields_null |= document_index_field.is_empty(); all_fields_null &= document_index_field.is_empty(); }
any_fields_null becomes true the moment the walker hits any null/empty value at any level (first, middle, or last) and stays true for the rest of the descent. all_fields_null only stays true if every value seen so far is null.
By the time the recursion reaches the terminal:
any_fields_null = falseand the index is unique → unique branch (bare Reference).any_fields_null = true(regardless of unique) → non-unique-style branch (sub-tree containing references).all_fields_null = trueANDnull_searchable = false→ the terminal call returns early without inserting anything; this document is not findable through this index.
This means different documents under the same unique index can land in different storage shapes depending on which of their indexed fields are null. A document with all required fields populated takes the bare-Reference shape; a document with a null in an optional indexed property takes the sub-tree shape, side by side under the same index.
Insert Flow Summary
Putting it together, when Drive inserts a document into a contract C of type T:
add_indices_for_top_index_level_for_contract_operations— for each top-level entry in the document type's index trie (each first-property of any declared index), pushes the property name and the document's value for that property onto the path, computes the initialany_fields_null/all_fields_nullfor that single value, and recurses.add_indices_for_index_level_for_contract_operations(recursive) — for each sub-level of the trie, pushes the property name and value onto the path, OR-accumulatesany_fields_null, AND-accumulatesall_fields_null, and recurses. If the current level hashas_index_with_type = Some(...), it also calls into step 3 before recursing further (because an index can terminate at a non-leaf trie level when another index continues past it).add_reference_for_index_level_for_contract_operations— the terminal call. Decides between unique and non-unique-style storage using the matrix above; for the non-unique-style path it picks aNormalTree/CountTree/ProvableCountTreebased oncountable; finally inserts the document reference (or sub-tree containing it).
Deletion mirrors the same walk in reverse — see packages/rs-drive/src/drive/document/delete/.
Query Traversal
When a query arrives at drive-abci, the document-query construction path picks one of the document type's indexes that "covers" the query — i.e., whose property prefix matches the query's equality clauses, in order. The picker is in packages/rs-drive/src/query/mod.rs (look for fn construct_path_query and the index-selection helpers it calls). For count queries specifically there's a separate, count-tree-aware picker (drive_document_count_query/mod.rs) — see Document Count Trees for that path.
Once an index is picked, the query-engine builds a PathQuery whose path is exactly the prefix shape the insert code produced: [DataContractDocuments, contract_id, 1, doc_type, prop, value, prop, value, …]. GroveDB then walks the path in O(log n per level), reading the terminal sub-tree (or single reference) and returning matching documents.
A query whose where-clauses don't form a prefix of any index can't take this fast path and falls back to a full-scan plan — which dapi-grpc surfaces as an error in most cases, since unbounded scans are deliberately discouraged.
Choosing Index Settings
Quick checklist for contract authors:
- Don't index what you won't query. Each index costs storage on every insert/delete and counts against the per-document-type index limit (10 indexes per type currently).
- Order index properties from most-selective to least-selective. A
[country, city]index is more useful than[city, country]for queries likewhere country = "FR". unique: truewhen the platform should reject duplicates at the consensus layer. This is the right place for "this should be unique" invariants — don't enforce them application-side.countable: "countable"when you'll regularly callGetDocumentsCountwith==(orin) clauses on exactly this index's properties. Adds a constant-factor overhead on insert/delete; reads become O(1). Acountable: trueindex counts only queries whose where clauses match its properties exactly — partial-prefix queries are rejected withWhereClauseOnNonIndexedProperty, not falling through to a slow scan. Define a separate index per distinct count-query shape you want to support, or setdocumentsCountable: trueon the document type for unfiltered totals.countable: "countableAllowingOffset"when you'll also want offset / range queries on this index in a future release. Strictly more expensive than plain"countable"; only worth it if you need the capability.null_searchable: true(the default) is right for almost all cases. Set tofalseonly when documents with all-null indexed values shouldn't be findable through this index — typically a niche optimization to avoid a hot all-null prefix.
For specifically count-related concerns — primary-key-tree flags (documentsCountable / rangeCountable), the no-prove-vs-prove paths, and the operator restrictions — see the dedicated Document Count Trees chapter.