AgglomerateAttachment Specification¶
Current version: 4
An AgglomerateAttachment stores the agglomeration graph for a segmentation layer.
It maps every segment to an agglomerate and stores, for each agglomerate, its constituent segments, the edges between them, the affinity scores of those edges, and a representative voxel position per segment.
Usually, there are multiple agglomerate attachments with varying degrees of agglomeration.
File Format¶
The artifact is stored as a Zarr v3 hierarchy on disk.
Directory Structure¶
{attachment_name}/
zarr.json # group metadata (version, class)
segment_to_agglomerate/ # array
agglomerate_to_segments_offsets/ # array
agglomerate_to_segments/ # array
agglomerate_to_edges_offsets/ # array
agglomerate_to_edges/ # array
agglomerate_to_affinities/ # array
agglomerate_to_positions/ # array
The attachment name is arbitrary.
When computed with Voxelytics it is usually agglomerate_view_{mapping_id}, where mapping_id is either an integer (commonly a percentile of the agglomeration score) or a string identifier.
The directory is referenced from the layer's datasource-properties.json via AttachmentsProperties.agglomerates.
Group Metadata (zarr.json)¶
The group zarr.json stores the following attributes under the voxelytics key:
| Key | Value |
|---|---|
zarr_format |
3 |
node_type |
"group" |
attributes.voxelytics.artifact_schema_version |
4 |
attributes.voxelytics.artifact_class |
"AgglomerateViewArtifact" |
Example:
{
"zarr_format": 3,
"node_type": "group",
"attributes": {
"voxelytics": {
"artifact_schema_version": 4,
"artifact_class": "AgglomerateViewArtifact"
}
}
}
Notation¶
Let:
n_segments= total number of segments (segment IDs are 1-based and dense: every integer from 1 ton_segmentsmust be present; segment 0 is the background)n_agglomerates= number of real agglomerates (agglomerate 0 is reserved and always empty)n_edges= total number of edges across all agglomerates
The segmentation_dtype must be uint32 or uint64 and must match the dtype of the corresponding segmentation layer.
Arrays¶
segment_to_agglomerate¶
| Property | Value |
|---|---|
| Shape | (n_segments + 1,) |
| Dtype | uint64 |
Maps each segment ID to its agglomerate ID. Index 0 is the background segment and maps to agglomerate 0.
Example: [0, 1, 1, 1, 1, 2, 2, 1] — segments 1–4 and 7 belong to agglomerate 1, segments 5–6 belong to agglomerate 2.
agglomerate_to_segments_offsets¶
| Property | Value |
|---|---|
| Shape | (n_agglomerates + 2,) |
| Dtype | uint64 |
CSR-style offset array into agglomerate_to_segments.
The segments belonging to agglomerate i are at indices [offsets[i], offsets[i+1]) in agglomerate_to_segments.
Agglomerate 0 is always empty: offsets[0] == offsets[1] == 0.
The last entry equals n_segments.
Example: [0, 0, 5, 7] — agglomerate 0 is empty, agglomerate 1 has 5 segments (indices 0–4), agglomerate 2 has 2 segments (indices 5–6).
agglomerate_to_segments¶
| Property | Value |
|---|---|
| Shape | (n_segments,) |
| Dtype | segmentation_dtype |
All segment IDs, grouped by agglomerate.
The segments for agglomerate i occupy agglomerate_to_segments[offsets[i]:offsets[i+1]] and are sorted in ascending order within each agglomerate.
Example: [1, 2, 3, 4, 7, 5, 6] — agglomerate 1 contains segments {1, 2, 3, 4, 7} and agglomerate 2 contains {5, 6}.
agglomerate_to_edges_offsets¶
| Property | Value |
|---|---|
| Shape | (n_agglomerates + 2,) |
| Dtype | uint64 |
CSR-style offset array into agglomerate_to_edges and agglomerate_to_affinities.
The edges for agglomerate i are at indices [offsets[i], offsets[i+1]).
The last entry equals n_edges.
Agglomerate 0 is always empty: offsets[0] == offsets[1] == 0.
Example: [0, 0, 4, 5] — agglomerate 1 has 4 edges, agglomerate 2 has 1 edge.
agglomerate_to_edges¶
| Property | Value |
|---|---|
| Shape | (n_edges, 2) |
| Dtype | segmentation_dtype |
All edges, grouped by agglomerate.
Values are zero-based local node indices within each agglomerate (i.e. positions within the agglomerate's slice of agglomerate_to_segments).
For each edge (n1, n2):
- n1 < n2
- Edges within each agglomerate are sorted lexicographically: first by n1, then by n2.
Example: [[0,1], [0,4], [1,2], [2,3], [0,1]]
To convert local indices to global segment IDs, index into agglomerate_to_segments using the agglomerate's offset from agglomerate_to_segments_offsets.
agglomerate_to_affinities¶
| Property | Value |
|---|---|
| Shape | (n_edges,) |
| Dtype | float32 |
Affinity score for each edge, co-indexed with agglomerate_to_edges.
Higher values indicate stronger evidence for merging.
Example: [124.0, 65.5, 0.0, 250.5, 80.0]
agglomerate_to_positions¶
| Property | Value |
|---|---|
| Shape | (n_segments, 3) |
| Dtype | int32 |
Representative voxel position (x, y, z) for each segment, co-indexed with agglomerate_to_segments.
The positions for agglomerate i occupy agglomerate_to_positions[offsets[i]:offsets[i+1]] where offsets is agglomerate_to_segments_offsets.
Recommended Chunking and Sharding¶
All arrays are written with Zarr v3 sharding (sharding_indexed codec).
The recommendended chunk and shard sizes are derived from the array's shape and dtype to approximate the targets below.
| Array | Target chunk size | Target shard size |
|---|---|---|
segment_to_agglomerate |
256 KB | 1 GB |
agglomerate_to_segments |
256 KB | 1 GB |
agglomerate_to_segments_offsets |
64 KB | 256 MB |
agglomerate_to_edges_offsets |
64 KB | 256 MB |
agglomerate_to_edges |
256 KB | 1 GB |
agglomerate_to_affinities |
256 KB | 1 GB |
agglomerate_to_positions |
256 KB | 1 GB |
The first axis is used as the "row" axis for size calculations; all remaining axes are kept whole in every chunk and shard. The shard shape is always rounded up to the nearest multiple of the chunk shape.
Codec stack (inner chunks): bytes (little-endian) → zstd (level 5, checksum enabled)
Shard index codecs: bytes (little-endian) → crc32c
Invariants¶
- Segment IDs are 1-based and dense: every integer from 1 to
n_segmentsmust appear as a node. - Agglomerate 0 is always empty: no segments and no edges.
- Segment IDs within each agglomerate are sorted in ascending order.
- For each edge
(n1, n2):n1 < n2. - Edges within each agglomerate are sorted lexicographically
(n1, n2). - Edge node indices are zero-based local indices into the agglomerate's segment list.
agglomerate_to_positionsis co-indexed withagglomerate_to_segments.- All offset arrays have shape
(n_agglomerates + 2,); the last entry equals the total element count of the corresponding data array.
Example¶
Consider the following 7 segments and 5 edges:
| Segments | 1, 2, 3, 4, 5, 6, 7 |
| Edges | (1, 2), (2, 3), (3, 4), (5, 6), (1, 7) |
These result in two agglomerates: | | | | --- | --- | | Agglomerate 1 | 1, 2, 3, 4, 7 | | Agglomerate 2 | 5, 6 |
Now, let's rewrite the segment IDs to local indices: | Segment | Agglomerate | Localized Segment | | --- | --- | --- | | 1 | 1 | 0 | | 2 | 1 | 1 | | 3 | 1 | 2 | | 4 | 1 | 3 | | 5 | 2 | 0 | | 6 | 2 | 1 | | 7 | 1 | 4 |
With this, we can rewrite and sort the edges: | Edge | Agglomerate | Localized Edge | | --- | --- | --- | | (1, 2) | 1 | (0, 1) | | (1, 7) | 1 | (0, 4) | | (2, 3) | 1 | (1, 2) | | (3, 4) | 1 | (2, 3) | | (5, 6) | 2 | (0, 1) |
This would be the content of the arrays:
| Array | Content | Shape |
| --- | --- | --- |
| segment_to_agglomerate | [0, 1, 1, 1, 1, 2, 2, 1] | (8,) |
| agglomerate_to_segments_offsets | [0, 0, 5, 7] | (4,) |
| agglomerate_to_segments | [1, 2, 3, 4, 7, 5, 6] | (7,) |
| agglomerate_to_edges_offsets | [0, 0, 4, 5] | (4,) |
| agglomerate_to_edges | [[0, 1], [0, 4], [1, 2], [2, 3], [0, 1]] | (5, 2) |
| agglomerate_to_affinities | [124.0, 65.5, 0.0, 250.5, 80.0] | (5,) |
| agglomerate_to_positions | [[x1, y1, z1], ..., [x7, y7, z7]] | (7, 3) |
- Get Help
- Community Forums
- Email Support