Skip to content

Tutorial: Annotating Ground Truth Data for ML in WEBKNOSSOS

When preparing ground truth data for machine learning, accurate and consistent annotations are essential. The following guidelines will help you create high-quality training data in WEBKNOSSOS for training AI models for segmentation.

For detailed instructions on using WEBKNOSSOS for annotation, please watch our Beginner’s Guide and/or our tutorial on Volume Annotations.


1. Annotation Area

  • Bounding Box: Set up a number of bounding boxes, equally spaced throughout the dataset (see guide on choosing bounding boxes) for marking your training data regions. Annotations should be performed within these bounding boxes. Annotate all segments that appear within the bounding box, including those that are only partially visible. Segmentation may extend beyond the bounding box boundaries; content outside the box does not matter, as long as the segmentation inside the box is complete.
  • Segment Distinction: Since the ML model processes only the data within the bounding box, any cell parts that are not connected or directly adjacent inside the box must be assigned separate segment IDs, even if they appear to be part of the same cellular process.

    Assign separate segment IDs to cell parts that are not connected within the bounding box.

    Data attributions: Motta et al., 2019, Dense connectomic reconstruction in layer 4 of the somatosensory cortex


2. Annotating Cell Segments

  • Unique Segment IDs: Each cell within the bounding box should be annotated with its own distinct segment ID. Create new segments as needed.
  • Sequential Approach:
    • Start at the corner at the top, left and front of the bounding box.
    • Annotate the cell in that area, following its structure across all sections.
    • Once completed, return to the top and assign a new segment ID for the next cell. We advise you to leave the cell membranes unannotated.

This systematic approach minimizes errors and ensures that no parts of a cell are missed.


3. Maintain Consistency Across Annotations

Consistency is key to creating reliable ground truth data. Here are a few examples where shift in consistency might happen:

  • Uniform Quality: If your annotation quality improves during the process, revisit and update earlier annotations for uniformity.
  • Membrane Boundaries: If you chose to leave a thin gap between cells (choosing not to annotate the membrane), ensure this method is applied consistently across all annotations.
  • Extracellular Space: Decide whether or not to annotate extracellular space and apply this choice consistently throughout your dataset.

4. Final Quality Check

A careful review of your annotations can prevent mistakes that might impact model performance. Consider these tips for your final quality check:

3D Verification: Review the annotation in all three viewports (XY, XZ, and YZ) simultaneously. If you observe any unnatural lines or separations - especially in the XZ or YZ views - it may indicate an error that needs correction.

Inspect the annotation in XY, XZ, and YZ views simultaneously for errors.

Data attributions: Motta et al., 2019, Dense connectomic reconstruction in layer 4 of the somatosensory cortex

Complete Coverage: Make sure no part of a cell is left unannotated. Even a small fragment in a corner of the bounding box should be included.

Ensure all cell fragments within the bounding box are completely annotated.

Data attributions: Motta et al., 2019, Dense connectomic reconstruction in layer 4 of the somatosensory cortex

Gaps and Holes: Look for any gaps or holes within a segment. Every cell should be annotated as a continuous, complete structure. If you see a darker line that suggests a membrane inside the same cell, it should be annotated; otherwise, an gap will be created.

Avoid internal holes or unannotated membranes within a single segment.

Image attribution: Briggman et al., 2024, GAUSS-EM, guided accumulation of ultrathin serial sections with a static magnetic field for volume electron microscopy

Disconnected Regions: Ensure that segments which are not connected or directly adjacent within the bounding box have distinct segment IDs — even if they appear to be part of the same process.

Use distinct segment IDs for disconnected regions in 3D.

Data attributions: Motta et al., 2019, Dense connectomic reconstruction in layer 4 of the somatosensory cortex

Membrane Annotation Consistency between Cells: Membranes separating two cells should remain unannotated consistently throughout your work.

Consistently leave membranes unannotated between adjacent cells.

Data attributions: Motta et al., 2019, Dense connectomic reconstruction in layer 4 of the somatosensory cortex

Natural Cell Geometry: Unnatural or irregular segment shapes in 3D often indicate a misunderstanding of the data or a mistake such as switching segment IDs during annotation.

Identify unnatural geometries in 3D that may signal annotation mistakes.

Data attribution: Briggman et al., 2024, GAUSS-EM, guided accumulation of ultrathin serial sections with a static magnetic field for volume electron microscopy

Isolated Voxels: Check for stray voxels or remnants that don’t belong to any cell. These artifacts can occur when using automated tools or through a complex annotation process.

Remove isolated voxels or artifacts that do not belong to any segment.

Data attribution: Briggman et al., 2024, GAUSS-EM, guided accumulation of ultrathin serial sections with a static magnetic field for volume electron microscopy

Segment Registration and check: Use the “Register segments in bounding box” function to compile a list of all annotated segments. Then, enable “Selective Visibility” under the annotation layer name to inspect each segment individually. Look for any inconsistencies such as:

  • Holes or gaps within segments
  • Isolated voxels that do not belong
  • “Unnatural” segment geometries

This method is especially relevant when using automated segmentation as the base for ground truth annotation, since it may generate new segments within the bounding box that you do not directly control.

Data attribution: Briggman et al., 2024, GAUSS-EM, guided accumulation of ultrathin serial sections with a static magnetic field for volume electron microscopy


By following these guidelines, you help ensure that your annotations are both accurate and consistent, thereby improving the overall quality of the training data for your ML models.