Skip to content

WEBKNOSSOS cuber (wkcuber)

PyPI version Supported Python Versions Build Status Documentation Code Style

Python library for creating and working with WEBKNOSSOS WKW datasets. WKW is a container format for efficiently storing large, scale 3D image data as found in (electron) microscopy.

The tools are modular components to allow easy integration into existing pipelines and workflows.

Features

  • wkcuber: Convert supported input files to fully ready WKW datasets (includes type detection, downsampling, compressing and metadata generation)
  • wkcuber.convert_image_stack_to_wkw: Convert image stacks to fully ready WKW datasets (includes downsampling, compressing and metadata generation)
  • wkcuber.export_wkw_as_tiff: Convert WKW datasets to a tiff stack (writing as tiles to a z/y/x.tiff folder structure is also supported)
  • wkcuber.cubing: Convert image stacks (e.g., tiff, jpg, png, bmp, dm3, dm4) to WKW cubes
  • wkcuber.tile_cubing: Convert tiled image stacks (e.g. in z/y/x.ext folder structure) to WKW cubes
  • wkcuber.convert_knossos: Convert KNOSSOS cubes to WKW cubes
  • wkcuber.convert_nifti: Convert NIFTI files to WKW files (Currently without applying transformations).
  • wkcuber.convert_raw: Convert RAW binary data (.raw, .vol) files to WKW datasets
  • wkcuber.downsampling: Create downsampled magnifications (with median, mode and linear interpolation modes). Downsampling compresses the new magnifications by default (disable via --no_compress).
  • wkcuber.compress: Compress WKW cubes for efficient file storage (especially useful for segmentation data)
  • wkcuber.metadata: Create (or refresh) metadata (with guessing of most parameters)
  • wkcuber.recubing: Read existing WKW cubes in and write them again specifying the WKW file length. Useful when dataset was written e.g. with file length 1.
  • wkcuber.check_equality: Compare two WKW datasets to check whether they are equal (e.g., after compressing a dataset, this task can be useful to double-check that the compressed dataset contains the same data).
  • Most modules support multiprocessing

Supported input formats

  • Standard image formats, e.g. tiff, jpg, png, bmp
  • Proprietary image formats, e.g. dm3
  • Tiled image stacks (used for Catmaid)
  • KNOSSOS cubes
  • NIFTI files
  • Raw binary files

Installation

Python 3 with pip from PyPi

  • wkcuber requires at least Python 3.8
# Make sure to have lz4 installed:
# Mac: brew install lz4
# Ubuntu/Debian: apt-get install liblz4-1
# CentOS/RHEL: yum install lz4

pip install wkcuber

Docker

Use the CI-built image: scalableminds/webknossos-cuber. Example usage docker run -v <host path>:/data --rm scalableminds/webknossos-cuber wkcuber --layer_name color --scale 11.24,11.24,25 --name great_dataset /data/source/color /data/target.

Usage

# Convert arbitrary, supported input files into wkw datasets. This sets reasonable defaults, but see other commands for customization.
python -m wkcuber \
  --scale 11.24,11.24,25 \
  data/source data/target

# Convert image stacks into wkw datasets
python -m wkcuber.convert_image_stack_to_wkw \
  --layer_name color \
  --scale 11.24,11.24,25 \
  --name great_dataset \
  data/source/color data/target

# Convert image files to wkw cubes
python -m wkcuber.cubing --layer_name color data/source/color data/target
python -m wkcuber.cubing --layer_name segmentation data/source/segmentation data/target

# Convert tiled image files to wkw cubes
python -m wkcuber.tile_cubing --layer_name color data/source data/target

# Convert Knossos cubes to wkw cubes
python -m wkcuber.convert_knossos --layer_name color data/source/mag1 data/target

# Convert NIFTI file to wkw file
python -m wkcuber.convert_nifti --layer_name color --scale 10,10,30 data/source/nifti_file data/target

# Convert folder with NIFTI files to wkw files
python -m wkcuber.convert_nifti --color_file one_nifti_file --segmentation_file --scale 10,10,30 another_nifti data/source/ data/target

# Convert RAW file to wkw file
python -m wkcuber.convert_raw --layer_name color --scale 10,10,30 --input_dtype uint8 --shape 2048,2048,1024 data/source/raw_file.raw data/target

# Create downsampled magnifications
python -m wkcuber.downsampling --layer_name color data/target
python -m wkcuber.downsampling --layer_name segmentation --interpolation_mode mode data/target

# Compress data in-place (mostly useful for segmentation)
python -m wkcuber.compress --layer_name segmentation data/target

# Compress data copy (mostly useful for segmentation)
python -m wkcuber.compress --layer_name segmentation data/target data/target_compress

# Create metadata
python -m wkcuber.metadata --name great_dataset --scale 11.24,11.24,25 data/target

# Refresh metadata so that new layers and/or magnifications are picked up
python -m wkcuber.metadata --refresh data/target

# Recubing an existing dataset
python -m wkcuber.recubing --layer_name color --dtype uint8 /data/source/wkw /data/target

# Check two datasets for equality
python -m wkcuber.check_equality /data/source /data/target

Parallelization

Most tasks can be configured to be executed in a parallelized manner. Via --distribution_strategy you can pass multiprocessing, slurm or kubernetes. The first can be further configured with --jobs and the latter via --job_resources='{"mem": "10M"}'. Use --help to get more information.

Zarr support

Most conversion commands can be configured with --data_format zarr. This will produce a Zarr-based dataset instead of WKW. Zarr-based datasets can also be stored on remote storage (e.g. S3, GCS, HTTP). For that, storage-specific credentials and configurations need to be passed in as environment variables.

Example S3

export AWS_SECRET_ACCESS_KEY="..."
export AWS_ACCESS_KEY_ID="..."
export AWS_REGION="..."

python -m wkcuber \
  --scale 11.24,11.24,25 \
  --data_format zarr \
  data/source s3://bucket/data/target

Example HTTPS

export HTTP_BASIC_USER="..."
export HTTP_BASIC_PASSWORD="..."

python -m wkcuber \
  --scale 11.24,11.24,25 \
  --data_format zarr \
  data/source https://example.org/data/target

Exchange https:// with webdav+https:// for WebDAV.

Development

Make sure to install all the required dependencies using Poetry:

pip install poetry
poetry install

Please, format, lint, and unit test your code changes before merging them.

poetry run black .
poetry run pylint -j4 wkcuber
poetry run pytest tests

Please, run the extended test suite:

tests/scripts/all_tests.sh

PyPi releases are automatically pushed when creating a new Git tag/Github release.

API documentation

Check out the latest version of the API documentation.

Generate the API documentation

Run docs/generate.sh to open a server displaying the API docs. docs/generate.sh --persist persists the html to docs/api.

Test Data Credits

Excerpts for testing purposes have been sampled from:

  • Dow Jacobo Hossain Siletti Hudspeth (2018). Connectomics of the zebrafish's lateral-line neuromast reveals wiring and miswiring in a simple microcircuit. eLife. DOI:10.7554/eLife.33988
  • Zheng Lauritzen Perlman Robinson Nichols Milkie Torrens Price Fisher Sharifi Calle-Schuler Kmecova Ali Karsh Trautman Bogovic Hanslovsky Jefferis Kazhdan Khairy Saalfeld Fetter Bock (2018). A Complete Electron Microscopy Volume of the Brain of Adult Drosophila melanogaster. Cell. DOI:10.1016/j.cell.2018.06.019. License: CC BY-NC 4.0

License

AGPLv3 Copyright scalable minds