WEBKNOSSOS cuber (wkcuber)
Python library for creating and working with WEBKNOSSOS WKW datasets. WKW is a container format for efficiently storing large, scale 3D image data as found in (electron) microscopy.
The tools are modular components to allow easy integration into existing pipelines and workflows.
Features
wkcuber
: Convert supported input files to fully ready WKW datasets (includes type detection, downsampling, compressing and metadata generation)wkcuber.convert_image_stack_to_wkw
: Convert image stacks to fully ready WKW datasets (includes downsampling, compressing and metadata generation)wkcuber.export_wkw_as_tiff
: Convert WKW datasets to a tiff stack (writing as tiles to az/y/x.tiff
folder structure is also supported)wkcuber.cubing
: Convert image stacks (e.g.,tiff
,jpg
,png
,bmp
,dm3
,dm4
) to WKW cubeswkcuber.tile_cubing
: Convert tiled image stacks (e.g. inz/y/x.ext
folder structure) to WKW cubeswkcuber.convert_knossos
: Convert KNOSSOS cubes to WKW cubeswkcuber.convert_nifti
: Convert NIFTI files to WKW files (Currently without applying transformations).wkcuber.convert_raw
: Convert RAW binary data (.raw, .vol) files to WKW datasetswkcuber.downsampling
: Create downsampled magnifications (withmedian
,mode
and linear interpolation modes). Downsampling compresses the new magnifications by default (disable via--no_compress
).wkcuber.compress
: Compress WKW cubes for efficient file storage (especially useful for segmentation data)wkcuber.metadata
: Create (or refresh) metadata (with guessing of most parameters)wkcuber.recubing
: Read existing WKW cubes in and write them again specifying the WKW file length. Useful when dataset was written e.g. with file length 1.wkcuber.check_equality
: Compare two WKW datasets to check whether they are equal (e.g., after compressing a dataset, this task can be useful to double-check that the compressed dataset contains the same data).- Most modules support multiprocessing
Supported input formats
- Standard image formats, e.g.
tiff
,jpg
,png
,bmp
- Proprietary image formats, e.g.
dm3
- Tiled image stacks (used for Catmaid)
- KNOSSOS cubes
- NIFTI files
- Raw binary files
Installation
Python 3 with pip from PyPi
wkcuber
requires at least Python 3.8
# Make sure to have lz4 installed:
# Mac: brew install lz4
# Ubuntu/Debian: apt-get install liblz4-1
# CentOS/RHEL: yum install lz4
pip install wkcuber
Docker
Use the CI-built image: scalableminds/webknossos-cuber. Example usage docker run -v <host path>:/data --rm scalableminds/webknossos-cuber wkcuber --layer_name color --scale 11.24,11.24,25 --name great_dataset /data/source/color /data/target
.
Usage
# Convert arbitrary, supported input files into wkw datasets. This sets reasonable defaults, but see other commands for customization.
python -m wkcuber \
--scale 11.24,11.24,25 \
data/source data/target
# Convert image stacks into wkw datasets
python -m wkcuber.convert_image_stack_to_wkw \
--layer_name color \
--scale 11.24,11.24,25 \
--name great_dataset \
data/source/color data/target
# Convert image files to wkw cubes
python -m wkcuber.cubing --layer_name color data/source/color data/target
python -m wkcuber.cubing --layer_name segmentation data/source/segmentation data/target
# Convert tiled image files to wkw cubes
python -m wkcuber.tile_cubing --layer_name color data/source data/target
# Convert Knossos cubes to wkw cubes
python -m wkcuber.convert_knossos --layer_name color data/source/mag1 data/target
# Convert NIFTI file to wkw file
python -m wkcuber.convert_nifti --layer_name color --scale 10,10,30 data/source/nifti_file data/target
# Convert folder with NIFTI files to wkw files
python -m wkcuber.convert_nifti --color_file one_nifti_file --segmentation_file --scale 10,10,30 another_nifti data/source/ data/target
# Convert RAW file to wkw file
python -m wkcuber.convert_raw --layer_name color --scale 10,10,30 --input_dtype uint8 --shape 2048,2048,1024 data/source/raw_file.raw data/target
# Create downsampled magnifications
python -m wkcuber.downsampling --layer_name color data/target
python -m wkcuber.downsampling --layer_name segmentation --interpolation_mode mode data/target
# Compress data in-place (mostly useful for segmentation)
python -m wkcuber.compress --layer_name segmentation data/target
# Compress data copy (mostly useful for segmentation)
python -m wkcuber.compress --layer_name segmentation data/target data/target_compress
# Create metadata
python -m wkcuber.metadata --name great_dataset --scale 11.24,11.24,25 data/target
# Refresh metadata so that new layers and/or magnifications are picked up
python -m wkcuber.metadata --refresh data/target
# Recubing an existing dataset
python -m wkcuber.recubing --layer_name color --dtype uint8 /data/source/wkw /data/target
# Check two datasets for equality
python -m wkcuber.check_equality /data/source /data/target
Parallelization
Most tasks can be configured to be executed in a parallelized manner. Via --distribution_strategy
you can pass multiprocessing
, slurm
or kubernetes
. The first can be further configured with --jobs
and the latter via --job_resources='{"mem": "10M"}'
. Use --help
to get more information.
Zarr support
Most conversion commands can be configured with --data_format zarr
. This will produce a Zarr-based dataset instead of WKW. Zarr-based datasets can also be stored on remote storage (e.g. S3, GCS, HTTP). For that, storage-specific credentials and configurations need to be passed in as environment variables.
Example S3
export AWS_SECRET_ACCESS_KEY="..."
export AWS_ACCESS_KEY_ID="..."
export AWS_REGION="..."
python -m wkcuber \
--scale 11.24,11.24,25 \
--data_format zarr \
data/source s3://bucket/data/target
Example HTTPS
export HTTP_BASIC_USER="..."
export HTTP_BASIC_PASSWORD="..."
python -m wkcuber \
--scale 11.24,11.24,25 \
--data_format zarr \
data/source https://example.org/data/target
Exchange https://
with webdav+https://
for WebDAV.
Development
Make sure to install all the required dependencies using Poetry:
pip install poetry
poetry install
Please, format, lint, and unit test your code changes before merging them.
poetry run black .
poetry run pylint -j4 wkcuber
poetry run pytest tests
Please, run the extended test suite:
tests/scripts/all_tests.sh
PyPi releases are automatically pushed when creating a new Git tag/Github release.
API documentation
Check out the latest version of the API documentation.
Generate the API documentation
Run docs/generate.sh
to open a server displaying the API docs. docs/generate.sh --persist
persists the html to docs/api
.
Test Data Credits
Excerpts for testing purposes have been sampled from:
- Dow Jacobo Hossain Siletti Hudspeth (2018). Connectomics of the zebrafish's lateral-line neuromast reveals wiring and miswiring in a simple microcircuit. eLife. DOI:10.7554/eLife.33988
- Zheng Lauritzen Perlman Robinson Nichols Milkie Torrens Price Fisher Sharifi Calle-Schuler Kmecova Ali Karsh Trautman Bogovic Hanslovsky Jefferis Kazhdan Khairy Saalfeld Fetter Bock (2018). A Complete Electron Microscopy Volume of the Brain of Adult Drosophila melanogaster. Cell. DOI:10.1016/j.cell.2018.06.019. License: CC BY-NC 4.0
License
AGPLv3 Copyright scalable minds