Skip to content

webknossos.dataset.dataset

Dataset

Dataset(dataset_path: Union[str, PathLike], voxel_size: Optional[Tuple[float, float, float]] = None, name: Optional[str] = None, exist_ok: bool = _UNSET, *, voxel_size_with_unit: Optional[VoxelSize] = None, scale: Optional[Tuple[float, float, float]] = None, read_only: bool = False)

A dataset is the entry point of the Dataset API.

An existing dataset on disk can be opened or new datasets can be created.

A dataset stores the data in .wkw files on disk with metadata in datasource-properties.json. The information in those files are kept in sync with the object.

Each dataset consists of one or more layers (webknossos.dataset.layer.Layer), which themselves can comprise multiple magnifications (webknossos.dataset.mag_view.MagView).

When using Dataset.open_remote() an instance of the RemoteDataset subclass is returned.

Examples:

Create a new dataset:

ds = Dataset("path/to/dataset", voxel_size=(11.2, 11.2, 25))

Open an existing dataset:

ds = Dataset.open("path/to/dataset")

Open a remote dataset:

ds = Dataset.open_remote("my_dataset", "organization_id")

Create a new dataset or open an existing one.

Creates a new dataset and the associated datasource-properties.json if one does not exist. If the dataset already exists and exist_ok is True, it is opened (the provided voxel_size and name are asserted to match the existing dataset).

Currently, exist_ok=True is the deprecated default and will change in future releases. Please use Dataset.open if you intend to open an existing dataset and don't want/need the creation behavior.

Parameters:

  • dataset_path (Union[str, PathLike]) –

    Path where the dataset should be created/opened

  • voxel_size (Optional[Tuple[float, float, float]], default: None ) –

    Optional tuple of floats (x, y, z) specifying voxel size in nanometers

  • name (Optional[str], default: None ) –

    Optional name for the dataset, defaults to last part of dataset_path if not provided

  • exist_ok (bool, default: _UNSET ) –

    Whether to open an existing dataset at the path rather than failing

  • voxel_size_with_unit (Optional[VoxelSize], default: None ) –

    Optional voxel size with unit specification

  • scale (Optional[Tuple[float, float, float]], default: None ) –

    Deprecated, use voxel_size instead

  • read_only (bool, default: False ) –

    Whether to open dataset in read-only mode

Raises:

  • RuntimeError

    If dataset exists and exist_ok=False

  • AssertionError

    If opening existing dataset with mismatched voxel size or name

default_view_configuration property writable

default_view_configuration: Optional[DatasetViewConfiguration]

Default view configuration for this dataset in webknossos.

Controls how the dataset is displayed in webknossos when first opened by a user, including position, zoom level, rotation etc.

Returns:

Examples:

ds.default_view_configuration = DatasetViewConfiguration(
    zoom=1.5,
    position=(100, 100, 100)
)

layers property

layers: Dict[str, Layer]

Dictionary containing all layers of this dataset.

Returns:

  • Dict[str, Layer]

    Dict[str, Layer]: Dictionary mapping layer names to Layer objects

Examples:

for layer_name, layer in ds.layers.items():
   print(layer_name)

name property writable

name: str

Name of this dataset as specified in datasource-properties.json.

Can be modified to rename the dataset. Changes are persisted to the properties file.

Returns:

  • str ( str ) –

    Current dataset name

Examples:

ds.name = "my_renamed_dataset"  # Updates the name in properties file

path instance-attribute

path: Path = strip_trailing_slash(UPath(dataset_path))

read_only property

read_only: bool

Whether this dataset is opened in read-only mode.

When True, operations that would modify the dataset (adding layers, changing properties, etc.) are not allowed and will raise RuntimeError.

Returns:

  • bool ( bool ) –

    True if dataset is read-only, False otherwise

scale property

scale: Tuple[float, float, float]

Deprecated, use voxel_size instead.

voxel_size property

voxel_size: Tuple[float, float, float]

Size of each voxel in nanometers along each dimension (x, y, z).

Returns:

  • Tuple[float, float, float]

    Tuple[float, float, float]: Size of each voxel in nanometers for x,y,z dimensions

Examples:

vx, vy, vz = ds.voxel_size
print(f"X resolution is {vx}nm")

voxel_size_with_unit property

voxel_size_with_unit: VoxelSize

Size of voxels including unit information.

Size of each voxel along each dimension (x, y, z), including unit specification. The default unit is nanometers.

Returns:

  • VoxelSize ( VoxelSize ) –

    Object containing voxel sizes and their units

ConversionLayerMapping

Bases: Enum

Strategies for mapping file paths to layers when importing images.

These strategies determine how input image files are grouped into layers during dataset creation using Dataset.from_images(). If no strategy is provided, INSPECT_SINGLE_FILE is used as the default.

If none of the pre-defined strategies fit your needs, you can provide a custom callable that takes a Path and returns a layer name string.

Examples:

Using default strategy:

ds = Dataset.from_images("images/", "dataset/")

Explicit strategy:

ds = Dataset.from_images(
    "images/",
    "dataset/",
    map_filepath_to_layer_name=ConversionLayerMapping.ENFORCE_SINGLE_LAYER
)

Custom mapping function:

ds = Dataset.from_images(
    "images/",
    "dataset/",
    map_filepath_to_layer_name=lambda p: p.stem
)

ENFORCE_LAYER_PER_FILE class-attribute instance-attribute

ENFORCE_LAYER_PER_FILE = 'enforce_layer_per_file'

Creates a new layer for each input file. Useful for converting multiple 3D images or when each 2D image should become its own layer.

ENFORCE_LAYER_PER_FOLDER class-attribute instance-attribute

ENFORCE_LAYER_PER_FOLDER = 'enforce_layer_per_folder'

Groups files by their containing folder. Each folder becomes one layer. Useful for organized 2D image stacks.

ENFORCE_LAYER_PER_TOPLEVEL_FOLDER class-attribute instance-attribute

ENFORCE_LAYER_PER_TOPLEVEL_FOLDER = 'enforce_layer_per_toplevel_folder'

Groups files by their top-level folder. Useful when multiple layers each have their stacks split across subfolders.

ENFORCE_SINGLE_LAYER class-attribute instance-attribute

ENFORCE_SINGLE_LAYER = 'enforce_single_layer'

Combines all input files into a single layer. Only useful when all images are 2D slices that should be combined.

INSPECT_EVERY_FILE class-attribute instance-attribute

INSPECT_EVERY_FILE = 'inspect_every_file'

Like INSPECT_SINGLE_FILE but determines strategy separately for each file. More flexible but slower for many files.

INSPECT_SINGLE_FILE class-attribute instance-attribute

INSPECT_SINGLE_FILE = 'inspect_single_file'

Default strategy. Inspects first image file to determine if data is 2D or 3D. For 2D data uses ENFORCE_LAYER_PER_FOLDER, for 3D uses ENFORCE_LAYER_PER_FILE.

add_copy_layer

add_copy_layer(foreign_layer: Union[str, Path, Layer], new_layer_name: Optional[str] = None, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[Vec3IntLike, int]] = None, data_format: Optional[Union[str, DataFormat]] = None, compress: Optional[bool] = None, executor: Optional[Executor] = None) -> Layer

Copy layer from another dataset to this one.

Creates a new layer in this dataset by copying data and metadata from a layer in another dataset.

Parameters:

  • foreign_layer (Union[str, Path, Layer]) –

    Layer to copy (path or Layer object)

  • new_layer_name (Optional[str], default: None ) –

    Optional name for the new layer, uses original name if None

  • chunk_shape (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional shape of chunks for storage

  • chunks_per_shard (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional number of chunks per shard

  • data_format (Optional[Union[str, DataFormat]], default: None ) –

    Optional format to store copied data ('wkw', 'zarr', etc.)

  • compress (Optional[bool], default: None ) –

    Optional whether to compress copied data

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallel copying

Returns:

  • Layer ( Layer ) –

    The newly created copy of the layer

Raises:

  • IndexError

    If target layer name already exists

  • RuntimeError

    If dataset is read-only

Examples:

Copy layer keeping same name:

other_ds = Dataset.open("other/dataset")
copied = ds.add_copy_layer(other_ds.get_layer("color"))

Copy with new name:

copied = ds.add_copy_layer(
    other_ds.get_layer("color"),
    new_layer_name="color_copy",
    compress=True
)

add_fs_copy_layer

add_fs_copy_layer(foreign_layer: Union[str, Path, Layer], new_layer_name: Optional[str] = None) -> Layer

Copies the files at foreign_layer which belongs to another dataset to the current dataset via the filesystem. Additionally, the relevant information from the datasource-properties.json of the other dataset are copied too. If new_layer_name is None, the name of the foreign layer is used.

add_layer

add_layer(layer_name: str, category: LayerCategoryType, dtype_per_layer: Optional[DTypeLike] = None, dtype_per_channel: Optional[DTypeLike] = None, num_channels: Optional[int] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, bounding_box: Optional[NDBoundingBox] = None, **kwargs: Any) -> Layer

Create a new layer in the dataset.

Creates a new layer with the given name, category, and data type.

Parameters:

  • layer_name (str) –

    Name for the new layer

  • category (LayerCategoryType) –

    Either 'color' or 'segmentation'

  • dtype_per_layer (Optional[DTypeLike], default: None ) –

    Optional data type for entire layer, e.g. np.uint8

  • dtype_per_channel (Optional[DTypeLike], default: None ) –

    Optional data type per channel, e.g. np.uint8

  • num_channels (Optional[int], default: None ) –

    Number of channels (default 1)

  • data_format (Union[str, DataFormat], default: DEFAULT_DATA_FORMAT ) –

    Format to store data ('wkw', 'zarr', 'zarr3')

  • bounding_box (Optional[NDBoundingBox], default: None ) –

    Optional initial bounding box of layer

  • **kwargs (Any, default: {} ) –

    Additional arguments: - largest_segment_id: For segmentation layers, initial largest ID - mappings: For segmentation layers, optional ID mappings

Returns:

  • Layer ( Layer ) –

    The newly created layer

Raises:

  • IndexError

    If layer with given name already exists

  • RuntimeError

    If invalid category specified

  • AttributeError

    If both dtype_per_layer and dtype_per_channel specified

  • AssertionError

    If invalid layer name or WKW format used with remote dataset

Examples:

Create color layer:

layer = ds.add_layer(
    "my_raw_microscopy_layer",
    LayerCategoryType.COLOR_CATEGORY,
    dtype_per_channel=np.uint8,
)

Create segmentation layer:

layer = ds.add_layer(
    "my_segmentation_labels",
    LayerCategoryType.SEGMENTATION_CATEGORY,
    dtype_per_channel=np.uint64
)
Note

The dtype can be specified either per layer or per channel, but not both. If neither is specified, uint8 per channel is used by default. WKW format can only be used with local datasets.

add_layer_for_existing_files

add_layer_for_existing_files(layer_name: str, category: LayerCategoryType, **kwargs: Any) -> Layer

Create a new layer from existing data files.

Adds a layer by discovering and incorporating existing data files that were created externally, rather than creating new ones. The layer properties are inferred from the existing files unless overridden.

Parameters:

  • layer_name (str) –

    Name for the new layer

  • category (LayerCategoryType) –

    Layer category ('color' or 'segmentation')

  • **kwargs (Any, default: {} ) –

    Additional arguments: - num_channels: Override detected number of channels - dtype_per_channel: Override detected data type - data_format: Override detected data format - bounding_box: Override detected bounding box

Returns:

  • Layer ( Layer ) –

    The newly created layer referencing the existing files

Raises:

  • AssertionError

    If layer already exists or no valid files found

  • RuntimeError

    If dataset is read-only

Examples:

Basic usage:

layer = ds.add_layer_for_existing_files(
    "external_data",
    "color"
)

Override properties:

layer = ds.add_layer_for_existing_files(
    "segmentation_data",
    "segmentation",
    dtype_per_channel=np.uint64
)
Note

The data files must already exist in the dataset directory under the layer name. Files are analyzed to determine properties like data type and number of channels. Magnifications are discovered automatically.

add_layer_from_images

add_layer_from_images(images: Union[str, FramesSequence, List[Union[str, PathLike]]], layer_name: str, category: Optional[LayerCategoryType] = 'color', data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, mag: Union[int, str, list, tuple, ndarray, Mag] = Mag(1), chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[int, Vec3IntLike]] = None, compress: bool = False, *, topleft: VecIntLike = zeros(), swap_xy: bool = False, flip_x: bool = False, flip_y: bool = False, flip_z: bool = False, dtype: Optional[DTypeLike] = None, use_bioformats: Optional[bool] = None, channel: Optional[int] = None, timepoint: Optional[int] = None, czi_channel: Optional[int] = None, batch_size: Optional[int] = None, allow_multiple_layers: bool = False, max_layers: int = 20, truncate_rgba_to_rgb: bool = True, executor: Optional[Executor] = None, chunk_size: Optional[Union[Vec3IntLike, int]] = None) -> Layer

Creates a new layer called layer_name with mag mag from images. images can be one of the following:

  • glob-string
  • list of paths
  • pims.FramesSequence instance

Please see the pims docs for more information.

This method needs extra packages like tifffile or pylibczirw. Please install the respective extras, e.g. using python -m pip install "webknossos[all]".

Further Arguments:

  • category: color by default, may be set to "segmentation"
  • data_format: by default wkw files are written, may be set to "zarr"
  • mag: magnification to use for the written data
  • chunk_shape, chunks_per_shard, compress: adjust how the data is stored on disk
  • topleft: set an offset in Mag(1) to start writing the data, only affecting the output
  • swap_xy: set to True to interchange x and y axis before writing to disk
  • flip_x, flip_y, flip_z: set to True to reverse the respective axis before writing to disk
  • dtype: the read image data will be convertoed to this dtype using numpy.ndarray.astype
  • use_bioformats: set to True to only use the pims bioformats adapter directly, needs a JVM, set to False to forbid using the bioformats adapter, by default it is tried as a last option
  • channel: may be used to select a single channel, if multiple are available
  • timepoint: for timeseries, select a timepoint to use by specifying it as an int, starting from 0
  • czi_channel: may be used to select a channel for .czi images, which differs from normal color-channels
  • batch_size: size to process the images (influences RAM consumption), must be a multiple of the chunk-size z-axis for uncompressed and the shard-size z-axis for compressed layers, default is the chunk-size or shard-size respectively
  • allow_multiple_layers: set to True if timepoints or channels may result in multiple layers being added (only the first is returned)
  • max_layers: only applies if allow_multiple_layers=True, limits the number of layers added via different channels or timepoints
  • truncate_rgba_to_rgb: only applies if allow_multiple_layers=True, set to False to write four channels into layers instead of an RGB channel
  • executor: pass a ClusterExecutor instance to parallelize the conversion jobs across the batches

add_layer_like

add_layer_like(other_layer: Layer, layer_name: str) -> Layer

add_remote_layer

add_remote_layer(foreign_layer: Union[str, UPath, Layer], new_layer_name: Optional[str] = None) -> Layer

Add a remote layer from another dataset.

Creates a layer that references data from a remote dataset. The image data will be streamed on-demand when accessed.

Parameters:

  • foreign_layer (Union[str, UPath, Layer]) –

    Remote layer to add (path or Layer object)

  • new_layer_name (Optional[str], default: None ) –

    Optional name for the new layer, uses original name if None

Returns:

  • Layer ( Layer ) –

    The newly created remote layer referencing the foreign data

Raises:

  • IndexError

    If target layer name already exists

  • AssertionError

    If trying to add non-remote layer or same origin dataset

  • RuntimeError

    If dataset is read-only

Examples:

ds = Dataset.open("other/dataset")
remote_ds = Dataset.open_remote("my_dataset", "my_org_id")
new_layer = ds.add_remote_layer(
    remote_ds.get_layer("color")
)
Note

Changes to the original layer's properties afterwards won't affect this dataset. Data is only referenced, not copied.

add_symlink_layer(foreign_layer: Union[str, Path, Layer], make_relative: bool = False, new_layer_name: Optional[str] = None) -> Layer

Create symbolic link to layer from another dataset.

Instead of copying data, creates a symbolic link to the original layer's data and copies only the layer metadata. Changes to the original layer's properties, e.g. bounding box, afterwards won't affect this dataset and vice-versa.

Parameters:

  • foreign_layer (Union[str, Path, Layer]) –

    Layer to link to (path or Layer object)

  • make_relative (bool, default: False ) –

    Whether to create relative symlinks

  • new_layer_name (Optional[str], default: None ) –

    Optional name for the linked layer, uses original name if None

Returns:

  • Layer ( Layer ) –

    The newly created symbolic link layer

Raises:

  • IndexError

    If target layer name already exists

  • AssertionError

    If trying to create symlinks in/to remote datasets

  • RuntimeError

    If dataset is read-only

Examples:

other_ds = Dataset.open("other/dataset")
linked = ds.add_symlink_layer(
    other_ds.get_layer("color"),
    make_relative=True
)
Note

Only works with local file systems, cannot link remote datasets or create symlinks in remote datasets.

announce_manual_upload classmethod

announce_manual_upload(dataset_name: str, organization: str, initial_team_ids: List[str], folder_id: str, token: Optional[str] = None) -> None

Announce a manual dataset upload to WEBKNOSSOS.

Used when manually uploading datasets to the file system of a datastore. Creates database entries and sets access rights on the webknossos instance before the actual data upload.

Parameters:

  • dataset_name (str) –

    Name for the new dataset

  • organization (str) –

    Organization ID to upload to

  • initial_team_ids (List[str]) –

    List of team IDs to grant initial access

  • folder_id (str) –

    ID of folder where dataset should be placed

  • token (Optional[str], default: None ) –

    Optional authentication token

Note

This is typically only used by administrators with direct file system access to the WEBKNOSSOS datastore. Most users should use upload() instead.

Examples:

Dataset.announce_manual_upload(
    "my_dataset",
    "my_organization",
    ["team_a", "team_b"],
    "folder_123"
)

calculate_bounding_box

calculate_bounding_box() -> NDBoundingBox

Calculate the enclosing bounding box of all layers.

Finds the smallest box that contains all data from all layers in the dataset.

Returns:

  • NDBoundingBox ( NDBoundingBox ) –

    Bounding box containing all layer data

Examples:

bbox = ds.calculate_bounding_box()
print(f"Dataset spans {bbox.size} voxels")
print(f"Dataset starts at {bbox.topleft}")

compress

compress(executor: Optional[Executor] = None) -> None

Compress all uncompressed magnifications in-place.

Compresses the data of all magnification levels that aren't already compressed, for all layers in the dataset.

Parameters:

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallel compression

Raises:

  • RuntimeError

    If dataset is read-only

Examples:

ds.compress()
Note

If data is already compressed, this will have no effect.

copy_dataset

copy_dataset(new_dataset_path: Union[str, Path], voxel_size: Optional[Tuple[float, float, float]] = None, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[Vec3IntLike, int]] = None, data_format: Optional[Union[str, DataFormat]] = None, compress: Optional[bool] = None, args: Optional[Namespace] = None, executor: Optional[Executor] = None, *, voxel_size_with_unit: Optional[VoxelSize] = None, chunk_size: Optional[Union[Vec3IntLike, int]] = None, block_len: Optional[int] = None, file_len: Optional[int] = None) -> Dataset

Creates an independent copy of the dataset with all layers at a new location. Data storage parameters can be customized for the copied dataset.

Parameters:

  • new_dataset_path (Union[str, Path]) –

    Path where new dataset should be created

  • voxel_size (Optional[Tuple[float, float, float]], default: None ) –

    Optional tuple of floats (x,y,z) specifying voxel size in nanometers

  • chunk_shape (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional shape of chunks for data storage

  • chunks_per_shard (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional number of chunks per shard

  • data_format (Optional[Union[str, DataFormat]], default: None ) –

    Optional format to store data ('wkw', 'zarr', 'zarr3')

  • compress (Optional[bool], default: None ) –

    Optional whether to compress data

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallel copying

  • voxel_size_with_unit (Optional[VoxelSize], default: None ) –

    Optional voxel size specification with units

  • **kwargs

    Additional deprecated arguments: - chunk_size: Use chunk_shape instead - block_len: Use chunk_shape instead - file_len: Use chunks_per_shard instead - args: Use executor instead

Returns:

  • Dataset ( Dataset ) –

    The newly created copy

Raises:

  • AssertionError

    If trying to copy WKW layers to remote dataset

Examples:

Basic copy:

copied = ds.copy_dataset("path/to/copy")

Copy with different storage:

copied = ds.copy_dataset(
    "path/to/copy",
    data_format="zarr",
    compress=True
)
Note

WKW layers can only be copied to datasets on local file systems. For remote datasets, use data_format='zarr'.

create classmethod

create(dataset_path: Union[str, PathLike], voxel_size: Tuple[float, float, float], name: Optional[str] = None) -> Dataset

Deprecated, please use the constructor Dataset() instead.

delete_layer

delete_layer(layer_name: str) -> None

Delete a layer from the dataset.

Removes the layer's data and metadata from disk completely. This deletes both the datasource-properties.json entry and all data files for the layer.

Parameters:

  • layer_name (str) –

    Name of layer to delete

Raises:

  • IndexError

    If no layer with the given name exists

  • RuntimeError

    If dataset is read-only

Examples:

ds.delete_layer("old_layer")
print("Remaining layers:", list(ds.layers))

download classmethod

download(dataset_name_or_url: str, organization_id: Optional[str] = None, sharing_token: Optional[str] = None, webknossos_url: Optional[str] = None, bbox: Optional[BoundingBox] = None, layers: Union[List[str], str, None] = None, mags: Optional[List[Mag]] = None, path: Optional[Union[PathLike, str]] = None, exist_ok: bool = False) -> Dataset

Downloads a dataset and returns the Dataset instance.

  • dataset_name_or_url may be a dataset name or a full URL to a dataset view, e.g. https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view If a URL is used, organization_id, webknossos_url and sharing_token must not be set.
  • organization_id may be supplied if a dataset name was used in the previous argument, it defaults to your current organization from the webknossos_context. You can find your organization_id here.
  • sharing_token may be supplied if a dataset name was used and can specify a sharing token.
  • webknossos_url may be supplied if a dataset name was used, and allows to specify in which webknossos instance to search for the dataset. It defaults to the url from your current webknossos_context, using https://webknossos.org as a fallback.
  • bbox, layers, and mags specify which parts of the dataset to download. If nothing is specified the whole image, all layers, and all mags are downloaded respectively.
  • path and exist_ok specify where to save the downloaded dataset and whether to overwrite if the path exists.

downsample

downsample(sampling_mode: SamplingModes = ANISOTROPIC, coarsest_mag: Optional[Mag] = None, executor: Optional[Executor] = None) -> None

Generate downsampled magnifications for all layers.

Creates lower resolution versions (coarser magnifications) of all layers that are not yet downsampled, up to the specified coarsest magnification.

Parameters:

  • sampling_mode (SamplingModes, default: ANISOTROPIC ) –

    Strategy for downsampling (e.g. ANISOTROPIC, MAX)

  • coarsest_mag (Optional[Mag], default: None ) –

    Optional maximum/coarsest magnification to generate

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallel processing

Raises:

  • RuntimeError

    If dataset is read-only

Examples:

Basic downsampling:

ds.downsample()

With custom parameters:

ds.downsample(
    sampling_mode=SamplingModes.ANISOTROPIC,
    coarsest_mag=Mag(8),
)
Note
  • ANISOTROPIC sampling creates anisotropic downsampling until dataset is isotropic
  • Other modes like MAX, CONSTANT etc create regular downsampling patterns
  • If magnifications already exist they will not be regenerated

from_images classmethod

from_images(input_path: Union[str, PathLike], output_path: Union[str, PathLike], voxel_size: Optional[Tuple[float, float, float]] = None, name: Optional[str] = None, *, map_filepath_to_layer_name: Union[ConversionLayerMapping, Callable[[Path], str]] = INSPECT_SINGLE_FILE, z_slices_sort_key: Callable[[Path], Any] = natsort_keygen(), voxel_size_with_unit: Optional[VoxelSize] = None, layer_name: Optional[str] = None, layer_category: Optional[LayerCategoryType] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[int, Vec3IntLike]] = None, compress: bool = False, swap_xy: bool = False, flip_x: bool = False, flip_y: bool = False, flip_z: bool = False, use_bioformats: Optional[bool] = None, max_layers: int = 20, batch_size: Optional[int] = None, executor: Optional[Executor] = None) -> Dataset

This method imports image data in a folder or from a file as a webknossos dataset.

The image data can be 3D images (such as multipage tiffs) or stacks of 2D images. Multiple 3D images or image stacks are mapped to different layers based on the mapping strategy.

The exact mapping is handled by the argument map_filepath_to_layer_name, which can be a pre-defined strategy from the enum ConversionLayerMapping, or a custom callable, taking a path of an image file and returning the corresponding layer name. All files belonging to the same layer name are then grouped. In case of multiple files per layer, those are usually mapped to the z-dimension. The order of the z-slices can be customized by setting z_slices_sort_key.

For more fine-grained control, please create an empty dataset and use add_layer_from_images.

Parameters:

  • input_path (Union[str, PathLike]) –

    Path to input image files

  • output_path (Union[str, PathLike]) –

    Output path for created dataset

  • voxel_size (Optional[Tuple[float, float, float]], default: None ) –

    Optional tuple of floats (x,y,z) for voxel size in nm

  • name (Optional[str], default: None ) –

    Optional name for dataset

  • map_filepath_to_layer_name (Union[ConversionLayerMapping, Callable[[Path], str]], default: INSPECT_SINGLE_FILE ) –

    Strategy for mapping files to layers, either a ConversionLayerMapping enum value or callable taking Path and returning str

  • z_slices_sort_key (Callable[[Path], Any], default: natsort_keygen() ) –

    Optional key function for sorting z-slices

  • voxel_size_with_unit (Optional[VoxelSize], default: None ) –

    Optional voxel size with unit specification

  • layer_name (Optional[str], default: None ) –

    Optional name for layer(s)

  • layer_category (Optional[LayerCategoryType], default: None ) –

    Optional category override (LayerCategoryType.color / LayerCategoryType.segmentation)

  • data_format (Union[str, DataFormat], default: DEFAULT_DATA_FORMAT ) –

    Format to store data in ('wkw'/'zarr')

  • chunk_shape (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional shape of chunks to store data in

  • chunks_per_shard (Optional[Union[int, Vec3IntLike]], default: None ) –

    Optional number of chunks per shard

  • compress (bool, default: False ) –

    Whether to compress the data

  • swap_xy (bool, default: False ) –

    Whether to swap x and y axes

  • flip_x (bool, default: False ) –

    Whether to flip the x axis

  • flip_y (bool, default: False ) –

    Whether to flip the y axis

  • flip_z (bool, default: False ) –

    Whether to flip the z axis

  • use_bioformats (Optional[bool], default: None ) –

    Whether to use bioformats for reading

  • max_layers (int, default: 20 ) –

    Maximum number of layers to create

  • batch_size (Optional[int], default: None ) –

    Size of batches for processing

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallelization

Returns:

  • Dataset ( Dataset ) –

    The created dataset instance

Examples:

ds = Dataset.from_images("path/to/images/",
                        "path/to/dataset/",
                        voxel_size=(1, 1, 1))
Note

This method needs extra packages like tifffile or pylibczirw. Install with pip install "webknossos[all]" and pip install --extra-index-url https://pypi.scm.io/simple/ "webknossos[czi]".

get_color_layer

get_color_layer() -> Layer

Deprecated, please use get_color_layers().

Returns the only color layer. Fails with a RuntimeError if there are multiple color layers or none.

get_color_layers

get_color_layers() -> List[Layer]

Get all color layers in the dataset.

Provides access to all layers with category 'color'. Useful when a dataset contains multiple color layers.

Returns:

  • List[Layer]

    List[Layer]: List of all color layers in order

Examples:

Print all color layer names:

for layer in ds.get_color_layers():
    print(layer.name)
Note

If you need only a single color layer, consider using get_layer() with the specific layer name instead.

get_layer

get_layer(layer_name: str) -> Layer

Get a specific layer from this dataset.

Parameters:

  • layer_name (str) –

    Name of the layer to retrieve

Returns:

  • Layer ( Layer ) –

    The requested layer object

Raises:

  • IndexError

    If no layer with the given name exists

Examples:

color_layer = ds.get_layer("color")
seg_layer = ds.get_layer("segmentation")
Note

Use layers property to access all layers at once.

get_or_add_layer

get_or_add_layer(layer_name: str, category: LayerCategoryType, dtype_per_layer: Optional[DTypeLike] = None, dtype_per_channel: Optional[DTypeLike] = None, num_channels: Optional[int] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, **kwargs: Any) -> Layer

Get an existing layer or create a new one.

Gets a layer with the given name if it exists, otherwise creates a new layer with the specified parameters.

Parameters:

  • layer_name (str) –

    Name of the layer to get or create

  • category (LayerCategoryType) –

    Layer category ('color' or 'segmentation')

  • dtype_per_layer (Optional[DTypeLike], default: None ) –

    Optional data type for entire layer

  • dtype_per_channel (Optional[DTypeLike], default: None ) –

    Optional data type per channel

  • num_channels (Optional[int], default: None ) –

    Optional number of channels

  • data_format (Union[str, DataFormat], default: DEFAULT_DATA_FORMAT ) –

    Format to store data ('wkw', 'zarr', etc.)

  • **kwargs (Any, default: {} ) –

    Additional arguments passed to add_layer()

Returns:

  • Layer ( Layer ) –

    The existing or newly created layer

Raises:

  • AssertionError

    If existing layer's properties don't match specified parameters

  • ValueError

    If both dtype_per_layer and dtype_per_channel specified

  • RuntimeError

    If invalid category specified

Examples:

layer = ds.get_or_add_layer(
    "segmentation",
    LayerCategoryType.SEGMENTATION_CATEGORY,
    dtype_per_channel=np.uint64,
)
Note

The dtype can be specified either per layer or per channel, but not both. For existing layers, the parameters are validated against the layer properties.

get_or_create classmethod

get_or_create(dataset_path: Union[str, Path], voxel_size: Tuple[float, float, float], name: Optional[str] = None) -> Dataset

Deprecated, please use the constructor Dataset() instead.

get_remote_datasets staticmethod

get_remote_datasets(organization_id: Optional[str] = None, tags: Optional[Union[str, Sequence[str]]] = None) -> Mapping[str, RemoteDataset]

Get available datasets from WEBKNOSSOS.

Returns a mapping of dataset names to lazy-initialized RemoteDataset objects for all datasets visible to the specified organization or current user.

Parameters:

  • organization_id (Optional[str], default: None ) –

    Optional organization to get datasets from. Defaults to organization of logged in user.

  • tags (Optional[Union[str, Sequence[str]]], default: None ) –

    Optional tag(s) to filter datasets by. Can be a single tag string or sequence of tags. Only returns datasets with all specified tags.

Returns:

  • Mapping[str, RemoteDataset]

    Mapping[str, RemoteDataset]: Dict mapping dataset names to RemoteDataset objects

Examples:

List all available datasets:

datasets = Dataset.get_remote_datasets()
print(sorted(datasets.keys()))

Get datasets for specific organization:

org_datasets = Dataset.get_remote_datasets("my_organization")
ds = org_datasets["dataset_name"]

Filter datasets by tag:

published = Dataset.get_remote_datasets(tags="published")
tagged = Dataset.get_remote_datasets(tags=["tag1", "tag2"])
Note

RemoteDataset objects are initialized lazily when accessed for the first time. The mapping object provides a fast way to list and look up available datasets.

get_segmentation_layer

get_segmentation_layer() -> SegmentationLayer

Deprecated, please use get_segmentation_layers().

Returns the only segmentation layer. Fails with a IndexError if there are multiple segmentation layers or none.

get_segmentation_layers

get_segmentation_layers() -> List[SegmentationLayer]

Get all segmentation layers in the dataset.

Provides access to all layers with category 'segmentation'. Useful when a dataset contains multiple segmentation layers.

Returns:

  • List[SegmentationLayer]

    List[SegmentationLayer]: List of all segmentation layers in order

Examples:

Print all segmentation layer names:

for layer in ds.get_segmentation_layers():
    print(layer.name)
Note

If you need only a single segmentation layer, consider using get_layer() with the specific layer name instead.

open classmethod

open(dataset_path: Union[str, PathLike]) -> Dataset

To open an existing dataset on disk, simply call Dataset.open("your_path"). This requires datasource-properties.json to exist in this folder. Based on the datasource-properties.json, a dataset object is constructed. Only layers and magnifications that are listed in the properties are loaded (even though there might exist more layers or magnifications on disk).

The dataset_path refers to the top level directory of the dataset (excluding layer or magnification names).

open_remote classmethod

open_remote(dataset_name_or_url: str, organization_id: Optional[str] = None, sharing_token: Optional[str] = None, webknossos_url: Optional[str] = None) -> RemoteDataset

Opens a remote webknossos dataset. Image data is accessed via network requests. Dataset metadata such as allowed teams or the sharing token can be read and set via the respective RemoteDataset properties.

Parameters:

  • dataset_name_or_url (str) –

    Either dataset name or full URL to dataset view, e.g. https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view

  • organization_id (Optional[str], default: None ) –

    Optional organization ID if using dataset name. Can be found here

  • sharing_token (Optional[str], default: None ) –

    Optional sharing token for dataset access

  • webknossos_url (Optional[str], default: None ) –

    Optional custom webknossos URL, defaults to context URL, usually https://webknossos.org

Returns:

  • RemoteDataset ( RemoteDataset ) –

    Dataset instance for remote access

Examples:

ds = Dataset.open_remote("`https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view`")
Note

If supplying an URL, organization_id, webknossos_url and sharing_token must not be set.

shallow_copy_dataset

shallow_copy_dataset(new_dataset_path: Union[str, PathLike], name: Optional[str] = None, make_relative: bool = False, layers_to_ignore: Optional[Iterable[str]] = None) -> Dataset

Create a new dataset that uses symlinks to reference data.

Links all magnifications and layer directories from the original dataset via symlinks rather than copying data. Useful for creating alternative views or exposing datasets to webknossos.

Parameters:

  • new_dataset_path (Union[str, PathLike]) –

    Path where new dataset should be created

  • name (Optional[str], default: None ) –

    Optional name for the new dataset, uses original name if None

  • make_relative (bool, default: False ) –

    Whether to create relative symlinks

  • layers_to_ignore (Optional[Iterable[str]], default: None ) –

    Optional iterable of layer names to exclude

Returns:

  • Dataset ( Dataset ) –

    The newly created dataset with linked layers

Raises:

  • AssertionError

    If trying to link remote datasets

  • RuntimeError

    If dataset is read-only

Examples:

Basic shallow copy:

linked = ds.shallow_copy_dataset("path/to/link")

With relative links excluding layers:

linked = ds.shallow_copy_dataset(
    "path/to/link",
    make_relative=True,
    layers_to_ignore=["temp_layer"]
)
Note

Only works with datasets on local filesystems. Cannot create shallow copies of remote datasets or create shallow copies in remote locations.

trigger_reload_in_datastore classmethod

trigger_reload_in_datastore(dataset_name: str, organization: str, token: Optional[str] = None) -> None

Trigger a manual reload of the dataset's properties.

For manually uploaded datasets, properties are normally updated automatically after a few minutes. This method forces an immediate reload.

This is typically only needed after manual changes to the dataset's files. Cannot be used for local datasets.

Parameters:

  • dataset_name (str) –

    Name of dataset to reload

  • organization (str) –

    Organization ID where dataset is located

  • token (Optional[str], default: None ) –

    Optional authentication token

Examples:

# Force reload after manual file changes
Dataset.trigger_reload_in_datastore(
    "my_dataset",
    "organization_id"
)

upload

upload(new_dataset_name: Optional[str] = None, layers_to_link: Optional[List[Union[LayerToLink, Layer]]] = None, jobs: Optional[int] = None) -> RemoteDataset

Upload this dataset to webknossos.

Copies all data and metadata to webknossos, creating a new dataset that can be accessed remotely. For large datasets, existing layers can be linked instead of re-uploaded.

Parameters:

  • new_dataset_name (Optional[str], default: None ) –

    Optional name for the uploaded dataset

  • layers_to_link (Optional[List[Union[LayerToLink, Layer]]], default: None ) –

    Optional list of layers that should link to existing data instead of being uploaded

  • jobs (Optional[int], default: None ) –

    Optional number of parallel upload jobs, defaults to 5

Returns:

  • RemoteDataset ( RemoteDataset ) –

    Reference to the newly created remote dataset

Examples:

Simple upload:

remote_ds = ds.upload("my_new_dataset")
print(remote_ds.url)

Link existing layers:

link = LayerToLink.from_remote_layer(existing_layer)
remote_ds = ds.upload(layers_to_link=[link])

RemoteDataset

RemoteDataset(dataset_path: UPath, dataset_name: str, organization_id: str, sharing_token: Optional[str], context: ContextManager)

Bases: Dataset

A representation of a dataset on a webknossos server.

This class is returned from Dataset.open_remote() and provides read-only access to image data streamed from the webknossos server. It uses the same interface as Dataset but additionally allows metadata manipulation through properties.

Properties

metadata: Dataset metadata as key-value pairs display_name: Human readable name description: Dataset description tags: Dataset tags is_public: Whether dataset is public sharing_token: Dataset sharing token allowed_teams: Teams with dataset access folder: Dataset folder location

Examples:

Opening a remote dataset with organization ID:

ds = Dataset.open_remote("my_dataset", "org_id")

Opening with dataset URL:

ds = Dataset.open_remote("https://webknossos.org/datasets/org/dataset/view")

Setting metadata:

ds.metadata = {"key": "value", "tags": ["tag1", "tag2"]}
ds.display_name = "My Dataset"
ds.allowed_teams = [Team.get_by_name("Lab_A")]
Note

Do not instantiate directly, use Dataset.open_remote() instead.

Initialize a remote dataset instance.

Parameters:

  • dataset_path (UPath) –

    Path to remote dataset location

  • dataset_name (str) –

    Name of dataset in WEBKNOSSOS

  • organization_id (str) –

    Organization that owns the dataset

  • sharing_token (Optional[str]) –

    Optional token for shared access

  • context (ContextManager) –

    Context manager for WEBKNOSSOS connection

Raises:

  • FileNotFoundError

    If dataset cannot be opened as zarr format and no metadata exists

Note

Do not call this constructor directly, use Dataset.open_remote() instead. This class provides access to remote WEBKNOSSOS datasets with additional metadata manipulation.

allowed_teams property writable

allowed_teams: Tuple[Team, ...]

Teams that are allowed to access this dataset.

Controls which teams have read access to view and use this dataset. Changes are immediately synchronized with WEBKNOSSOS.

Returns:

  • Tuple[Team, ...]

    Tuple[Team, ...]: Teams currently having access

Examples:

from webknossos import Team
team = Team.get_by_name("Lab_A")
ds.allowed_teams = [team]
print([t.name for t in ds.allowed_teams])

# Give access to multiple teams:
ds.allowed_teams = [
    Team.get_by_name("Lab_A"),
    Team.get_by_name("Lab_B")
]
Note
  • Teams must be from the same organization as the dataset
  • Can be set using Team objects or team ID strings
  • An empty list makes the dataset private

default_view_configuration property writable

default_view_configuration: Optional[DatasetViewConfiguration]

Default view configuration for this dataset in webknossos.

Controls how the dataset is displayed in webknossos when first opened by a user, including position, zoom level, rotation etc.

Returns:

Examples:

ds.default_view_configuration = DatasetViewConfiguration(
    zoom=1.5,
    position=(100, 100, 100)
)

description deletable property writable

description: Optional[str]

Free-text description of the dataset.

Can be edited with markdown formatting. Changes are immediately synchronized with WEBKNOSSOS.

Returns:

  • Optional[str]

    Optional[str]: Current description if set, None otherwise

Examples:

ds.description = "Dataset acquired on *June 1st*"
ds.description = None  # Remove description

display_name deletable property writable

display_name: Optional[str]

The human-readable name for the dataset in the webknossos interface.

Can be set to a different value than the dataset name used in URLs and downloads. Changes are immediately synchronized with WEBKNOSSOS.

Returns:

  • Optional[str]

    Optional[str]: Current display name if set, None otherwise

Examples:

remote_ds.display_name = "Mouse Brain Sample A"

folder property writable

folder: RemoteFolder

The (virtual) folder containing this dataset in WEBKNOSSOS.

Represents the folder location in the WEBKNOSSOS UI folder structure. Can be changed to move the dataset to a different folder. Changes are immediately synchronized with WEBKNOSSOS.

Returns:

  • RemoteFolder ( RemoteFolder ) –

    Current folder containing the dataset

Examples:

folder = RemoteFolder.get_by_path("Datasets/Published")
ds.folder = folder
print(ds.folder.path) # 'Datasets/Published'

is_public property writable

is_public: bool

Control whether the dataset is publicly accessible.

When True, anyone can view the dataset without logging in to WEBKNOSSOS. Changes are immediately synchronized with WEBKNOSSOS.

Returns:

  • bool ( bool ) –

    True if dataset is public, False if private

Examples:

ds.is_public = True
ds.is_public = False
print("Public" if ds.is_public else "Private")  # Private

layers property

layers: Dict[str, Layer]

Dictionary containing all layers of this dataset.

Returns:

  • Dict[str, Layer]

    Dict[str, Layer]: Dictionary mapping layer names to Layer objects

Examples:

for layer_name, layer in ds.layers.items():
   print(layer_name)

metadata property writable

metadata: DatasetMetadata

Get or set metadata key-value pairs for the dataset.

The metadata can contain strings, numbers, and lists of strings as values. Changes are immediately synchronized with WEBKNOSSOS.

Returns:

  • DatasetMetadata ( DatasetMetadata ) –

    Current metadata key-value pairs

Examples:

ds.metadata = {
    "species": "mouse",
    "age_days": 42,
    "tags": ["verified", "published"]
}
print(ds.metadata["species"])

name property writable

name: str

Name of this dataset as specified in datasource-properties.json.

Can be modified to rename the dataset. Changes are persisted to the properties file.

Returns:

  • str ( str ) –

    Current dataset name

Examples:

ds.name = "my_renamed_dataset"  # Updates the name in properties file

path instance-attribute

path = None

read_only property

read_only: bool

Whether this dataset is opened in read-only mode.

When True, operations that would modify the dataset (adding layers, changing properties, etc.) are not allowed and will raise RuntimeError.

Returns:

  • bool ( bool ) –

    True if dataset is read-only, False otherwise

scale property

scale: Tuple[float, float, float]

Deprecated, use voxel_size instead.

sharing_token property

sharing_token: str

Get a new token for sharing access to this dataset.

Each call generates a fresh token that allows viewing the dataset without logging in. The token can be appended to dataset URLs as a query parameter.

Returns:

  • str ( str ) –

    Fresh sharing token for dataset access

Examples:

token = ds.sharing_token
url = f"{ds.url}?token={token}"
print("Share this link:", url)
Note
  • A new token is generated on each access
  • The token provides read-only access
  • Anyone with the token can view the dataset

tags property writable

tags: Tuple[str, ...]

User-assigned tags for organizing and filtering datasets.

Tags allow categorizing and filtering datasets in the webknossos dashboard interface. Changes are immediately synchronized with WEBKNOSSOS.

Returns:

  • Tuple[str, ...]

    Tuple[str, ...]: Currently assigned tags, in string tuple form

Examples:

ds.tags = ["verified", "published"]
print(ds.tags)  # ('verified', 'published')
ds.tags = []  # Remove all tags

url property

url: str

URL to access this dataset in webknossos.

Constructs the full URL to the dataset in the webknossos web interface.

Returns:

  • str ( str ) –

    Full dataset URL including organization and dataset name

Examples:

print(ds.url) # 'https://webknossos.org/datasets/my_org/my_dataset'

voxel_size property

voxel_size: Tuple[float, float, float]

Size of each voxel in nanometers along each dimension (x, y, z).

Returns:

  • Tuple[float, float, float]

    Tuple[float, float, float]: Size of each voxel in nanometers for x,y,z dimensions

Examples:

vx, vy, vz = ds.voxel_size
print(f"X resolution is {vx}nm")

voxel_size_with_unit property

voxel_size_with_unit: VoxelSize

Size of voxels including unit information.

Size of each voxel along each dimension (x, y, z), including unit specification. The default unit is nanometers.

Returns:

  • VoxelSize ( VoxelSize ) –

    Object containing voxel sizes and their units

ConversionLayerMapping

Bases: Enum

Strategies for mapping file paths to layers when importing images.

These strategies determine how input image files are grouped into layers during dataset creation using Dataset.from_images(). If no strategy is provided, INSPECT_SINGLE_FILE is used as the default.

If none of the pre-defined strategies fit your needs, you can provide a custom callable that takes a Path and returns a layer name string.

Examples:

Using default strategy:

ds = Dataset.from_images("images/", "dataset/")

Explicit strategy:

ds = Dataset.from_images(
    "images/",
    "dataset/",
    map_filepath_to_layer_name=ConversionLayerMapping.ENFORCE_SINGLE_LAYER
)

Custom mapping function:

ds = Dataset.from_images(
    "images/",
    "dataset/",
    map_filepath_to_layer_name=lambda p: p.stem
)

ENFORCE_LAYER_PER_FILE class-attribute instance-attribute

ENFORCE_LAYER_PER_FILE = 'enforce_layer_per_file'

Creates a new layer for each input file. Useful for converting multiple 3D images or when each 2D image should become its own layer.

ENFORCE_LAYER_PER_FOLDER class-attribute instance-attribute

ENFORCE_LAYER_PER_FOLDER = 'enforce_layer_per_folder'

Groups files by their containing folder. Each folder becomes one layer. Useful for organized 2D image stacks.

ENFORCE_LAYER_PER_TOPLEVEL_FOLDER class-attribute instance-attribute

ENFORCE_LAYER_PER_TOPLEVEL_FOLDER = 'enforce_layer_per_toplevel_folder'

Groups files by their top-level folder. Useful when multiple layers each have their stacks split across subfolders.

ENFORCE_SINGLE_LAYER class-attribute instance-attribute

ENFORCE_SINGLE_LAYER = 'enforce_single_layer'

Combines all input files into a single layer. Only useful when all images are 2D slices that should be combined.

INSPECT_EVERY_FILE class-attribute instance-attribute

INSPECT_EVERY_FILE = 'inspect_every_file'

Like INSPECT_SINGLE_FILE but determines strategy separately for each file. More flexible but slower for many files.

INSPECT_SINGLE_FILE class-attribute instance-attribute

INSPECT_SINGLE_FILE = 'inspect_single_file'

Default strategy. Inspects first image file to determine if data is 2D or 3D. For 2D data uses ENFORCE_LAYER_PER_FOLDER, for 3D uses ENFORCE_LAYER_PER_FILE.

add_copy_layer

add_copy_layer(foreign_layer: Union[str, Path, Layer], new_layer_name: Optional[str] = None, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[Vec3IntLike, int]] = None, data_format: Optional[Union[str, DataFormat]] = None, compress: Optional[bool] = None, executor: Optional[Executor] = None) -> Layer

Copy layer from another dataset to this one.

Creates a new layer in this dataset by copying data and metadata from a layer in another dataset.

Parameters:

  • foreign_layer (Union[str, Path, Layer]) –

    Layer to copy (path or Layer object)

  • new_layer_name (Optional[str], default: None ) –

    Optional name for the new layer, uses original name if None

  • chunk_shape (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional shape of chunks for storage

  • chunks_per_shard (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional number of chunks per shard

  • data_format (Optional[Union[str, DataFormat]], default: None ) –

    Optional format to store copied data ('wkw', 'zarr', etc.)

  • compress (Optional[bool], default: None ) –

    Optional whether to compress copied data

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallel copying

Returns:

  • Layer ( Layer ) –

    The newly created copy of the layer

Raises:

  • IndexError

    If target layer name already exists

  • RuntimeError

    If dataset is read-only

Examples:

Copy layer keeping same name:

other_ds = Dataset.open("other/dataset")
copied = ds.add_copy_layer(other_ds.get_layer("color"))

Copy with new name:

copied = ds.add_copy_layer(
    other_ds.get_layer("color"),
    new_layer_name="color_copy",
    compress=True
)

add_fs_copy_layer

add_fs_copy_layer(foreign_layer: Union[str, Path, Layer], new_layer_name: Optional[str] = None) -> Layer

Copies the files at foreign_layer which belongs to another dataset to the current dataset via the filesystem. Additionally, the relevant information from the datasource-properties.json of the other dataset are copied too. If new_layer_name is None, the name of the foreign layer is used.

add_layer

add_layer(layer_name: str, category: LayerCategoryType, dtype_per_layer: Optional[DTypeLike] = None, dtype_per_channel: Optional[DTypeLike] = None, num_channels: Optional[int] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, bounding_box: Optional[NDBoundingBox] = None, **kwargs: Any) -> Layer

Create a new layer in the dataset.

Creates a new layer with the given name, category, and data type.

Parameters:

  • layer_name (str) –

    Name for the new layer

  • category (LayerCategoryType) –

    Either 'color' or 'segmentation'

  • dtype_per_layer (Optional[DTypeLike], default: None ) –

    Optional data type for entire layer, e.g. np.uint8

  • dtype_per_channel (Optional[DTypeLike], default: None ) –

    Optional data type per channel, e.g. np.uint8

  • num_channels (Optional[int], default: None ) –

    Number of channels (default 1)

  • data_format (Union[str, DataFormat], default: DEFAULT_DATA_FORMAT ) –

    Format to store data ('wkw', 'zarr', 'zarr3')

  • bounding_box (Optional[NDBoundingBox], default: None ) –

    Optional initial bounding box of layer

  • **kwargs (Any, default: {} ) –

    Additional arguments: - largest_segment_id: For segmentation layers, initial largest ID - mappings: For segmentation layers, optional ID mappings

Returns:

  • Layer ( Layer ) –

    The newly created layer

Raises:

  • IndexError

    If layer with given name already exists

  • RuntimeError

    If invalid category specified

  • AttributeError

    If both dtype_per_layer and dtype_per_channel specified

  • AssertionError

    If invalid layer name or WKW format used with remote dataset

Examples:

Create color layer:

layer = ds.add_layer(
    "my_raw_microscopy_layer",
    LayerCategoryType.COLOR_CATEGORY,
    dtype_per_channel=np.uint8,
)

Create segmentation layer:

layer = ds.add_layer(
    "my_segmentation_labels",
    LayerCategoryType.SEGMENTATION_CATEGORY,
    dtype_per_channel=np.uint64
)
Note

The dtype can be specified either per layer or per channel, but not both. If neither is specified, uint8 per channel is used by default. WKW format can only be used with local datasets.

add_layer_for_existing_files

add_layer_for_existing_files(layer_name: str, category: LayerCategoryType, **kwargs: Any) -> Layer

Create a new layer from existing data files.

Adds a layer by discovering and incorporating existing data files that were created externally, rather than creating new ones. The layer properties are inferred from the existing files unless overridden.

Parameters:

  • layer_name (str) –

    Name for the new layer

  • category (LayerCategoryType) –

    Layer category ('color' or 'segmentation')

  • **kwargs (Any, default: {} ) –

    Additional arguments: - num_channels: Override detected number of channels - dtype_per_channel: Override detected data type - data_format: Override detected data format - bounding_box: Override detected bounding box

Returns:

  • Layer ( Layer ) –

    The newly created layer referencing the existing files

Raises:

  • AssertionError

    If layer already exists or no valid files found

  • RuntimeError

    If dataset is read-only

Examples:

Basic usage:

layer = ds.add_layer_for_existing_files(
    "external_data",
    "color"
)

Override properties:

layer = ds.add_layer_for_existing_files(
    "segmentation_data",
    "segmentation",
    dtype_per_channel=np.uint64
)
Note

The data files must already exist in the dataset directory under the layer name. Files are analyzed to determine properties like data type and number of channels. Magnifications are discovered automatically.

add_layer_from_images

add_layer_from_images(images: Union[str, FramesSequence, List[Union[str, PathLike]]], layer_name: str, category: Optional[LayerCategoryType] = 'color', data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, mag: Union[int, str, list, tuple, ndarray, Mag] = Mag(1), chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[int, Vec3IntLike]] = None, compress: bool = False, *, topleft: VecIntLike = zeros(), swap_xy: bool = False, flip_x: bool = False, flip_y: bool = False, flip_z: bool = False, dtype: Optional[DTypeLike] = None, use_bioformats: Optional[bool] = None, channel: Optional[int] = None, timepoint: Optional[int] = None, czi_channel: Optional[int] = None, batch_size: Optional[int] = None, allow_multiple_layers: bool = False, max_layers: int = 20, truncate_rgba_to_rgb: bool = True, executor: Optional[Executor] = None, chunk_size: Optional[Union[Vec3IntLike, int]] = None) -> Layer

Creates a new layer called layer_name with mag mag from images. images can be one of the following:

  • glob-string
  • list of paths
  • pims.FramesSequence instance

Please see the pims docs for more information.

This method needs extra packages like tifffile or pylibczirw. Please install the respective extras, e.g. using python -m pip install "webknossos[all]".

Further Arguments:

  • category: color by default, may be set to "segmentation"
  • data_format: by default wkw files are written, may be set to "zarr"
  • mag: magnification to use for the written data
  • chunk_shape, chunks_per_shard, compress: adjust how the data is stored on disk
  • topleft: set an offset in Mag(1) to start writing the data, only affecting the output
  • swap_xy: set to True to interchange x and y axis before writing to disk
  • flip_x, flip_y, flip_z: set to True to reverse the respective axis before writing to disk
  • dtype: the read image data will be convertoed to this dtype using numpy.ndarray.astype
  • use_bioformats: set to True to only use the pims bioformats adapter directly, needs a JVM, set to False to forbid using the bioformats adapter, by default it is tried as a last option
  • channel: may be used to select a single channel, if multiple are available
  • timepoint: for timeseries, select a timepoint to use by specifying it as an int, starting from 0
  • czi_channel: may be used to select a channel for .czi images, which differs from normal color-channels
  • batch_size: size to process the images (influences RAM consumption), must be a multiple of the chunk-size z-axis for uncompressed and the shard-size z-axis for compressed layers, default is the chunk-size or shard-size respectively
  • allow_multiple_layers: set to True if timepoints or channels may result in multiple layers being added (only the first is returned)
  • max_layers: only applies if allow_multiple_layers=True, limits the number of layers added via different channels or timepoints
  • truncate_rgba_to_rgb: only applies if allow_multiple_layers=True, set to False to write four channels into layers instead of an RGB channel
  • executor: pass a ClusterExecutor instance to parallelize the conversion jobs across the batches

add_layer_like

add_layer_like(other_layer: Layer, layer_name: str) -> Layer

add_remote_layer

add_remote_layer(foreign_layer: Union[str, UPath, Layer], new_layer_name: Optional[str] = None) -> Layer

Add a remote layer from another dataset.

Creates a layer that references data from a remote dataset. The image data will be streamed on-demand when accessed.

Parameters:

  • foreign_layer (Union[str, UPath, Layer]) –

    Remote layer to add (path or Layer object)

  • new_layer_name (Optional[str], default: None ) –

    Optional name for the new layer, uses original name if None

Returns:

  • Layer ( Layer ) –

    The newly created remote layer referencing the foreign data

Raises:

  • IndexError

    If target layer name already exists

  • AssertionError

    If trying to add non-remote layer or same origin dataset

  • RuntimeError

    If dataset is read-only

Examples:

ds = Dataset.open("other/dataset")
remote_ds = Dataset.open_remote("my_dataset", "my_org_id")
new_layer = ds.add_remote_layer(
    remote_ds.get_layer("color")
)
Note

Changes to the original layer's properties afterwards won't affect this dataset. Data is only referenced, not copied.

add_symlink_layer(foreign_layer: Union[str, Path, Layer], make_relative: bool = False, new_layer_name: Optional[str] = None) -> Layer

Create symbolic link to layer from another dataset.

Instead of copying data, creates a symbolic link to the original layer's data and copies only the layer metadata. Changes to the original layer's properties, e.g. bounding box, afterwards won't affect this dataset and vice-versa.

Parameters:

  • foreign_layer (Union[str, Path, Layer]) –

    Layer to link to (path or Layer object)

  • make_relative (bool, default: False ) –

    Whether to create relative symlinks

  • new_layer_name (Optional[str], default: None ) –

    Optional name for the linked layer, uses original name if None

Returns:

  • Layer ( Layer ) –

    The newly created symbolic link layer

Raises:

  • IndexError

    If target layer name already exists

  • AssertionError

    If trying to create symlinks in/to remote datasets

  • RuntimeError

    If dataset is read-only

Examples:

other_ds = Dataset.open("other/dataset")
linked = ds.add_symlink_layer(
    other_ds.get_layer("color"),
    make_relative=True
)
Note

Only works with local file systems, cannot link remote datasets or create symlinks in remote datasets.

announce_manual_upload classmethod

announce_manual_upload(dataset_name: str, organization: str, initial_team_ids: List[str], folder_id: str, token: Optional[str] = None) -> None

Announce a manual dataset upload to WEBKNOSSOS.

Used when manually uploading datasets to the file system of a datastore. Creates database entries and sets access rights on the webknossos instance before the actual data upload.

Parameters:

  • dataset_name (str) –

    Name for the new dataset

  • organization (str) –

    Organization ID to upload to

  • initial_team_ids (List[str]) –

    List of team IDs to grant initial access

  • folder_id (str) –

    ID of folder where dataset should be placed

  • token (Optional[str], default: None ) –

    Optional authentication token

Note

This is typically only used by administrators with direct file system access to the WEBKNOSSOS datastore. Most users should use upload() instead.

Examples:

Dataset.announce_manual_upload(
    "my_dataset",
    "my_organization",
    ["team_a", "team_b"],
    "folder_123"
)

calculate_bounding_box

calculate_bounding_box() -> NDBoundingBox

Calculate the enclosing bounding box of all layers.

Finds the smallest box that contains all data from all layers in the dataset.

Returns:

  • NDBoundingBox ( NDBoundingBox ) –

    Bounding box containing all layer data

Examples:

bbox = ds.calculate_bounding_box()
print(f"Dataset spans {bbox.size} voxels")
print(f"Dataset starts at {bbox.topleft}")

compress

compress(executor: Optional[Executor] = None) -> None

Compress all uncompressed magnifications in-place.

Compresses the data of all magnification levels that aren't already compressed, for all layers in the dataset.

Parameters:

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallel compression

Raises:

  • RuntimeError

    If dataset is read-only

Examples:

ds.compress()
Note

If data is already compressed, this will have no effect.

copy_dataset

copy_dataset(new_dataset_path: Union[str, Path], voxel_size: Optional[Tuple[float, float, float]] = None, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[Vec3IntLike, int]] = None, data_format: Optional[Union[str, DataFormat]] = None, compress: Optional[bool] = None, args: Optional[Namespace] = None, executor: Optional[Executor] = None, *, voxel_size_with_unit: Optional[VoxelSize] = None, chunk_size: Optional[Union[Vec3IntLike, int]] = None, block_len: Optional[int] = None, file_len: Optional[int] = None) -> Dataset

Creates an independent copy of the dataset with all layers at a new location. Data storage parameters can be customized for the copied dataset.

Parameters:

  • new_dataset_path (Union[str, Path]) –

    Path where new dataset should be created

  • voxel_size (Optional[Tuple[float, float, float]], default: None ) –

    Optional tuple of floats (x,y,z) specifying voxel size in nanometers

  • chunk_shape (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional shape of chunks for data storage

  • chunks_per_shard (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional number of chunks per shard

  • data_format (Optional[Union[str, DataFormat]], default: None ) –

    Optional format to store data ('wkw', 'zarr', 'zarr3')

  • compress (Optional[bool], default: None ) –

    Optional whether to compress data

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallel copying

  • voxel_size_with_unit (Optional[VoxelSize], default: None ) –

    Optional voxel size specification with units

  • **kwargs

    Additional deprecated arguments: - chunk_size: Use chunk_shape instead - block_len: Use chunk_shape instead - file_len: Use chunks_per_shard instead - args: Use executor instead

Returns:

  • Dataset ( Dataset ) –

    The newly created copy

Raises:

  • AssertionError

    If trying to copy WKW layers to remote dataset

Examples:

Basic copy:

copied = ds.copy_dataset("path/to/copy")

Copy with different storage:

copied = ds.copy_dataset(
    "path/to/copy",
    data_format="zarr",
    compress=True
)
Note

WKW layers can only be copied to datasets on local file systems. For remote datasets, use data_format='zarr'.

create classmethod

create(dataset_path: Union[str, PathLike], voxel_size: Tuple[float, float, float], name: Optional[str] = None) -> Dataset

Deprecated, please use the constructor Dataset() instead.

delete_layer

delete_layer(layer_name: str) -> None

Delete a layer from the dataset.

Removes the layer's data and metadata from disk completely. This deletes both the datasource-properties.json entry and all data files for the layer.

Parameters:

  • layer_name (str) –

    Name of layer to delete

Raises:

  • IndexError

    If no layer with the given name exists

  • RuntimeError

    If dataset is read-only

Examples:

ds.delete_layer("old_layer")
print("Remaining layers:", list(ds.layers))

download classmethod

download(dataset_name_or_url: str, organization_id: Optional[str] = None, sharing_token: Optional[str] = None, webknossos_url: Optional[str] = None, bbox: Optional[BoundingBox] = None, layers: Union[List[str], str, None] = None, mags: Optional[List[Mag]] = None, path: Optional[Union[PathLike, str]] = None, exist_ok: bool = False) -> Dataset

Downloads a dataset and returns the Dataset instance.

  • dataset_name_or_url may be a dataset name or a full URL to a dataset view, e.g. https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view If a URL is used, organization_id, webknossos_url and sharing_token must not be set.
  • organization_id may be supplied if a dataset name was used in the previous argument, it defaults to your current organization from the webknossos_context. You can find your organization_id here.
  • sharing_token may be supplied if a dataset name was used and can specify a sharing token.
  • webknossos_url may be supplied if a dataset name was used, and allows to specify in which webknossos instance to search for the dataset. It defaults to the url from your current webknossos_context, using https://webknossos.org as a fallback.
  • bbox, layers, and mags specify which parts of the dataset to download. If nothing is specified the whole image, all layers, and all mags are downloaded respectively.
  • path and exist_ok specify where to save the downloaded dataset and whether to overwrite if the path exists.

downsample

downsample(sampling_mode: SamplingModes = ANISOTROPIC, coarsest_mag: Optional[Mag] = None, executor: Optional[Executor] = None) -> None

Generate downsampled magnifications for all layers.

Creates lower resolution versions (coarser magnifications) of all layers that are not yet downsampled, up to the specified coarsest magnification.

Parameters:

  • sampling_mode (SamplingModes, default: ANISOTROPIC ) –

    Strategy for downsampling (e.g. ANISOTROPIC, MAX)

  • coarsest_mag (Optional[Mag], default: None ) –

    Optional maximum/coarsest magnification to generate

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallel processing

Raises:

  • RuntimeError

    If dataset is read-only

Examples:

Basic downsampling:

ds.downsample()

With custom parameters:

ds.downsample(
    sampling_mode=SamplingModes.ANISOTROPIC,
    coarsest_mag=Mag(8),
)
Note
  • ANISOTROPIC sampling creates anisotropic downsampling until dataset is isotropic
  • Other modes like MAX, CONSTANT etc create regular downsampling patterns
  • If magnifications already exist they will not be regenerated

explore_and_add_remote classmethod

explore_and_add_remote(dataset_uri: Union[str, PathLike], dataset_name: str, folder_path: str) -> RemoteDataset

Explore and add an external dataset as a remote dataset.

Adds a dataset from an external location (e.g. S3, Google Cloud Storage, or HTTPs) to WEBKNOSSOS by inspecting its layout and metadata without copying the data.

Parameters:

  • dataset_uri (Union[str, PathLike]) –

    URI pointing to the remote dataset location

  • dataset_name (str) –

    Name to register dataset under in WEBKNOSSOS

  • folder_path (str) –

    Path in WEBKNOSSOS folder structure where dataset should appear

Returns:

  • RemoteDataset ( RemoteDataset ) –

    The newly added dataset accessible via WEBKNOSSOS

Examples:

remote = Dataset.explore_and_add_remote(
    "s3://bucket/dataset",
    "my_dataset",
    "Datasets/Research"
)
Note

The dataset files must be accessible from the WEBKNOSSOS server for this to work. The data will be streamed directly from the source.

from_images classmethod

from_images(input_path: Union[str, PathLike], output_path: Union[str, PathLike], voxel_size: Optional[Tuple[float, float, float]] = None, name: Optional[str] = None, *, map_filepath_to_layer_name: Union[ConversionLayerMapping, Callable[[Path], str]] = INSPECT_SINGLE_FILE, z_slices_sort_key: Callable[[Path], Any] = natsort_keygen(), voxel_size_with_unit: Optional[VoxelSize] = None, layer_name: Optional[str] = None, layer_category: Optional[LayerCategoryType] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[int, Vec3IntLike]] = None, compress: bool = False, swap_xy: bool = False, flip_x: bool = False, flip_y: bool = False, flip_z: bool = False, use_bioformats: Optional[bool] = None, max_layers: int = 20, batch_size: Optional[int] = None, executor: Optional[Executor] = None) -> Dataset

This method imports image data in a folder or from a file as a webknossos dataset.

The image data can be 3D images (such as multipage tiffs) or stacks of 2D images. Multiple 3D images or image stacks are mapped to different layers based on the mapping strategy.

The exact mapping is handled by the argument map_filepath_to_layer_name, which can be a pre-defined strategy from the enum ConversionLayerMapping, or a custom callable, taking a path of an image file and returning the corresponding layer name. All files belonging to the same layer name are then grouped. In case of multiple files per layer, those are usually mapped to the z-dimension. The order of the z-slices can be customized by setting z_slices_sort_key.

For more fine-grained control, please create an empty dataset and use add_layer_from_images.

Parameters:

  • input_path (Union[str, PathLike]) –

    Path to input image files

  • output_path (Union[str, PathLike]) –

    Output path for created dataset

  • voxel_size (Optional[Tuple[float, float, float]], default: None ) –

    Optional tuple of floats (x,y,z) for voxel size in nm

  • name (Optional[str], default: None ) –

    Optional name for dataset

  • map_filepath_to_layer_name (Union[ConversionLayerMapping, Callable[[Path], str]], default: INSPECT_SINGLE_FILE ) –

    Strategy for mapping files to layers, either a ConversionLayerMapping enum value or callable taking Path and returning str

  • z_slices_sort_key (Callable[[Path], Any], default: natsort_keygen() ) –

    Optional key function for sorting z-slices

  • voxel_size_with_unit (Optional[VoxelSize], default: None ) –

    Optional voxel size with unit specification

  • layer_name (Optional[str], default: None ) –

    Optional name for layer(s)

  • layer_category (Optional[LayerCategoryType], default: None ) –

    Optional category override (LayerCategoryType.color / LayerCategoryType.segmentation)

  • data_format (Union[str, DataFormat], default: DEFAULT_DATA_FORMAT ) –

    Format to store data in ('wkw'/'zarr')

  • chunk_shape (Optional[Union[Vec3IntLike, int]], default: None ) –

    Optional shape of chunks to store data in

  • chunks_per_shard (Optional[Union[int, Vec3IntLike]], default: None ) –

    Optional number of chunks per shard

  • compress (bool, default: False ) –

    Whether to compress the data

  • swap_xy (bool, default: False ) –

    Whether to swap x and y axes

  • flip_x (bool, default: False ) –

    Whether to flip the x axis

  • flip_y (bool, default: False ) –

    Whether to flip the y axis

  • flip_z (bool, default: False ) –

    Whether to flip the z axis

  • use_bioformats (Optional[bool], default: None ) –

    Whether to use bioformats for reading

  • max_layers (int, default: 20 ) –

    Maximum number of layers to create

  • batch_size (Optional[int], default: None ) –

    Size of batches for processing

  • executor (Optional[Executor], default: None ) –

    Optional executor for parallelization

Returns:

  • Dataset ( Dataset ) –

    The created dataset instance

Examples:

ds = Dataset.from_images("path/to/images/",
                        "path/to/dataset/",
                        voxel_size=(1, 1, 1))
Note

This method needs extra packages like tifffile or pylibczirw. Install with pip install "webknossos[all]" and pip install --extra-index-url https://pypi.scm.io/simple/ "webknossos[czi]".

get_color_layer

get_color_layer() -> Layer

Deprecated, please use get_color_layers().

Returns the only color layer. Fails with a RuntimeError if there are multiple color layers or none.

get_color_layers

get_color_layers() -> List[Layer]

Get all color layers in the dataset.

Provides access to all layers with category 'color'. Useful when a dataset contains multiple color layers.

Returns:

  • List[Layer]

    List[Layer]: List of all color layers in order

Examples:

Print all color layer names:

for layer in ds.get_color_layers():
    print(layer.name)
Note

If you need only a single color layer, consider using get_layer() with the specific layer name instead.

get_layer

get_layer(layer_name: str) -> Layer

Get a specific layer from this dataset.

Parameters:

  • layer_name (str) –

    Name of the layer to retrieve

Returns:

  • Layer ( Layer ) –

    The requested layer object

Raises:

  • IndexError

    If no layer with the given name exists

Examples:

color_layer = ds.get_layer("color")
seg_layer = ds.get_layer("segmentation")
Note

Use layers property to access all layers at once.

get_or_add_layer

get_or_add_layer(layer_name: str, category: LayerCategoryType, dtype_per_layer: Optional[DTypeLike] = None, dtype_per_channel: Optional[DTypeLike] = None, num_channels: Optional[int] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, **kwargs: Any) -> Layer

Get an existing layer or create a new one.

Gets a layer with the given name if it exists, otherwise creates a new layer with the specified parameters.

Parameters:

  • layer_name (str) –

    Name of the layer to get or create

  • category (LayerCategoryType) –

    Layer category ('color' or 'segmentation')

  • dtype_per_layer (Optional[DTypeLike], default: None ) –

    Optional data type for entire layer

  • dtype_per_channel (Optional[DTypeLike], default: None ) –

    Optional data type per channel

  • num_channels (Optional[int], default: None ) –

    Optional number of channels

  • data_format (Union[str, DataFormat], default: DEFAULT_DATA_FORMAT ) –

    Format to store data ('wkw', 'zarr', etc.)

  • **kwargs (Any, default: {} ) –

    Additional arguments passed to add_layer()

Returns:

  • Layer ( Layer ) –

    The existing or newly created layer

Raises:

  • AssertionError

    If existing layer's properties don't match specified parameters

  • ValueError

    If both dtype_per_layer and dtype_per_channel specified

  • RuntimeError

    If invalid category specified

Examples:

layer = ds.get_or_add_layer(
    "segmentation",
    LayerCategoryType.SEGMENTATION_CATEGORY,
    dtype_per_channel=np.uint64,
)
Note

The dtype can be specified either per layer or per channel, but not both. For existing layers, the parameters are validated against the layer properties.

get_or_create classmethod

get_or_create(dataset_path: Union[str, Path], voxel_size: Tuple[float, float, float], name: Optional[str] = None) -> Dataset

Deprecated, please use the constructor Dataset() instead.

get_remote_datasets staticmethod

get_remote_datasets(organization_id: Optional[str] = None, tags: Optional[Union[str, Sequence[str]]] = None) -> Mapping[str, RemoteDataset]

Get available datasets from WEBKNOSSOS.

Returns a mapping of dataset names to lazy-initialized RemoteDataset objects for all datasets visible to the specified organization or current user.

Parameters:

  • organization_id (Optional[str], default: None ) –

    Optional organization to get datasets from. Defaults to organization of logged in user.

  • tags (Optional[Union[str, Sequence[str]]], default: None ) –

    Optional tag(s) to filter datasets by. Can be a single tag string or sequence of tags. Only returns datasets with all specified tags.

Returns:

  • Mapping[str, RemoteDataset]

    Mapping[str, RemoteDataset]: Dict mapping dataset names to RemoteDataset objects

Examples:

List all available datasets:

datasets = Dataset.get_remote_datasets()
print(sorted(datasets.keys()))

Get datasets for specific organization:

org_datasets = Dataset.get_remote_datasets("my_organization")
ds = org_datasets["dataset_name"]

Filter datasets by tag:

published = Dataset.get_remote_datasets(tags="published")
tagged = Dataset.get_remote_datasets(tags=["tag1", "tag2"])
Note

RemoteDataset objects are initialized lazily when accessed for the first time. The mapping object provides a fast way to list and look up available datasets.

get_segmentation_layer

get_segmentation_layer() -> SegmentationLayer

Deprecated, please use get_segmentation_layers().

Returns the only segmentation layer. Fails with a IndexError if there are multiple segmentation layers or none.

get_segmentation_layers

get_segmentation_layers() -> List[SegmentationLayer]

Get all segmentation layers in the dataset.

Provides access to all layers with category 'segmentation'. Useful when a dataset contains multiple segmentation layers.

Returns:

  • List[SegmentationLayer]

    List[SegmentationLayer]: List of all segmentation layers in order

Examples:

Print all segmentation layer names:

for layer in ds.get_segmentation_layers():
    print(layer.name)
Note

If you need only a single segmentation layer, consider using get_layer() with the specific layer name instead.

open classmethod

open(dataset_path: Union[str, PathLike]) -> Dataset

Do not call manually, please use Dataset.open_remote() instead.

open_remote classmethod

open_remote(dataset_name_or_url: str, organization_id: Optional[str] = None, sharing_token: Optional[str] = None, webknossos_url: Optional[str] = None) -> RemoteDataset

Opens a remote webknossos dataset. Image data is accessed via network requests. Dataset metadata such as allowed teams or the sharing token can be read and set via the respective RemoteDataset properties.

Parameters:

  • dataset_name_or_url (str) –

    Either dataset name or full URL to dataset view, e.g. https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view

  • organization_id (Optional[str], default: None ) –

    Optional organization ID if using dataset name. Can be found here

  • sharing_token (Optional[str], default: None ) –

    Optional sharing token for dataset access

  • webknossos_url (Optional[str], default: None ) –

    Optional custom webknossos URL, defaults to context URL, usually https://webknossos.org

Returns:

  • RemoteDataset ( RemoteDataset ) –

    Dataset instance for remote access

Examples:

ds = Dataset.open_remote("`https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view`")
Note

If supplying an URL, organization_id, webknossos_url and sharing_token must not be set.

shallow_copy_dataset

shallow_copy_dataset(new_dataset_path: Union[str, PathLike], name: Optional[str] = None, make_relative: bool = False, layers_to_ignore: Optional[Iterable[str]] = None) -> Dataset

Create a new dataset that uses symlinks to reference data.

Links all magnifications and layer directories from the original dataset via symlinks rather than copying data. Useful for creating alternative views or exposing datasets to webknossos.

Parameters:

  • new_dataset_path (Union[str, PathLike]) –

    Path where new dataset should be created

  • name (Optional[str], default: None ) –

    Optional name for the new dataset, uses original name if None

  • make_relative (bool, default: False ) –

    Whether to create relative symlinks

  • layers_to_ignore (Optional[Iterable[str]], default: None ) –

    Optional iterable of layer names to exclude

Returns:

  • Dataset ( Dataset ) –

    The newly created dataset with linked layers

Raises:

  • AssertionError

    If trying to link remote datasets

  • RuntimeError

    If dataset is read-only

Examples:

Basic shallow copy:

linked = ds.shallow_copy_dataset("path/to/link")

With relative links excluding layers:

linked = ds.shallow_copy_dataset(
    "path/to/link",
    make_relative=True,
    layers_to_ignore=["temp_layer"]
)
Note

Only works with datasets on local filesystems. Cannot create shallow copies of remote datasets or create shallow copies in remote locations.

trigger_reload_in_datastore classmethod

trigger_reload_in_datastore(dataset_name: str, organization: str, token: Optional[str] = None) -> None

Trigger a manual reload of the dataset's properties.

For manually uploaded datasets, properties are normally updated automatically after a few minutes. This method forces an immediate reload.

This is typically only needed after manual changes to the dataset's files. Cannot be used for local datasets.

Parameters:

  • dataset_name (str) –

    Name of dataset to reload

  • organization (str) –

    Organization ID where dataset is located

  • token (Optional[str], default: None ) –

    Optional authentication token

Examples:

# Force reload after manual file changes
Dataset.trigger_reload_in_datastore(
    "my_dataset",
    "organization_id"
)

upload

upload(new_dataset_name: Optional[str] = None, layers_to_link: Optional[List[Union[LayerToLink, Layer]]] = None, jobs: Optional[int] = None) -> RemoteDataset

Upload this dataset to webknossos.

Copies all data and metadata to webknossos, creating a new dataset that can be accessed remotely. For large datasets, existing layers can be linked instead of re-uploaded.

Parameters:

  • new_dataset_name (Optional[str], default: None ) –

    Optional name for the uploaded dataset

  • layers_to_link (Optional[List[Union[LayerToLink, Layer]]], default: None ) –

    Optional list of layers that should link to existing data instead of being uploaded

  • jobs (Optional[int], default: None ) –

    Optional number of parallel upload jobs, defaults to 5

Returns:

  • RemoteDataset ( RemoteDataset ) –

    Reference to the newly created remote dataset

Examples:

Simple upload:

remote_ds = ds.upload("my_new_dataset")
print(remote_ds.url)

Link existing layers:

link = LayerToLink.from_remote_layer(existing_layer)
remote_ds = ds.upload(layers_to_link=[link])