webknossos.dataset.dataset
¶
Dataset
¶
Dataset(dataset_path: Union[str, PathLike], voxel_size: Optional[Tuple[float, float, float]] = None, name: Optional[str] = None, exist_ok: bool = _UNSET, *, voxel_size_with_unit: Optional[VoxelSize] = None, scale: Optional[Tuple[float, float, float]] = None, read_only: bool = False)
A dataset is the entry point of the Dataset API.
An existing dataset on disk can be opened or new datasets can be created.
A dataset stores the data in .wkw
files on disk with metadata in datasource-properties.json
.
The information in those files are kept in sync with the object.
Each dataset consists of one or more layers (webknossos.dataset.layer.Layer), which themselves can comprise multiple magnifications (webknossos.dataset.mag_view.MagView).
When using Dataset.open_remote()
an instance of the RemoteDataset
subclass is returned.
Examples:
Create a new dataset:
ds = Dataset("path/to/dataset", voxel_size=(11.2, 11.2, 25))
Open an existing dataset:
ds = Dataset.open("path/to/dataset")
Open a remote dataset:
ds = Dataset.open_remote("my_dataset", "organization_id")
Create a new dataset or open an existing one.
Creates a new dataset and the associated datasource-properties.json
if one does not exist.
If the dataset already exists and exist_ok is True, it is opened (the provided voxel_size
and name are asserted to match the existing dataset).
Currently, exist_ok=True
is the deprecated default and will change in future releases.
Please use Dataset.open
if you intend to open an existing dataset and don't want/need
the creation behavior.
Parameters:
-
dataset_path
(Union[str, PathLike]
) –Path where the dataset should be created/opened
-
voxel_size
(Optional[Tuple[float, float, float]]
, default:None
) –Optional tuple of floats (x, y, z) specifying voxel size in nanometers
-
name
(Optional[str]
, default:None
) –Optional name for the dataset, defaults to last part of dataset_path if not provided
-
exist_ok
(bool
, default:_UNSET
) –Whether to open an existing dataset at the path rather than failing
-
voxel_size_with_unit
(Optional[VoxelSize]
, default:None
) –Optional voxel size with unit specification
-
scale
(Optional[Tuple[float, float, float]]
, default:None
) –Deprecated, use voxel_size instead
-
read_only
(bool
, default:False
) –Whether to open dataset in read-only mode
Raises:
-
RuntimeError
–If dataset exists and exist_ok=False
-
AssertionError
–If opening existing dataset with mismatched voxel size or name
default_view_configuration
property
writable
¶
default_view_configuration: Optional[DatasetViewConfiguration]
Default view configuration for this dataset in webknossos.
Controls how the dataset is displayed in webknossos when first opened by a user, including position, zoom level, rotation etc.
Returns:
-
Optional[DatasetViewConfiguration]
–Optional[DatasetViewConfiguration]: Current view configuration if set
Examples:
ds.default_view_configuration = DatasetViewConfiguration(
zoom=1.5,
position=(100, 100, 100)
)
layers
property
¶
layers: Dict[str, Layer]
Dictionary containing all layers of this dataset.
Returns:
-
Dict[str, Layer]
–Dict[str, Layer]: Dictionary mapping layer names to Layer objects
Examples:
for layer_name, layer in ds.layers.items():
print(layer_name)
name
property
writable
¶
name: str
Name of this dataset as specified in datasource-properties.json.
Can be modified to rename the dataset. Changes are persisted to the properties file.
Returns:
-
str
(str
) –Current dataset name
Examples:
ds.name = "my_renamed_dataset" # Updates the name in properties file
read_only
property
¶
read_only: bool
Whether this dataset is opened in read-only mode.
When True, operations that would modify the dataset (adding layers, changing properties, etc.) are not allowed and will raise RuntimeError.
Returns:
-
bool
(bool
) –True if dataset is read-only, False otherwise
voxel_size
property
¶
voxel_size: Tuple[float, float, float]
Size of each voxel in nanometers along each dimension (x, y, z).
Returns:
-
Tuple[float, float, float]
–Tuple[float, float, float]: Size of each voxel in nanometers for x,y,z dimensions
Examples:
vx, vy, vz = ds.voxel_size
print(f"X resolution is {vx}nm")
voxel_size_with_unit
property
¶
voxel_size_with_unit: VoxelSize
Size of voxels including unit information.
Size of each voxel along each dimension (x, y, z), including unit specification. The default unit is nanometers.
Returns:
-
VoxelSize
(VoxelSize
) –Object containing voxel sizes and their units
ConversionLayerMapping
¶
Bases: Enum
Strategies for mapping file paths to layers when importing images.
These strategies determine how input image files are grouped into layers during
dataset creation using Dataset.from_images()
. If no strategy is provided,
INSPECT_SINGLE_FILE
is used as the default.
If none of the pre-defined strategies fit your needs, you can provide a custom callable that takes a Path and returns a layer name string.
Examples:
Using default strategy:
ds = Dataset.from_images("images/", "dataset/")
Explicit strategy:
ds = Dataset.from_images(
"images/",
"dataset/",
map_filepath_to_layer_name=ConversionLayerMapping.ENFORCE_SINGLE_LAYER
)
Custom mapping function:
ds = Dataset.from_images(
"images/",
"dataset/",
map_filepath_to_layer_name=lambda p: p.stem
)
ENFORCE_LAYER_PER_FILE
class-attribute
instance-attribute
¶
ENFORCE_LAYER_PER_FILE = 'enforce_layer_per_file'
Creates a new layer for each input file. Useful for converting multiple 3D images or when each 2D image should become its own layer.
ENFORCE_LAYER_PER_FOLDER
class-attribute
instance-attribute
¶
ENFORCE_LAYER_PER_FOLDER = 'enforce_layer_per_folder'
Groups files by their containing folder. Each folder becomes one layer. Useful for organized 2D image stacks.
ENFORCE_LAYER_PER_TOPLEVEL_FOLDER
class-attribute
instance-attribute
¶
ENFORCE_LAYER_PER_TOPLEVEL_FOLDER = 'enforce_layer_per_toplevel_folder'
Groups files by their top-level folder. Useful when multiple layers each have their stacks split across subfolders.
ENFORCE_SINGLE_LAYER
class-attribute
instance-attribute
¶
ENFORCE_SINGLE_LAYER = 'enforce_single_layer'
Combines all input files into a single layer. Only useful when all images are 2D slices that should be combined.
INSPECT_EVERY_FILE
class-attribute
instance-attribute
¶
INSPECT_EVERY_FILE = 'inspect_every_file'
Like INSPECT_SINGLE_FILE but determines strategy separately for each file. More flexible but slower for many files.
INSPECT_SINGLE_FILE
class-attribute
instance-attribute
¶
INSPECT_SINGLE_FILE = 'inspect_single_file'
Default strategy. Inspects first image file to determine if data is 2D or 3D. For 2D data uses ENFORCE_LAYER_PER_FOLDER, for 3D uses ENFORCE_LAYER_PER_FILE.
add_copy_layer
¶
add_copy_layer(foreign_layer: Union[str, Path, Layer], new_layer_name: Optional[str] = None, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[Vec3IntLike, int]] = None, data_format: Optional[Union[str, DataFormat]] = None, compress: Optional[bool] = None, executor: Optional[Executor] = None) -> Layer
Copy layer from another dataset to this one.
Creates a new layer in this dataset by copying data and metadata from a layer in another dataset.
Parameters:
-
foreign_layer
(Union[str, Path, Layer]
) –Layer to copy (path or Layer object)
-
new_layer_name
(Optional[str]
, default:None
) –Optional name for the new layer, uses original name if None
-
chunk_shape
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional shape of chunks for storage
-
chunks_per_shard
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional number of chunks per shard
-
data_format
(Optional[Union[str, DataFormat]]
, default:None
) –Optional format to store copied data ('wkw', 'zarr', etc.)
-
compress
(Optional[bool]
, default:None
) –Optional whether to compress copied data
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallel copying
Returns:
-
Layer
(Layer
) –The newly created copy of the layer
Raises:
-
IndexError
–If target layer name already exists
-
RuntimeError
–If dataset is read-only
Examples:
Copy layer keeping same name:
other_ds = Dataset.open("other/dataset")
copied = ds.add_copy_layer(other_ds.get_layer("color"))
Copy with new name:
copied = ds.add_copy_layer(
other_ds.get_layer("color"),
new_layer_name="color_copy",
compress=True
)
add_fs_copy_layer
¶
add_fs_copy_layer(foreign_layer: Union[str, Path, Layer], new_layer_name: Optional[str] = None) -> Layer
Copies the files at foreign_layer
which belongs to another dataset
to the current dataset via the filesystem. Additionally, the relevant
information from the datasource-properties.json
of the other dataset
are copied too. If new_layer_name is None, the name of the foreign
layer is used.
add_layer
¶
add_layer(layer_name: str, category: LayerCategoryType, dtype_per_layer: Optional[DTypeLike] = None, dtype_per_channel: Optional[DTypeLike] = None, num_channels: Optional[int] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, bounding_box: Optional[NDBoundingBox] = None, **kwargs: Any) -> Layer
Create a new layer in the dataset.
Creates a new layer with the given name, category, and data type.
Parameters:
-
layer_name
(str
) –Name for the new layer
-
category
(LayerCategoryType
) –Either 'color' or 'segmentation'
-
dtype_per_layer
(Optional[DTypeLike]
, default:None
) –Optional data type for entire layer, e.g. np.uint8
-
dtype_per_channel
(Optional[DTypeLike]
, default:None
) –Optional data type per channel, e.g. np.uint8
-
num_channels
(Optional[int]
, default:None
) –Number of channels (default 1)
-
data_format
(Union[str, DataFormat]
, default:DEFAULT_DATA_FORMAT
) –Format to store data ('wkw', 'zarr', 'zarr3')
-
bounding_box
(Optional[NDBoundingBox]
, default:None
) –Optional initial bounding box of layer
-
**kwargs
(Any
, default:{}
) –Additional arguments: - largest_segment_id: For segmentation layers, initial largest ID - mappings: For segmentation layers, optional ID mappings
Returns:
-
Layer
(Layer
) –The newly created layer
Raises:
-
IndexError
–If layer with given name already exists
-
RuntimeError
–If invalid category specified
-
AttributeError
–If both dtype_per_layer and dtype_per_channel specified
-
AssertionError
–If invalid layer name or WKW format used with remote dataset
Examples:
Create color layer:
layer = ds.add_layer(
"my_raw_microscopy_layer",
LayerCategoryType.COLOR_CATEGORY,
dtype_per_channel=np.uint8,
)
Create segmentation layer:
layer = ds.add_layer(
"my_segmentation_labels",
LayerCategoryType.SEGMENTATION_CATEGORY,
dtype_per_channel=np.uint64
)
Note
The dtype can be specified either per layer or per channel, but not both. If neither is specified, uint8 per channel is used by default. WKW format can only be used with local datasets.
add_layer_for_existing_files
¶
add_layer_for_existing_files(layer_name: str, category: LayerCategoryType, **kwargs: Any) -> Layer
Create a new layer from existing data files.
Adds a layer by discovering and incorporating existing data files that were created externally, rather than creating new ones. The layer properties are inferred from the existing files unless overridden.
Parameters:
-
layer_name
(str
) –Name for the new layer
-
category
(LayerCategoryType
) –Layer category ('color' or 'segmentation')
-
**kwargs
(Any
, default:{}
) –Additional arguments: - num_channels: Override detected number of channels - dtype_per_channel: Override detected data type - data_format: Override detected data format - bounding_box: Override detected bounding box
Returns:
-
Layer
(Layer
) –The newly created layer referencing the existing files
Raises:
-
AssertionError
–If layer already exists or no valid files found
-
RuntimeError
–If dataset is read-only
Examples:
Basic usage:
layer = ds.add_layer_for_existing_files(
"external_data",
"color"
)
Override properties:
layer = ds.add_layer_for_existing_files(
"segmentation_data",
"segmentation",
dtype_per_channel=np.uint64
)
Note
The data files must already exist in the dataset directory under the layer name. Files are analyzed to determine properties like data type and number of channels. Magnifications are discovered automatically.
add_layer_from_images
¶
add_layer_from_images(images: Union[str, FramesSequence, List[Union[str, PathLike]]], layer_name: str, category: Optional[LayerCategoryType] = 'color', data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, mag: Union[int, str, list, tuple, ndarray, Mag] = Mag(1), chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[int, Vec3IntLike]] = None, compress: bool = False, *, topleft: VecIntLike = zeros(), swap_xy: bool = False, flip_x: bool = False, flip_y: bool = False, flip_z: bool = False, dtype: Optional[DTypeLike] = None, use_bioformats: Optional[bool] = None, channel: Optional[int] = None, timepoint: Optional[int] = None, czi_channel: Optional[int] = None, batch_size: Optional[int] = None, allow_multiple_layers: bool = False, max_layers: int = 20, truncate_rgba_to_rgb: bool = True, executor: Optional[Executor] = None, chunk_size: Optional[Union[Vec3IntLike, int]] = None) -> Layer
Creates a new layer called layer_name
with mag mag
from images
.
images
can be one of the following:
- glob-string
- list of paths
pims.FramesSequence
instance
Please see the pims docs for more information.
This method needs extra packages like tifffile or pylibczirw. Please install the respective extras,
e.g. using python -m pip install "webknossos[all]"
.
Further Arguments:
category
:color
by default, may be set to "segmentation"data_format
: by default wkw files are written, may be set to "zarr"mag
: magnification to use for the written datachunk_shape
,chunks_per_shard
,compress
: adjust how the data is stored on disktopleft
: set an offset in Mag(1) to start writing the data, only affecting the outputswap_xy
: set toTrue
to interchange x and y axis before writing to diskflip_x
,flip_y
,flip_z
: set toTrue
to reverse the respective axis before writing to diskdtype
: the read image data will be convertoed to this dtype usingnumpy.ndarray.astype
use_bioformats
: set toTrue
to only use the pims bioformats adapter directly, needs a JVM, set toFalse
to forbid using the bioformats adapter, by default it is tried as a last optionchannel
: may be used to select a single channel, if multiple are availabletimepoint
: for timeseries, select a timepoint to use by specifying it as an int, starting from 0czi_channel
: may be used to select a channel for .czi images, which differs from normal color-channelsbatch_size
: size to process the images (influences RAM consumption), must be a multiple of the chunk-size z-axis for uncompressed and the shard-size z-axis for compressed layers, default is the chunk-size or shard-size respectivelyallow_multiple_layers
: set toTrue
if timepoints or channels may result in multiple layers being added (only the first is returned)max_layers
: only applies ifallow_multiple_layers=True
, limits the number of layers added via different channels or timepointstruncate_rgba_to_rgb
: only applies ifallow_multiple_layers=True
, set toFalse
to write four channels into layers instead of an RGB channelexecutor
: pass aClusterExecutor
instance to parallelize the conversion jobs across the batches
add_remote_layer
¶
add_remote_layer(foreign_layer: Union[str, UPath, Layer], new_layer_name: Optional[str] = None) -> Layer
Add a remote layer from another dataset.
Creates a layer that references data from a remote dataset. The image data will be streamed on-demand when accessed.
Parameters:
-
foreign_layer
(Union[str, UPath, Layer]
) –Remote layer to add (path or Layer object)
-
new_layer_name
(Optional[str]
, default:None
) –Optional name for the new layer, uses original name if None
Returns:
-
Layer
(Layer
) –The newly created remote layer referencing the foreign data
Raises:
-
IndexError
–If target layer name already exists
-
AssertionError
–If trying to add non-remote layer or same origin dataset
-
RuntimeError
–If dataset is read-only
Examples:
ds = Dataset.open("other/dataset")
remote_ds = Dataset.open_remote("my_dataset", "my_org_id")
new_layer = ds.add_remote_layer(
remote_ds.get_layer("color")
)
Note
Changes to the original layer's properties afterwards won't affect this dataset. Data is only referenced, not copied.
add_symlink_layer
¶
add_symlink_layer(foreign_layer: Union[str, Path, Layer], make_relative: bool = False, new_layer_name: Optional[str] = None) -> Layer
Create symbolic link to layer from another dataset.
Instead of copying data, creates a symbolic link to the original layer's data and copies only the layer metadata. Changes to the original layer's properties, e.g. bounding box, afterwards won't affect this dataset and vice-versa.
Parameters:
-
foreign_layer
(Union[str, Path, Layer]
) –Layer to link to (path or Layer object)
-
make_relative
(bool
, default:False
) –Whether to create relative symlinks
-
new_layer_name
(Optional[str]
, default:None
) –Optional name for the linked layer, uses original name if None
Returns:
-
Layer
(Layer
) –The newly created symbolic link layer
Raises:
-
IndexError
–If target layer name already exists
-
AssertionError
–If trying to create symlinks in/to remote datasets
-
RuntimeError
–If dataset is read-only
Examples:
other_ds = Dataset.open("other/dataset")
linked = ds.add_symlink_layer(
other_ds.get_layer("color"),
make_relative=True
)
Note
Only works with local file systems, cannot link remote datasets or create symlinks in remote datasets.
announce_manual_upload
classmethod
¶
announce_manual_upload(dataset_name: str, organization: str, initial_team_ids: List[str], folder_id: str, token: Optional[str] = None) -> None
Announce a manual dataset upload to WEBKNOSSOS.
Used when manually uploading datasets to the file system of a datastore. Creates database entries and sets access rights on the webknossos instance before the actual data upload.
Parameters:
-
dataset_name
(str
) –Name for the new dataset
-
organization
(str
) –Organization ID to upload to
-
initial_team_ids
(List[str]
) –List of team IDs to grant initial access
-
folder_id
(str
) –ID of folder where dataset should be placed
-
token
(Optional[str]
, default:None
) –Optional authentication token
Note
This is typically only used by administrators with direct file system access to the WEBKNOSSOS datastore. Most users should use upload() instead.
Examples:
Dataset.announce_manual_upload(
"my_dataset",
"my_organization",
["team_a", "team_b"],
"folder_123"
)
calculate_bounding_box
¶
calculate_bounding_box() -> NDBoundingBox
Calculate the enclosing bounding box of all layers.
Finds the smallest box that contains all data from all layers in the dataset.
Returns:
-
NDBoundingBox
(NDBoundingBox
) –Bounding box containing all layer data
Examples:
bbox = ds.calculate_bounding_box()
print(f"Dataset spans {bbox.size} voxels")
print(f"Dataset starts at {bbox.topleft}")
compress
¶
compress(executor: Optional[Executor] = None) -> None
Compress all uncompressed magnifications in-place.
Compresses the data of all magnification levels that aren't already compressed, for all layers in the dataset.
Parameters:
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallel compression
Raises:
-
RuntimeError
–If dataset is read-only
Examples:
ds.compress()
Note
If data is already compressed, this will have no effect.
copy_dataset
¶
copy_dataset(new_dataset_path: Union[str, Path], voxel_size: Optional[Tuple[float, float, float]] = None, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[Vec3IntLike, int]] = None, data_format: Optional[Union[str, DataFormat]] = None, compress: Optional[bool] = None, args: Optional[Namespace] = None, executor: Optional[Executor] = None, *, voxel_size_with_unit: Optional[VoxelSize] = None, chunk_size: Optional[Union[Vec3IntLike, int]] = None, block_len: Optional[int] = None, file_len: Optional[int] = None) -> Dataset
Creates an independent copy of the dataset with all layers at a new location. Data storage parameters can be customized for the copied dataset.
Parameters:
-
new_dataset_path
(Union[str, Path]
) –Path where new dataset should be created
-
voxel_size
(Optional[Tuple[float, float, float]]
, default:None
) –Optional tuple of floats (x,y,z) specifying voxel size in nanometers
-
chunk_shape
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional shape of chunks for data storage
-
chunks_per_shard
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional number of chunks per shard
-
data_format
(Optional[Union[str, DataFormat]]
, default:None
) –Optional format to store data ('wkw', 'zarr', 'zarr3')
-
compress
(Optional[bool]
, default:None
) –Optional whether to compress data
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallel copying
-
voxel_size_with_unit
(Optional[VoxelSize]
, default:None
) –Optional voxel size specification with units
-
**kwargs
–Additional deprecated arguments: - chunk_size: Use chunk_shape instead - block_len: Use chunk_shape instead - file_len: Use chunks_per_shard instead - args: Use executor instead
Returns:
-
Dataset
(Dataset
) –The newly created copy
Raises:
-
AssertionError
–If trying to copy WKW layers to remote dataset
Examples:
Basic copy:
copied = ds.copy_dataset("path/to/copy")
Copy with different storage:
copied = ds.copy_dataset(
"path/to/copy",
data_format="zarr",
compress=True
)
Note
WKW layers can only be copied to datasets on local file systems. For remote datasets, use data_format='zarr'.
create
classmethod
¶
create(dataset_path: Union[str, PathLike], voxel_size: Tuple[float, float, float], name: Optional[str] = None) -> Dataset
Deprecated, please use the constructor Dataset()
instead.
delete_layer
¶
delete_layer(layer_name: str) -> None
Delete a layer from the dataset.
Removes the layer's data and metadata from disk completely. This deletes both the datasource-properties.json entry and all data files for the layer.
Parameters:
-
layer_name
(str
) –Name of layer to delete
Raises:
-
IndexError
–If no layer with the given name exists
-
RuntimeError
–If dataset is read-only
Examples:
ds.delete_layer("old_layer")
print("Remaining layers:", list(ds.layers))
download
classmethod
¶
download(dataset_name_or_url: str, organization_id: Optional[str] = None, sharing_token: Optional[str] = None, webknossos_url: Optional[str] = None, bbox: Optional[BoundingBox] = None, layers: Union[List[str], str, None] = None, mags: Optional[List[Mag]] = None, path: Optional[Union[PathLike, str]] = None, exist_ok: bool = False) -> Dataset
Downloads a dataset and returns the Dataset instance.
dataset_name_or_url
may be a dataset name or a full URL to a dataset view, e.g.https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view
If a URL is used,organization_id
,webknossos_url
andsharing_token
must not be set.organization_id
may be supplied if a dataset name was used in the previous argument, it defaults to your current organization from thewebknossos_context
. You can find yourorganization_id
here.sharing_token
may be supplied if a dataset name was used and can specify a sharing token.webknossos_url
may be supplied if a dataset name was used, and allows to specify in which webknossos instance to search for the dataset. It defaults to the url from your currentwebknossos_context
, using https://webknossos.org as a fallback.bbox
,layers
, andmags
specify which parts of the dataset to download. If nothing is specified the whole image, all layers, and all mags are downloaded respectively.path
andexist_ok
specify where to save the downloaded dataset and whether to overwrite if thepath
exists.
downsample
¶
downsample(sampling_mode: SamplingModes = ANISOTROPIC, coarsest_mag: Optional[Mag] = None, executor: Optional[Executor] = None) -> None
Generate downsampled magnifications for all layers.
Creates lower resolution versions (coarser magnifications) of all layers that are not yet downsampled, up to the specified coarsest magnification.
Parameters:
-
sampling_mode
(SamplingModes
, default:ANISOTROPIC
) –Strategy for downsampling (e.g. ANISOTROPIC, MAX)
-
coarsest_mag
(Optional[Mag]
, default:None
) –Optional maximum/coarsest magnification to generate
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallel processing
Raises:
-
RuntimeError
–If dataset is read-only
Examples:
Basic downsampling:
ds.downsample()
With custom parameters:
ds.downsample(
sampling_mode=SamplingModes.ANISOTROPIC,
coarsest_mag=Mag(8),
)
Note
- ANISOTROPIC sampling creates anisotropic downsampling until dataset is isotropic
- Other modes like MAX, CONSTANT etc create regular downsampling patterns
- If magnifications already exist they will not be regenerated
from_images
classmethod
¶
from_images(input_path: Union[str, PathLike], output_path: Union[str, PathLike], voxel_size: Optional[Tuple[float, float, float]] = None, name: Optional[str] = None, *, map_filepath_to_layer_name: Union[ConversionLayerMapping, Callable[[Path], str]] = INSPECT_SINGLE_FILE, z_slices_sort_key: Callable[[Path], Any] = natsort_keygen(), voxel_size_with_unit: Optional[VoxelSize] = None, layer_name: Optional[str] = None, layer_category: Optional[LayerCategoryType] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[int, Vec3IntLike]] = None, compress: bool = False, swap_xy: bool = False, flip_x: bool = False, flip_y: bool = False, flip_z: bool = False, use_bioformats: Optional[bool] = None, max_layers: int = 20, batch_size: Optional[int] = None, executor: Optional[Executor] = None) -> Dataset
This method imports image data in a folder or from a file as a webknossos dataset.
The image data can be 3D images (such as multipage tiffs) or stacks of 2D images. Multiple 3D images or image stacks are mapped to different layers based on the mapping strategy.
The exact mapping is handled by the argument map_filepath_to_layer_name
, which can be a pre-defined
strategy from the enum ConversionLayerMapping
, or a custom callable, taking
a path of an image file and returning the corresponding layer name. All
files belonging to the same layer name are then grouped. In case of
multiple files per layer, those are usually mapped to the z-dimension.
The order of the z-slices can be customized by setting
z_slices_sort_key
.
For more fine-grained control, please create an empty dataset and use add_layer_from_images
.
Parameters:
-
input_path
(Union[str, PathLike]
) –Path to input image files
-
output_path
(Union[str, PathLike]
) –Output path for created dataset
-
voxel_size
(Optional[Tuple[float, float, float]]
, default:None
) –Optional tuple of floats (x,y,z) for voxel size in nm
-
name
(Optional[str]
, default:None
) –Optional name for dataset
-
map_filepath_to_layer_name
(Union[ConversionLayerMapping, Callable[[Path], str]]
, default:INSPECT_SINGLE_FILE
) –Strategy for mapping files to layers, either a ConversionLayerMapping enum value or callable taking Path and returning str
-
z_slices_sort_key
(Callable[[Path], Any]
, default:natsort_keygen()
) –Optional key function for sorting z-slices
-
voxel_size_with_unit
(Optional[VoxelSize]
, default:None
) –Optional voxel size with unit specification
-
layer_name
(Optional[str]
, default:None
) –Optional name for layer(s)
-
layer_category
(Optional[LayerCategoryType]
, default:None
) –Optional category override (LayerCategoryType.color / LayerCategoryType.segmentation)
-
data_format
(Union[str, DataFormat]
, default:DEFAULT_DATA_FORMAT
) –Format to store data in ('wkw'/'zarr')
-
chunk_shape
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional shape of chunks to store data in
-
chunks_per_shard
(Optional[Union[int, Vec3IntLike]]
, default:None
) –Optional number of chunks per shard
-
compress
(bool
, default:False
) –Whether to compress the data
-
swap_xy
(bool
, default:False
) –Whether to swap x and y axes
-
flip_x
(bool
, default:False
) –Whether to flip the x axis
-
flip_y
(bool
, default:False
) –Whether to flip the y axis
-
flip_z
(bool
, default:False
) –Whether to flip the z axis
-
use_bioformats
(Optional[bool]
, default:None
) –Whether to use bioformats for reading
-
max_layers
(int
, default:20
) –Maximum number of layers to create
-
batch_size
(Optional[int]
, default:None
) –Size of batches for processing
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallelization
Returns:
-
Dataset
(Dataset
) –The created dataset instance
Examples:
ds = Dataset.from_images("path/to/images/",
"path/to/dataset/",
voxel_size=(1, 1, 1))
Note
This method needs extra packages like tifffile or pylibczirw.
Install with pip install "webknossos[all]"
and pip install --extra-index-url https://pypi.scm.io/simple/ "webknossos[czi]"
.
get_color_layer
¶
get_color_layer() -> Layer
Deprecated, please use get_color_layers()
.
Returns the only color layer. Fails with a RuntimeError if there are multiple color layers or none.
get_color_layers
¶
get_color_layers() -> List[Layer]
Get all color layers in the dataset.
Provides access to all layers with category 'color'. Useful when a dataset contains multiple color layers.
Returns:
-
List[Layer]
–List[Layer]: List of all color layers in order
Examples:
Print all color layer names:
for layer in ds.get_color_layers():
print(layer.name)
Note
If you need only a single color layer, consider using
get_layer()
with the specific layer name instead.
get_layer
¶
get_layer(layer_name: str) -> Layer
Get a specific layer from this dataset.
Parameters:
-
layer_name
(str
) –Name of the layer to retrieve
Returns:
-
Layer
(Layer
) –The requested layer object
Raises:
-
IndexError
–If no layer with the given name exists
Examples:
color_layer = ds.get_layer("color")
seg_layer = ds.get_layer("segmentation")
Note
Use layers
property to access all layers at once.
get_or_add_layer
¶
get_or_add_layer(layer_name: str, category: LayerCategoryType, dtype_per_layer: Optional[DTypeLike] = None, dtype_per_channel: Optional[DTypeLike] = None, num_channels: Optional[int] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, **kwargs: Any) -> Layer
Get an existing layer or create a new one.
Gets a layer with the given name if it exists, otherwise creates a new layer with the specified parameters.
Parameters:
-
layer_name
(str
) –Name of the layer to get or create
-
category
(LayerCategoryType
) –Layer category ('color' or 'segmentation')
-
dtype_per_layer
(Optional[DTypeLike]
, default:None
) –Optional data type for entire layer
-
dtype_per_channel
(Optional[DTypeLike]
, default:None
) –Optional data type per channel
-
num_channels
(Optional[int]
, default:None
) –Optional number of channels
-
data_format
(Union[str, DataFormat]
, default:DEFAULT_DATA_FORMAT
) –Format to store data ('wkw', 'zarr', etc.)
-
**kwargs
(Any
, default:{}
) –Additional arguments passed to add_layer()
Returns:
-
Layer
(Layer
) –The existing or newly created layer
Raises:
-
AssertionError
–If existing layer's properties don't match specified parameters
-
ValueError
–If both dtype_per_layer and dtype_per_channel specified
-
RuntimeError
–If invalid category specified
Examples:
layer = ds.get_or_add_layer(
"segmentation",
LayerCategoryType.SEGMENTATION_CATEGORY,
dtype_per_channel=np.uint64,
)
Note
The dtype can be specified either per layer or per channel, but not both. For existing layers, the parameters are validated against the layer properties.
get_or_create
classmethod
¶
get_or_create(dataset_path: Union[str, Path], voxel_size: Tuple[float, float, float], name: Optional[str] = None) -> Dataset
Deprecated, please use the constructor Dataset()
instead.
get_remote_datasets
staticmethod
¶
get_remote_datasets(organization_id: Optional[str] = None, tags: Optional[Union[str, Sequence[str]]] = None) -> Mapping[str, RemoteDataset]
Get available datasets from WEBKNOSSOS.
Returns a mapping of dataset names to lazy-initialized RemoteDataset objects for all datasets visible to the specified organization or current user.
Parameters:
-
organization_id
(Optional[str]
, default:None
) –Optional organization to get datasets from. Defaults to organization of logged in user.
-
tags
(Optional[Union[str, Sequence[str]]]
, default:None
) –Optional tag(s) to filter datasets by. Can be a single tag string or sequence of tags. Only returns datasets with all specified tags.
Returns:
-
Mapping[str, RemoteDataset]
–Mapping[str, RemoteDataset]: Dict mapping dataset names to RemoteDataset objects
Examples:
List all available datasets:
datasets = Dataset.get_remote_datasets()
print(sorted(datasets.keys()))
Get datasets for specific organization:
org_datasets = Dataset.get_remote_datasets("my_organization")
ds = org_datasets["dataset_name"]
Filter datasets by tag:
published = Dataset.get_remote_datasets(tags="published")
tagged = Dataset.get_remote_datasets(tags=["tag1", "tag2"])
Note
RemoteDataset objects are initialized lazily when accessed for the first time. The mapping object provides a fast way to list and look up available datasets.
get_segmentation_layer
¶
get_segmentation_layer() -> SegmentationLayer
Deprecated, please use get_segmentation_layers()
.
Returns the only segmentation layer. Fails with a IndexError if there are multiple segmentation layers or none.
get_segmentation_layers
¶
get_segmentation_layers() -> List[SegmentationLayer]
Get all segmentation layers in the dataset.
Provides access to all layers with category 'segmentation'. Useful when a dataset contains multiple segmentation layers.
Returns:
-
List[SegmentationLayer]
–List[SegmentationLayer]: List of all segmentation layers in order
Examples:
Print all segmentation layer names:
for layer in ds.get_segmentation_layers():
print(layer.name)
Note
If you need only a single segmentation layer, consider using
get_layer()
with the specific layer name instead.
open
classmethod
¶
open(dataset_path: Union[str, PathLike]) -> Dataset
To open an existing dataset on disk, simply call Dataset.open("your_path")
.
This requires datasource-properties.json
to exist in this folder. Based on the datasource-properties.json
,
a dataset object is constructed. Only layers and magnifications that are listed in the properties are loaded
(even though there might exist more layers or magnifications on disk).
The dataset_path
refers to the top level directory of the dataset (excluding layer or magnification names).
open_remote
classmethod
¶
open_remote(dataset_name_or_url: str, organization_id: Optional[str] = None, sharing_token: Optional[str] = None, webknossos_url: Optional[str] = None) -> RemoteDataset
Opens a remote webknossos dataset. Image data is accessed via network requests.
Dataset metadata such as allowed teams or the sharing token can be read and set
via the respective RemoteDataset
properties.
Parameters:
-
dataset_name_or_url
(str
) –Either dataset name or full URL to dataset view, e.g. https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view
-
organization_id
(Optional[str]
, default:None
) –Optional organization ID if using dataset name. Can be found here
-
sharing_token
(Optional[str]
, default:None
) –Optional sharing token for dataset access
-
webknossos_url
(Optional[str]
, default:None
) –Optional custom webknossos URL, defaults to context URL, usually https://webknossos.org
Returns:
-
RemoteDataset
(RemoteDataset
) –Dataset instance for remote access
Examples:
ds = Dataset.open_remote("`https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view`")
Note
If supplying an URL, organization_id, webknossos_url and sharing_token must not be set.
shallow_copy_dataset
¶
shallow_copy_dataset(new_dataset_path: Union[str, PathLike], name: Optional[str] = None, make_relative: bool = False, layers_to_ignore: Optional[Iterable[str]] = None) -> Dataset
Create a new dataset that uses symlinks to reference data.
Links all magnifications and layer directories from the original dataset via symlinks rather than copying data. Useful for creating alternative views or exposing datasets to webknossos.
Parameters:
-
new_dataset_path
(Union[str, PathLike]
) –Path where new dataset should be created
-
name
(Optional[str]
, default:None
) –Optional name for the new dataset, uses original name if None
-
make_relative
(bool
, default:False
) –Whether to create relative symlinks
-
layers_to_ignore
(Optional[Iterable[str]]
, default:None
) –Optional iterable of layer names to exclude
Returns:
-
Dataset
(Dataset
) –The newly created dataset with linked layers
Raises:
-
AssertionError
–If trying to link remote datasets
-
RuntimeError
–If dataset is read-only
Examples:
Basic shallow copy:
linked = ds.shallow_copy_dataset("path/to/link")
With relative links excluding layers:
linked = ds.shallow_copy_dataset(
"path/to/link",
make_relative=True,
layers_to_ignore=["temp_layer"]
)
Note
Only works with datasets on local filesystems. Cannot create shallow copies of remote datasets or create shallow copies in remote locations.
trigger_reload_in_datastore
classmethod
¶
trigger_reload_in_datastore(dataset_name: str, organization: str, token: Optional[str] = None) -> None
Trigger a manual reload of the dataset's properties.
For manually uploaded datasets, properties are normally updated automatically after a few minutes. This method forces an immediate reload.
This is typically only needed after manual changes to the dataset's files. Cannot be used for local datasets.
Parameters:
-
dataset_name
(str
) –Name of dataset to reload
-
organization
(str
) –Organization ID where dataset is located
-
token
(Optional[str]
, default:None
) –Optional authentication token
Examples:
# Force reload after manual file changes
Dataset.trigger_reload_in_datastore(
"my_dataset",
"organization_id"
)
upload
¶
upload(new_dataset_name: Optional[str] = None, layers_to_link: Optional[List[Union[LayerToLink, Layer]]] = None, jobs: Optional[int] = None) -> RemoteDataset
Upload this dataset to webknossos.
Copies all data and metadata to webknossos, creating a new dataset that can be accessed remotely. For large datasets, existing layers can be linked instead of re-uploaded.
Parameters:
-
new_dataset_name
(Optional[str]
, default:None
) –Optional name for the uploaded dataset
-
layers_to_link
(Optional[List[Union[LayerToLink, Layer]]]
, default:None
) –Optional list of layers that should link to existing data instead of being uploaded
-
jobs
(Optional[int]
, default:None
) –Optional number of parallel upload jobs, defaults to 5
Returns:
-
RemoteDataset
(RemoteDataset
) –Reference to the newly created remote dataset
Examples:
Simple upload:
remote_ds = ds.upload("my_new_dataset")
print(remote_ds.url)
Link existing layers:
link = LayerToLink.from_remote_layer(existing_layer)
remote_ds = ds.upload(layers_to_link=[link])
RemoteDataset
¶
RemoteDataset(dataset_path: UPath, dataset_name: str, organization_id: str, sharing_token: Optional[str], context: ContextManager)
Bases: Dataset
A representation of a dataset on a webknossos server.
This class is returned from Dataset.open_remote()
and provides read-only access to
image data streamed from the webknossos server. It uses the same interface as Dataset
but additionally allows metadata manipulation through properties.
Properties
metadata: Dataset metadata as key-value pairs display_name: Human readable name description: Dataset description tags: Dataset tags is_public: Whether dataset is public sharing_token: Dataset sharing token allowed_teams: Teams with dataset access folder: Dataset folder location
Examples:
Opening a remote dataset with organization ID:
ds = Dataset.open_remote("my_dataset", "org_id")
Opening with dataset URL:
ds = Dataset.open_remote("https://webknossos.org/datasets/org/dataset/view")
Setting metadata:
ds.metadata = {"key": "value", "tags": ["tag1", "tag2"]}
ds.display_name = "My Dataset"
ds.allowed_teams = [Team.get_by_name("Lab_A")]
Note
Do not instantiate directly, use Dataset.open_remote()
instead.
Initialize a remote dataset instance.
Parameters:
-
dataset_path
(UPath
) –Path to remote dataset location
-
dataset_name
(str
) –Name of dataset in WEBKNOSSOS
-
organization_id
(str
) –Organization that owns the dataset
-
sharing_token
(Optional[str]
) –Optional token for shared access
-
context
(ContextManager
) –Context manager for WEBKNOSSOS connection
Raises:
-
FileNotFoundError
–If dataset cannot be opened as zarr format and no metadata exists
Note
Do not call this constructor directly, use Dataset.open_remote() instead. This class provides access to remote WEBKNOSSOS datasets with additional metadata manipulation.
allowed_teams
property
writable
¶
allowed_teams: Tuple[Team, ...]
Teams that are allowed to access this dataset.
Controls which teams have read access to view and use this dataset. Changes are immediately synchronized with WEBKNOSSOS.
Returns:
-
Tuple[Team, ...]
–Tuple[Team, ...]: Teams currently having access
Examples:
from webknossos import Team
team = Team.get_by_name("Lab_A")
ds.allowed_teams = [team]
print([t.name for t in ds.allowed_teams])
# Give access to multiple teams:
ds.allowed_teams = [
Team.get_by_name("Lab_A"),
Team.get_by_name("Lab_B")
]
Note
- Teams must be from the same organization as the dataset
- Can be set using Team objects or team ID strings
- An empty list makes the dataset private
default_view_configuration
property
writable
¶
default_view_configuration: Optional[DatasetViewConfiguration]
Default view configuration for this dataset in webknossos.
Controls how the dataset is displayed in webknossos when first opened by a user, including position, zoom level, rotation etc.
Returns:
-
Optional[DatasetViewConfiguration]
–Optional[DatasetViewConfiguration]: Current view configuration if set
Examples:
ds.default_view_configuration = DatasetViewConfiguration(
zoom=1.5,
position=(100, 100, 100)
)
description
deletable
property
writable
¶
description: Optional[str]
Free-text description of the dataset.
Can be edited with markdown formatting. Changes are immediately synchronized with WEBKNOSSOS.
Returns:
-
Optional[str]
–Optional[str]: Current description if set, None otherwise
Examples:
ds.description = "Dataset acquired on *June 1st*"
ds.description = None # Remove description
display_name
deletable
property
writable
¶
display_name: Optional[str]
The human-readable name for the dataset in the webknossos interface.
Can be set to a different value than the dataset name used in URLs and downloads. Changes are immediately synchronized with WEBKNOSSOS.
Returns:
-
Optional[str]
–Optional[str]: Current display name if set, None otherwise
Examples:
remote_ds.display_name = "Mouse Brain Sample A"
folder
property
writable
¶
folder: RemoteFolder
The (virtual) folder containing this dataset in WEBKNOSSOS.
Represents the folder location in the WEBKNOSSOS UI folder structure. Can be changed to move the dataset to a different folder. Changes are immediately synchronized with WEBKNOSSOS.
Returns:
-
RemoteFolder
(RemoteFolder
) –Current folder containing the dataset
Examples:
folder = RemoteFolder.get_by_path("Datasets/Published")
ds.folder = folder
print(ds.folder.path) # 'Datasets/Published'
is_public
property
writable
¶
is_public: bool
Control whether the dataset is publicly accessible.
When True, anyone can view the dataset without logging in to WEBKNOSSOS. Changes are immediately synchronized with WEBKNOSSOS.
Returns:
-
bool
(bool
) –True if dataset is public, False if private
Examples:
ds.is_public = True
ds.is_public = False
print("Public" if ds.is_public else "Private") # Private
layers
property
¶
layers: Dict[str, Layer]
Dictionary containing all layers of this dataset.
Returns:
-
Dict[str, Layer]
–Dict[str, Layer]: Dictionary mapping layer names to Layer objects
Examples:
for layer_name, layer in ds.layers.items():
print(layer_name)
metadata
property
writable
¶
metadata: DatasetMetadata
Get or set metadata key-value pairs for the dataset.
The metadata can contain strings, numbers, and lists of strings as values. Changes are immediately synchronized with WEBKNOSSOS.
Returns:
-
DatasetMetadata
(DatasetMetadata
) –Current metadata key-value pairs
Examples:
ds.metadata = {
"species": "mouse",
"age_days": 42,
"tags": ["verified", "published"]
}
print(ds.metadata["species"])
name
property
writable
¶
name: str
Name of this dataset as specified in datasource-properties.json.
Can be modified to rename the dataset. Changes are persisted to the properties file.
Returns:
-
str
(str
) –Current dataset name
Examples:
ds.name = "my_renamed_dataset" # Updates the name in properties file
read_only
property
¶
read_only: bool
Whether this dataset is opened in read-only mode.
When True, operations that would modify the dataset (adding layers, changing properties, etc.) are not allowed and will raise RuntimeError.
Returns:
-
bool
(bool
) –True if dataset is read-only, False otherwise
sharing_token
property
¶
sharing_token: str
Get a new token for sharing access to this dataset.
Each call generates a fresh token that allows viewing the dataset without logging in. The token can be appended to dataset URLs as a query parameter.
Returns:
-
str
(str
) –Fresh sharing token for dataset access
Examples:
token = ds.sharing_token
url = f"{ds.url}?token={token}"
print("Share this link:", url)
Note
- A new token is generated on each access
- The token provides read-only access
- Anyone with the token can view the dataset
tags
property
writable
¶
tags: Tuple[str, ...]
User-assigned tags for organizing and filtering datasets.
Tags allow categorizing and filtering datasets in the webknossos dashboard interface. Changes are immediately synchronized with WEBKNOSSOS.
Returns:
-
Tuple[str, ...]
–Tuple[str, ...]: Currently assigned tags, in string tuple form
Examples:
ds.tags = ["verified", "published"]
print(ds.tags) # ('verified', 'published')
ds.tags = [] # Remove all tags
url
property
¶
url: str
URL to access this dataset in webknossos.
Constructs the full URL to the dataset in the webknossos web interface.
Returns:
-
str
(str
) –Full dataset URL including organization and dataset name
Examples:
print(ds.url) # 'https://webknossos.org/datasets/my_org/my_dataset'
voxel_size
property
¶
voxel_size: Tuple[float, float, float]
Size of each voxel in nanometers along each dimension (x, y, z).
Returns:
-
Tuple[float, float, float]
–Tuple[float, float, float]: Size of each voxel in nanometers for x,y,z dimensions
Examples:
vx, vy, vz = ds.voxel_size
print(f"X resolution is {vx}nm")
voxel_size_with_unit
property
¶
voxel_size_with_unit: VoxelSize
Size of voxels including unit information.
Size of each voxel along each dimension (x, y, z), including unit specification. The default unit is nanometers.
Returns:
-
VoxelSize
(VoxelSize
) –Object containing voxel sizes and their units
ConversionLayerMapping
¶
Bases: Enum
Strategies for mapping file paths to layers when importing images.
These strategies determine how input image files are grouped into layers during
dataset creation using Dataset.from_images()
. If no strategy is provided,
INSPECT_SINGLE_FILE
is used as the default.
If none of the pre-defined strategies fit your needs, you can provide a custom callable that takes a Path and returns a layer name string.
Examples:
Using default strategy:
ds = Dataset.from_images("images/", "dataset/")
Explicit strategy:
ds = Dataset.from_images(
"images/",
"dataset/",
map_filepath_to_layer_name=ConversionLayerMapping.ENFORCE_SINGLE_LAYER
)
Custom mapping function:
ds = Dataset.from_images(
"images/",
"dataset/",
map_filepath_to_layer_name=lambda p: p.stem
)
ENFORCE_LAYER_PER_FILE
class-attribute
instance-attribute
¶
ENFORCE_LAYER_PER_FILE = 'enforce_layer_per_file'
Creates a new layer for each input file. Useful for converting multiple 3D images or when each 2D image should become its own layer.
ENFORCE_LAYER_PER_FOLDER
class-attribute
instance-attribute
¶
ENFORCE_LAYER_PER_FOLDER = 'enforce_layer_per_folder'
Groups files by their containing folder. Each folder becomes one layer. Useful for organized 2D image stacks.
ENFORCE_LAYER_PER_TOPLEVEL_FOLDER
class-attribute
instance-attribute
¶
ENFORCE_LAYER_PER_TOPLEVEL_FOLDER = 'enforce_layer_per_toplevel_folder'
Groups files by their top-level folder. Useful when multiple layers each have their stacks split across subfolders.
ENFORCE_SINGLE_LAYER
class-attribute
instance-attribute
¶
ENFORCE_SINGLE_LAYER = 'enforce_single_layer'
Combines all input files into a single layer. Only useful when all images are 2D slices that should be combined.
INSPECT_EVERY_FILE
class-attribute
instance-attribute
¶
INSPECT_EVERY_FILE = 'inspect_every_file'
Like INSPECT_SINGLE_FILE but determines strategy separately for each file. More flexible but slower for many files.
INSPECT_SINGLE_FILE
class-attribute
instance-attribute
¶
INSPECT_SINGLE_FILE = 'inspect_single_file'
Default strategy. Inspects first image file to determine if data is 2D or 3D. For 2D data uses ENFORCE_LAYER_PER_FOLDER, for 3D uses ENFORCE_LAYER_PER_FILE.
add_copy_layer
¶
add_copy_layer(foreign_layer: Union[str, Path, Layer], new_layer_name: Optional[str] = None, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[Vec3IntLike, int]] = None, data_format: Optional[Union[str, DataFormat]] = None, compress: Optional[bool] = None, executor: Optional[Executor] = None) -> Layer
Copy layer from another dataset to this one.
Creates a new layer in this dataset by copying data and metadata from a layer in another dataset.
Parameters:
-
foreign_layer
(Union[str, Path, Layer]
) –Layer to copy (path or Layer object)
-
new_layer_name
(Optional[str]
, default:None
) –Optional name for the new layer, uses original name if None
-
chunk_shape
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional shape of chunks for storage
-
chunks_per_shard
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional number of chunks per shard
-
data_format
(Optional[Union[str, DataFormat]]
, default:None
) –Optional format to store copied data ('wkw', 'zarr', etc.)
-
compress
(Optional[bool]
, default:None
) –Optional whether to compress copied data
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallel copying
Returns:
-
Layer
(Layer
) –The newly created copy of the layer
Raises:
-
IndexError
–If target layer name already exists
-
RuntimeError
–If dataset is read-only
Examples:
Copy layer keeping same name:
other_ds = Dataset.open("other/dataset")
copied = ds.add_copy_layer(other_ds.get_layer("color"))
Copy with new name:
copied = ds.add_copy_layer(
other_ds.get_layer("color"),
new_layer_name="color_copy",
compress=True
)
add_fs_copy_layer
¶
add_fs_copy_layer(foreign_layer: Union[str, Path, Layer], new_layer_name: Optional[str] = None) -> Layer
Copies the files at foreign_layer
which belongs to another dataset
to the current dataset via the filesystem. Additionally, the relevant
information from the datasource-properties.json
of the other dataset
are copied too. If new_layer_name is None, the name of the foreign
layer is used.
add_layer
¶
add_layer(layer_name: str, category: LayerCategoryType, dtype_per_layer: Optional[DTypeLike] = None, dtype_per_channel: Optional[DTypeLike] = None, num_channels: Optional[int] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, bounding_box: Optional[NDBoundingBox] = None, **kwargs: Any) -> Layer
Create a new layer in the dataset.
Creates a new layer with the given name, category, and data type.
Parameters:
-
layer_name
(str
) –Name for the new layer
-
category
(LayerCategoryType
) –Either 'color' or 'segmentation'
-
dtype_per_layer
(Optional[DTypeLike]
, default:None
) –Optional data type for entire layer, e.g. np.uint8
-
dtype_per_channel
(Optional[DTypeLike]
, default:None
) –Optional data type per channel, e.g. np.uint8
-
num_channels
(Optional[int]
, default:None
) –Number of channels (default 1)
-
data_format
(Union[str, DataFormat]
, default:DEFAULT_DATA_FORMAT
) –Format to store data ('wkw', 'zarr', 'zarr3')
-
bounding_box
(Optional[NDBoundingBox]
, default:None
) –Optional initial bounding box of layer
-
**kwargs
(Any
, default:{}
) –Additional arguments: - largest_segment_id: For segmentation layers, initial largest ID - mappings: For segmentation layers, optional ID mappings
Returns:
-
Layer
(Layer
) –The newly created layer
Raises:
-
IndexError
–If layer with given name already exists
-
RuntimeError
–If invalid category specified
-
AttributeError
–If both dtype_per_layer and dtype_per_channel specified
-
AssertionError
–If invalid layer name or WKW format used with remote dataset
Examples:
Create color layer:
layer = ds.add_layer(
"my_raw_microscopy_layer",
LayerCategoryType.COLOR_CATEGORY,
dtype_per_channel=np.uint8,
)
Create segmentation layer:
layer = ds.add_layer(
"my_segmentation_labels",
LayerCategoryType.SEGMENTATION_CATEGORY,
dtype_per_channel=np.uint64
)
Note
The dtype can be specified either per layer or per channel, but not both. If neither is specified, uint8 per channel is used by default. WKW format can only be used with local datasets.
add_layer_for_existing_files
¶
add_layer_for_existing_files(layer_name: str, category: LayerCategoryType, **kwargs: Any) -> Layer
Create a new layer from existing data files.
Adds a layer by discovering and incorporating existing data files that were created externally, rather than creating new ones. The layer properties are inferred from the existing files unless overridden.
Parameters:
-
layer_name
(str
) –Name for the new layer
-
category
(LayerCategoryType
) –Layer category ('color' or 'segmentation')
-
**kwargs
(Any
, default:{}
) –Additional arguments: - num_channels: Override detected number of channels - dtype_per_channel: Override detected data type - data_format: Override detected data format - bounding_box: Override detected bounding box
Returns:
-
Layer
(Layer
) –The newly created layer referencing the existing files
Raises:
-
AssertionError
–If layer already exists or no valid files found
-
RuntimeError
–If dataset is read-only
Examples:
Basic usage:
layer = ds.add_layer_for_existing_files(
"external_data",
"color"
)
Override properties:
layer = ds.add_layer_for_existing_files(
"segmentation_data",
"segmentation",
dtype_per_channel=np.uint64
)
Note
The data files must already exist in the dataset directory under the layer name. Files are analyzed to determine properties like data type and number of channels. Magnifications are discovered automatically.
add_layer_from_images
¶
add_layer_from_images(images: Union[str, FramesSequence, List[Union[str, PathLike]]], layer_name: str, category: Optional[LayerCategoryType] = 'color', data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, mag: Union[int, str, list, tuple, ndarray, Mag] = Mag(1), chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[int, Vec3IntLike]] = None, compress: bool = False, *, topleft: VecIntLike = zeros(), swap_xy: bool = False, flip_x: bool = False, flip_y: bool = False, flip_z: bool = False, dtype: Optional[DTypeLike] = None, use_bioformats: Optional[bool] = None, channel: Optional[int] = None, timepoint: Optional[int] = None, czi_channel: Optional[int] = None, batch_size: Optional[int] = None, allow_multiple_layers: bool = False, max_layers: int = 20, truncate_rgba_to_rgb: bool = True, executor: Optional[Executor] = None, chunk_size: Optional[Union[Vec3IntLike, int]] = None) -> Layer
Creates a new layer called layer_name
with mag mag
from images
.
images
can be one of the following:
- glob-string
- list of paths
pims.FramesSequence
instance
Please see the pims docs for more information.
This method needs extra packages like tifffile or pylibczirw. Please install the respective extras,
e.g. using python -m pip install "webknossos[all]"
.
Further Arguments:
category
:color
by default, may be set to "segmentation"data_format
: by default wkw files are written, may be set to "zarr"mag
: magnification to use for the written datachunk_shape
,chunks_per_shard
,compress
: adjust how the data is stored on disktopleft
: set an offset in Mag(1) to start writing the data, only affecting the outputswap_xy
: set toTrue
to interchange x and y axis before writing to diskflip_x
,flip_y
,flip_z
: set toTrue
to reverse the respective axis before writing to diskdtype
: the read image data will be convertoed to this dtype usingnumpy.ndarray.astype
use_bioformats
: set toTrue
to only use the pims bioformats adapter directly, needs a JVM, set toFalse
to forbid using the bioformats adapter, by default it is tried as a last optionchannel
: may be used to select a single channel, if multiple are availabletimepoint
: for timeseries, select a timepoint to use by specifying it as an int, starting from 0czi_channel
: may be used to select a channel for .czi images, which differs from normal color-channelsbatch_size
: size to process the images (influences RAM consumption), must be a multiple of the chunk-size z-axis for uncompressed and the shard-size z-axis for compressed layers, default is the chunk-size or shard-size respectivelyallow_multiple_layers
: set toTrue
if timepoints or channels may result in multiple layers being added (only the first is returned)max_layers
: only applies ifallow_multiple_layers=True
, limits the number of layers added via different channels or timepointstruncate_rgba_to_rgb
: only applies ifallow_multiple_layers=True
, set toFalse
to write four channels into layers instead of an RGB channelexecutor
: pass aClusterExecutor
instance to parallelize the conversion jobs across the batches
add_remote_layer
¶
add_remote_layer(foreign_layer: Union[str, UPath, Layer], new_layer_name: Optional[str] = None) -> Layer
Add a remote layer from another dataset.
Creates a layer that references data from a remote dataset. The image data will be streamed on-demand when accessed.
Parameters:
-
foreign_layer
(Union[str, UPath, Layer]
) –Remote layer to add (path or Layer object)
-
new_layer_name
(Optional[str]
, default:None
) –Optional name for the new layer, uses original name if None
Returns:
-
Layer
(Layer
) –The newly created remote layer referencing the foreign data
Raises:
-
IndexError
–If target layer name already exists
-
AssertionError
–If trying to add non-remote layer or same origin dataset
-
RuntimeError
–If dataset is read-only
Examples:
ds = Dataset.open("other/dataset")
remote_ds = Dataset.open_remote("my_dataset", "my_org_id")
new_layer = ds.add_remote_layer(
remote_ds.get_layer("color")
)
Note
Changes to the original layer's properties afterwards won't affect this dataset. Data is only referenced, not copied.
add_symlink_layer
¶
add_symlink_layer(foreign_layer: Union[str, Path, Layer], make_relative: bool = False, new_layer_name: Optional[str] = None) -> Layer
Create symbolic link to layer from another dataset.
Instead of copying data, creates a symbolic link to the original layer's data and copies only the layer metadata. Changes to the original layer's properties, e.g. bounding box, afterwards won't affect this dataset and vice-versa.
Parameters:
-
foreign_layer
(Union[str, Path, Layer]
) –Layer to link to (path or Layer object)
-
make_relative
(bool
, default:False
) –Whether to create relative symlinks
-
new_layer_name
(Optional[str]
, default:None
) –Optional name for the linked layer, uses original name if None
Returns:
-
Layer
(Layer
) –The newly created symbolic link layer
Raises:
-
IndexError
–If target layer name already exists
-
AssertionError
–If trying to create symlinks in/to remote datasets
-
RuntimeError
–If dataset is read-only
Examples:
other_ds = Dataset.open("other/dataset")
linked = ds.add_symlink_layer(
other_ds.get_layer("color"),
make_relative=True
)
Note
Only works with local file systems, cannot link remote datasets or create symlinks in remote datasets.
announce_manual_upload
classmethod
¶
announce_manual_upload(dataset_name: str, organization: str, initial_team_ids: List[str], folder_id: str, token: Optional[str] = None) -> None
Announce a manual dataset upload to WEBKNOSSOS.
Used when manually uploading datasets to the file system of a datastore. Creates database entries and sets access rights on the webknossos instance before the actual data upload.
Parameters:
-
dataset_name
(str
) –Name for the new dataset
-
organization
(str
) –Organization ID to upload to
-
initial_team_ids
(List[str]
) –List of team IDs to grant initial access
-
folder_id
(str
) –ID of folder where dataset should be placed
-
token
(Optional[str]
, default:None
) –Optional authentication token
Note
This is typically only used by administrators with direct file system access to the WEBKNOSSOS datastore. Most users should use upload() instead.
Examples:
Dataset.announce_manual_upload(
"my_dataset",
"my_organization",
["team_a", "team_b"],
"folder_123"
)
calculate_bounding_box
¶
calculate_bounding_box() -> NDBoundingBox
Calculate the enclosing bounding box of all layers.
Finds the smallest box that contains all data from all layers in the dataset.
Returns:
-
NDBoundingBox
(NDBoundingBox
) –Bounding box containing all layer data
Examples:
bbox = ds.calculate_bounding_box()
print(f"Dataset spans {bbox.size} voxels")
print(f"Dataset starts at {bbox.topleft}")
compress
¶
compress(executor: Optional[Executor] = None) -> None
Compress all uncompressed magnifications in-place.
Compresses the data of all magnification levels that aren't already compressed, for all layers in the dataset.
Parameters:
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallel compression
Raises:
-
RuntimeError
–If dataset is read-only
Examples:
ds.compress()
Note
If data is already compressed, this will have no effect.
copy_dataset
¶
copy_dataset(new_dataset_path: Union[str, Path], voxel_size: Optional[Tuple[float, float, float]] = None, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[Vec3IntLike, int]] = None, data_format: Optional[Union[str, DataFormat]] = None, compress: Optional[bool] = None, args: Optional[Namespace] = None, executor: Optional[Executor] = None, *, voxel_size_with_unit: Optional[VoxelSize] = None, chunk_size: Optional[Union[Vec3IntLike, int]] = None, block_len: Optional[int] = None, file_len: Optional[int] = None) -> Dataset
Creates an independent copy of the dataset with all layers at a new location. Data storage parameters can be customized for the copied dataset.
Parameters:
-
new_dataset_path
(Union[str, Path]
) –Path where new dataset should be created
-
voxel_size
(Optional[Tuple[float, float, float]]
, default:None
) –Optional tuple of floats (x,y,z) specifying voxel size in nanometers
-
chunk_shape
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional shape of chunks for data storage
-
chunks_per_shard
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional number of chunks per shard
-
data_format
(Optional[Union[str, DataFormat]]
, default:None
) –Optional format to store data ('wkw', 'zarr', 'zarr3')
-
compress
(Optional[bool]
, default:None
) –Optional whether to compress data
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallel copying
-
voxel_size_with_unit
(Optional[VoxelSize]
, default:None
) –Optional voxel size specification with units
-
**kwargs
–Additional deprecated arguments: - chunk_size: Use chunk_shape instead - block_len: Use chunk_shape instead - file_len: Use chunks_per_shard instead - args: Use executor instead
Returns:
-
Dataset
(Dataset
) –The newly created copy
Raises:
-
AssertionError
–If trying to copy WKW layers to remote dataset
Examples:
Basic copy:
copied = ds.copy_dataset("path/to/copy")
Copy with different storage:
copied = ds.copy_dataset(
"path/to/copy",
data_format="zarr",
compress=True
)
Note
WKW layers can only be copied to datasets on local file systems. For remote datasets, use data_format='zarr'.
create
classmethod
¶
create(dataset_path: Union[str, PathLike], voxel_size: Tuple[float, float, float], name: Optional[str] = None) -> Dataset
Deprecated, please use the constructor Dataset()
instead.
delete_layer
¶
delete_layer(layer_name: str) -> None
Delete a layer from the dataset.
Removes the layer's data and metadata from disk completely. This deletes both the datasource-properties.json entry and all data files for the layer.
Parameters:
-
layer_name
(str
) –Name of layer to delete
Raises:
-
IndexError
–If no layer with the given name exists
-
RuntimeError
–If dataset is read-only
Examples:
ds.delete_layer("old_layer")
print("Remaining layers:", list(ds.layers))
download
classmethod
¶
download(dataset_name_or_url: str, organization_id: Optional[str] = None, sharing_token: Optional[str] = None, webknossos_url: Optional[str] = None, bbox: Optional[BoundingBox] = None, layers: Union[List[str], str, None] = None, mags: Optional[List[Mag]] = None, path: Optional[Union[PathLike, str]] = None, exist_ok: bool = False) -> Dataset
Downloads a dataset and returns the Dataset instance.
dataset_name_or_url
may be a dataset name or a full URL to a dataset view, e.g.https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view
If a URL is used,organization_id
,webknossos_url
andsharing_token
must not be set.organization_id
may be supplied if a dataset name was used in the previous argument, it defaults to your current organization from thewebknossos_context
. You can find yourorganization_id
here.sharing_token
may be supplied if a dataset name was used and can specify a sharing token.webknossos_url
may be supplied if a dataset name was used, and allows to specify in which webknossos instance to search for the dataset. It defaults to the url from your currentwebknossos_context
, using https://webknossos.org as a fallback.bbox
,layers
, andmags
specify which parts of the dataset to download. If nothing is specified the whole image, all layers, and all mags are downloaded respectively.path
andexist_ok
specify where to save the downloaded dataset and whether to overwrite if thepath
exists.
downsample
¶
downsample(sampling_mode: SamplingModes = ANISOTROPIC, coarsest_mag: Optional[Mag] = None, executor: Optional[Executor] = None) -> None
Generate downsampled magnifications for all layers.
Creates lower resolution versions (coarser magnifications) of all layers that are not yet downsampled, up to the specified coarsest magnification.
Parameters:
-
sampling_mode
(SamplingModes
, default:ANISOTROPIC
) –Strategy for downsampling (e.g. ANISOTROPIC, MAX)
-
coarsest_mag
(Optional[Mag]
, default:None
) –Optional maximum/coarsest magnification to generate
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallel processing
Raises:
-
RuntimeError
–If dataset is read-only
Examples:
Basic downsampling:
ds.downsample()
With custom parameters:
ds.downsample(
sampling_mode=SamplingModes.ANISOTROPIC,
coarsest_mag=Mag(8),
)
Note
- ANISOTROPIC sampling creates anisotropic downsampling until dataset is isotropic
- Other modes like MAX, CONSTANT etc create regular downsampling patterns
- If magnifications already exist they will not be regenerated
explore_and_add_remote
classmethod
¶
explore_and_add_remote(dataset_uri: Union[str, PathLike], dataset_name: str, folder_path: str) -> RemoteDataset
Explore and add an external dataset as a remote dataset.
Adds a dataset from an external location (e.g. S3, Google Cloud Storage, or HTTPs) to WEBKNOSSOS by inspecting its layout and metadata without copying the data.
Parameters:
-
dataset_uri
(Union[str, PathLike]
) –URI pointing to the remote dataset location
-
dataset_name
(str
) –Name to register dataset under in WEBKNOSSOS
-
folder_path
(str
) –Path in WEBKNOSSOS folder structure where dataset should appear
Returns:
-
RemoteDataset
(RemoteDataset
) –The newly added dataset accessible via WEBKNOSSOS
Examples:
remote = Dataset.explore_and_add_remote(
"s3://bucket/dataset",
"my_dataset",
"Datasets/Research"
)
Note
The dataset files must be accessible from the WEBKNOSSOS server for this to work. The data will be streamed directly from the source.
from_images
classmethod
¶
from_images(input_path: Union[str, PathLike], output_path: Union[str, PathLike], voxel_size: Optional[Tuple[float, float, float]] = None, name: Optional[str] = None, *, map_filepath_to_layer_name: Union[ConversionLayerMapping, Callable[[Path], str]] = INSPECT_SINGLE_FILE, z_slices_sort_key: Callable[[Path], Any] = natsort_keygen(), voxel_size_with_unit: Optional[VoxelSize] = None, layer_name: Optional[str] = None, layer_category: Optional[LayerCategoryType] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, chunk_shape: Optional[Union[Vec3IntLike, int]] = None, chunks_per_shard: Optional[Union[int, Vec3IntLike]] = None, compress: bool = False, swap_xy: bool = False, flip_x: bool = False, flip_y: bool = False, flip_z: bool = False, use_bioformats: Optional[bool] = None, max_layers: int = 20, batch_size: Optional[int] = None, executor: Optional[Executor] = None) -> Dataset
This method imports image data in a folder or from a file as a webknossos dataset.
The image data can be 3D images (such as multipage tiffs) or stacks of 2D images. Multiple 3D images or image stacks are mapped to different layers based on the mapping strategy.
The exact mapping is handled by the argument map_filepath_to_layer_name
, which can be a pre-defined
strategy from the enum ConversionLayerMapping
, or a custom callable, taking
a path of an image file and returning the corresponding layer name. All
files belonging to the same layer name are then grouped. In case of
multiple files per layer, those are usually mapped to the z-dimension.
The order of the z-slices can be customized by setting
z_slices_sort_key
.
For more fine-grained control, please create an empty dataset and use add_layer_from_images
.
Parameters:
-
input_path
(Union[str, PathLike]
) –Path to input image files
-
output_path
(Union[str, PathLike]
) –Output path for created dataset
-
voxel_size
(Optional[Tuple[float, float, float]]
, default:None
) –Optional tuple of floats (x,y,z) for voxel size in nm
-
name
(Optional[str]
, default:None
) –Optional name for dataset
-
map_filepath_to_layer_name
(Union[ConversionLayerMapping, Callable[[Path], str]]
, default:INSPECT_SINGLE_FILE
) –Strategy for mapping files to layers, either a ConversionLayerMapping enum value or callable taking Path and returning str
-
z_slices_sort_key
(Callable[[Path], Any]
, default:natsort_keygen()
) –Optional key function for sorting z-slices
-
voxel_size_with_unit
(Optional[VoxelSize]
, default:None
) –Optional voxel size with unit specification
-
layer_name
(Optional[str]
, default:None
) –Optional name for layer(s)
-
layer_category
(Optional[LayerCategoryType]
, default:None
) –Optional category override (LayerCategoryType.color / LayerCategoryType.segmentation)
-
data_format
(Union[str, DataFormat]
, default:DEFAULT_DATA_FORMAT
) –Format to store data in ('wkw'/'zarr')
-
chunk_shape
(Optional[Union[Vec3IntLike, int]]
, default:None
) –Optional shape of chunks to store data in
-
chunks_per_shard
(Optional[Union[int, Vec3IntLike]]
, default:None
) –Optional number of chunks per shard
-
compress
(bool
, default:False
) –Whether to compress the data
-
swap_xy
(bool
, default:False
) –Whether to swap x and y axes
-
flip_x
(bool
, default:False
) –Whether to flip the x axis
-
flip_y
(bool
, default:False
) –Whether to flip the y axis
-
flip_z
(bool
, default:False
) –Whether to flip the z axis
-
use_bioformats
(Optional[bool]
, default:None
) –Whether to use bioformats for reading
-
max_layers
(int
, default:20
) –Maximum number of layers to create
-
batch_size
(Optional[int]
, default:None
) –Size of batches for processing
-
executor
(Optional[Executor]
, default:None
) –Optional executor for parallelization
Returns:
-
Dataset
(Dataset
) –The created dataset instance
Examples:
ds = Dataset.from_images("path/to/images/",
"path/to/dataset/",
voxel_size=(1, 1, 1))
Note
This method needs extra packages like tifffile or pylibczirw.
Install with pip install "webknossos[all]"
and pip install --extra-index-url https://pypi.scm.io/simple/ "webknossos[czi]"
.
get_color_layer
¶
get_color_layer() -> Layer
Deprecated, please use get_color_layers()
.
Returns the only color layer. Fails with a RuntimeError if there are multiple color layers or none.
get_color_layers
¶
get_color_layers() -> List[Layer]
Get all color layers in the dataset.
Provides access to all layers with category 'color'. Useful when a dataset contains multiple color layers.
Returns:
-
List[Layer]
–List[Layer]: List of all color layers in order
Examples:
Print all color layer names:
for layer in ds.get_color_layers():
print(layer.name)
Note
If you need only a single color layer, consider using
get_layer()
with the specific layer name instead.
get_layer
¶
get_layer(layer_name: str) -> Layer
Get a specific layer from this dataset.
Parameters:
-
layer_name
(str
) –Name of the layer to retrieve
Returns:
-
Layer
(Layer
) –The requested layer object
Raises:
-
IndexError
–If no layer with the given name exists
Examples:
color_layer = ds.get_layer("color")
seg_layer = ds.get_layer("segmentation")
Note
Use layers
property to access all layers at once.
get_or_add_layer
¶
get_or_add_layer(layer_name: str, category: LayerCategoryType, dtype_per_layer: Optional[DTypeLike] = None, dtype_per_channel: Optional[DTypeLike] = None, num_channels: Optional[int] = None, data_format: Union[str, DataFormat] = DEFAULT_DATA_FORMAT, **kwargs: Any) -> Layer
Get an existing layer or create a new one.
Gets a layer with the given name if it exists, otherwise creates a new layer with the specified parameters.
Parameters:
-
layer_name
(str
) –Name of the layer to get or create
-
category
(LayerCategoryType
) –Layer category ('color' or 'segmentation')
-
dtype_per_layer
(Optional[DTypeLike]
, default:None
) –Optional data type for entire layer
-
dtype_per_channel
(Optional[DTypeLike]
, default:None
) –Optional data type per channel
-
num_channels
(Optional[int]
, default:None
) –Optional number of channels
-
data_format
(Union[str, DataFormat]
, default:DEFAULT_DATA_FORMAT
) –Format to store data ('wkw', 'zarr', etc.)
-
**kwargs
(Any
, default:{}
) –Additional arguments passed to add_layer()
Returns:
-
Layer
(Layer
) –The existing or newly created layer
Raises:
-
AssertionError
–If existing layer's properties don't match specified parameters
-
ValueError
–If both dtype_per_layer and dtype_per_channel specified
-
RuntimeError
–If invalid category specified
Examples:
layer = ds.get_or_add_layer(
"segmentation",
LayerCategoryType.SEGMENTATION_CATEGORY,
dtype_per_channel=np.uint64,
)
Note
The dtype can be specified either per layer or per channel, but not both. For existing layers, the parameters are validated against the layer properties.
get_or_create
classmethod
¶
get_or_create(dataset_path: Union[str, Path], voxel_size: Tuple[float, float, float], name: Optional[str] = None) -> Dataset
Deprecated, please use the constructor Dataset()
instead.
get_remote_datasets
staticmethod
¶
get_remote_datasets(organization_id: Optional[str] = None, tags: Optional[Union[str, Sequence[str]]] = None) -> Mapping[str, RemoteDataset]
Get available datasets from WEBKNOSSOS.
Returns a mapping of dataset names to lazy-initialized RemoteDataset objects for all datasets visible to the specified organization or current user.
Parameters:
-
organization_id
(Optional[str]
, default:None
) –Optional organization to get datasets from. Defaults to organization of logged in user.
-
tags
(Optional[Union[str, Sequence[str]]]
, default:None
) –Optional tag(s) to filter datasets by. Can be a single tag string or sequence of tags. Only returns datasets with all specified tags.
Returns:
-
Mapping[str, RemoteDataset]
–Mapping[str, RemoteDataset]: Dict mapping dataset names to RemoteDataset objects
Examples:
List all available datasets:
datasets = Dataset.get_remote_datasets()
print(sorted(datasets.keys()))
Get datasets for specific organization:
org_datasets = Dataset.get_remote_datasets("my_organization")
ds = org_datasets["dataset_name"]
Filter datasets by tag:
published = Dataset.get_remote_datasets(tags="published")
tagged = Dataset.get_remote_datasets(tags=["tag1", "tag2"])
Note
RemoteDataset objects are initialized lazily when accessed for the first time. The mapping object provides a fast way to list and look up available datasets.
get_segmentation_layer
¶
get_segmentation_layer() -> SegmentationLayer
Deprecated, please use get_segmentation_layers()
.
Returns the only segmentation layer. Fails with a IndexError if there are multiple segmentation layers or none.
get_segmentation_layers
¶
get_segmentation_layers() -> List[SegmentationLayer]
Get all segmentation layers in the dataset.
Provides access to all layers with category 'segmentation'. Useful when a dataset contains multiple segmentation layers.
Returns:
-
List[SegmentationLayer]
–List[SegmentationLayer]: List of all segmentation layers in order
Examples:
Print all segmentation layer names:
for layer in ds.get_segmentation_layers():
print(layer.name)
Note
If you need only a single segmentation layer, consider using
get_layer()
with the specific layer name instead.
open
classmethod
¶
open(dataset_path: Union[str, PathLike]) -> Dataset
Do not call manually, please use Dataset.open_remote()
instead.
open_remote
classmethod
¶
open_remote(dataset_name_or_url: str, organization_id: Optional[str] = None, sharing_token: Optional[str] = None, webknossos_url: Optional[str] = None) -> RemoteDataset
Opens a remote webknossos dataset. Image data is accessed via network requests.
Dataset metadata such as allowed teams or the sharing token can be read and set
via the respective RemoteDataset
properties.
Parameters:
-
dataset_name_or_url
(str
) –Either dataset name or full URL to dataset view, e.g. https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view
-
organization_id
(Optional[str]
, default:None
) –Optional organization ID if using dataset name. Can be found here
-
sharing_token
(Optional[str]
, default:None
) –Optional sharing token for dataset access
-
webknossos_url
(Optional[str]
, default:None
) –Optional custom webknossos URL, defaults to context URL, usually https://webknossos.org
Returns:
-
RemoteDataset
(RemoteDataset
) –Dataset instance for remote access
Examples:
ds = Dataset.open_remote("`https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view`")
Note
If supplying an URL, organization_id, webknossos_url and sharing_token must not be set.
shallow_copy_dataset
¶
shallow_copy_dataset(new_dataset_path: Union[str, PathLike], name: Optional[str] = None, make_relative: bool = False, layers_to_ignore: Optional[Iterable[str]] = None) -> Dataset
Create a new dataset that uses symlinks to reference data.
Links all magnifications and layer directories from the original dataset via symlinks rather than copying data. Useful for creating alternative views or exposing datasets to webknossos.
Parameters:
-
new_dataset_path
(Union[str, PathLike]
) –Path where new dataset should be created
-
name
(Optional[str]
, default:None
) –Optional name for the new dataset, uses original name if None
-
make_relative
(bool
, default:False
) –Whether to create relative symlinks
-
layers_to_ignore
(Optional[Iterable[str]]
, default:None
) –Optional iterable of layer names to exclude
Returns:
-
Dataset
(Dataset
) –The newly created dataset with linked layers
Raises:
-
AssertionError
–If trying to link remote datasets
-
RuntimeError
–If dataset is read-only
Examples:
Basic shallow copy:
linked = ds.shallow_copy_dataset("path/to/link")
With relative links excluding layers:
linked = ds.shallow_copy_dataset(
"path/to/link",
make_relative=True,
layers_to_ignore=["temp_layer"]
)
Note
Only works with datasets on local filesystems. Cannot create shallow copies of remote datasets or create shallow copies in remote locations.
trigger_reload_in_datastore
classmethod
¶
trigger_reload_in_datastore(dataset_name: str, organization: str, token: Optional[str] = None) -> None
Trigger a manual reload of the dataset's properties.
For manually uploaded datasets, properties are normally updated automatically after a few minutes. This method forces an immediate reload.
This is typically only needed after manual changes to the dataset's files. Cannot be used for local datasets.
Parameters:
-
dataset_name
(str
) –Name of dataset to reload
-
organization
(str
) –Organization ID where dataset is located
-
token
(Optional[str]
, default:None
) –Optional authentication token
Examples:
# Force reload after manual file changes
Dataset.trigger_reload_in_datastore(
"my_dataset",
"organization_id"
)
upload
¶
upload(new_dataset_name: Optional[str] = None, layers_to_link: Optional[List[Union[LayerToLink, Layer]]] = None, jobs: Optional[int] = None) -> RemoteDataset
Upload this dataset to webknossos.
Copies all data and metadata to webknossos, creating a new dataset that can be accessed remotely. For large datasets, existing layers can be linked instead of re-uploaded.
Parameters:
-
new_dataset_name
(Optional[str]
, default:None
) –Optional name for the uploaded dataset
-
layers_to_link
(Optional[List[Union[LayerToLink, Layer]]]
, default:None
) –Optional list of layers that should link to existing data instead of being uploaded
-
jobs
(Optional[int]
, default:None
) –Optional number of parallel upload jobs, defaults to 5
Returns:
-
RemoteDataset
(RemoteDataset
) –Reference to the newly created remote dataset
Examples:
Simple upload:
remote_ds = ds.upload("my_new_dataset")
print(remote_ds.url)
Link existing layers:
link = LayerToLink.from_remote_layer(existing_layer)
remote_ds = ds.upload(layers_to_link=[link])
- Get Help
- Community Forums
- Email Support