webknossos.dataset.dataset
A dataset is the entry point of the Dataset API. An existing dataset on disk can be opened or new datasets can be created.
A dataset stores the data in .wkw
files on disk with metadata in datasource-properties.json
.
The information in those files are kept in sync with the object.
Each dataset consists of one or more layers (webknossos.dataset.layer.Layer), which themselves can comprise multiple magnifications (webknossos.dataset.mag_view.MagView).
When using Dataset.open_remote()
an instance of the RemoteDataset
subclass is returned.
Creates a new dataset and the associated datasource-properties.json
.
If the dataset already exists and exist_ok is set to True,
it is opened (the provided voxel_size and name are asserted to match the existing dataset).
Currently, exist_ok=True
is the deprecated default and will change in future releases.
Please use Dataset.open
if you intend to open an existing dataset and don't want/need the creation behavior.
scale
is deprecated, please use voxel_size
instead.
To open an existing dataset on disk, simply call Dataset.open("your_path")
.
This requires datasource-properties.json
to exist in this folder. Based on the datasource-properties.json
,
a dataset object is constructed. Only layers and magnifications that are listed in the properties are loaded
(even though there might exist more layers or magnifications on disk).
The dataset_path
refers to the top level directory of the dataset (excluding layer or magnification names).
Opens a remote webknossos dataset. Image data is accessed via network requests.
Dataset metadata such as allowed teams or the sharing token can be read and set
via the respective RemoteDataset
properties.
dataset_name_or_url
may be a dataset name or a full URL to a dataset view, e.g.https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view
If a URL is used,organization_id
,webknossos_url
andsharing_token
must not be set.organization_id
may be supplied if a dataset name was used in the previous argument, it defaults to your current organization from thewebknossos_context
. You can find yourorganization_id
here.sharing_token
may be supplied if a dataset name was used and can specify a sharing token.webknossos_url
may be supplied if a dataset name was used, and allows to specifiy in which webknossos instance to search for the dataset. It defaults to the url from your currentwebknossos_context
, using https://webknossos.org as a fallback.
Downloads a dataset and returns the Dataset instance.
dataset_name_or_url
may be a dataset name or a full URL to a dataset view, e.g.https://webknossos.org/datasets/scalable_minds/l4_sample_dev/view
If a URL is used,organization_id
,webknossos_url
andsharing_token
must not be set.organization_id
may be supplied if a dataset name was used in the previous argument, it defaults to your current organization from thewebknossos_context
. You can find yourorganization_id
here.sharing_token
may be supplied if a dataset name was used and can specify a sharing token.webknossos_url
may be supplied if a dataset name was used, and allows to specifiy in which webknossos instance to search for the dataset. It defaults to the url from your currentwebknossos_context
, using https://webknossos.org as a fallback.bbox
,layers
, andmags
specify which parts of the dataset to download. If nothing is specified the whole image, all layers, and all mags are downloaded respectively.path
andexist_ok
specify where to save the downloaded dataset and whether to overwrite if thepath
exists.
This method imports image data in a folder as a WEBKNOSSOS dataset. The
image data can be 3D images (such as multipage tiffs) or stacks of 2D
images. In case of multiple 3D images or image stacks, those are mapped
to different layers. The exact mapping is handled by the argument
map_filepath_to_layer_name
, which can be a pre-defined strategy from
the enum ConversionLayerMapping
, or a custom callable, taking
a path of an image file and returning the corresponding layer name. All
files belonging to the same layer name are then grouped. In case of
multiple files per layer, those are usually mapped to the z-dimension.
The order of the z-slices can be customized by setting
z_slices_sort_key
.
The category of layers (color
vs segmentation
) is determined
automatically by checking if segmentation
is part of the path.
Alternatively, a category can be enforced by passing layer_category
.
Further arguments behave as in add_layer_from_images
, please also
refer to its documentation.
For more fine-grained control, please create an empty dataset and use
add_layer_from_images
.
Uploads this dataset to WEBKNOSSOS.
The new_dataset_name
parameter allows to assign a specific name for the dataset.
layers_to_link
allows to add (or override) a layer in the uploaded dataset, so that
it links to a layer of an existing dataset in WEBKNOSSOS. That way, already existing
layers don't need to be uploaded again.
If supplied, the jobs
parameter will determine the number of simultaneous chunk uploads. Defaults to 5.
Returns the RemoteDataset
upon successful upload.
Returns the layer called layer_name
of this dataset. The return type is webknossos.dataset.layer.Layer
.
This function raises an IndexError
if the specified layer_name
does not exist.
Creates a new layer called layer_name
and adds it to the dataset.
The dtype can either be specified per layer or per channel.
If neither of them are specified, uint8
per channel is used as default.
Creates the folder layer_name
in the directory of self.path
.
WKW layers can only be added to datasets on local file systems.
The return type is webknossos.dataset.layer.Layer
.
This function raises an IndexError
if the specified layer_name
already exists.
Creates a new layer called layer_name
and adds it to the dataset, in case it did not exist before.
Then, returns the layer.
For more information see add_layer
.
Creates a new layer called layer_name
with mag mag
from images
.
images
can be one of the following:
- glob-string
- list of paths
pims.FramesSequence
instance
Please see the pims docs for more information.
This method needs extra packages such as pims. Please install the respective extras,
e.g. using python -m pip install "webknossos[all]"
.
Further Arguments:
category
:color
by default, may be set to "segmentation"data_format
: by default wkw files are written, may be set to "zarr"mag
: magnification to use for the written datachunk_shape
,chunks_per_shard
,compress
: adjust how the data is stored on disktopleft
: set an offset in Mag(1) to start writing the data, only affecting the outputswap_xy
: set toTrue
to interchange x and y axis before writing to diskflip_x
,flip_y
,flip_z
: set toTrue
to reverse the respective axis before writing to diskdtype
: the read image data will be convertoed to this dtype usingnumpy.ndarray.astype
use_bioformats
: set toTrue
to only use the pims bioformats adapter directly, needs a JVM, set toFalse
to forbid using the bioformats adapter, by default it is tried as a last optionchannel
: may be used to select a single channel, if multiple are availabletimepoint
: for timeseries, select a timepoint to use by specifying it as an int, starting from 0czi_channel
: may be used to select a channel for .czi images, which differs from normal color-channelsbatch_size
: size to process the images, must be a multiple of the chunk-size z-axis for uncompressed and the shard-size z-axis for compressed layers, default is the chunk-size or shard-size respectivelyallow_multiple_layers
: set toTrue
if timepoints or channels may result in multiple layers being added (only the first is returned)max_layers
: only applies ifallow_multiple_layers=True
, limits the number of layers added via different channels or timepointstruncate_rgba_to_rgb
: only applies ifallow_multiple_layers=True
, set toFalse
to write four channels into layers instead of an RGB channelexecutor
: pass aClusterExecutor
instance to parallelize the conversion jobs across the batches
Deprecated, please use get_segmentation_layers()
.
Returns the only segmentation layer. Fails with a IndexError if there are multiple segmentation layers or none.
Deprecated, please use get_color_layers()
.
Returns the only color layer. Fails with a RuntimeError if there are multiple color layers or none.
Deletes the layer from the datasource-properties.json
and the data from disk.
Copies the data at foreign_layer
which belongs to another dataset to the current dataset.
Additionally, the relevant information from the datasource-properties.json
of the other dataset are copied too.
If new_layer_name is None, the name of the foreign layer is used.
Creates a symlink to the data at foreign_layer
which belongs to another dataset.
The relevant information from the datasource-properties.json
of the other dataset is copied to this dataset.
Note: If the other dataset modifies its bounding box afterwards, the change does not affect this properties
(or vice versa).
If make_relative is True, the symlink is made relative to the current dataset path.
If new_layer_name is None, the name of the foreign layer is used.
Symlinked layers can only be added to datasets on local file systems.
Copies the files at foreign_layer
which belongs to another dataset
to the current dataset via the filesystem. Additionally, the relevant
information from the datasource-properties.json
of the other dataset
are copied too. If new_layer_name is None, the name of the foreign
layer is used.
Creates a new dataset at new_dataset_path
and copies the data from the current dataset to empty_target_ds
.
If not specified otherwise, the voxel_size
, chunk_shape
, chunks_per_shard
and compress
of the current dataset
are also used for the new dataset.
WKW layers can only be copied to datasets on local file systems.
Create a new dataset at the given path. Link all mags of all existing layers. In addition, link all other directories in all layer directories to make this method robust against additional files e.g. layer/mappings/agglomerate_view.hdf5. This method becomes useful when exposing a dataset to webknossos. Only datasets on local filesystems can be shallow copied.
Compresses all mag views in-place that are not yet compressed.
Downsamples all layers that are not yet downsampled.
Deprecated, please use the constructor Dataset()
instead.
Deprecated, please use the constructor Dataset()
instead.
Returns a dict of all remote datasets visible for selected organization, or the organization of the logged in user by default.
The dict contains lazy-initialized RemoteDataset
values for keys indicating the dataset name.
import webknossos as wk
print(sorted(wk.Dataset.get_remote_datasets()))
ds = wk.Dataset.get_remote_datasets(
organization_id="scalable_minds"
)["l4dense_motta_et_al_demo"]
Strategies for mapping file paths to layers, for use in
Dataset.from_images
for the map_filepath_to_layer_name
argument.
If none of the strategies fit, the mapping can also be specified by a callable.
The first found image file is opened. If it appears to be
a 2D image, ENFORCE_LAYER_PER_FOLDER
is used,
if it appears to be 3D, ENFORCE_LAYER_PER_FILE
is used.
This is the default mapping.
Like INSPECT_SINGLE_FILE
, but the strategy
is determined for each image file separately.
Enforce a new layer per file. This is useful for 2D images that should be converted to 2D layers each.
Combines all found files into a single layer. This is only useful if all images are 2D.
Combine all files in a folder into one layer.
The first folders of the input path are each converted to one layer. This might be useful if multiple layers have stacks of 2D images, but parts of the stacks are in different folders.
Inherited Members
- enum.Enum
- name
- value
Representation of a dataset on the webknossos server, returned from Dataset.open_remote()
.
Read-only image data is streamed from the webknossos server using the same interface as Dataset
.
Additionally, metadata can be set via the additional properties below.
Do not call manually, please use Dataset.open_remote()
instead.
Do not call manually, please use Dataset.open_remote()
instead.
Assign the teams that are allowed to access the dataset. Specify the teams like this [Team.get_by_name("Lab_A"), ...]
.
Move the dataset to a folder. Specify the folder like this RemoteFolder.get_by_path("Datasets/Folder_A")
.
Inherited Members
- Dataset
- ConversionLayerMapping
- path
- open_remote
- download
- from_images
- layers
- voxel_size
- scale
- name
- default_view_configuration
- read_only
- upload
- get_layer
- add_layer
- get_or_add_layer
- add_layer_like
- add_layer_for_existing_files
- add_layer_from_images
- get_segmentation_layer
- get_segmentation_layers
- get_color_layer
- get_color_layers
- delete_layer
- add_copy_layer
- add_symlink_layer
- add_fs_copy_layer
- copy_dataset
- shallow_copy_dataset
- compress
- downsample
- create
- get_or_create
- get_remote_datasets