Skip to content

Managing Datasets

Working with 3D (and 2D) image datasets is at the heart of webKnossos.

Read the section on file and data formats if you are interested in the technical background and concepts behind webKnossos datasets.

Importing Datasets

Uploading through the web browser

The easiest way to get started with working on your datasets is through the webKnossos web interface. You can directly upload your dataset through the browser.

  1. From the My dataset tab in the user dashboard, click the Add Dataset button.
  2. Provide some metadata information:
  3. a name
  4. give access permissions for one or more teams (use the default team if all members of your organization should be able to see it)
  5. scale of each voxel (in nanometers)
  6. Drag and drop your data into the upload section
  7. Click the Upload button

webKnossos uses the WKW-format internally to display your data. If your data is already in WKW you can simply drag your folder (or zip archive of that folder) into the upload view.

If your data is not in WKW, you can either:

  • upload the data in a supported file format and webKnossos will automatically convert it to WKW (webknossos.org only). Depending on the size of the dataset, the conversion will take some time. You can check the progress at the "Jobs" page or the "My Datasets" page in the dashboard (both will update automatically).
  • Convert your data manually to WKW.

In particular, the following file formats are supported for uploading (and conversion):

Once the data is uploaded (and potentially converted), you can further configure a dataset Settings and double-check layer properties, finetune access rights & permissions, or set default values for rendering.

Working with Neuroglancer and BossDB dataset

On webKnossos.org you can work directly with

  • datasets in the Neuroglancer precomputed format stored in the Google Cloud
  • datasets provided by a BossDB server

To import these datasets:

  1. From the My dataset tab in the user dashboard, click the Add Dataset button.
  2. Select the Add Neuroglancer Dataset or Add BossDB Dataset tab
  3. Provide some metadata information:
  4. a name
  5. a URL or domain/collection identifier to locate the dataset on the remote service
  6. authentication credentials for accessing the resources on the remote service
  7. Click the Add button

webKnossos will NOT download/copy any data from these third-party data providers. Rather, any data viewed in webKnossos will be streamed read-only and directly from the remote source. Any other webKnossos feature, e.g., annotations, access rights, will be stored in webKnossos and do not affect these services.

Note, this may count against any usage limits or minutes as defined by these third-party services. Check with the service provider or dataset owner.

Working with Zarr dataset

We are working on integrating full Zarr support into webKnossos. If you have datasets in the Zarr format and would like to work with us on building, testing, and refining the Zarr integration into webKnossos then please contact us.

Uploading through the Python API

For those wishing to automate dataset upload or to do it programmatically, check out the webKnossos Python library. It allows you to create, manage and upload datasets as well.

Uploading through the File System

-- (Self-Hosted Instances Only)--

On self-hosted instances, large datasets can be efficiently imported by placing them directly in the file system:

  • Place the dataset at <webKnossos directory>/binaryData/<Organization name>/<Dataset name>. For example /opt/webknossos/binaryData/Springfield_University/great_dataset.
  • Go to the dataset view on the dashboard
  • Use the refresh button on the dashboard or wait for webKnossos to detect the dataset (up to 10min)

Typically webKnossos can infer all the required metadata for a dataset automatically and import datasets automatically on refresh. In some cases, you will need to manually import a dataset and provide more information: * On the dashboard, click Import for your new dataset * Provided the requested properties, such as scale and largestSegmentId. See the section on configuring datasets below for more detailed explanations of these parameters.

Info

If you uploaded the dataset along with a datasource-properties.json metadata file, the dataset will be imported automatically without any additional manual steps.

-- Self-Hosted Instances Only --

When you have direct file system access, you can also use symbolic links to import your data into webKnossos. This might be useful when you want to create new datasets based on potentially very large raw microscopy data and symlink it to one or several segmentation layers.

Note, when using Docker, the targets of the link also need to be available to the container through mounts.

For example, you could have a link from /opt/webknossos/binaryData/sample_organization/awesome_dataset to /cluster/path/to/dataset123. To make this dataset available to the Docker container, you need to add /cluster as another volume mount. You can add this directly to the docker-compose.yml:

...
services:
  webknossos:
    ...
    volumes:
      - ./data:/webknossos/binaryData
      - /cluster:/cluster
...

Converting Datasets

Any dataset uploaded through the web interface at webknossos.org is automatically converted for compatibility.

For manual conversion, we provide the following software tools and libraries:

See page on software tooling for more.

Configuring Datasets

You can configure the metadata, permission, and other properties of a dataset at any time.

Note, any changes made to a dataset may influence the user experience of all users in your organization working with that dataset, e.g., removing access rights working, adding/removing layers, or setting default values for rendering the data.

To make changes, click on the "Settings" action next to a dataset in the "My Datasets" tab of your dashboard. Editing these settings requires your account to have enough access rights and permissions. Read more about this.

Data Tab

The Data tab contains the settings for correctly reading the dataset as the correct data type (e.g., uint8), setting up, and configuring any layers.

  • Scale: The physical size of a voxel in nanometers, e.g., 11, 11, 24

For each detected layer:

  • Bounding Box: The position and extents of the dataset layer in voxel coordinates. The format is x, y, z, x_size,y_size, z_size or respectively min_x, min_y, min_z, (max_x - min_x), (max_y - min_y), (max_z - min_z).
  • Largest Segment ID: The highest ID that is currently used in the respective segmentation layer. This is required for volume annotations where new objects with incrementing IDs are created. Only applies to segmentation layers.

The Advanced view lets you edit the underlying JSON configuration directly. Toggle between the Advanced and Simple page in the upper right. Advanced mode is only recommended for low level access to dataset properties and users familiar with the datasource-properties.json format.

webKnossos automatically periodically checks and detects changes to a dataset's metadata (datasource-properties.json) on disk (only relevant for self-hosted instances). Before applying these suggestions, users can preview all the new settings (as JSON) and inspect just the detected difference (as JSON).

Dataset Editing: Data Tab

Sharing & Permissions Tab

  • Make dataset publicly accessible: By default, a dataset can only be accessed by users from your organization with the correct access permissions. Turning a dataset to public will allow anyone in the general public to view the dataset when sharing a link to the dataset without the need for a webKnossos account. Anyone can start using this dataset to create annotations. Enable this setting if you want to share a dataset in a publication, social media, or any other public website.
  • Teams allowed to access this dataset: Defines which teams of your organization have permission to work with this dataset. By default, no team has access, but users with admin and team manager roles can see and edit the dataset.
  • Sharing Link: A web URL pointing to this dataset for easy sharing that allows any user to view your dataset. The URL contains an access token to allow people to view the dataset without a webKnossos account. The access token is random, and therefore the URL cannot be guessed by visitors. You may also revoke the access token to create a new one. Anyone with a URL containing a revoked token will no longer have access to this dataset. Read more in the Sharing guide.

Dataset Editing: Sharing Tab

Metadata Tab

  • Display Name: A meaningful name for a dataset other than its (automatically assigned) technical name which is usually limited by naming rules of file systems. It is displayed in various parts of webKnossos. The display name may contain special characters and can also be changed without invalidating already created sharing URLs. It can also be useful when sharing datasets with outsiders while "hiding" any internal naming schemes or make it more approachable, e.g., L. Simpson et al.: Full Neuron Segmentation instead of neuron_seg_v4_2022.
  • Description: A free-text field for providing more information about your datasets, e.g., authors, paper reference, descriptions, etc. Supports Markdown formatting. The description will be featured in the webKnossos UI when opening a dataset in view mode.

Dataset Editing: Metadata Tab

View Configuration Tab

The View configuration tab lets you set defaults for viewing this dataset. Anytime a user opens a dataset or creates a new annotation based on this dataset, these default values will be applied.

Defaults include:

  • Position: Default position of the dataset in voxel coordinates. When opening the dataset, users will be located at this position.
  • Zoom: Default zoom.
  • Interpolation: Whether interpolation should be enabled by default.
  • Layer Configuration: Advanced feature to control the default settings on a per-layer basis. It needs to be configured in JSON format. E.g., layer visibility & opacity, color, contrast/brightness/intensity range ("histogram sliders"), and many more.

Dataset Editing: View Configuration Tab

Of course, the defaults can all be overwritten and adjusted once a user opens the dataset in the main webKnossos interface and makes changes to any of these settings in his viewports.

For self-hosted webKnossos instances, there are two ways to set default View Configuration settings:

  • in the web UI as described above
  • inside the datasource_properties.json on disk

The View Configuration from the web UI takes precedence over the datasource_properties.json. You don't have to set complete View Configurations in either option, as webKnossos will fill missing attributes with sensible defaults.

Delete Tab

Offers an option to delete a dataset and completely removes it from webKnossos. Careful, this can not be undone!

Dataset Editing: Delete Tab

Dataset Sharing

Read more in the Sharing guide

Using External Datastores

The system architecture of webKnossos allows for versatile deployment options where you can install a dedicated datastore server directly on your lab's cluster infrastructure. This may be useful when dealing with large datasets that should remain in your data center. Please contact us or write a post, if you require any assistance with your setup.

scalable minds also offers a dataset alignment tool called Voxelytics Align. Learn more.

Sample Datasets

For convenience and testing, we provide a list of sample datasets for webKnossos:

Back to top