Digital Slide Archive (DSA)

Digital Slide Archive (DSA) is an open-source resource for organization of large whole slide images (WSIs) as well as providing an interface (HistomicsUI) for image annotation and running computational analyses. It provides a RESTful API which enables programmatic access of data that is stored on a given DSA instance as well as handling POST, GET, PUT, etc. requests.

fusion-tools provides several components which integrate with a running DSA instance to provide alternative interfaces for visualizing data, stored within image annotations, in conjunction with Histology images. Furthermore, fusion-tools provides a format for defining upload templates (UploadType*s) that allow adminstrators to pre-specify files, metadata, and processing steps used for a specific type of data. While it does not implement every possible process that is implemented in *DSA (for example, copying/moving items, modifying user details, and several others), fusion-tools may be a valuable resource for developers that use DSA to design custom visualization and interaction pages (in Python) to share with collaborators as well as integrating plugins with specific sets of inputs to user-interactions.

fusion_tools.handler module

Classes related to the DSAHandler and Tool components

class fusion_tools.handler.dsa_handler.DSAHandler(girderApiUrl: str, username: str | None = None, password: str | None = None)

Bases: Handler

Handler for DSA (digital slide archive) instance

add_metadata(item: str, metadata: dict, user_token: str | None = None)

Add metadata key/value to a specific item

Parameters:
  • item (str) – ID for item that is receiving the metadata

  • metadata (dict) – Metadata key/value combination (can contain multiple keys and values (JSON formatted))

add_plugin(image_name: str | list, user_token: str | None = None)

Add a plugin/CLI to the current DSA instance by name of the Docker image (requires admin login)

Parameters:

image_name (str) – Name of Docker image on Docker Hub

authenticate_new(username: str, password: str)
cancel_job(job_id: str, user_token: str)
create_dataset_builder(include: list | None = None)

Table view allowing parsing of dataset/slide-level metadata and adding remote/local slides to current session.

Parameters:

include (Union[list,None], optional) – List of collections to only include (None = include everything accessible to this user), defaults to None

Returns:

DatasetBuilder instance

Return type:

DatasetBuilder

create_login_component()

Creates login button for multiple DSA users to use the same fusion-tools instance

create_metadata_table(metadata_args: dict)

Create table of metadata keys/values for a folder or collection

Parameters:

metadata_args (dict) – Additional arguments specifying location of folder and any keys to ignore.

create_new_user(username, password, email, firstName, lastName)

Create a new user on this DSA instance

Parameters:
  • username (str) – Username (publically visible)

  • password (str) – Password (not publically visible)

  • email (str) – Email (must not overlap with other users)

  • firstName (str) – First Name to use for user

  • lastName (str) – Last Name to use for user

create_plugin_progress()

Creates component that monitors current and past job logs.

create_survey(survey_args: dict)

Create a survey component which will route collected data to a specific file in the connected DSA instance.

Parameters:

survey_args (dict) – Setup arguments for survey questions (keys = questions list, usernames list, storage folder)

create_uploader(upload_types: list)

Create uploader component layout to a specific folder. “uploader_args” contains optional additional arguments.

Parameters:

upload_types (list) – List of UploadTypes objects including current varieties of formatted, linked upload procedures

create_user_folder(parent_path, folder_name, user_token, description='')

Creating a folder in user’s public folder

get_annotation_names(item: str, user_token: str | None = None)
get_annotations(item: str, annotation_id: str | list | None = None, format: str | None = 'geojson', user_token: str | None = None)

Get annotations for an item in DSA

Parameters:
  • item (str) – Girder item Id for desired image

  • annotation_id (Union[str,list,None], optional) – If only a subset of annotations is desired, pass their ids here, defaults to None

  • format (Union[str,None], optional) – Desired format of annotations, defaults to ‘geojson’

Raises:
  • NotImplementedError – Invalid format passed

  • NotImplementedError – Invalid format passed

Returns:

Annotations for the queried item Id

Return type:

list

get_collection_slide_count(collection_name, ignore_histoqc=True, user_token: str | None = None) int

Get a count of all of the slides in a given collection across all child folders

Parameters:
  • collection_name (str) – Name of collection (‘/collection/{}’)

  • ignore_histoqc (bool, optional) – Whether to ignore folders containing histoqc outputs (not slides)., defaults to True

Returns:

Total count of slides (large-image objects) in a given collection

Return type:

int

get_collections(user_token: str | None = None) list

Get list of all available collections in DSA instance.

Returns:

List of available collections info.

Return type:

list

get_file_info(fileId: str, user_token: str | None = None) dict

Getting information for a given file (specifically what item it’s attached to).

Parameters:
  • fileId (str) – Girder Id of a file

  • user_token (Union[str,None], optional) – User session token, defaults to None

Returns:

Information on file

Return type:

dict

get_folder_folders(folder_id: str, folder_type: str = 'folder', user_token: str | None = None)

Get the folders within a folder

Parameters:
  • folder_id (str) – Girder Id for a folder

  • folder_type (str, optional) – Either “folder” or “collection”, defaults to ‘folder’

get_folder_info(folder_id: str, user_token: str | None = None) dict

Getting folder info from ID

Parameters:

folder_id (str) – ID assigned to that folder

Returns:

Dictionary with details like name, parentType, meta, updated, size, etc.

Return type:

dict

get_folder_rootpath(folder_id: str, user_token: str | None = None) list

Get the rootpath for a given folder Id.

Parameters:

folder_id (str) – Girder Id for a folder

Returns:

List of objects in that folder’s path that are parents

Return type:

list

get_folder_slides(folder_path: str, folder_type: str = 'folder', ignore_histoqc: bool = True, user_token: str | None = None) list

Get all slides in a folder

Parameters:
  • folder_path (str) – Path in DSA for a folder

  • folder_type (str, optional) – Whether it’s a folder or a collection

  • ignore_histoqc (bool, optional) – Whether or not to ignore images in the histoqc_outputs folder, defaults to True

Returns:

List of image items contained within a folder

Return type:

list

get_image_region(item_id: str, coords_list: list, style: dict | None = None, user_token: str | None = None) ndarray

Grabbing image region from list of bounding box coordinates

get_image_thumbnail(item_id: str, user_token: str | None = None) ndarray
get_item_info(itemId: str, user_token: str | None = None)
get_path_info(path: str, user_token: str | None = None) dict

Get information for a given resource path

Parameters:

item_path (str) – Path in DSA instance for a given resource

Returns:

Dictionary containing id and metadata, etc.

Return type:

dict

get_specific_job(job_id: str, user_token: str)
get_tile_server(item: str) DSATileServer

Create a tileserver for a given item

Parameters:

item (str) – Girder Item Id for the slide you want to create a tileserver for

Returns:

DSATileServer instance

Return type:

DSATileServer

get_user_jobs(user_id: str, user_token: str, offset: int = 0, limit: int = 0)
list_plugins(user_token: str)

List all of the plugins/CLIs available for the current DSA instance

make_boundary_mask(exterior_coords: list) ndarray

Making boundary mask for a set of exterior coordinates

Parameters:

exterior_coords (list) – List of exterior vertex coordinates

Returns:

Binary mask of external boundaries of object

Return type:

np.ndarray

post_annotations(item: str, annotations: str | list | dict | None = None, user_token: str | None = None)

Add annotations to an item in Girder.

Parameters:
  • item (str) – ID for the item that is receiving the annotations

  • annotations (Union[str,list,dict,None], optional) – Formatted dictionary, path, or list of dictionaries/paths with the annotations., defaults to None

query_annotation_count(item: str | list, user_token: str | None = None) DataFrame

Get count of structures in an item

Parameters:

item (Union[str,list]) – Girder item Id for image of interest

Returns:

Dataframe containing name and count of annotated structures

Return type:

pd.DataFrame

run_plugin(plugin_id: str, arguments: dict, user_token: str)

Run a plugin given a set of input arguments

Parameters:
  • plugin_id (str) – ID for plugin to run.

  • arguments (dict) – Dictionary containing keys/values for each input argument to a plugin

Interactive DSA components

DSALoginComponent

class fusion_tools.handler.login.DSALoginComponent(handler, default_user: dict | None = None)

Bases: DSATool

Handler for DSALoginComponent, enabling login to the running DSA instance

Parameters:

DSATool (None) – Sub-class of Tool specific to DSA components. Updates with session data by default.

display_login_fields(login_clicked, create_account_clicked, back_clicked)

Displaying login fields depending on if login, create account, or back is clicked

Parameters:
  • login_clicked (list) – Login button clicked

  • create_account_clicked (list) – Create Account button clicked

  • back_clicked (list) – Back icon clicked

gen_layout(session_data: dict)

Creating the layout for this component, assigning it to the DashBlueprint object

Parameters:

session_data (dict) – Dictionary containing relevant information for the current session

get_callbacks()
load(component_prefix: int)
submit_create_account(clicked, firstname_input, lastname_input, email_input, username_input, password_input, session_data)
submit_login(login_clicked, username_input, password_input, session_data)
update_layout(session_data: dict, use_prefix: bool)

Components and classes related to user surveys hosted in DSA

class fusion_tools.handler.survey.DSASurvey(dsa_handler, survey: SurveyType)

Bases: DSATool

Handler for DSASurvey component, letting users add a survey questionnaire to a layout (with optional login for targeting specific users).

Parameters:

Tool (None) – General class for components that perform visualization and analysis of data.

gen_layout(session_data: dict | None)
get_callbacks()
load(component_prefix: int)
class fusion_tools.handler.survey.SurveyType(question_list: list = [], users: list = [], storage_folder: str = '')

Bases: object

Components related to DSA Plugins

class fusion_tools.handler.plugin.DSAPluginProgress(handler)

Bases: DSATool

Handler for DSAPluginProgress component, letting users check the progress of currently running or previously run plugins as well as cancellation of running plugins.

Parameters:

Tool (None) – General class for components that perform visualization and analysis of data.

gen_layout(session_data: dict | None)
generate_plugin_table(session_data: dict, offset: int, limit: int, next_clicks: int, prev_clicks: int, use_prefix: bool)
get_callbacks()
get_logs_or_cancel(logs_clicked, cancel_clicked, table_rows, session_data)
load(component_prefix: int)
load_new_jobs(next_clicked, prev_clicked, session_data)
update_layout(session_data: dict, use_prefix: bool)
class fusion_tools.handler.plugin.DSAPluginRunner(handler: None)

Bases: DSATool

Handler for DSAPluginRunner component, letting users specify input arguments to plugins to run on connected DSA instance.

Parameters:

Tool (None) – General class for components that perform visualization and analysis of data.

find_executable_input(executable_dict, input_name) dict
gen_layout(session_data: dict)
get_callbacks()
get_executable_dict(plugin_info, session_data)
load(component_prefix: int)
load_plugin(plugin_dict, session_data, uploaded_files_data, component_index)
make_file_component(select_type: str, value: str | dict, component_index: int, disabled: bool)
make_input_component(input_dict, input_index)
parse_executable(exe_xml) dict
populate_plugin_inputs(cli_select, docker_select, session_data)
run_plugin_request(plugin_id, session_data, input_params_dict)
submit_plugin(clicked, docker_select, cli_select, plugin_inputs, session_data)
update_cli_options(docker_select, session_data)
update_layout(session_data: dict, use_prefix: bool)

DatasetBuilder Component

class fusion_tools.handler.dataset_builder.DatasetBuilder(handler, include_only: list | None = None)

Bases: DSATool

Handler for DatasetBuilder component, enabling selection/deselection of folders and slides to add to current visualization session.

Parameters:

DSATool (None) – Sub-class of Tool specific to DSA components. Updates with session data by default.

collection_selection(collection_rows, builder_data, collection_div_children)

Callback for when one/multiple collections are selected from the collections table

Parameters:
  • collection_rows (list) – Row indices of selected collections

  • builder_data (list) – Data store on available collections and currently included slides

  • colleciton_div_children (list) – Child cards created by collection_selection

Returns:

Children of collection-contents-div (items/folders within selected collections)

Return type:

list

extract_path_parts(current_parts: list | dict, search_key: list = ['props', 'children']) tuple

Recursively extract pieces of folder paths stored as clickable components.

Parameters:
  • current_parts (Union[list,dict]) – list or dictionary containing html.A or dbc.Stack of html.A components.

  • search_key (list, optional) – Property keys to search for in nested dicts, defaults to [‘props’,’children’]

Returns:

Tuple containing all the parts of the folder path

Return type:

tuple

gen_collections_dataframe(session_data: dict)
gen_layout(session_data: dict | None)

Generating DatasetBuilder layout, adding to DashBlueprint() object to be embedded in larger layout.

Parameters:

session_data (Union[dict,None]) – Data on current session, not used in this component.

get_callbacks()
get_clicked_part(current_parts: list | dict) list

Get the “n_clicks” value for components which have “id”. If they have “id” but not “n_clicks”, assign 0

Parameters:

current_parts (Union[list,dict]) – Either a list or dictionary containing components

Returns:

List of values corresponding to “n_clicks”

Return type:

list

get_component_indices(components: list | dict) list
load(component_prefix: int)
make_selectable_dash_table(dataframe: DataFrame, id: dict, multi_row: bool = True, selected_rows: list = [])

Generate a selectable DataTable to add to the layout

Parameters:
  • dataframe (pd.DataFrame) – Pandas DataFrame containing columns/rows of interest

  • id (dict) – Dictionary containing “type” and “index” keys for interactivity

  • multi_row (bool, optional) – Whether to allow selection of multiple rows in the table or just single, defaults to True

Returns:

dash_table.DataTable component to be added to layout

Return type:

dash_table.DataTable

make_selected_slide(slide_id: str, idx: int, local_slide: bool = False, use_prefix: bool = True)

Creating a visualization session component for a selected slide

Parameters:
  • slide_id (str) – Girder Id for the slide to be added

  • local_slide (bool) – Whether or not the slide is from the LocalTileServer or if it’s in the cloud

  • use_prefix – Whether or not to add the component prefix (initially don’t add, when updating the layout do add)

  • use_prefix – bool

organize_folder_contents(folder_info: dict, show_empty: bool = False, ignore_histoqc: bool = True) list

For a given folder selection, return a list of slides(0th) and folders (1th)

Parameters:
  • folder_info (dict) – Folder info dict returned by self.handler.get_path_info(path)

  • show_empty (bool, optional) – Whether or not to display folders which contain 0 slides, defaults to False

Returns:

List of slides within the current folder as well as folders within that folder

Return type:

list

slide_selection(slide_rows, slide_all, slide_rem_all, slide_rem, current_crumbs, slide_table_data, builder_data, current_slide_components, current_collection_components, vis_session_data)
update_folder_div(folder_row, crumb_click, collection_folders, current_crumbs, builder_data)

Selecting a folder from the collection’s folder table

Parameters:
  • folder_row (list) – Selected folder (list of 1 index)

  • crumb_click (int) – If one of the folder path parts was clicked it will trigger this.

  • collection_folders (list) – Current data in the collection’s folder table

  • current_crumbs (list) – List of current path parts that can be selected to go up a folder

  • builder_data (list) – Current contents of data store for dataset-builder, used for determining if a slide is already selected

Returns:

Sub-folder and slide selection tables for further selection

Return type:

tuple

update_layout(session_data: dict, use_prefix: bool)

Generating DatasetBuilder layout

Returns:

Div object containing interactive components for the SlideMap object.

Return type:

dash.html.Div.Div

update_vis_store(new_slide_data, current_vis_data)

Updating current visualization session based on selected slide(s)

Parameters:
  • new_slide_data (list) – New slides to be added to the Visualization Session

  • current_vis_data (list) – Current Visualization Session data

Returns:

Updated Visualization Session

Return type:

str

DSAUploader component and UploadType

class fusion_tools.handler.dataset_uploader.DSAUploadHandler(server, upload_folder, use_upload_id)

Bases: object

get()
get_after()
get_before()
post()
post_after()
post_annotations(upload_info)
post_before(req_data)
post_file_chunk(api_url, token, parentId, chunk, offset)
remove_file(path)
save_annotation_chunk()
class fusion_tools.handler.dataset_uploader.DSAUploadType(name: str, description: str, input_files: list = [], processing_plugins: list | None = None, required_metadata: list | None = None)

Bases: object

Formatted upload type for a DSAUploader Component.

class fusion_tools.handler.dataset_uploader.DSAUploader(handler, dsa_upload_types: DSAUploadType | list = [])

Bases: DSATool

Handler for DSAUploader component, handling uploading data to a specific folder, adding metadata, and running sets of preprocessing plugins.

Parameters:

DSATool (None) – Sub-class of Tool specific to DSA components. Updates with session data by default.

add_row_custom_metadata(clicked, custom_metadata)
create_upload_component(file_info, user_info, idx)
enable_submit_metadata(all_table_data, upload_type)
enable_upload_done(uploads_complete, upload_type, current_filenames, session_data, upload_file_data)

Enabling the “Done” button when all required uploads are uploaded

Parameters:
  • uploads_complete (list) – Current uploadComplete flags from active UploadComponents

  • upload_type (list) – Selected type of upload

extract_path_parts(current_parts: list | dict, search_key: list = ['props', 'children']) tuple

Recursively extract pieces of folder paths stored as clickable components.

Parameters:
  • current_parts (Union[list,dict]) – list or dictionary containing html.A or dbc.Stack of html.A components.

  • search_key (list, optional) – Property keys to search for in nested dicts, defaults to [‘props’,’children’]

Returns:

Tuple containing all the parts of the folder path

Return type:

tuple

gen_collections_dataframe()

Generating dataframe containing current collections

Returns:

Dataframe with each Collection

Return type:

pd.DataFrame

gen_layout(session_data: dict | None)
gen_metadata_table(required_metadata: list, upload_items: list)
get_callbacks()
get_clicked_part(current_parts: list | dict) list

Get the “n_clicks” value for components which have “id”. If they have “id” but not “n_clicks”, assign 0

Parameters:

current_parts (Union[list,dict]) – Either a list or dictionary containing components

Returns:

List of values corresponding to “n_clicks”

Return type:

list

load(component_prefix: int)
make_file_uploads(upload_type_value, upload_folder_path, session_data)

Making the file upload components for the selected UploadType

Parameters:
  • upload_type_value (list) – Selected UploadType from the dropdown menu

  • upload_folder_path (list) – Folder path to upload to

  • session_data (str) – Current visualization session data

make_selectable_dash_table(dataframe: DataFrame, id: dict, multi_row: bool = True, selected_rows: list = [])

Generate a selectable DataTable to add to the layout

Parameters:
  • dataframe (pd.DataFrame) – Pandas DataFrame containing columns/rows of interest

  • id (dict) – Dictionary containing “type” and “index” keys for interactivity

  • multi_row (bool, optional) – Whether to allow selection of multiple rows in the table or just single, defaults to True

Returns:

dash_table.DataTable component to be added to layout

Return type:

dash_table.DataTable

organize_folder_contents(folder_info: dict, show_empty: bool = True, ignore_histoqc: bool = True) list

For a given folder selection, return a list of slides(0th) and folders (1th)

Parameters:
  • folder_info (dict) – Folder info dict returned by self.handler.get_path_info(path)

  • show_empty (bool, optional) – Whether or not to display folders which contain 0 slides, defaults to False

Returns:

List of slides within the current folder as well as folders within that folder

Return type:

list

populate_folder_div(collection_clicked, user_clicked, folder_table_rows, folder_crumb, back_clicked, folder_table_data, folder_crumb_parent, session_data)

Generate the collection/user folder div.

Parameters:
  • collection_clicked (list) – Whether “Collection” was clicked

  • user_clicked (list) – Whether “User” was clicked

  • folder_table_rows (list) – Which rows in the folder table were clicked (all set to multi_row=False).

  • folder_crumb (list) – Whether a part of the file path was clicked.

  • back_clicked (list) – Whether the back arrow as clicked.

  • folder_table_data (list) – The current row data in the folder table.

  • folder_crumb_parent (list) – The parent container of all the folder path parts.

  • session_data (list) – Current visualization session data

populate_new_folder(create_clicked, submit_clicked, cancel_clicked, new_folder_name, parent_path, session_data)

Callback for creating a new folder at a specific location.

Parameters:
  • create_clicked (list) – Whether Create Folder was clicked

  • submit_clicked (list) – Whether Submit folder was clicked

  • cancel_clicked (list) – Whether Cancel was clicked

  • new_folder_name (list) – Name of new folder

  • parent_path (list) – Parent of folder path parts

  • session_data (list) – Current Visualization Session data

populate_processing_plugins(done_clicked, upload_type, path_parts, upload_files_data, session_data)
populate_upload_type(select_clicked, path_parts)
submit_metadata(clicked, upload_files_data, tables_data, session_data)
submit_plugin(clicked, plugin_info, plugin_inputs, upload_type, session_data)
update_layout(session_data: dict, use_prefix: bool)
class fusion_tools.handler.dataset_uploader.RequestData(request)

Bases: object