Non-Interactive Functions and Utilities

Non-interactive functions included in fusion-tools include annotation readers, mask generators and image region extractors, annotation filters, some statistical methods, and omics data handlers.

Shape Utilities

Shape utility functions include all functions which create annotations or perform spatial queries on sets of annotations.

Utility functions for data derived from FUSION

fusion_tools.utils.shapes.align_object_props(geo_ann: dict, prop_df: DataFrame | list, prop_cols: list | str, alignment_type: str, prop_key: None | str = None) dict

Aligning GeoJSON formatted annotations with an external file containing properties for each feature

Parameters:
  • geo_ann (dict) – GeoJSON formatted annotations to align with external file

  • prop_df (Union[pd.DataFrame,list]) – Property DataFrame or list containing DataFrames to align with GeoJSON

  • prop_cols (Union[list,str]) – Column(s) containing property information in each DataFrame

  • alignment_type (str) – Process to use for aligning rows of property DataFrame to GeoJSON

  • prop_key – Name of property to assign to the new aligned property (only one)

  • prop_key – Union[None,str], optional

Returns:

GeoJSON annotations with aligned properties applied

Return type:

dict

fusion_tools.utils.shapes.detect_geojson(query_annotations: list | dict)

Check whether a list/dict of annotations are in GeoJSON format

Parameters:

query_annotations (Union[list,dict]) – Input query annotation

fusion_tools.utils.shapes.detect_histomics(query_annotations: list | dict)

Check whether a list/dict of annotations are in histomics format

Parameters:

query_annotations (Union[list,dict]) – Input query annotation

fusion_tools.utils.shapes.detect_image_overlay(query_annotations: list | dict)

Checking whether a list/dict of annotations contain an image overlay

fusion_tools.utils.shapes.export_annotations(ann_geojson: dict | list, format: str, save_path: str, ann_options: dict = {})

Exporting GeoJSON annotations to a desired format

Parameters:
  • ann_geojson (Union[dict,list]) – Individual or list of GeoJSON formatted annotations

  • format (str) – What format to export these annotations to

  • save_path (str) – Where to save the exported annotations

  • ann_options (dict, optional) – Additional options to pass to export (used to add an id or layer name for Aperio formatted annotations)

fusion_tools.utils.shapes.extract_geojson_properties(geo_list: list, reference_object: str | None = None, ignore_list: list | None = None, nested_depth: int = 4) list

Extract property names and info for provided list of GeoJSON structures.

Parameters:
  • geo_list (list) – List of GeoJSON dictionaries containing properties

  • reference_object (Union[str,None], optional) – File path to reference object containing more information for each structure, defaults to None

  • ignore_list (Union[list,None], optional) – List of properties to hide from the main view, defaults to None

  • nested_depth (int, optional) – For properties stored as nested dictionaries, specify desired depth (depth of 2 = {‘property_name’: {‘sub-prop1’: val, etc.}}), defaults to 2

Returns:

List of accessible properties in visualization session.

Return type:

list

fusion_tools.utils.shapes.extract_nested_prop(main_prop_dict: dict, depth: int, path: tuple = (), values_list: list = None, _seen: set = None)

Extracted nested properties up to depth level.

Parameters:
  • main_prop_dict (dict) – Main dictionary containing nested properties. ex: {‘main_prop’: {‘sub_prop1’: value1, ‘sub_prop2’: value2}}

  • depth (int) – Number of levels to extend into nested dictionary

fusion_tools.utils.shapes.find_intersecting(geo_source: dict | str, geo_query: Polygon, return_props: bool = True, return_shapes: bool = True)

Return properties and/or shapes of features from geo_source that intersect with geo_query

Parameters:
  • geo_source (dict) – Source GeoJSON where you are searching for intersecting features

  • geo_query (shapely.geometry.Polygon) – Query polygon used to filter source GeoJSON features

  • return_props (bool, optional) – Whether or not to return properties of intersecting features

  • return_shapes (bool, optional) – Whether or not to return shape information of intersecting features

Returns:

Intersecting properties and/or shapes from geo_source

Return type:

tuple

fusion_tools.utils.shapes.find_nested_levels(nested_dict) int

Find number of levels for nested dictionary

Parameters:

nested_dict (dict) – dictionary containing nested values

Returns:

number of levels for nested dictionary

Return type:

int

fusion_tools.utils.shapes.indices_to_path(indices)

From numpy array of coordinates to path string for adding to plotly figure layout

fusion_tools.utils.shapes.load_aperio(xml_path: str) list

Loading Aperio formatted annotations

Parameters:

xml_path (str) – Path to Aperio formatted annotations (XML)

Returns:

GeoJSON FeatureCollection formatted annotations for each layer in XML

Return type:

list

fusion_tools.utils.shapes.load_geojson(geojson_path: str, name: str | None = None) dict

Load GeoJSON annotations from file path. Optionally add names for GeoJSON FeatureCollections

Parameters:
  • geojson_path (str) – Path to GeoJSON file

  • name (Union[str,None], optional) – Name for structure present in FeatureCollection, defaults to None

Returns:

GeoJSON FeatureCollection dictionary

Return type:

dict

fusion_tools.utils.shapes.load_histomics(json_path: str) list

Load large-image annotation from filepath

Parameters:

json_path (str) – Path to large-image formatted annotations

Returns:

GeoJSON FeatureCollection formatted annotation

Return type:

list

fusion_tools.utils.shapes.load_polygon_csv(csv_path: str, name: str, shape_cols: list | str, group_by_col: str | None, property_cols: str | list | None, shape_options: dict) dict

Load csv formatted annotations from filepath shape_cols: should be x,y (names of columns for x and then y coordinates) group_by_col: name of column to group features by (used to determine which coordinates belong to the same structure for non-point annotations) property_cols: list of columns containing properties shape_options: dict with “radius” for point annotations (can be number or column that has number)

fusion_tools.utils.shapes.load_visium(visium_path: str, include_var_names: list = [], include_obs: list = [], mpp: float | None = None, scale_factor: float | str | None = None, verbose: bool = True)

Loading 10x Visium Spot annotations from an h5ad file or csv file containing spot center coordinates. Adds any of the variables listed in var_names and also the barcodes associated with each spot (if the path is an h5ad file).

Parameters:
  • visium_path (str) – Path to the h5ad (anndata) formatted Visium data or csv file containing “imagerow” and “imagecol” columns

  • include_var_names (list, optional) – List of additional variables to add to the generated annotations (barcode is added by default), defaults to []

  • mpp (Union[float,None], optional) – If the Microns Per Pixel (MPP) is known for this image then pass it here to save time calculating spot diameter., defaults to None

fusion_tools.utils.shapes.load_visiumhd(visiumhd_path: str, resolution_level: int, include_analysis_path: str | list | None = None, include_analysis_name: str | list | None = None, verbose: bool = True)

Generating annotations for a VisiumHD dataset

Parameters:
  • visiumhd_path (str) – Path to “binned_outputs”

  • resolution_level (int) – Number representing the length of one side of the square

  • include_analysis_path (Union[str,list,None], optional) – Path to various analyses performed on these ROIs, can either be output of spaceranger or any csv file with “barcode” column to be used for alignment., defaults to None

  • include_analysis_name (Union[str,list,None], optional) – Name to use for each included analysis, if none are provided, name is inferred from {path}.split(os.sep)[-2]., defaults to None

fusion_tools.utils.shapes.path_to_indices(path)

From SVG path to numpy array of coordinates, each row being a (row, col) point

fusion_tools.utils.shapes.path_to_mask(path, shape)

From SVG path to a boolean array where all pixels enclosed by the path are True, and the other pixels are False.

fusion_tools.utils.shapes.process_filters_queries(filter_list: list, spatial_list: list, structures: list, all_geo_list: list)

Filter GeoJSON list based on lists of both spatial and property filters.

Parameters:
  • filter_list (list) – List of property filters (keys = name: “name of property”, range: “either a list of categorical values or min-max for quantitative”)

  • spatial_list (list) – List of spatial filters (keys= type: “predicate”, structure: “name of structure that is basis of predicate”)

  • structures (list) – List of included structures in final GeoJSON

  • all_geo_list (list) – List of GeoJSON objects to search

Returns:

Filtered GeoJSON where all included structures are included as one FeatureCollection and a reference list containing original structure name and index

Return type:

tuple

fusion_tools.utils.shapes.spatially_aggregate(child_geo: dict, parent_geos: list, separate: bool = True, summarize: bool = True, ignore_list: list = ['_id', '_index'])

Aggregate intersecting feature properties to a provided GeoJSON

Parameters:
  • child_geo (dict) – GeoJSON object that is receiving aggregated properties

  • parent_geos (list) – List of GeoJSON objects which are intersecting with child_geo

Returns:

Updated child_geo object with new properties from intersecting parent_geos

Return type:

dict

Image Utilities

Image utility functions include methods which read images/regions of images based on user inputs.

Omics Utilities

Omics utilities include methods which query external APIs to derive further information from Omics data queries as well as some utility functions for enforcing cell type hierarchies.

Making some utility functions for handling –omics data

fusion_tools.utils.omics.get_asct(id: str | int)

Get Anatomical Structure & Cell Type associated with a given HGNC Id.

fusion_tools.utils.omics.get_cell(id: str)

Get all the cell types available within a given anatomical structure Input has to be an UBERON id “UBERON_######…”

fusion_tools.utils.omics.get_gene_info(id: str | list, species: str = 'human', fields: list = ['HGNC', 'alias', 'summary'], size: int = 5)

Get information about a given gene id or list of ids. By default returns HGNC, alias, and summary Can be expanded to include go, pubmed articles, etc.

fusion_tools.utils.omics.group_subtypes(geo_props: dict, name: str, key: dict, keep_zeros: bool = True, normalize: bool = True) dict

Grouping together properties into an lower-level descriptor

Parameters:
  • geo_props (dict) – Property dict for a single Feature

  • name (str) – Name of property containing “sub-properties” to be aggregated

  • key (dict) – Dictionary containing keys and values pertaining to sub-properties to be aggregated to each key

  • keep_zeros (bool, optional) – Whether or not to keep aggregated properties which sum to zero, defaults to True

  • normalize (bool, optional) – Whether to normalize this set of keys to sum to 1 or keep as sums, defaults to True

Returns:

Updated property dictionary containing lower-level descriptor and keys

Return type:

dict

fusion_tools.utils.omics.selective_aggregation(child_geo: dict, parent_geo: dict, include_keys: dict = {}, aggregate_dropped: bool = True, dropped_name: str = 'undefined', re_normalize: bool = True)

Selectively aggregate different fields for each intersecting structure. Useful for only aggregating cell types which should be found within a specific structure

Parameters:
  • child_geo (dict) – Child structure to be receiving aggregated properties

  • parent_geo (dict) – Parent structure that contains properties that are going to be aggregated within the child geos.

  • include_keys (dict, optional) – Dictionary containing keys for each property that is modified and a list of values which should be included for that key., defaults to {}

  • aggregate_dropped (bool, optional) – Whether or not to include dropped values in the final aggregation, defaults to True

  • dropped_name (str, optional) – Name to use for dropped names that are aggregated. Ignored if aggregate_dropped=False, defaults to ‘undefined’

  • re_normalize (bool, optional) – Whether or not to re-normalize values after dropping keys., defaults to True

Returns:

Child structure with selectively aggregated properties from the parent structure.

Return type:

dict