Non-Interactive Functions and Utilities

Non-interactive functions included in fusion-tools include annotation readers, mask generators and image region extractors, annotation filters, some statistical methods, and omics data handlers.

Shape Utilities

Shape utility functions include all functions which create annotations or perform spatial queries on sets of annotations.

Utility functions for data derived from FUSION

fusion_tools.utils.shapes.align_object_props(geo_ann: dict, prop_df: DataFrame | list, prop_cols: list | str, alignment_type: str, prop_key: None | str = None) → dict

Aligning GeoJSON formatted annotations with an external file containing properties for each feature

Parameters:

geo_ann (dict) – GeoJSON formatted annotations to align with external file
prop_df (Union[pd.DataFrame,list]) – Property DataFrame or list containing DataFrames to align with GeoJSON
prop_cols (Union[list,str]) – Column(s) containing property information in each DataFrame
alignment_type (str) – Process to use for aligning rows of property DataFrame to GeoJSON
prop_key – Name of property to assign to the new aligned property (only one)
prop_key – Union[None,str], optional

Returns:

GeoJSON annotations with aligned properties applied

Return type:

dict

fusion_tools.utils.shapes.detect_geojson(query_annotations: list | dict)

Check whether a list/dict of annotations are in GeoJSON format

Parameters:: query_annotations (Union[list,dict]) – Input query annotation

fusion_tools.utils.shapes.detect_histomics(query_annotations: list | dict)

Check whether a list/dict of annotations are in histomics format

Parameters:: query_annotations (Union[list,dict]) – Input query annotation

fusion_tools.utils.shapes.detect_image_overlay(query_annotations: list | dict): Checking whether a list/dict of annotations contain an image overlay

fusion_tools.utils.shapes.export_annotations(ann_geojson: dict | list, format: str, save_path: str, ann_options: dict = {})

Exporting GeoJSON annotations to a desired format

Parameters:

ann_geojson (Union[dict,list]) – Individual or list of GeoJSON formatted annotations
format (str) – What format to export these annotations to
save_path (str) – Where to save the exported annotations
ann_options (dict, optional) – Additional options to pass to export (used to add an id or layer name for Aperio formatted annotations)

fusion_tools.utils.shapes.extract_geojson_properties(geo_list: list, reference_object: str | None = None, ignore_list: list | None = None, nested_depth: int = 4) → list

Extract property names and info for provided list of GeoJSON structures.

Parameters:

geo_list (list) – List of GeoJSON dictionaries containing properties
reference_object (Union[str,None], optional) – File path to reference object containing more information for each structure, defaults to None
ignore_list (Union[list,None], optional) – List of properties to hide from the main view, defaults to None
nested_depth (int, optional) – For properties stored as nested dictionaries, specify desired depth (depth of 2 = {‘property_name’: {‘sub-prop1’: val, etc.}}), defaults to 2

Returns:

List of accessible properties in visualization session.

Return type:

list

fusion_tools.utils.shapes.extract_nested_prop(main_prop_dict: dict, depth: int, path: tuple = (), values_list: list = None, _seen: set = None)

Extracted nested properties up to depth level.

Parameters:

main_prop_dict (dict) – Main dictionary containing nested properties. ex: {‘main_prop’: {‘sub_prop1’: value1, ‘sub_prop2’: value2}}
depth (int) – Number of levels to extend into nested dictionary

fusion_tools.utils.shapes.find_intersecting(geo_source: dict | str, geo_query: Polygon, return_props: bool = True, return_shapes: bool = True)

Return properties and/or shapes of features from geo_source that intersect with geo_query

Parameters:

geo_source (dict) – Source GeoJSON where you are searching for intersecting features
geo_query (shapely.geometry.Polygon) – Query polygon used to filter source GeoJSON features
return_props (bool, optional) – Whether or not to return properties of intersecting features
return_shapes (bool, optional) – Whether or not to return shape information of intersecting features

Returns:

Intersecting properties and/or shapes from geo_source

Return type:

tuple

fusion_tools.utils.shapes.find_nested_levels(nested_dict) → int

Find number of levels for nested dictionary

Parameters:: nested_dict (dict) – dictionary containing nested values
Returns:: number of levels for nested dictionary
Return type:: int

fusion_tools.utils.shapes.indices_to_path(indices): From numpy array of coordinates to path string for adding to plotly figure layout

fusion_tools.utils.shapes.load_aperio(xml_path: str) → list

Loading Aperio formatted annotations

Parameters:: xml_path (str) – Path to Aperio formatted annotations (XML)
Returns:: GeoJSON FeatureCollection formatted annotations for each layer in XML
Return type:: list

fusion_tools.utils.shapes.load_geojson(geojson_path: str, name: str | None = None) → dict

Load GeoJSON annotations from file path. Optionally add names for GeoJSON FeatureCollections

Parameters:

geojson_path (str) – Path to GeoJSON file
name (Union[str,None], optional) – Name for structure present in FeatureCollection, defaults to None

Returns:

GeoJSON FeatureCollection dictionary

Return type:

dict

fusion_tools.utils.shapes.load_histomics(json_path: str) → list

Load large-image annotation from filepath

Parameters:: json_path (str) – Path to large-image formatted annotations
Returns:: GeoJSON FeatureCollection formatted annotation
Return type:: list

fusion_tools.utils.shapes.load_polygon_csv(csv_path: str, name: str, shape_cols: list | str, group_by_col: str | None, property_cols: str | list | None, shape_options: dict) → dict: Load csv formatted annotations from filepath shape_cols: should be x,y (names of columns for x and then y coordinates) group_by_col: name of column to group features by (used to determine which coordinates belong to the same structure for non-point annotations) property_cols: list of columns containing properties shape_options: dict with “radius” for point annotations (can be number or column that has number)

fusion_tools.utils.shapes.load_visium(visium_path: str, include_var_names: list = [], include_obs: list = [], mpp: float | None = None, scale_factor: float | str | None = None, verbose: bool = True)

Loading 10x Visium Spot annotations from an h5ad file or csv file containing spot center coordinates. Adds any of the variables listed in var_names and also the barcodes associated with each spot (if the path is an h5ad file).

Parameters:

visium_path (str) – Path to the h5ad (anndata) formatted Visium data or csv file containing “imagerow” and “imagecol” columns
include_var_names (list, optional) – List of additional variables to add to the generated annotations (barcode is added by default), defaults to []
mpp (Union[float,None], optional) – If the Microns Per Pixel (MPP) is known for this image then pass it here to save time calculating spot diameter., defaults to None

fusion_tools.utils.shapes.load_visiumhd(visiumhd_path: str, resolution_level: int, include_analysis_path: str | list | None = None, include_analysis_name: str | list | None = None, verbose: bool = True)

Generating annotations for a VisiumHD dataset

Parameters:

visiumhd_path (str) – Path to “binned_outputs”
resolution_level (int) – Number representing the length of one side of the square
include_analysis_path (Union[str,list,None], optional) – Path to various analyses performed on these ROIs, can either be output of spaceranger or any csv file with “barcode” column to be used for alignment., defaults to None
include_analysis_name (Union[str,list,None], optional) – Name to use for each included analysis, if none are provided, name is inferred from {path}.split(os.sep)[-2]., defaults to None

fusion_tools.utils.shapes.path_to_indices(path): From SVG path to numpy array of coordinates, each row being a (row, col) point

fusion_tools.utils.shapes.path_to_mask(path, shape): From SVG path to a boolean array where all pixels enclosed by the path are True, and the other pixels are False.

fusion_tools.utils.shapes.process_filters_queries(filter_list: list, spatial_list: list, structures: list, all_geo_list: list)

Filter GeoJSON list based on lists of both spatial and property filters.

Parameters:

filter_list (list) – List of property filters (keys = name: “name of property”, range: “either a list of categorical values or min-max for quantitative”)
spatial_list (list) – List of spatial filters (keys= type: “predicate”, structure: “name of structure that is basis of predicate”)
structures (list) – List of included structures in final GeoJSON
all_geo_list (list) – List of GeoJSON objects to search

Returns:

Filtered GeoJSON where all included structures are included as one FeatureCollection and a reference list containing original structure name and index

Return type:

tuple

fusion_tools.utils.shapes.spatially_aggregate(child_geo: dict, parent_geos: list, separate: bool = True, summarize: bool = True, ignore_list: list = ['_id', '_index'])

Aggregate intersecting feature properties to a provided GeoJSON

Parameters:

child_geo (dict) – GeoJSON object that is receiving aggregated properties
parent_geos (list) – List of GeoJSON objects which are intersecting with child_geo

Returns:

Updated child_geo object with new properties from intersecting parent_geos

Return type:

dict

Image Utilities

Image utility functions include methods which read images/regions of images based on user inputs.

Omics Utilities

Omics utilities include methods which query external APIs to derive further information from Omics data queries as well as some utility functions for enforcing cell type hierarchies.

Making some utility functions for handling –omics data

fusion_tools.utils.omics.get_asct(id: str | int): Get Anatomical Structure & Cell Type associated with a given HGNC Id.

fusion_tools.utils.omics.get_cell(id: str): Get all the cell types available within a given anatomical structure Input has to be an UBERON id “UBERON_######…”

fusion_tools.utils.omics.get_gene_info(id: str | list, species: str = 'human', fields: list = ['HGNC', 'alias', 'summary'], size: int = 5): Get information about a given gene id or list of ids. By default returns HGNC, alias, and summary Can be expanded to include go, pubmed articles, etc.

fusion_tools.utils.omics.group_subtypes(geo_props: dict, name: str, key: dict, keep_zeros: bool = True, normalize: bool = True) → dict

Grouping together properties into an lower-level descriptor

Parameters:

geo_props (dict) – Property dict for a single Feature
name (str) – Name of property containing “sub-properties” to be aggregated
key (dict) – Dictionary containing keys and values pertaining to sub-properties to be aggregated to each key
keep_zeros (bool, optional) – Whether or not to keep aggregated properties which sum to zero, defaults to True
normalize (bool, optional) – Whether to normalize this set of keys to sum to 1 or keep as sums, defaults to True

Returns:

Updated property dictionary containing lower-level descriptor and keys

Return type:

dict

fusion_tools.utils.omics.selective_aggregation(child_geo: dict, parent_geo: dict, include_keys: dict = {}, aggregate_dropped: bool = True, dropped_name: str = 'undefined', re_normalize: bool = True)

Selectively aggregate different fields for each intersecting structure. Useful for only aggregating cell types which should be found within a specific structure

Parameters:

child_geo (dict) – Child structure to be receiving aggregated properties
parent_geo (dict) – Parent structure that contains properties that are going to be aggregated within the child geos.
include_keys (dict, optional) – Dictionary containing keys for each property that is modified and a list of values which should be included for that key., defaults to {}
aggregate_dropped (bool, optional) – Whether or not to include dropped values in the final aggregation, defaults to True
dropped_name (str, optional) – Name to use for dropped names that are aggregated. Ignored if aggregate_dropped=False, defaults to ‘undefined’
re_normalize (bool, optional) – Whether or not to re-normalize values after dropping keys., defaults to True

Returns:

Child structure with selectively aggregated properties from the parent structure.

Return type:

dict