Non-Interactive Functions and Utilities
Non-interactive functions included in fusion-tools include annotation readers, mask generators and image region extractors, annotation filters, some statistical methods, and omics data handlers.
Shape Utilities
Shape utility functions include all functions which create annotations or perform spatial queries on sets of annotations.
Utility functions for data derived from FUSION
- fusion_tools.utils.shapes.align_object_props(geo_ann: dict, prop_df: DataFrame | list, prop_cols: list | str, alignment_type: str, prop_key: None | str = None) dict
Aligning GeoJSON formatted annotations with an external file containing properties for each feature
- Parameters:
geo_ann (dict) – GeoJSON formatted annotations to align with external file
prop_df (Union[pd.DataFrame,list]) – Property DataFrame or list containing DataFrames to align with GeoJSON
prop_cols (Union[list,str]) – Column(s) containing property information in each DataFrame
alignment_type (str) – Process to use for aligning rows of property DataFrame to GeoJSON
prop_key – Name of property to assign to the new aligned property (only one)
prop_key – Union[None,str], optional
- Returns:
GeoJSON annotations with aligned properties applied
- Return type:
dict
- fusion_tools.utils.shapes.detect_geojson(query_annotations: list | dict)
Check whether a list/dict of annotations are in GeoJSON format
- Parameters:
query_annotations (Union[list,dict]) – Input query annotation
- fusion_tools.utils.shapes.detect_histomics(query_annotations: list | dict)
Check whether a list/dict of annotations are in histomics format
- Parameters:
query_annotations (Union[list,dict]) – Input query annotation
- fusion_tools.utils.shapes.detect_image_overlay(query_annotations: list | dict)
Checking whether a list/dict of annotations contain an image overlay
- fusion_tools.utils.shapes.export_annotations(ann_geojson: dict | list, format: str, save_path: str, ann_options: dict = {})
Exporting GeoJSON annotations to a desired format
- Parameters:
ann_geojson (Union[dict,list]) – Individual or list of GeoJSON formatted annotations
format (str) – What format to export these annotations to
save_path (str) – Where to save the exported annotations
ann_options (dict, optional) – Additional options to pass to export (used to add an id or layer name for Aperio formatted annotations)
- fusion_tools.utils.shapes.extract_geojson_properties(geo_list: list, reference_object: str | None = None, ignore_list: list | None = None, nested_depth: int = 4) list
Extract property names and info for provided list of GeoJSON structures.
- Parameters:
geo_list (list) – List of GeoJSON dictionaries containing properties
reference_object (Union[str,None], optional) – File path to reference object containing more information for each structure, defaults to None
ignore_list (Union[list,None], optional) – List of properties to hide from the main view, defaults to None
nested_depth (int, optional) – For properties stored as nested dictionaries, specify desired depth (depth of 2 = {‘property_name’: {‘sub-prop1’: val, etc.}}), defaults to 2
- Returns:
List of accessible properties in visualization session.
- Return type:
list
- fusion_tools.utils.shapes.extract_nested_prop(main_prop_dict: dict, depth: int, path: tuple = (), values_list: list = None, _seen: set = None)
Extracted nested properties up to depth level.
- Parameters:
main_prop_dict (dict) – Main dictionary containing nested properties. ex: {‘main_prop’: {‘sub_prop1’: value1, ‘sub_prop2’: value2}}
depth (int) – Number of levels to extend into nested dictionary
- fusion_tools.utils.shapes.find_intersecting(geo_source: dict | str, geo_query: Polygon, return_props: bool = True, return_shapes: bool = True)
Return properties and/or shapes of features from geo_source that intersect with geo_query
- Parameters:
geo_source (dict) – Source GeoJSON where you are searching for intersecting features
geo_query (shapely.geometry.Polygon) – Query polygon used to filter source GeoJSON features
return_props (bool, optional) – Whether or not to return properties of intersecting features
return_shapes (bool, optional) – Whether or not to return shape information of intersecting features
- Returns:
Intersecting properties and/or shapes from geo_source
- Return type:
tuple
- fusion_tools.utils.shapes.find_nested_levels(nested_dict) int
Find number of levels for nested dictionary
- Parameters:
nested_dict (dict) – dictionary containing nested values
- Returns:
number of levels for nested dictionary
- Return type:
int
- fusion_tools.utils.shapes.indices_to_path(indices)
From numpy array of coordinates to path string for adding to plotly figure layout
- fusion_tools.utils.shapes.load_aperio(xml_path: str) list
Loading Aperio formatted annotations
- Parameters:
xml_path (str) – Path to Aperio formatted annotations (XML)
- Returns:
GeoJSON FeatureCollection formatted annotations for each layer in XML
- Return type:
list
- fusion_tools.utils.shapes.load_geojson(geojson_path: str, name: str | None = None) dict
Load GeoJSON annotations from file path. Optionally add names for GeoJSON FeatureCollections
- Parameters:
geojson_path (str) – Path to GeoJSON file
name (Union[str,None], optional) – Name for structure present in FeatureCollection, defaults to None
- Returns:
GeoJSON FeatureCollection dictionary
- Return type:
dict
- fusion_tools.utils.shapes.load_histomics(json_path: str) list
Load large-image annotation from filepath
- Parameters:
json_path (str) – Path to large-image formatted annotations
- Returns:
GeoJSON FeatureCollection formatted annotation
- Return type:
list
- fusion_tools.utils.shapes.load_polygon_csv(csv_path: str, name: str, shape_cols: list | str, group_by_col: str | None, property_cols: str | list | None, shape_options: dict) dict
Load csv formatted annotations from filepath shape_cols: should be x,y (names of columns for x and then y coordinates) group_by_col: name of column to group features by (used to determine which coordinates belong to the same structure for non-point annotations) property_cols: list of columns containing properties shape_options: dict with “radius” for point annotations (can be number or column that has number)
- fusion_tools.utils.shapes.load_visium(visium_path: str, include_var_names: list = [], include_obs: list = [], mpp: float | None = None, scale_factor: float | str | None = None, verbose: bool = True)
Loading 10x Visium Spot annotations from an h5ad file or csv file containing spot center coordinates. Adds any of the variables listed in var_names and also the barcodes associated with each spot (if the path is an h5ad file).
- Parameters:
visium_path (str) – Path to the h5ad (anndata) formatted Visium data or csv file containing “imagerow” and “imagecol” columns
include_var_names (list, optional) – List of additional variables to add to the generated annotations (barcode is added by default), defaults to []
mpp (Union[float,None], optional) – If the Microns Per Pixel (MPP) is known for this image then pass it here to save time calculating spot diameter., defaults to None
- fusion_tools.utils.shapes.load_visiumhd(visiumhd_path: str, resolution_level: int, include_analysis_path: str | list | None = None, include_analysis_name: str | list | None = None, verbose: bool = True)
Generating annotations for a VisiumHD dataset
- Parameters:
visiumhd_path (str) – Path to “binned_outputs”
resolution_level (int) – Number representing the length of one side of the square
include_analysis_path (Union[str,list,None], optional) – Path to various analyses performed on these ROIs, can either be output of spaceranger or any csv file with “barcode” column to be used for alignment., defaults to None
include_analysis_name (Union[str,list,None], optional) – Name to use for each included analysis, if none are provided, name is inferred from {path}.split(os.sep)[-2]., defaults to None
- fusion_tools.utils.shapes.path_to_indices(path)
From SVG path to numpy array of coordinates, each row being a (row, col) point
- fusion_tools.utils.shapes.path_to_mask(path, shape)
From SVG path to a boolean array where all pixels enclosed by the path are True, and the other pixels are False.
- fusion_tools.utils.shapes.process_filters_queries(filter_list: list, spatial_list: list, structures: list, all_geo_list: list)
Filter GeoJSON list based on lists of both spatial and property filters.
- Parameters:
filter_list (list) – List of property filters (keys = name: “name of property”, range: “either a list of categorical values or min-max for quantitative”)
spatial_list (list) – List of spatial filters (keys= type: “predicate”, structure: “name of structure that is basis of predicate”)
structures (list) – List of included structures in final GeoJSON
all_geo_list (list) – List of GeoJSON objects to search
- Returns:
Filtered GeoJSON where all included structures are included as one FeatureCollection and a reference list containing original structure name and index
- Return type:
tuple
- fusion_tools.utils.shapes.spatially_aggregate(child_geo: dict, parent_geos: list, separate: bool = True, summarize: bool = True, ignore_list: list = ['_id', '_index'])
Aggregate intersecting feature properties to a provided GeoJSON
- Parameters:
child_geo (dict) – GeoJSON object that is receiving aggregated properties
parent_geos (list) – List of GeoJSON objects which are intersecting with child_geo
- Returns:
Updated child_geo object with new properties from intersecting parent_geos
- Return type:
dict
Image Utilities
Image utility functions include methods which read images/regions of images based on user inputs.
Omics Utilities
Omics utilities include methods which query external APIs to derive further information from Omics data queries as well as some utility functions for enforcing cell type hierarchies.
Making some utility functions for handling –omics data
- fusion_tools.utils.omics.get_asct(id: str | int)
Get Anatomical Structure & Cell Type associated with a given HGNC Id.
- fusion_tools.utils.omics.get_cell(id: str)
Get all the cell types available within a given anatomical structure Input has to be an UBERON id “UBERON_######…”
- fusion_tools.utils.omics.get_gene_info(id: str | list, species: str = 'human', fields: list = ['HGNC', 'alias', 'summary'], size: int = 5)
Get information about a given gene id or list of ids. By default returns HGNC, alias, and summary Can be expanded to include go, pubmed articles, etc.
- fusion_tools.utils.omics.group_subtypes(geo_props: dict, name: str, key: dict, keep_zeros: bool = True, normalize: bool = True) dict
Grouping together properties into an lower-level descriptor
- Parameters:
geo_props (dict) – Property dict for a single Feature
name (str) – Name of property containing “sub-properties” to be aggregated
key (dict) – Dictionary containing keys and values pertaining to sub-properties to be aggregated to each key
keep_zeros (bool, optional) – Whether or not to keep aggregated properties which sum to zero, defaults to True
normalize (bool, optional) – Whether to normalize this set of keys to sum to 1 or keep as sums, defaults to True
- Returns:
Updated property dictionary containing lower-level descriptor and keys
- Return type:
dict
- fusion_tools.utils.omics.selective_aggregation(child_geo: dict, parent_geo: dict, include_keys: dict = {}, aggregate_dropped: bool = True, dropped_name: str = 'undefined', re_normalize: bool = True)
Selectively aggregate different fields for each intersecting structure. Useful for only aggregating cell types which should be found within a specific structure
- Parameters:
child_geo (dict) – Child structure to be receiving aggregated properties
parent_geo (dict) – Parent structure that contains properties that are going to be aggregated within the child geos.
include_keys (dict, optional) – Dictionary containing keys for each property that is modified and a list of values which should be included for that key., defaults to {}
aggregate_dropped (bool, optional) – Whether or not to include dropped values in the final aggregation, defaults to True
dropped_name (str, optional) – Name to use for dropped names that are aggregated. Ignored if aggregate_dropped=False, defaults to ‘undefined’
re_normalize (bool, optional) – Whether or not to re-normalize values after dropping keys., defaults to True
- Returns:
Child structure with selectively aggregated properties from the parent structure.
- Return type:
dict