API¶
Clustering¶
- constclust.cluster(adata, n_neighbors, resolutions, random_state, n_procs=1, neighbor_kwargs={}, leiden_kwargs={}, progress_bar=True)[source]¶
Generate clusterings for each combination of
n_neighbors,resolutions, andrandom_state.- Parameters
- adata :
AnnDataAnnData Object to be clustered.
- n_neighbors :
Collection[int]Collection[int] Values for numbers of neighbors.
- resolutions :
Collection[float]Collection[float] Values for resolution parameter for modularity optimization.
- random_state :
Collection[int]Collection[int] Random seeds to start with.
- n_procs :
intint(default:1) Number of processes to use.
- neighbor_kwargs :
dictdict(default:{}) Key word arguments to pass to all calls to
scanpy.pp.neighbors(). For example: {“use_rep”: “X”}.- leiden_kwargs :
dictdict(default:{}) Key word argument to pass to all calls to
leidenalg.find_partition(). For example,{"partition_type": leidenalg.CPMVertexPartition}.- progress_bar :
boolbool(default:True) Whether to diplay a progress bar for the clustering process.
- adata :
- Return type
- Returns
Pair of dataframes, where the first contains the settings for each partitioning,
and the second contains the partitionings.
Example
>>> params, clusterings = cluster( adata, n_neighbors=np.linspace(15, 90, 4, dtype=int), resolutions=np.geomspace(0.05, 20, 50), random_state=[0,1,2,3], n_procs=4 )
Reconciling¶
- constclust.reconcile(settings, clusterings, paramtypes='oou', nprocs=1)[source]¶
Reconcile clusterings and parameters into a graph of clusters.
- Parameters
Example
>>> params, clusterings = cluster(adata, ...) >>> reconciler = reconcile(params, clusterings)
- Return type
- class constclust.aggregate.Component(reconciler, cluster_ids)[source]¶
A connected component from a Reconciler
- _parent¶
The
Reconcilerwhich generated this component.- Type
- settings¶
Subset of parents settings. Contains only settings for clustering which appear in this component.
- Type
- cluster_ids¶
Which clusters are in this component.
- Type
- intersect¶
Intersection of samples in this component.
- Type
- intersect_names¶
Names of samples in the intersection of this component.
- Type
- union¶
Union of samples in this component.
- Type
- union_names¶
Names of samples in the union of this component.
- Type
- class constclust.aggregate.ComponentList(components)[source]¶
A set of consistent components identified from many clustering solutions.
This is considered to be an immutable list, so operations values will be cached.
- property obs_names: pandas.core.indexes.base.Index¶
The set of observations these components were found on.
- to_graph(overlap='intersect')[source]¶
Builds a hierarchichal graph of the components
- Return type
DiGraphDiGraph
- describe()[source]¶
Calculates summary statistics for components.
Example
>>> stats = comp_list.describe()
- filter(func=None, *, min_intersect=None, max_intersect=None, min_union=None, max_union=None, min_solutions=None, max_solutions=None)[source]¶
Filter components from this collection, returns a copy.
Example
>>> to_examine = comp_list.filter(min_intersect=20, min_solutions=100)
- Return type
- plot_components(adata, *, x_param='n_neighbors', y_param='resolution', embedding_basis='X_umap', embedding_kwargs=mappingproxy({}))[source]¶
Plot parameter space and scatter plot for each component.
The parameter space is a heatmap, showing the range of parameters each component was found in. The scatter plot shows which observations were included in the component in a 2d embedding of the dataset.
- Parameters
- x_param :
strstr(default:'n_neighbors') Which key from the parameters will be along the y-axis of the heatmaps.
- y_param :
strstr(default:'resolution') Which key from the parameters will be along the y-axis of the heatmaps.
- embedding_basis :
strstr(default:'X_umap') Basis from adata to use for embedding plot.
- embedding_kwargs :
MappingMapping(default:mappingproxy({})) Keyword arguments to pass to sc.pl.embedding.
- x_param :
Example
>>> comps.plot_components(coords=adata.obsm["X_umap"])
- plot_hierarchies(coords, *, overlap='intersect', scatter_kwargs=mappingproxy({}))[source]¶
Find and plot interactive hierarchies of components.
- Parameters
Example
>>> from bokeh.io import show >>> comps = reconciler.get_components(0.9, min_cells=5) >>> show( comps .filter(min_solutions=100) .plot_hierarchies(coords=adata.obsm["X_umap"]) )
- class constclust.aggregate.ReconcilerBase[source]¶
Base type for reconciler.
Has methods for subsetting implemented, providing data is up to subclass.
- property obs_names: pandas.core.indexes.base.Index¶
The set of observations clusters were found on.
- get_param_range(clusters)[source]¶
Given a set of clusters, returns the range of parameters for which they were calculated.
- Parameters
- clusters : Collection[Int]
If its a collection of ints, I’ll say that was a range of parameter ids.
- subset_clusterings(clusterings_to_keep)[source]¶
Take subset of Reconciler, where only
clusterings_to_keepare present.Reduces size of both
.settingsand.clusterings.- Parameters
- clusterings_to_keep
Indexer into
Reconciler.settings. Anything that should give the correct result forreconciler.settings.loc[clusterings_to_keep].
- Returns
- Return type
ReconcilerSubset
- subset_cells(cells_to_keep)[source]¶
Take subset of Reconciler, where only
cells_to_keepare present.- Parameters
- cells_to_keep
Indexer into
Reconciler.clusterings. Anything that should give the correct result forreconciler.clusterings.loc[cells_to_keep].
- Returns
- Return type
ReconcilerSubset
- describe_clusters(log1p=False)[source]¶
Describe the clusters in this Reconciler.
- Parameters
- Return type
- Returns
DataFrame containing summary statistics on the clusters in this reconciler. Good
for plotting.
Example
>>> import hvplot.pandas >>> clusters = reconciler.describe_clusters(log1p=True) >>> clusters.hvplot.scatter( "log1p_resolution", "log1p_n_obs", datashade=True, dynspread=True )
- class constclust.aggregate.ReconcilerSubset(parent, settings, clusterings, mapping, graph)[source]¶
Subset of a Reconciler
- _parent¶
Reconciler this subset was derived from.
- Type
- settings¶
Settings for clusterings in this subset.
- Type
- clusterings¶
Clusterings contained in this subset.
- Type
- graph¶
Reference to graph from parent.
- Type
igraph.Graph
- _mapping¶
pd.Serieswith aMultiIndex. Unlike the_mappingfromReconciler, this does not necessarily have all clusters, so ranges of clusters cannot be assumed to be contiguous. Additionally, you can’t just index into this withcluster_idsas positions.- Type
- _obs_names¶
Maps from integer position to input cell name.
- Type
pd.Series
- find_contained_components(min_presence, min_weight=0.9, min_cells=2)[source]¶
Find components contained in a subset.
- class constclust.aggregate.Reconciler(settings, clusterings, mapping, graph)[source]¶
Collects and reconciles many clusterings by local (in parameter space) stability.
- settings¶
Contains settings for all clusterings. Index corresponds to .clusterings columns, while columns should correspond to the parameters which were varied.
- Type
- clusterings¶
Contains cluster assignments for each cell, for each clustering. Columns correspond to .settings index, while the index correspond to the cells. Each cluster is encoded with a unique cluster id.
- Type
- graph¶
Weighted graph. Nodes are clusters (identified by unique cluster id integer, same as in .clusterings). Edges connect clusters with shared contents. Weight is the Jaccard similarity between the contents of the clusters.
- Type
igraph.Graph
- cluster_ids¶
Integer ids of all clusters in this Reconciler.
- Type
- _obs_names¶
Ordered set for names of the cells. Internally they are refered to by integer positions.
- Type
- _mapping¶
pd.Serieswith aMultiIndex. Index has levelsclusteringandcluster. Each position in index should have a unique value at level “cluster”, which corresponds to a cluster in the clustering dataframe. Values arenp.arrayswith indices of cells in relevant cluster. This should be considered immutable, though this is not the case for ``ReconcilerSubset``s.- Type
- find_components(min_weight, clusters, min_cells=2)[source]¶
Return components from filtered graph which contain specified clusters.
- Parameters
- min_weight :
float Minimum weight for edges to be kept in graph. Should be over 0.5.
- clusters :
np.array[int] Clusters which you’d like to search from.
- min_weight :
Plotting¶
- constclust.plotting.component_param_range(component, x='n_neighbors', y='resolution', ax=None)[source]¶
Given a component, show which parameters it’s found at as a heatmap.
- Parameters
- component :
ForwardRefForwardRef The component to plot.
- x :
strstr(default:'n_neighbors') The parameter for the x axis.
- y :
strstr(default:'resolution') The parameter to place on the y axis.
- ax :
Axis|NoneOptional[Axis] (default:None) Optional axis to plot on.
- component :
Example
>>> comps = reconciler.get_comps(0.9) >>> plotting.component_param_range(comps[0])
- constclust.plotting.component(component, adata, x='n_neighbors', y='resolution', embedding_basis='X_umap', plot_global=False, aspect=None, embedding_kwargs={})[source]¶
Plot stability and embedding for component.
- Parameters
- component :
ForwardRefForwardRef Component object to plot.
- adata :
AnnDataAnnData AnnData to use for plotting UMAP. Should have same cell names as Component`s parent `Reconciler.
- x :
strstr(default:'n_neighbors') Parameter to plot on the X-axis of the heatmap.
- y :
strstr(default:'resolution') Parameter to plot on the Y-axis of the heatmap.
- embedding_basis :
strstr(default:'X_umap') Which basis from the AnnData object to use for embedding.
- aspect :
float|NoneOptional[float] (default:None) Aspect ratio of entire plot. Defaults to 1/2.
- embedding_kwargs :
dictdict(default:{}) Arguments passed to sc.pl.embedding.
- component :