arcfish.tl.ABCaller#

class arcfish.tl.ABCaller(min_cpmt_size: float, ref_genome: str | None = None, centromere: bool = True, cutoff: float | None = None, sigma: float | None = 1, method: Literal['axes', 'pca'] = 'axes')#

Call A/B compartments from multiplexed imaging data using PCA.

Parameters:

min_comp_size (float) – Minimum compartment size in bp.
ref_genome (str, optional) –
Reference genome assembly ID used to assign A/B compartment based on clustering result, by default None.

If None, use the average pairwise distance to assign A/B compartment (smaller distance is A and larger distance is B). It is highly recommended to pass in a reference genome string.

See available assembly IDs at: UCSC Genome browser. Common assembly IDs are: “hg19”, “hg38”, “mm10”.
centromere (bool, optional) –
If True, use centromere position to split the chromosome into two parts and call A/B compartments separately, by default True. If False, call A/B compartments for the whole chromosome.

If ref_genome is None, this parameter is ignored.
cutoff (float, optional) – Required only if method is “pca”. Distance below cutoff is defined as contact, by dafault None.
sigma (float, optional) – Required only if method is “pca”. Gaussian kernel size, by default 1.
method (Literal["axes", "pca"], optional) – A/B compartment calling algorithm used, by default “axes”.

__init__(min_cpmt_size: float, ref_genome: str | None = None, centromere: bool = True, cutoff: float | None = None, sigma: float | None = 1, method: Literal['axes', 'pca'] = 'axes')#

Methods

`__init__`(min_cpmt_size[, ref_genome, ...])
`by_axes_pc`(adata)	Call A/B compartments by weighting the 2nd PC from different axes.
`by_first_pc`(adata)	Call A/B compartments by first PCA.
`call_cpmt`(adata)	Call A/B compartments from adata.

Attributes

tss

Transcript start sites (TSS) of the reference genome.

property tss: DataFrame | None#: Transcript start sites (TSS) of the reference genome.

call_cpmt(adata: AnnData) → DataFrame#

Call A/B compartments from adata.

Parameters:: adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().
Returns:: A dataframe with length equal to the number of locus. The column cpmt indicates A/B compartment assignments: 0 indicates A compartment and 1 indicates B compartment.
Return type:: pd.DataFrame

by_axes_pc(adata: AnnData) → DataFrame#

Call A/B compartments by weighting the 2nd PC from different axes.

Parameters:: adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().
Returns:: A dataframe with length equal to the number of locus. The column cpmt indicates A/B compartment assignments: 0 indicates A compartment and 1 indicates B compartment.
Return type:: pd.DataFrame

by_first_pc(adata: AnnData) → DataFrame#

Call A/B compartments by first PCA. Adopted from Su, J.-H., Zheng, P., Kinrot, S. S., Bintu, B. & Zhuang, X. Genome-Scale Imaging of the 3D Organization and Transcriptional Activity of Chromatin. Cell 182, 1641-1659.e26 (2020).

Parameters:: adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().
Returns:: A dataframe with length equal to the number of locus. The column cpmt indicates A/B compartment assignments: 0 indicates A compartment and 1 indicates B compartment.
Return type:: pd.DataFrame