arcfish.tl.ABCaller#

class arcfish.tl.ABCaller(min_cpmt_size: float, ref_genome: str | None = None, centromere: bool = True, cutoff: float | None = None, sigma: float | None = 1, method: Literal['axes', 'pca'] = 'axes')#

Call A/B compartments from multiplexed imaging data using PCA.

Parameters:
  • min_comp_size (float) – Minimum compartment size in bp.

  • ref_genome (str, optional) –

    Reference genome assembly ID used to assign A/B compartment based on clustering result, by default None.

    If None, use the average pairwise distance to assign A/B compartment (smaller distance is A and larger distance is B). It is highly recommended to pass in a reference genome string.

    See available assembly IDs at: UCSC Genome browser. Common assembly IDs are: “hg19”, “hg38”, “mm10”.

  • centromere (bool, optional) –

    If True, use centromere position to split the chromosome into two parts and call A/B compartments separately, by default True. If False, call A/B compartments for the whole chromosome.

    If ref_genome is None, this parameter is ignored.

  • cutoff (float, optional) – Required only if method is “pca”. Distance below cutoff is defined as contact, by dafault None.

  • sigma (float, optional) – Required only if method is “pca”. Gaussian kernel size, by default 1.

  • method (Literal["axes", "pca"], optional) – A/B compartment calling algorithm used, by default “axes”.

__init__(min_cpmt_size: float, ref_genome: str | None = None, centromere: bool = True, cutoff: float | None = None, sigma: float | None = 1, method: Literal['axes', 'pca'] = 'axes')#

Methods

__init__(min_cpmt_size[, ref_genome, ...])

by_axes_pc(adata)

Call A/B compartments by weighting the 2nd PC from different axes.

by_first_pc(adata)

Call A/B compartments by first PCA.

call_cpmt(adata)

Call A/B compartments from adata.

Attributes

tss

Transcript start sites (TSS) of the reference genome.

property tss: DataFrame | None#

Transcript start sites (TSS) of the reference genome.

call_cpmt(adata: AnnData) DataFrame#

Call A/B compartments from adata.

Parameters:

adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().

Returns:

A dataframe with length equal to the number of locus. The column cpmt indicates A/B compartment assignments: 0 indicates A compartment and 1 indicates B compartment.

Return type:

pd.DataFrame

by_axes_pc(adata: AnnData) DataFrame#

Call A/B compartments by weighting the 2nd PC from different axes.

Parameters:

adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().

Returns:

A dataframe with length equal to the number of locus. The column cpmt indicates A/B compartment assignments: 0 indicates A compartment and 1 indicates B compartment.

Return type:

pd.DataFrame

by_first_pc(adata: AnnData) DataFrame#

Call A/B compartments by first PCA. Adopted from Su, J.-H., Zheng, P., Kinrot, S. S., Bintu, B. & Zhuang, X. Genome-Scale Imaging of the 3D Organization and Transcriptional Activity of Chromatin. Cell 182, 1641-1659.e26 (2020).

Parameters:

adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().

Returns:

A dataframe with length equal to the number of locus. The column cpmt indicates A/B compartment assignments: 0 indicates A compartment and 1 indicates B compartment.

Return type:

pd.DataFrame