arcfish.tl.TADCaller#

class arcfish.tl.TADCaller(fdr_cutoff: float = 0.1, window: float = 100000.0, tree: bool = True, min_tad_size: float = 0, prominence: float | None = None, distance: int | None = None, method: Literal['pval', 'insulation'] = 'pval')#

TAD calling class.

Parameters:
  • fdr_cutoff (float, optional) – Boundary peaks with FDR below this cutoff are defined as TAD boundaries, by default 0.1.

  • window (float, optional) – Domain size in bp to calculate intra- and inter-domain distance, by default 1e5.

  • tree (bool, optional) – Whether to return hierarchical TADs, by default True.

  • min_tad_size (float, optional) – The minimum TAD size allowed, by default 0.

  • prominence (float, optional) – Required only if method is “insulation”, by default None. Least height difference in normalized insulation score in order for the locus to be defined as a peak. Passed to find_peaks().

  • distance (float, optional) – Required only if method is “insulation”, by default None. Least number of loci between two peaks. Passed to find_peaks().

  • method (Literal["pval", "insulation"], optional) – TAD calling algorithm used, by default “pval”.

__init__(fdr_cutoff: float = 0.1, window: float = 100000.0, tree: bool = True, min_tad_size: float = 0, prominence: float | None = None, distance: int | None = None, method: Literal['pval', 'insulation'] = 'pval')#

Methods

__init__([fdr_cutoff, window, tree, ...])

by_insulation(adata)

Call TADs by insulation score.

by_pval(adata)

Call TADs by thresholding FDR values.

call_tads(adata)

Call TADs from adata.

to_bedpe(result)

Convert TAD calling result to a dataframe where each row is a TAD (a pair of boundaries instead of a single boundary).

Attributes

distance

Minimum distance between peaks.

fdr_cutoff

FDR cut-off for TAD boundaries.

method

TAD calling method used.

min_tad_size

Minimum TAD size.

prominence

Prominence of peaks.

tree

Call hierarchical TADs.

window

size of the window used to compute inter/intra contacts.

property fdr_cutoff: float#

FDR cut-off for TAD boundaries.

Type:

float

property window: int#

size of the window used to compute inter/intra contacts.

Type:

int

property tree: bool#

Call hierarchical TADs.

Type:

bool

property min_tad_size: int#

Minimum TAD size.

Type:

int

property prominence: float#

Prominence of peaks.

Type:

float

property distance: int#

Minimum distance between peaks.

Type:

int

property method: Literal['pval', 'insulation']#

TAD calling method used.

Type:

str

call_tads(adata: AnnData) DataFrame#

Call TADs from adata.

Parameters:

adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().

Returns:

A dataframe with length equal to the number of loci. The column peak defines whether the position is a boundary.

Return type:

pd.DataFrame

by_pval(adata: AnnData) DataFrame#

Call TADs by thresholding FDR values.

Parameters:

adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().

Returns:

A dataframe with length equal to the number of loci. The column peak defines whether the position is a boundary.

Return type:

pd.DataFrame

by_insulation(adata: AnnData) DataFrame#

Call TADs by insulation score. Method from Su, J.-H., Zheng, P., Kinrot, S. S., Bintu, B. & Zhuang, X. Genome-Scale Imaging of the 3D Organization and Transcriptional Activity of Chromatin. Cell 182, 1641-1659.e26 (2020).

Parameters:

adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().

Returns:

A dataframe with length equal to the number of loci. The column peak defines whether the row is a boundary.

Return type:

pd.DataFrame

to_bedpe(result: DataFrame) DataFrame | None#

Convert TAD calling result to a dataframe where each row is a TAD (a pair of boundaries instead of a single boundary).

Parameters:

result (pd.DataFrame) – Result returned by by_insulation() or by_pval().

Returns:

Dataframe where each row is a TAD. Columns include c1, s1, e1, c2, s2, e2, {score_col}1, {score_col}2, level, idx1, and idx2, representing the two boundaries of the TAD. If tree is false, then level is 0 for all rows; otherwise, it will be integers increasing from the smaller to larger TADs.

Return type:

pd.DataFrame | None