arcfish.tl.TADCaller#
- class arcfish.tl.TADCaller(fdr_cutoff: float = 0.1, window: float = 100000.0, tree: bool = True, min_tad_size: float = 0, prominence: float | None = None, distance: int | None = None, method: Literal['pval', 'insulation'] = 'pval')#
TAD calling class.
- Parameters:
fdr_cutoff (float, optional) – Boundary peaks with FDR below this cutoff are defined as TAD boundaries, by default 0.1.
window (float, optional) – Domain size in bp to calculate intra- and inter-domain distance, by default 1e5.
tree (bool, optional) – Whether to return hierarchical TADs, by default True.
min_tad_size (float, optional) – The minimum TAD size allowed, by default 0.
prominence (float, optional) – Required only if method is “insulation”, by default None. Least height difference in normalized insulation score in order for the locus to be defined as a peak. Passed to
find_peaks().distance (float, optional) – Required only if method is “insulation”, by default None. Least number of loci between two peaks. Passed to
find_peaks().method (Literal["pval", "insulation"], optional) – TAD calling algorithm used, by default “pval”.
- __init__(fdr_cutoff: float = 0.1, window: float = 100000.0, tree: bool = True, min_tad_size: float = 0, prominence: float | None = None, distance: int | None = None, method: Literal['pval', 'insulation'] = 'pval')#
Methods
__init__([fdr_cutoff, window, tree, ...])by_insulation(adata)Call TADs by insulation score.
by_pval(adata)Call TADs by thresholding FDR values.
call_tads(adata)Call TADs from adata.
to_bedpe(result)Convert TAD calling result to a dataframe where each row is a TAD (a pair of boundaries instead of a single boundary).
Attributes
Minimum distance between peaks.
FDR cut-off for TAD boundaries.
TAD calling method used.
Minimum TAD size.
Prominence of peaks.
Call hierarchical TADs.
size of the window used to compute inter/intra contacts.
- call_tads(adata: AnnData) DataFrame#
Call TADs from adata.
- Parameters:
adata (AnnData) – adata of a single chromosome, created by
arcfish.pp.FOF_CT_Loader.create_adata().- Returns:
A dataframe with length equal to the number of loci. The column peak defines whether the position is a boundary.
- Return type:
pd.DataFrame
- by_pval(adata: AnnData) DataFrame#
Call TADs by thresholding FDR values.
- Parameters:
adata (AnnData) – adata of a single chromosome, created by
arcfish.pp.FOF_CT_Loader.create_adata().- Returns:
A dataframe with length equal to the number of loci. The column peak defines whether the position is a boundary.
- Return type:
pd.DataFrame
- by_insulation(adata: AnnData) DataFrame#
Call TADs by insulation score. Method from Su, J.-H., Zheng, P., Kinrot, S. S., Bintu, B. & Zhuang, X. Genome-Scale Imaging of the 3D Organization and Transcriptional Activity of Chromatin. Cell 182, 1641-1659.e26 (2020).
- Parameters:
adata (AnnData) – adata of a single chromosome, created by
arcfish.pp.FOF_CT_Loader.create_adata().- Returns:
A dataframe with length equal to the number of loci. The column peak defines whether the row is a boundary.
- Return type:
pd.DataFrame
- to_bedpe(result: DataFrame) DataFrame | None#
Convert TAD calling result to a dataframe where each row is a TAD (a pair of boundaries instead of a single boundary).
- Parameters:
result (pd.DataFrame) – Result returned by
by_insulation()orby_pval().- Returns:
Dataframe where each row is a TAD. Columns include c1, s1, e1, c2, s2, e2, {score_col}1, {score_col}2, level, idx1, and idx2, representing the two boundaries of the TAD. If tree is false, then level is 0 for all rows; otherwise, it will be integers increasing from the smaller to larger TADs.
- Return type:
pd.DataFrame | None