arcfish.tl.TwoSampleT#

class arcfish.tl.TwoSampleT(adata: AnnData)#

Test 3D distance by two sample T-test. This the same test as implemented in the original SnapFISH paper: Lee, L. et al. SnapFISH: a computational pipeline to identify chromatin loops from multiplexed DNA FISH data. Nat. Commun. 14, 4873 (2023).

Parameters:

adata (AnnData) – adata of a single chromosome, created by arcfish.pp.FOF_CT_Loader.create_adata().

__init__(adata: AnnData)#

Methods

__init__(adata)

append_pval(result, cut_lo, cut_up, outer_cut)

Perform two-sample t-tests.

append_summit(result)

Treat the entry with the smallest p-value in each cluster as a potential summit.

ij_background(i, j, d1d, outer_cut)

The entries that are between 25kb and outer_cut away from the (i,j) entry are treated as the background.

static ij_background(i: int, j: int, d1d: ndarray, outer_cut: int) ndarray#

The entries that are between 25kb and outer_cut away from the (i,j) entry are treated as the background.

Parameters:
  • i (int) – Index of the first locus.

  • j (int) – Index of the second locus.

  • d1d ((p,) np.ndarray) – Array of 1D genomic locations of imaging loci.

  • outer_cut (int) – Loci with 1D genomic distance within outer_cut from the target locus is included in the local background.

Returns:

A boolean matrix with background entries equal to True.

Return type:

(p, p) np.ndarray

append_pval(result: dict, cut_lo: int, cut_up: int, outer_cut: int)#

Perform two-sample t-tests.

Parameters:
  • result (dict) – The result dictionary to add testing results.

  • cut_lo (float) – Minimum loop size.

  • cut_up (float) – Maximum loop size.

  • outer_cut (int) – Loci with 1D genomic distance within outer_cut from the target locus is included in the local background.

append_summit(result: dict)#

Treat the entry with the smallest p-value in each cluster as a potential summit. Filter summits by contact frequency.

1. If the summit is a singleton (i.e. from only one candidate), then it is marked as summit if contact frequency is larger than 1/2.

2. If the summit is not a singleton (i.e. from multiple candidates), then it is marked as summit if contact frequency is larger than 1/3.

Parameters:

result (dict) – The result dictionary to add summit.