arcfish.pp.add_cell_type#

arcfish.pp.add_cell_type(adata: AnnData, df: DataFrame | list | dict, shared_col: str, cell_col: str)#

Add cell type information to adata.

adata is created from a single or a list/dictionary of FOF_CT-core files. Depending on how adata is created, df is a single dataframe or a list/dictionary which contains cell type information.

Parameters:

adata (AnnData) – Object created by FOF_CT_Loader.create_adata().
df (pd.DataFrame | list | dict) – Object created by FOF_CT_Loader.read_data().
shared_col (str) – Column shared between adata and df used as an identifier.
cell_col (str) – Column in df containing cell type information.

Examples

1. df is a dataframe. Cell_ID is a common column in both the original FOF_CT-core file and the cell type information file. Map cluster label from the cell type information file to adata.

>>> loader = sf.pp.FOF_CT_Loader("PATH.csv")
>>> adata = loader.create_adata("chr3", obs_cols_add=["Cell_ID"])
>>> dfs = sf.pp.FOF_CT_Loader("TYPEPATH.csv").read_data()
>>> sf.pp.add_cell_type(adata, dfs, "Cell_ID", "cluster label")

2. df is a list. Cell_ID is a common column in both the original FOF_CT-core file and the cell type information file. Map cluster label from the cell type information file to adata.

>>> loader = sf.pp.FOF_CT_Loader(["PATH1.csv", "PATH2.csv"])
>>> adata = loader.create_adata("chr3", obs_cols_add=["Cell_ID"])
>>> dfs = sf.pp.FOF_CT_Loader([
...     "TYPEPATH1.csv", "TYPEPATH2.csv"
... ]).read_data()
>>> sf.pp.add_cell_type(adata, dfs, "Cell_ID", "cluster label")

3. df is a dictionary. Cell_ID is a common column in both the original FOF_CT-core file and the cell type information file. Map cluster label from the cell type information file to adata.

>>> loader = sf.pp.FOF_CT_Loader({
...     "rep1":"PATH1.csv", "rep2":"PATH2.csv"
... })
>>> adata = loader.create_adata("chr3", obs_cols_add=["Cell_ID"])
>>> dfs = sf.pp.FOF_CT_Loader({
...     "rep1":"TYPEPATH1.csv", "rep2":"TYPEPATH2.csv"
... }).read_data()
>>> sf.pp.add_cell_type(adata, dfs, "Cell_ID", "cluster label")