Skip to content

Latest commit

 

History

History
53 lines (44 loc) · 3.43 KB

File metadata and controls

53 lines (44 loc) · 3.43 KB

StarCAT analysis for post transplant CD34+ genes

Using the existing nature methods GEP dataset, we want to analyze the post-transplant CD34+ dataset. The resulting dataset should show the usage scores for each of the GEPs.

Current Issues:

  • Visualizing plots on R and python (match UMAP from sc package)
  • 40k reference not working for starCAT
    • Negative values in cNMF data
  • starCAT not matching all genes in reference
    • 2k gene/GEP reference
    • 40k gene/GEP reference

Non-impactful issues:

  • Errors converting seurat R to python (h5ad)
  • Code/Repo readability

Repo Structure / File Descriptions:

postTrans_cd34
    ├───.gitignore
    ├───README.md
    ├───postTrans_v1
    │   ├───postTrans_coordinates.csv (UMAP coordinates for python)
    │   ├───postTrans_starCAT.rf_usage_normalized.txt (starCAT first run)
    │   ├───postTrans_common_starCAT.rf_usage_normalized.txt (starCAT second run with common genes)
    │   ├───starcat_analysis.ipynb (analysis notebook, R)
    │   └───starcat_visualization.ipynb (visualization notebook, python)
    ├───postTrans_v2
    │   ├───postTrans_coordinates.csv (UMAP coordinates for python)
    │   ├───seu_common_starCAT.rf_usage_normalized.txt (starCAT run with common genes)
    │   ├───starcat_analysis.ipynb (analysis notebook, R)
    │   └───starcat_visualization.ipynb (visualization notebook, python)
    └───References
        ├───cNMF4.gene_spectra_score.k_35.dt_0_15.csv (reference, 2k genes [csv file])
        ├───cNMF4.spectra.k_35.dt_0_15.consensus.txt (reference, 2k genes [txt file]) <= Used for starCAT analysis
        ├───cNMF4.spectra.k_35.dt_0_15.consensus.csv (reference, 40k genes [csv file])
        └───cNMF4.spectra.k_35.dt_0_15.consensus.txt (reference, 40k genes [txt file])

- postTrans_v1: First run of starCAT analysis and data visualization on the preliminary dataset
- postTrans_v2: Second run of starCAT analysis and data visualization on the completed dataset
- References: GEP/gene references needed for starCAT analysis

References:

  • Immunogenomics. (n.d.). Immunogenomics/starcat: Implements *cellannotator (aka *cat/starcat), annotating scrna-seq with predefined gene expression programs. GitHub. https://github.com/immunogenomics/starCAT
  • Kotliar, D., Curtis, M., Agnew, R., Weinand, K., Nathan, A., Baglaenko, Y., Zhao, Y., Sabeti, P. C., Rao, D. A., & Raychaudhuri, S. (2024). Reproducible Single Cell Annotation of Programs Underlying T-Cell Subsets, Activation States, and Functions. https://doi.org/10.1101/2024.05.03.592310
  • Kotliar, D., Veres, A., Nagy, M. A., Tabrizi, S., Hodis, E., Melton, D. A., & Sabeti, P. C. (2019). Identifying gene expression programs of cell-type identity and cellular activity with single-cell RNA-seq. eLife, 8. https://doi.org/10.7554/elife.43803
  • Li, H., Côté, P., Kuoch, M., Ezike, J., Frenis, K., Afanassiev, A., Greenstreet, L., Tanaka-Yano, M., Tarantino, G., Zhang, S., Whangbo, J., Butty, V. L., Moiso, E., Falchetti, M., Lu, K., Connelly, G. G., Morris, V., Wang, D., Chen, A. F., Bianchi, G., … Rowe, R. G. (2025). The dynamics of hematopoiesis over the human lifespan. Nature methods, 22(2), 422–434. https://doi.org/10.1038/s41592-024-02495-0

Research under the Li Lab, University of California, San Diego

Completed, bugs may appear