Checkpoint 4 PP TX
Checkpoint 4 PP TX
Checkpoint 4 PP TX
Align raw data to referenceAmbient RNA & Process datasets Merging & batch Automatic celltype Downstream
& sample demultiplexing Doublets removal (QC, Scran Normalization, HVG...etc) correction datasets annotation analysis
Starsolo Cellbender Scanpy & Scran SCVI CellID Scanpy & GseaPY..etc
Demuxlet Doublet Detection
Reference used for all this Pipline : Human + IAV PR8 + Sars COV2 GE
Build our own reference using new strain and redo EVERYTHING
Build reference
Align raw data to referenceAmbient RNA & Process datasets Merging & batch Automatic celltype Downstream
& sample demultiplexing Doublets removal (QC, Scran Normalization, HVG...etc) correction datasets annotation analysis
Cellbender
Starsolo Scanpy & Scran SCVI CellID Scanpy & GseaPY..etc
Doublet Detection
Demuxlet Scrublet
Adjustments:
Align raw data to referenceAmbient RNA & Process datasets Merging & batch Automatic celltype Downstream
& sample demultiplexing Doublets removal (QC, Scran Normalization, HVG...etc) correction datasets annotation analysis
Cellbender
Starsolo Scanpy & Scran SCVI CellID Scanpy & GseaPY..etc
Doublet Detection
Demuxlet Scrublet
• Cell ID: Instead of randomly subsampling the Yann dataset, for use as a
reference for cell ID.
• Increase the number of generated gene signatures per cell type from 500
to 1000.
Further QC
Further QC
Further QC
• Visualizing count of cells per cell type for cell that have less than 1500 total count read
Post QC UMAP
Annotation
SARS
Coverage
Junctions
• Most reads come from the end of the genome (3' prime end, "poly A").
• Could it be an infection? Specifically, is phagocytosis by monocytes of the Omicron variant a possibility? Or could this be an
effect of vaccination or prior exposure?
SARS reads
"Spliced reads suggest that viral genes are being processed, indicating successful invasion by the virus."
Clustering
• Understand what i have .......spend some time with the literature for innate immunity,viral response,
diffrent immune cell functions and interactions etc..
10 KNN , 30 pca
Mwogli / Optimal transport
10 KNN , OT 30 latent_dim
Mwogli / Optimal transport
10 KNN , OT 30 latent_dim
Topometry
obtain properly weighted eigenbases to represent the underlying data manifold
the eigencomponents are the 'latent space' (a.k.a. the dimensionality reduced
spaced), similar to the latent space learned by autoencoders like scVI or the
principal components learned by PCA.