(五)单细胞数据分析——细胞亚群的表型特征刻画(补充工作)

本文主要是介绍(五)单细胞数据分析——细胞亚群的表型特征刻画(补充工作),希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

由于细胞表型特征这里,我们可以进行一些信息的补充,一个是从全局重新观察每个细胞亚群的基因表达,另一个是细胞亚群特征基因的小提琴图。

第一步:导入R包

library(scibet)
library(Seurat)
library(scater)
library(scran)
library(dplyr)
library(Matrix)
library(cowplot)
library(ggplot2)
library(harmony)

第二步:全局观察每个细胞亚群的基因表达

导入所有细胞亚群的数据

BRCA_SingleCell.Bcell <- readRDS("BRCA_SingleCell.B_Pcell.RDS")
BRCA_SingleCell.Tcell <- readRDS("BRCA_SingleCell.Tcell.RDS")
BRCA_SingleCell.Mycell <- readRDS("BRCA_SingleCell.Mycell.RDS")
BRCA_SingleCell.Fcell <- readRDS("BRCA_SingleCell.Fcell.RDS")

设置颜色种类

cb_palette <- c("antiquewhite", "yellowgreen", "thistle", "violet", "azure3", "azure4", "black","blue", "blue4","blueviolet", #B"brown4","brown1", "burlywood1", "burlywood4", "cadetblue1", "cadetblue4","chartreuse", "chartreuse3", #T"chartreuse4","chocolate1", "chocolate3", "chocolate4", "cyan3", "darkgoldenrod","darkgoldenrod1","darkgoldenrod4", "darkolivegreen", "khaki", "yellow" ,"hotpink" ,"hotpink4" , #My"slateblue1" ,"slateblue4", "yellow3","sienna4", "tomato1", "sienna1", "dodgerblue4", "darkmagenta","cornsilk4") #F

绘制每个细胞亚群的UMAP图

DimPlot(object = BRCA_SingleCell.Bcell,reduction = 'umap',group.by = c('cell_Subtype'),cols = cb_palette[1:10], label = TRUE)
DimPlot(object = BRCA_SingleCell.Tcell,reduction = 'umap',group.by = c('cell_Subtype'),cols = cb_palette[11:18], label = TRUE)
DimPlot(object = BRCA_SingleCell.Mycell,reduction = 'umap',group.by = c('cell_Subtype'),cols = cb_palette[19:31], label = TRUE)
DimPlot(object = BRCA_SingleCell.Fcell,reduction = 'umap',group.by = c('cell_Subtype'),cols = cb_palette[32:40], label = TRUE)

其中一个结果如下:

合并所有数据绘制总的UMAP图

BRCA_SingleCell <- merge(BRCA_SingleCell.Fcell,BRCA_SingleCell.Tcell)
BRCA_SingleCell <- merge(BRCA_SingleCell,BRCA_SingleCell.Bcell)
BRCA_SingleCell <- merge(BRCA_SingleCell,BRCA_SingleCell.Mycell)
DimPlot(object = BRCA_SingleCell,reduction = 'umap',group.by = c('cell_Subtype'), cols = cb_palette[1:40], label = FALSE)

结果如下:

 分析不同细胞亚群之间的差异基因,并绘制热图

cellType_markerGene <- findMarker(BRCA_SingleCell)
top <- cellType_markerGene %>% group_by(cluster) %>% top_n(n =10, wt = myAUC)
top$cluster %>% levels(.)
clusters <- unique(BRCA_SingleCell$cell_Subtype)
genes <- c()
for(i in clusters){gene <- as.vector(as.matrix(subset(top, cluster == i, gene)))gene <- gene[1:(ifelse(length(gene)>5,5,length(gene)))]names(gene) <- rep(i, times = length(gene))genes <- c(genes, gene)
}
table(names(genes))
table(unique(names(genes)))
B_P_gene <- c(genes[which(names(genes) == "naive B")],genes[which(names(genes) == "Memory B")],genes[which(names(genes) == "Germinal center B cell")],genes[which(names(genes) == "IgA Plasma")],genes[which(names(genes) == "IGHGP+ Plasma")],genes[which(names(genes) == "IGLC7+ Plasma")],genes[which(names(genes) == "IL7R+ Plasma")],genes[which(names(genes) == "NKG7+ Plasma")],genes[which(names(genes) == "IgG Plasma")])
T_gene <- c(genes[which(names(genes) == "CD8+ Tem")],genes[which(names(genes) == "CD4+ Tfh")],genes[which(names(genes) == "NK")],genes[which(names(genes) == "Th0")],genes[which(names(genes) == "GZMK+ Tc")],genes[which(names(genes) == "CD8+ Tex")],genes[which(names(genes) == "CD4+ Treg")],genes[which(names(genes) == "Tem")])
Stroma_gene <- c(genes[which(names(genes) == "MYH11+MCAM+ Myofibro")],genes[which(names(genes) == "RGS5+MCAM+ Myofibro")],genes[which(names(genes) == "VEGF+ Fibro")],genes[which(names(genes) == "CFD+ Fibro")],genes[which(names(genes) == "PLA2G2A+CFD+ Fibro")],genes[which(names(genes) == "PDGFC+ Fibro")],genes[which(names(genes) == "CXCL11+ Fibro")],genes[which(names(genes) == "MMP11+ Fibro")],genes[which(names(genes) == "MMP1+ Fibro")],genes[which(names(genes) == "MMP11+ Fibro")])
Myeloid_gene <- c(genes[which(names(genes) == "MM9+ Macro")],genes[which(names(genes) == "C1QC+MRC1-  Macro")],genes[which(names(genes) == "MRC1+ Macro")],genes[which(names(genes) == "cDC2")],genes[which(names(genes) == "VCAN+SPP1+ Macro")],genes[which(names(genes) == "MARCO+SPP1+ Macro")],genes[which(names(genes) == "cDC1")],genes[which(names(genes) == "Activated DCs")],genes[which(names(genes) == "Mast cell")],genes[which(names(genes) == "TACSTD2+ Macro")],genes[which(names(genes) == "CXCL10+ Macro")])
cellMarker <- c(B_P_gene,T_gene,Myeloid_gene,Stroma_gene)
length(cellMarker)
length(unique(cellMarker))
sum(table(cellMarker) == 1)
cellMarker.unique <- unique(cellMarker)
options(repr.plot.height = 14, repr.plot.width = 18)
test <- DotPlot(object = BRCA_SingleCell,features = cellMarker.unique,cols = c('blue','red'),group.by = 'cell_Subtype')+
theme_bw()+ theme(axis.text.x = element_text(angle = -90,hjust = 0,vjust = 0)) +
coord_flip()
test
test$data$id %>% unique(.) %>% setdiff(.,c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibro','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ CAFs','Endo'))
Mmarker_data  <- test$data
Mmarker_data$Zscore <- Mmarker_data$avg.exp.scaled
Mmarker_data$cluster <- factor(x = Mmarker_data$id,levels = c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibroblast','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ Fibro','Endo'),labels = c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibro','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ Fibro','Endo'))
unique(Mmarker_data$id)
Mmarker_data$geneClass <- NA
Mmarker_data$geneClass <- ifelse(Mmarker_data$features.plot %in% B_P_gene,'B',Mmarker_data$geneClass)
Mmarker_data$geneClass <- ifelse(Mmarker_data$features.plot %in% T_gene,'T',Mmarker_data$geneClass)
Mmarker_data$geneClass <- ifelse(Mmarker_data$features.plot %in% Myeloid_gene,'M',Mmarker_data$geneClass)
Mmarker_data$geneClass <- ifelse(Mmarker_data$features.plot %in% Stroma_gene,'S',Mmarker_data$geneClass)
library(aplot)
options(repr.plot.height = 14, repr.plot.width = 13)
group <- factor(c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibro','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ Fibro','Endo'),levels = c('naive B','IL7R+ Plasma','memory B','Germinal center B','IGHGP+ Plasma','IGLC7+ Plasma','NKG7+ Plasma','STMN1+ Plasma','IgA Plasma','IgG Plasma','Th0','CD4+ Tfh','CD8+ Tem','CD4+ Treg','GZMK+ Tc','CD8+ Tex','NK','Tem','TACSTD2+ Macro','MARCO+SPP1+ Macro','VCAN+SPP1+ Macro','CXCL10+ Macro','cDC2','CXCL9+ Macro','MRC1+ Macro','cDC1','Activated DCs','C1QC+MRC1- Macro','MM9+ Macro','CPB1+ Macro','Mast cell','CXCL11+ Fibro','CFD+ Fibro','RGS5+MCAM+ Myofibro','MMP11+ Fibro','VEGF+ Fibro','PLA2G2A+CFD+ Fibro','MYH11+MCAM+ Myofibro','PDGFC+ Fibro','Endo')) %>% as.data.frame() %>% mutate(group=c(rep("B cell",10),rep("T cell",8),rep("Myeloid",13),rep("Strom cell",9))) %>%mutate(p="group") %>%ggplot(aes(.,y=p,fill=group))+geom_tile() + scale_y_discrete(position="right") +theme_minimal()+xlab(NULL) + ylab(NULL) +theme(axis.text.x = element_blank())+labs(fill = "Group")
heatmap <- ggplot(Mmarker_data, aes(y=features.plot, x=cluster, fill=Zscore))+ 
geom_raster()+scale_fill_gradient2(low="#003366", high="#990033", mid="white",midpoint = 0)+
theme(axis.text.x = element_text(angle = -90,hjust = 0,vjust = 0.5, size = 15),axis.ticks.y = element_blank())+
xlab(NULL) + ylab(NULL)+theme(panel.border = element_rect(fill=NA,color="black", size=1, linetype="solid"))+geom_vline(xintercept=c(-0.5,10.5,18.5,31.5),size=.8)+geom_hline(yintercept=c(-0.5,18.5,48.5,76.5), size=.8)
heatmap <- heatmap %>% insert_top(group ,height = 0.05)
heatmap

结果如下:

第三步:绘制特征基因的小提琴图

构造绘制小提琴图的函数与特征基因筛选函数

modify_vlnplot<- function(obj, feature, pt.size = 0, plot.margin = unit(c(-0.75, 0, -0.75, 0), "cm"),...) {p<- VlnPlot(obj, features = feature, pt.size = pt.size,group.by = 'cell_Subtype', ... )  + xlab("") + ylab(feature) + ggtitle("") + theme(legend.position = "none", axis.text.x = element_blank(), axis.ticks.x = element_blank(), axis.title.y = element_text(size = rel(1), angle = 0), axis.text.y = element_text(size = rel(1)), plot.margin = plot.margin ) #+ rotate_x_text() return(p)
}
extract_max<- function(p){ymax<- max(ggplot_build(p)$layout$panel_scales_y[[1]]$range$range)return(ceiling(ymax))
}## main function
StackedVlnPlot<- function(obj, features,pt.size = 0, plot.margin = unit(c(-0.75, 0, -0.75, 0), "cm"),...) {plot_list<- purrr::map(features, function(x) modify_vlnplot(obj = obj,feature = x, ...))# Add back x-axis title to bottom plot. patchwork is going to support this?# rotate_x_text() 转置横坐标的字体,ggpubr 包,在这里添加才会只出现一次横坐标,不会出现多个横坐标plot_list[[length(plot_list)]]<- plot_list[[length(plot_list)]] +theme(axis.text.x=element_text(), axis.ticks.x = element_line()) + rotate_x_text(angle = 90)# change the y-axis tick to only max value ymaxs<- purrr::map_dbl(plot_list, extract_max)plot_list<- purrr::map2(plot_list, ymaxs, function(x,y) x + scale_y_continuous(breaks = c(y)) + expand_limits(y = y))p<- patchwork::wrap_plots(plotlist = plot_list, ncol = 1)return(p)
}
findMarker <- function(x){Idents(x) <- 'cell_Subtype'cellType_markerGene <- FindAllMarkers(x,logfc.threshold = 0.5,only.pos = T,test.use = 'roc',min.pct = 0.2)cellType_markerGene <- subset(cellType_markerGene,!grepl(pattern = 'RP[LS]',gene))cellType_markerGene <- subset(cellType_markerGene,!grepl(pattern = 'MT-',gene))top5 <- cellType_markerGene %>% group_by(cluster) %>% top_n(n = 10, wt = myAUC)print(top5)return(cellType_markerGene)
}

注:查找差异基因是因为,如果没有去了解特征基因,可以通过差异基因进行一个“预见”

绘制小提琴图(以髓系细胞为例)

cellMarker.Mycell <- findMarker(BRCA_SingleCell.Mycell)
features<- c("HLA-DOB","HLA-DQA2","HLA-DPB1","LGALS2","CLEC9A","WDFY4","IDO1","CCND1","IRF8","RGCC","SHTN1","BATF3","DNASE1L3","CD1C","XCR1","CADM1","LIMD2","CPNE3")
vlnplt <- StackedVlnPlot(obj = BRCA_SingleCell.Mycell, features = features)
vlnplt

结果如下:

综上所述,补充的工作就完成了,主要是对描述进行一个更细致的刻画,通过热图和总UMAP全局观察细胞亚群,通过小提琴图刻画特征基因在细胞亚群中的表达!

这篇关于(五)单细胞数据分析——细胞亚群的表型特征刻画(补充工作)的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/205884

相关文章

SpringBoot整合Flowable实现工作流的详细流程

《SpringBoot整合Flowable实现工作流的详细流程》Flowable是一个使用Java编写的轻量级业务流程引擎,Flowable流程引擎可用于部署BPMN2.0流程定义,创建这些流程定义的... 目录1、流程引擎介绍2、创建项目3、画流程图4、开发接口4.1 Java 类梳理4.2 查看流程图4

LiteFlow轻量级工作流引擎使用示例详解

《LiteFlow轻量级工作流引擎使用示例详解》:本文主要介绍LiteFlow是一个灵活、简洁且轻量的工作流引擎,适合用于中小型项目和微服务架构中的流程编排,本文给大家介绍LiteFlow轻量级工... 目录1. LiteFlow 主要特点2. 工作流定义方式3. LiteFlow 流程示例4. LiteF

SpringBoot集成LiteFlow实现轻量级工作流引擎的详细过程

《SpringBoot集成LiteFlow实现轻量级工作流引擎的详细过程》LiteFlow是一款专注于逻辑驱动流程编排的轻量级框架,它以组件化方式快速构建和执行业务流程,有效解耦复杂业务逻辑,下面给大... 目录一、基础概念1.1 组件(Component)1.2 规则(Rule)1.3 上下文(Conte

详解如何使用Python构建从数据到文档的自动化工作流

《详解如何使用Python构建从数据到文档的自动化工作流》这篇文章将通过真实工作场景拆解,为大家展示如何用Python构建自动化工作流,让工具代替人力完成这些数字苦力活,感兴趣的小伙伴可以跟随小编一起... 目录一、Excel处理:从数据搬运工到智能分析师二、PDF处理:文档工厂的智能生产线三、邮件自动化:

Python数据分析与可视化的全面指南(从数据清洗到图表呈现)

《Python数据分析与可视化的全面指南(从数据清洗到图表呈现)》Python是数据分析与可视化领域中最受欢迎的编程语言之一,凭借其丰富的库和工具,Python能够帮助我们快速处理、分析数据并生成高质... 目录一、数据采集与初步探索二、数据清洗的七种武器1. 缺失值处理策略2. 异常值检测与修正3. 数据

基于Python开发一个有趣的工作时长计算器

《基于Python开发一个有趣的工作时长计算器》随着远程办公和弹性工作制的兴起,个人及团队对于工作时长的准确统计需求日益增长,本文将使用Python和PyQt5打造一个工作时长计算器,感兴趣的小伙伴可... 目录概述功能介绍界面展示php软件使用步骤说明代码详解1.窗口初始化与布局2.工作时长计算核心逻辑3

RabbitMQ工作模式中的RPC通信模式详解

《RabbitMQ工作模式中的RPC通信模式详解》在RabbitMQ中,RPC模式通过消息队列实现远程调用功能,这篇文章给大家介绍RabbitMQ工作模式之RPC通信模式,感兴趣的朋友一起看看吧... 目录RPC通信模式概述工作流程代码案例引入依赖常量类编写客户端代码编写服务端代码RPC通信模式概述在R

Go 语言中的select语句详解及工作原理

《Go语言中的select语句详解及工作原理》在Go语言中,select语句是用于处理多个通道(channel)操作的一种控制结构,它类似于switch语句,本文给大家介绍Go语言中的select语... 目录Go 语言中的 select 是做什么的基本功能语法工作原理示例示例 1:监听多个通道示例 2:带

kotlin中的模块化结构组件及工作原理

《kotlin中的模块化结构组件及工作原理》本文介绍了Kotlin中模块化结构组件,包括ViewModel、LiveData、Room和Navigation的工作原理和基础使用,本文通过实例代码给大家... 目录ViewModel 工作原理LiveData 工作原理Room 工作原理Navigation 工

SSID究竟是什么? WiFi网络名称及工作方式解析

《SSID究竟是什么?WiFi网络名称及工作方式解析》SID可以看作是无线网络的名称,类似于有线网络中的网络名称或者路由器的名称,在无线网络中,设备通过SSID来识别和连接到特定的无线网络... 当提到 Wi-Fi 网络时,就避不开「SSID」这个术语。简单来说,SSID 就是 Wi-Fi 网络的名称。比如