曼哈顿图展示差异OTUs

简介

曼哈顿图(Manhattan Plot)本质上是散点图,一般用于展示大量非零的波动数据,散点在y轴的高度突出其属性异于其他低点:最早应用于全基因组关联分析(GWAS)研究中,y轴高点显示出具有强相关性的位点。

在微生物组中,曼哈顿图在展示差异OTUs上下调情况、差异OTUs归属情况等中有比较好的效果。一般来说,在扩增子中的曼哈顿图包括:

  1. X轴:按OTUs不同分类学水平的字母顺序排列,一般不展示OTU编号标签:如果想以门水平着色,则建议按门——纲——目进行排序。

  2. Y轴:-log10(Pvalue);由于Pvalue的范围是从0-1,并且我们希望它越小越好,但直接展示时位于最下方,不符合直觉认知;因此,对Pvalue进行-log10转换是非常好的方法,此时高显著性(Pvalue趋近零)的值被高位显示,独立于其他值。

  3. 水平线:一般表示显著性水平阈值,方便直观了解每个点的显著性水平;

  4. 散点色彩:一般用于展示分类学水平;

  5. 散点形状:一般用于展示富集情况;

  6. 散点大小:一般用于展示相对丰度值;

注:该版本为初稿,后期将更新至MicrobiomeStatPlot

实例解读

例1. 曼哈顿图展示差异富OTUs分布情况1

knitr::include_graphics('曼哈顿图展示差异富OTUs分布情况.png')
image.png

图1. 曼哈顿图展示差异富OTUs分布情况(Rafal Zgadzaj et al., 2016)

图片描述:

  1. 结果总体描述:我们根据OTUs的分类情况(Phylum, Class, Order)对群落变化进行了剖析,并通过曼哈顿图展示了野生型和突变体在根或根际的富集情况.

  2. 图1结果描述(Fig. 5 C and D):与WT相比,突变株中显著富集的OTUs的数量和多样性都显著增加(实心点大小和数量均增加)。

  3. 元素解释:

    • …respect to root:根内菌为统计对照,计算富集的OTUs;
    • 散点:代表单个OTU;
    • 散点大小:相对丰度;
    • 散点形状:显著富集实心圆点,否则为圆环;
    • 灰色背景:间隔每个目水平(或强调是否存在显著富集OTUs);
    • 水平线:显著性水平p = 0.05;

Manhattan plots showing root-enriched OTUs in WT (A) or in the mutants (B) with respect to rhizosphere and rhizosphere-enriched OTUs in WT (C) or in the mutants (D) with respect to root. OTUs that are significantly enriched (also with respect to soil) are depicted as full circles. The dashed line corresponds to the false discovery rate-corrected P value threshold of significance (α = 0.05). The color of each dot represents the different taxonomic affiliation of the OTUs (order level), and the size corresponds to their RAs in the respective samples [WT root samples (A), mutant root samples (B), WT rhizosphere samples (C), and mutant rhizosphere samples (D)]. Gray boxes are used to denote the different taxonomic groups (order level).

注意事项:

  1. 此处数据通过每个OTU的总体相对丰度 > 0.05%筛选;

例2. 曼哈顿图展示籼稻或粳稻中富集的OTUs2

knitr::include_graphics('曼哈顿图展示籼稻或粳稻中富集的OTUs.png')
image.png

图2. 曼哈顿图展示籼稻或粳稻中富集的OTUs(Jingying Zhang et al., 2019)

图片描述:

  1. 总体结果描述:首先,我们利用曼哈顿图对OTUs进行富集分析。

    • 在籼稻中富集门水平包括:Acidobacteria, Proteobacteria, Actinobacteria, Bacteroidetes, Chloroflexi, Firmicutes and Verrucomicrobia;

    • 在粳稻中富集门水平主要是:Proteobacteria, Bacteroidetes and Firmicutes;

    • 并且发现其了重叠的富集区域;

  2. 元素解释:

    • 每个点或三角形代表一个OTU;
    • 点的形状:indica中富集OTU为实心三角形;japonica中富集OTU为空心三角形;实心点表明无显著富集;
    • OTUs按照门的分类字母顺序排列和着色,变形杆菌则按照纲排列;
    • 与x平行虚线:应该是显著富集的OTU中最小的-log10(Pvalue);

Manhattan plot showing OTUs enriched in indica or japonica in field I (a) and field II (b). Each dot or triangle represents a single OTU. OTUs enriched in indica or japonica are represented by filled or empty triangles, respectively (FDR adjusted P < 0.05, Wilcoxon rank sum test). OTUs are arranged in taxonomic order and colored according to the phylum or, for Proteobacteria, the class. CPM, counts per million.

例3. 曼哈顿图展示三萜突变体对微生物组的调控作用3

knitr::include_graphics('曼哈顿图展示三萜突变体对微生物组的调控作用.png')
image.png

图3. 曼哈顿图展示三萜突变体对微生物组的调控作用(Ancheng C. Huang et al., 2019)

图片描述:

  1. 元素解释:

    • 每个点代表一个OTU;
    • 散点形状:显著富集——实心上三角;显著下调——空心下三角;
    • OTUs按物种分类水平字母排序,并按门或纲水平(个别)着色;

Manhattan plots showing modulation of OTUs in the microbiota of triterpene mutants. Manhattan plot showing enriched and depleted OTUs. Each dot or triangle represents a single OTU. OTUs enriched or depleted in mutants are shown with filled or empty triangles (P < 0.05), respectively. OTUs are arranged according to taxonomy and colored at the phylum or class levels. CPM represents counts per million.

最近发表的其他文章4,5也用到曼哈顿图对富集OTUs进行了展示,图片元素基本与上述三个例子相同。

绘图实战

为了便于后期添加元素,此处将数据处理和可视化分开,使用时可以随时检查,便于使用。

环境设置

本教程需要在R语言环境下运行,推荐在RStudio界面中学习。目前测试版本为:Windows 7环境,R 3.6.3和 RStudio 1.2.5033。理论上Mac、Linux系统,以及R或RStudio的更新版本是兼容的,但并没有广泛测试,有问题欢迎自行解决并在群在与同行分享经验。

按需求安装,没必要每次都运行该安装代码。

if (!requireNamespace("tidyverse", quietly = TRUE))
  install.packages("tidyverse")

if (!requireNamespace("BiocManager", quietly = TRUE))
  install.packages("BiocManager")

if (!requireNamespace("edgeR", quietly = TRUE))
  BiocManager::install("edgeR")

数据处理函数介绍

otu_table数据源:MicrobiomeStatPlot项目中的示例数据。

  1. 加载需要用的包
library(edgeR)
library(tidyverse)
  1. enrichment_analyses参数介绍:

    • otu: otu_table
    • design: 实验设计信息,样品信息的列名设置为samples,分组信息列名设置为group
    • taxonomy_table: OTUs的分类水平表
    • threshold: 相对丰度阈值,希望丢弃(获取)的相对丰度大小
    • contrasts: 指定对比矩阵(“KO-WT”:表示KO相对于WT显著富集的OTUs)
enrichment_analyses <- function(otu_table, design, taxonomy_table, threshold, contrasts){

  ## filtering relative abundance by thresholding
  otu_relative <- apply(otu_table, 2, function(x){x/sum(x)})
  if (missing(threshold))
    threshold = 0.0005
  idx <- rowSums(otu_relative > threshold) >= 1
  otu <- as.data.frame(otu_table[idx, ])
  otu_relative <- as.data.frame(otu_relative[idx, ])

  ## construct a DGEList
  dge_list <- DGEList(counts = otu, group = design$group)

  ## Remove the lower abundance(In this case, it's usually useless)
  keep <- rowSums(dge_list$counts) >= 0
  dge_keep <- dge_list[keep, ,keep.lib.sizes = FALSE]

  ## scale the raw library sizes dgelist
  dge <- calcNormFactors(dge_keep)

  ## set the design_mat
  design_mat <- model.matrix(~0 + dge$samples$group)
  colnames(design_mat) <- gsub("([dge$samples$group])", 
                               "", colnames(design_mat))
  ## fit the GLM
  GLMC = estimateGLMCommonDisp(dge, design_mat)
  GLMT = estimateGLMTagwiseDisp(GLMC, design_mat)
  fit = glmFit(GLMT, design_mat)

  ## conducts likelihood ratio tests
  contrast_mat <- makeContrasts(
    contrasts = contrasts, 
    levels = colnames(design_mat))

  ## Fit a negative binomial generalized log-linear model to the read counts
  lrt = glmLRT(fit, contrast = contrast_mat)

  ## Multiple Testing
  de_lrt <- decideTestsDGE(lrt, adjust.method = "fdr", p.value = 0.05)

  ## extract values
  data <- lrt$table
  data$sign_level <- de_lrt@.Data

  ## enrichment status
  data$enrichment <- as.factor(ifelse(data$sign_level == 1, "enriched", ifelse(data$sign_level == -1, "depleted","nosig")))
  ## get the out's names
  data$otus <- rownames(data)

  ## negative logarithmes transformation
  data$neglog_p = -log(data$PValue)

  ## remove low foldchange
  idx <- data$logFC < 0
  data$neglog_p[idx] <- 0

  ## reorder OTUs according to taxonomy(Phylum, Class, Order)
  taxonomy <- taxonomy_table[order(taxonomy_table[, 3], 
                                   taxonomy_table[, 4], 
                                   taxonomy_table[, 5]), ]

  idx <- taxonomy[, 1] %in% data$otu
  taxonomy <- taxonomy[idx, ]

  idx <- match(taxonomy[, 1], data$otu)
  data <- data[idx, ]

  data$classification  <- taxonomy[, 3]
  data$otu <- factor(data$otu, levels = data$otu)

  ## calculating the relative abundance 
  ra <- apply(otu_relative, 1, mean)
  data$ra <- ra[match(data$otu, names(ra))]

  return(data)
} 
  1. threshold_line 参数介绍:
    • Pvalue:期望的显著性P值得阈值(0.01, 0.05)
    • data:enrichment_analyses 计算结果
threshold_line <- function(data, Pvalue){
  if (missing(Pvalue)){
    FDR <- min(data$neglog_p[data$enrichment == "enriched"])}
  else {
    FDR <- -log(Pvalue)}
 return(FDR) 
}
  1. basic_theme:包含基本作图主题参数
basic_theme <- theme(panel.background = element_blank(),
                     panel.grid = element_blank(),
                     axis.ticks.x = element_blank(),
                     axis.text.x = element_blank(),
                     axis.line = element_line(size = 1),
                     legend.background = element_blank(),
                     legend.key = element_blank(),
                     legend.position = "top")

数据处理示例

## file_otu <-  "...\\otutab.txt"
## file_tax <- "...\\taxonomy.txt"
## loading data
otu <- read.delim("otutab.txt", header = TRUE, 
                  sep = "\t", row.names = 1,
                  stringsAsFactors = FALSE)

taxonomy <- read.delim("taxonomy.txt", header = TRUE, 
                       sep = "\t", 
                       stringsAsFactors = FALSE)

design <- data.frame(
  samples = colnames(otu),
  group = rep(c("KO", "WT", "OE" ),each = 6))

## Running enrichment_analyses

data <- enrichment_analyses(otu_table = otu, design = design, taxonomy_table = taxonomy, contrasts = "KO-WT")

FDR <- threshold_line(data = data)

ggpplot2 可视化示例

ggplot(data, aes(x = otu, y = neglog_p, color = classification, size = ra, shape = enrichment)) +
  geom_point(alpha = 0.8) +
  geom_hline(yintercept = FDR, linetype = 2, color = "red") +
  scale_shape_manual(values=c("enriched" = 17, 
                                "depleted" = 25, 
                                "nosig" = 20))+
  labs(x="OTUs", y = expression(~-log[10](P))) +
  guides(colour = "none") +
  basic_theme
image.png

图4. 曼哈顿图展示富集OTUs

间隔灰色设置

  1. rect_coordinates:

    • 计算灰度注释时,需要先设定x索引
    • data:enrichment_analyses 计算结果
rect_coordinates <- function(data){
  manhattan_rect <- subset(data, select = c(classification,
                                            x_index))

  rect_coordinates <- manhattan_rect %>% 
    group_by(classification) %>% 
    summarise(xmin = min(x_index),
              xmax = max(x_index))
  return(rect_coordinates)}
  1. ggplot2 可视化
data$x_index <- 1:nrow(data)
rect_coordinates <- rect_coordinates(data = data)

p <- ggplot(data, aes(x = x_index, y =  neglog_p))

for (i in 1:(nrow(rect_coordinates))) 
  p <- p + annotate('rect', 
                    xmin = rect_coordinates$xmin[i], 
                    xmax = rect_coordinates$xmax[i], 
                    ymin = 0, 
                    ymax = Inf,
                    fill = ifelse(i %% 2 == 0, 
                                  'gray50', 
                                  'gray95'))

p + geom_point(aes(color = classification, 
               size = ra, shape = enrichment)) + 
  geom_hline(yintercept = FDR, linetype = 2, color = "red") +
  scale_shape_manual(values=c("enriched" = 17, 
                              "depleted" = 25, 
                              "nosig" = 20))+
  labs(x="OTUs", y = expression(~-log[10](P))) +
  scale_x_continuous(expand = c(0, 0)) +
  guides(colour = "none") +
  basic_theme
image.png

图5. 灰区间隔的曼哈顿图展示富集OTUs

丰度排序着色

  1. top_class:

    • 计算灰度注释时,需要先设定x索引
    • data:enrichment_analyses 计算结果
    • top: 需要突出的丰度排名
top_class <- function(data, top){
  manhattan_rect <- subset(data, select = c(classification,
                                            x_index, 
                                            enrichment, 
                                            ra))
  rect_coordinates <- manhattan_rect %>% 
    group_by(classification) %>% 
    summarise(xmin = min(x_index),
              xmax = max(x_index),
              center = (xmin + xmax)/2)

  enriched_data <- subset(manhattan_rect, enrichment == "enriched")

  enriched_data %>% 
    group_by(classification) %>% 
    summarise(sum_ra = sum(ra)) -> enriched_ra             

  fill_data <- enriched_ra[order(enriched_ra$sum_ra, decreasing = TRUE), 1]

  fill_data <- as.data.frame(fill_data[c(1:top),])

  class_name <- merge(fill_data, rect_coordinates, by = "classification",
                      sort = FALSE)
  class_name <- class_name[c(1:top), ]

  reslut <- list(rect_coordinates = rect_coordinates, 
         fill_data = unlist(fill_data), 
         class_name = class_name)
  return(reslut)
}
  1. ggplot 可视化

将显著富集的OTUs 按总相对丰度排序,用较深灰色突出并注明分类水平名称。

data$x_index <- 1:nrow(data)
reslut <- top_class(data = data, top = 3)

rect_coordinates <- reslut$rect_coordinates
fill_data <- reslut$ fill_data
class_name<- reslut$class_name

p <- ggplot(data, aes(x = x_index, y=  neglog_p))

for (i in 1:(nrow(rect_coordinates))) 
  p <- p + annotate('rect', 
                    xmin = rect_coordinates$xmin[i], 
                    xmax = rect_coordinates$xmax[i], 
                    ymin = 0, 
                    ymax = ceiling(max(data$neglog_p)/5)*5,
                    fill = ifelse(rect_coordinates$classification[i] %in% fill_data, 
                                  'gray65', 
                                  'gray85'))

p + geom_point(aes(color = classification, size = ra, 
                   shape = enrichment)) +
  geom_hline(yintercept=FDR, linetype = 2, color = "red") +
  scale_shape_manual(values=c("enriched" = 17, 
                              "depleted" = 25, 
                              "nosig" = 20))+
  labs(x="OTUs", y="-log10(P)") +
  guides(colour = "none") +
  basic_theme + 
  scale_x_continuous(expand = c(0, 0)) +
  coord_cartesian(clip = "off") +
  geom_text(data = class_name, aes(x = center,
                                   y = Inf,
                                   label = classification))
image.png

图6. 深灰色突出高丰度富集OTUs

1. Zgadzaj, R. et al. Root nodule symbiosis in lotus japonicus drives the establishment of distinctive rhizosphere, root, and nodule bacterial communities. Proc Natl Acad Sci U S A 113, E7996–E8005 (2016).

2. Zhang, J. et al. NRT1.1B is associated with root microbiota composition and nitrogen use in field-grown rice. Nat Biotechnol 37, 676–684 (2019).

3. Huang, A. C. et al. A specialized metabolic network selectively modulates arabidopsis root microbiota. Science 364, (2019).

4. Wang, J. et al. Arsenic concentrations, diversity and co-occurrence patterns of bacterial and fungal communities in the feces of mice under sub-chronic arsenic exposure through food. Environ Int 138, 105600 (2020).

5. Zhu, M. et al. Synchronous response in methanogenesis and anaerobic degradation of pentachlorophenol in flooded soil. J Hazard Mater 374, 258–266 (2019).

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 201,681评论 5 474
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 84,710评论 2 377
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 148,623评论 0 334
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,202评论 1 272
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,232评论 5 363
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,368评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,795评论 3 393
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,461评论 0 256
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,647评论 1 295
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,476评论 2 317
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,525评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,226评论 3 318
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,785评论 3 303
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,857评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,090评论 1 258
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,647评论 2 348
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,215评论 2 341