基因课FTP地址:ftp://http://gsx.genek.tv/2020-3-10%E7%9B%B4%E6%92%AD%E4%B8%80%E4%B8%AA%E5%AE%8C%E6%95%B4%E7%9A%84%E8%BD%AC%E5%BD%95%E7%BB%84%E9%A1%B9%E7%9B%AE/
听张旭东老师的课
R Markdown
- 可生成html文件
- 完成后点击Knit, 可生成Markdown文本文件
- R Markdown升级版Bookdown
导入数据
read.table(file = 'de_result.tab', sep = '\t')
加载包
library(ggplot2)
library(ggsci) # ggplot扩展包,内含不同级别期刊的标度
准备框架
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) # 创建画布
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) + geom_point() + # 绘制散点图,添加散点几何对象,“+”用来接后面的添加工具
theme_bw() # 换主题,可以试theme_classic, theme_test等,theme_bw适合科研用
将 direction 映射到点的颜色
- 让火山图中上调、下调、ns的基因对应的点有不同的颜色
- commands
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) +
geom_point(aes(color = direction)) +
scale_color_npg() +
theme_bw() - 标注
- aes 为表示映射的函数
- scale_color_xxx() 利用ggsci包配色,npg为Nature的高度, jco等,还有很多
标度
- 对于图表颜色、样式等,不同的期刊有不同的喜好
- ggsci是ggplot的扩展包,记载了SCI不同级别期刊的不同喜好
- scale_color_xxx() # 标度调色板。xxx处为对应期刊,详见帮助文档
自定义修改颜色
- 需要制定颜色对应项的顺序,需要自定义调色板
- commands
library(tidyverse)
de_result <- mutate(de_result, direction = factor(direction, levels = c('up', 'ns', 'down')))
my_palette <- c('green', 'grey', 'red')
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) +
geom_point(aes(color = direction)) +
scale_color_manual(values = my_palette) +
theme_bw() - 标注
- direction一列只有up/ns/down三个值,为离散型变量——无序,将离散型变量排序后为因子型变量——有序;
- 使用mutate需要加载tidyverse;
- mutate可以修改列的数据类型,修改为因子型变量——有序;
- my_palette参数中传入的是自定义的颜色,可以试试颜色my_palette <- c('#E64B35FF', '#999999', '#4DBBD5FF'),数据映射的顺序与因子的顺序一致,若没有定义因子顺序,默认离散变量的排序为字母顺序
- scale_color_manual() 传入自己的参数
- 颜色调整
- 深沉的颜色相对好看
- 网上查找“16进制颜色”
修改点的大小
- log2FC越大的点越大
- commands
library(ggplot2)
library(tidyverse)
de_result <- mutate(de_result, direction = factor(direction, levels = c('up', 'ns', 'down')))
my_palette <- c('green', 'grey', 'red')
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) +
geom_point(aes(color = direction, size = abs(log2FoldChange))) +
scale_color_manual(values = my_palette) +
scale_size(range = c(0.1, 2)) +
theme_bw() - 标注
- geom_point(aes(size = ?)) 调节点大小的映射值
- abs求绝对值
- scale_size(range = ?) 定制点的大小,一般为0.1-2或0.1-3
增加透明度(选加)
library(tidyverse)
de_result <- mutate(de_result, direction = factor(direction, levels = c('up', 'ns', 'down')))
my_palette <- c('green', 'grey', 'red')
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) +
geom_point(aes(color = direction,
size = abs(log2FoldChange),
alpha = abs(log2FoldChange))) +
scale_color_manual(values = my_palette) +
scale_size(range = c(0.1, 2)) +
theme_bw()
点的形状
-
有边框的点和无边框的点有差别
- commands
library(ggplot2)
library(tidyverse)
de_result <- mutate(de_result, direction = factor(direction, levels = c('up', 'ns', 'down')))
my_palette <- c('green', 'grey', 'red')
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) +
geom_point(shape = 21,
alpha = 1/2,
color = 'black',
aes(fill = direction,
size = abs(log2FoldChange))) +
scale_color_manual(values = my_palette) +
scale_size(range = c(0.1, 2)) +
theme_bw() - 标注
- shape指定有边框的点
- alpha = 1/2指定全部颜色透明度
- color指定边框颜色
- fill指定填充颜色映射值
添加阈值线
-
R语言线的类型
commands
library(ggplot2)
library(tidyverse)
de_result <- mutate(de_result, direction = factor(direction, levels = c('up', 'ns', 'down')))
my_palette <- c('green', 'grey', 'red')
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) +
geom_point(aes(color = direction, size = abs(log2FoldChange))) +
geom_hline(yintercept = -log10(0.05), linetype = 'dashed', size = 0.2) +
geom_vline(xintercept = c(-1, 1), linetype = 'dashed') +
scale_color_manual(values = my_palette) +
scale_size(range = c(0.1, 2)) +
theme_bw()-
标注
- geom_hline()绘制水平线,geom_vline()绘制垂直线;
- yintercept指定线的y轴值,xintercept指定线的x轴值;
- 坐标轴的值是通过计算得到的,不能用数值直接指定
- 虚线——dashed
- size调整线的粗细,一般用默认的即可
- color调整线的颜色,默认黑色
添加标签
- commands
library(ggplot2)
library(tidyverse)
library(ggrepel)
de_result <- mutate(de_result, direction = factor(direction, levels = c('up', 'ns', 'down')))
top_de <- filter(de_result,
abs(log2FoldChange) > 2 & padj < 1e-50)
my_palette <- c('green', 'grey', 'red')
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) +
geom_point(aes(color = direction, size = abs(log2FoldChange))) +
geom_hline(yintercept = -log10(0.05), linetype = 'dashed', size = 0.2) +
geom_vline(xintercept = c(-1, 1), linetype = 'dashed') +
geom_label_repel(data = top_de, aes(label = id)) +
ylim(c(0, 200)) +
scale_color_manual(values = my_palette) +
scale_size(range = c(0.1, 2)) +
theme_bw() - 标注
- ggrepel包用于加标签(自动避免标签重叠)
- 给关键点添加标签
- 使用filter需要加载tidyverse包
- 注意padj指定值的表示方式
- 如果想要指定加标签的点,则top_de赋值改为如下操作
top_de <- filter(de_result, id == 'HF01786') - geom_label_repel()指定标签及映射的值
- ylim筛掉太大的已经被研究过的点,防止值太大的点影响其他基因在图中的显示
设置坐标轴名字及标题
- commands
library(ggplot2)
library(tidyverse)
library(ggrepel)
de_result <- mutate(de_result, direction = factor(direction, levels = c('up', 'ns', 'down')))
top_de <- filter(de_result,
abs(log2FoldChange) > 2 & padj < 1e-50)
my_palette <- c('green', 'grey', 'red')
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) +
geom_point(aes(color = direction, size = abs(log2FoldChange))) +
geom_hline(yintercept = -log10(0.05), linetype = 'dashed', size = 0.2) +
geom_vline(xintercept = c(-1, 1), linetype = 'dashed') +
geom_label_repel(data = top_de, aes(label = id)) +
scale_color_manual(values = my_palette) +
scale_size(range = c(0.1, 2)) +
labs(x = 'log2 fold change',
y = '-log10(P value)',
title = 'Vocano Plot'
size = 'log2 fold change') +
theme_bw() +
theme(plot.title = element_text(size = 18, hjust = 0.5)) - 标注
- labs设置x轴、y轴、图例(size)及全图的标题
- theme_bw主题中标题靠左,想要居中要自己设置,不过要注意顺序,先肯定theme_bw
- theme()更改主题中设置,plot.title = element_text()给标题设置,size → 字体,hjust = 0/0.5/1 → 靠左/居中/靠右,vjust可设置纵向居中
修改图例
- commands
library(ggplot2)
library(tidyverse)
library(ggrepel)
de_result <- mutate(de_result, direction = factor(direction, levels = c('up', 'ns', 'down')))
top_de <- filter(de_result,
abs(log2FoldChange) > 2 & padj < 1e-50)
my_palette <- c('green', 'grey', 'red')
ggplot(data = de_result, aes(x = log2FoldChange, y = -log10(padj))) +
geom_point(aes(color = direction, size = abs(log2FoldChange))) +
geom_hline(yintercept = -log10(0.05), linetype = 'dashed', size = 0.2) +
geom_vline(xintercept = c(-1, 1), linetype = 'dashed') +
geom_label_repel(data = top_de, aes(label = id)) +
scale_color_manual(values = my_palette) +
scale_size(range = c(0.1, 2)) +
labs(x = 'log2 fold change',
y = '-log10(P value)',
title = 'Vocano Plot'
size = 'log2 fold change') +
guides(size = FALSE) +
theme_bw() +
theme(plot.title = element_text(size = 18, hjust = 0.5),
legend.background = element_blank(),
legend.key = element_blank(),
legend.position = c(0.93, 0.85)) - 标注
- guides()去除一个图例
- theme中legened.background 设置图例文字背景,legend.key设置图例图像背景,使背景变透明,不要遮盖图中的网格线,
- legned.position限定图例位置,默认为legend.position = 'right', 想画在图中需要手动调整
- legend.position = c(x,y), 0→1为坐标轴最小到最大,手动调整,多次换值
图片导出
- 方法一
Rstudio导出pdf格式,不要用PNG - 方法二
画图前声明pdf文件,打开一个pdf文件
pdf(file = 'p1.pdf')
ggplot(... ...) # 画图省略
dev.off()
总结
- 1.图层:颜色、透明度、性状等的调整为全局调整;
- 2.映射:对应的点颜色、透明度、性状等属性有不同分类;
- 3.标度:控制映射的规律
- 4.主题:颜色是否好看