通常在对物种做GO富集分析时,我们会遇到2种情况模式物种 & 非模式物种;针对模式物种专门的Orgdb包,但是目前针对模式物种的包只有20种,针对非模式职位另一种解决方案是通过AnnotationForge包来创建Orgdb包,本节来介绍如何构建非模式物种的Orgdb来做GO富集分析
阅读原文获取数据
原文链接 AnnotationForge包构建非模式物种Orgdb包
1.准备注释文件
注释文件可以通过eggnog网站上传序列文件获得
2.安装AnnotationForge包
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("AnnotationForge")
3.加载R包
library(tidyverse)
library(AnnotationForge)
4.导入数据
名称由makeOrgPackage函数规定请不要更改
emapper <- read.delim("emapper.annotations.xls") %>%
dplyr::select(GID=query_name,Gene_Symbol=Preferred_name,
GO=GOs,KO=KEGG_ko,Pathway =KEGG_Pathway,
OG =X.3,Gene_Name =X.4)
gene_info <- dplyr::select(emapper,GID,Gene_Name) %>%
dplyr::filter(!is.na(Gene_Name))
gene2go <- dplyr::select(emapper,GID,GO) %>%
separate_rows(GO, sep = ',', convert = F) %>%
filter(GO!="",!is.na(GO)) %>%
mutate(EVIDENCE = 'A')
5.构建0rgde包
AnnotationForge::makeOrgPackage(gene_info=gene_info,
go=gene2go,
maintainer='YJA<yanjunan@gmail.com>',
author='YJA',
version="0.1" ,
outputDir=".",
tax_id="59729",
genus="M",
species="A",
goTable = "go")
6.封装R包
终端执行如下命令对生成的org.MA.eg.db进行封装
R CMD build org.MA.eg.db
{21:27}~/Desktop/GO ➭ R CMD build org.MA.eg.db
* checking for file ‘org.MA.eg.db/DESCRIPTION’ ... OK
* preparing ‘org.MA.eg.db’:
* checking DESCRIPTION meta-information ... OK
* checking for LF line-endings in source and make files and shell scripts
* checking for empty or unneeded directories
* building ‘org.MA.eg.db_0.1.tar.gz’
7.安装R包
install.packages("org.MA.eg.db_0.1.tar.gz",repos=NULL)
8.GO富集分析
library(clusterProfiler)
library(org.MA.eg.db)
gene <- read.delim("genes.counts.DESeq2.xls") %>%
filter(abs(log2FoldChange)>1 & padj < 0.05) %>%
pull(id)
ego <- enrichGO(gene=gene,OrgDb=org.MA.eg.db,keyType="GID",
ont="ALL",qvalueCutoff = 0.05,pvalueCutoff =0.05)
ego %>% as.data.frame()
喜欢的小伙伴欢迎关注我的公众号
R语言数据分析指南,持续分享数据可视化的经典案例及一些生信知识,希望对大家有所帮助