clusterProfiler花式用法

Chapter 3 Universal enrichment analysis

clusterProfiler supports both hypergeometric test and gene set enrichment analyses of many ontology/pathway, but it’s still not enough for users may want to analyze their data with unsupported organisms, slim version of GO, novel functional annotation (e.g. GO via BlastGO or KEGG via KAAS), unsupported ontologies/pathways or customized annotations.

clusterProfiler provides enricher function for hypergeometric test and GSEA function for gene set enrichment analysis that are designed to accept user defined annotation. They accept two additional parameters TERM2GENE and TERM2NAME. As indicated in the parameter names, TERM2GENE is a data.frame with first column of term ID and second column of corresponding mapped gene and TERM2NAME is a data.frame with first column of term ID and second column of corresponding term name. TERM2NAME is optional.

3.1 Input data

For over representation analysis, all we need is a gene vector, that is a vector of gene IDs. These gene IDs can be obtained by differential expression analysis (e.g. with DESeq2 package).

For gene set enrichment analysis, we need a ranked list of genes. DOSE provides an example dataset geneList which was derived from R package breastCancerMAINZ that contained 200 samples, including 29 samples in grade I, 136 samples in grade II and 35 samples in grade III. We computed the ratios of geometric means of grade III samples versus geometric means of grade I samples. Logarithm of these ratios (base 2) were stored in geneList dataset.

The geneList contains three features:

  1. numeric vector: fold change or other type of numerical variable
  2. named vector: every number was named by the corresponding gene ID
  3. sorted vector: number should be sorted in decreasing order

Suppose you are importing your own data from a csv file and the file contains two columns, one for gene ID (no duplicated allowed) and another one for fold change, you can prepare your own geneList via the following command:

d <- read.csv(your_csv_file)
## assume that 1st column is ID
## 2nd column is fold change

## feature 1: numeric vector
geneList <- d[,2]

## feature 2: named vector
names(geneList) <- as.character(d[,1])

## feature 3: decreasing order
geneList <- sort(geneList, decreasing = TRUE)

We can load the sample data into R via:

data(geneList, package="DOSE")
head(geneList)
##     4312     8318    10874    55143    55388      991 
## 4.572613 4.514594 4.418218 4.144075 3.876258 3.677857

Suppose we define fold change greater than 2 as DEGs:

gene <- names(geneList)[abs(geneList) > 2]
head(gene)
## [1] "4312"  "8318"  "10874" "55143" "55388" "991"

3.2 WikiPathways analysis

WikiPathways is a continuously updated pathway database curated by a community of researchers and pathway enthusiasts. WikiPathways produces monthly releases of gmt files for supported organisms at data.wikipathways.org. Download the appropriate gmt file and then generate TERM2GENE and TERM2NAME to use enricher and GSEA functions.

library(magrittr)
library(clusterProfiler)

data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]

wpgmtfile <- system.file("extdata/wikipathways-20180810-gmt-Homo_sapiens.gmt", package="clusterProfiler")
wp2gene <- read.gmt(wpgmtfile)
wp2gene <- wp2gene %>% tidyr::separate(ont, c("name","version","wpid","org"), "%")
wpid2gene <- wp2gene %>% dplyr::select(wpid, gene) #TERM2GENE
wpid2name <- wp2gene %>% dplyr::select(wpid, name) #TERM2NAME

ewp <- enricher(gene, TERM2GENE = wpid2gene, TERM2NAME = wpid2name)
head(ewp)
##            ID
## WP2446 WP2446
## WP2361 WP2361
## WP179   WP179
## WP3942 WP3942
## WP4240 WP4240
## WP2328 WP2328
##                                                                           Description
## WP2446                                                  Retinoblastoma (RB) in Cancer
## WP2361                                                       Gastric Cancer Network 1
## WP179                                                                      Cell Cycle
## WP3942                                                         PPAR signaling pathway
## WP4240 Regulation of sister chromatid separation at the metaphase-anaphase transition
## WP2328                                                            Allograft Rejection
##        GeneRatio  BgRatio       pvalue     p.adjust
## WP2446     11/95  88/6249 6.801697e-08 1.054263e-05
## WP2361      6/95  29/6249 3.772735e-06 2.923870e-04
## WP179      10/95 122/6249 1.384549e-05 7.153503e-04
## WP3942      7/95  67/6249 6.210513e-05 2.406574e-03
## WP4240      4/95  16/6249 7.931988e-05 2.458916e-03
## WP2328      7/95  90/6249 4.016758e-04 1.037663e-02
##              qvalue
## WP2446 9.450779e-06
## WP2361 2.621058e-04
## WP179  6.412648e-04
## WP3942 2.157336e-03
## WP4240 2.204258e-03
## WP2328 9.301967e-03
##                                                       geneID
## WP2446 8318/9133/7153/6241/890/983/81620/7272/1111/891/24137
## WP2361                       4605/7153/11065/22974/6286/6790
## WP179          8318/991/9133/890/983/7272/1111/891/4174/9232
## WP3942                    4312/9415/9370/5105/2167/3158/5346
## WP4240                                    991/1062/4085/9232
## WP2328                   10563/6373/4283/3002/10578/3117/730
##        Count
## WP2446    11
## WP2361     6
## WP179     10
## WP3942     7
## WP4240     4
## WP2328     7
ewp2 <- GSEA(geneList, TERM2GENE = wpid2gene, TERM2NAME = wpid2name, verbose=FALSE)
head(ewp2)
##            ID
## WP4172 WP4172
## WP3932 WP3932
## WP306   WP306
## WP236   WP236
## WP474   WP474
## WP2911 WP2911
##                                           Description
## WP4172                     PI3K-Akt Signaling Pathway
## WP3932 Focal Adhesion-PI3K-Akt-mTOR-signaling pathway
## WP306                                  Focal Adhesion
## WP236                                    Adipogenesis
## WP474                       Endochondral Ossification
## WP2911    miRNA targets in ECM and membrane receptors
##        setSize enrichmentScore       NES      pvalue
## WP4172     311      -0.3406070 -1.467847 0.001347709
## WP3932     281      -0.3903410 -1.664489 0.001383126
## WP306      188      -0.4230017 -1.731019 0.001449275
## WP236      125      -0.4387536 -1.694197 0.001526718
## WP474       59      -0.5111776 -1.751890 0.001572327
## WP2911      21      -0.6953475 -1.888706 0.001757469
##          p.adjust    qvalues rank
## WP4172 0.03001658 0.02487562 2133
## WP3932 0.03001658 0.02487562 1994
## WP306  0.03001658 0.02487562 2221
## WP236  0.03001658 0.02487562 2301
## WP474  0.03001658 0.02487562 2142
## WP2911 0.03001658 0.02487562 2629
##                          leading_edge
## WP4172 tags=24%, list=17%, signal=21%
## WP3932 tags=26%, list=16%, signal=22%
## WP306  tags=28%, list=18%, signal=23%
## WP236  tags=34%, list=18%, signal=28%
## WP474  tags=47%, list=17%, signal=40%
## WP2911 tags=76%, list=21%, signal=60%
##                                                                                                                                                                                                                                                                                                                                                                                        core_enrichment
## WP4172 5228/7424/7157/627/2252/7059/92579/5563/5295/6794/1288/7010/3910/3371/3082/1291/4602/3791/1027/90993/3441/3643/1129/2322/1975/7450/596/3685/1942/2149/1280/4804/3675/595/2261/7248/2246/4803/2259/3912/1902/1278/1277/2846/2057/1293/2247/55970/5618/7058/10161/56034/3693/4254/3480/4908/5159/1292/3908/2690/3909/8817/4915/3551/2791/63923/3913/3667/1287/3679/7060/3479/80310/1311/5105/1101
## WP3932           2252/7059/51719/8660/5563/5295/6794/1288/7010/3910/3371/3082/3791/1301/1027/90993/3643/1129/1975/7450/3685/2034/1942/2149/1280/4804/3675/2261/7248/2246/4803/2259/3912/1902/2308/1278/1277/81617/2846/2057/2247/1281/50509/1290/55970/5618/7058/10161/56034/3693/4254/6720/3480/5159/3991/1289/1292/3908/2690/3909/8817/3551/2791/63923/3913/3667/3679/7060/3479/80310/1311/1101/3169
## WP306                                                                                                                       10420/5595/5228/7424/1499/4636/83660/7059/5295/1288/23396/3910/3371/3082/394/3791/7450/596/3685/1280/3675/595/2318/3912/1793/1278/5753/1277/10398/55742/50509/1290/2317/7058/25759/56034/3693/3480/5159/857/1292/3908/3909/63923/3913/3679/7060/3479/10451/80310/1311/1101
## WP236                                                                                                                                                                                 9611/8321/5925/8609/1499/2908/4088/8660/3399/6778/8648/6258/2662/5914/6776/2034/196/8204/4692/4208/2308/5468/6695/4023/1592/7350/81029/3952/1675/5618/6720/3991/2487/3667/3572/3479/6424/9370/5105/652/5346/2625
## WP474                                                                                                                                                                                                                                                    9759/4921/51719/860/5745/1028/1280/2261/4256/7042/4208/4322/8100/7078/2247/11096/85477/3480/2690/8817/5167/2737/5744/2487/5327/1300/1473/3479
## WP2911                                                                                                                                                                                                                                                                                                                6383/3672/7057/3915/3910/1291/1278/1293/1281/50509/1290/7058/3693/1289/1292/3913

You may want to convert the gene IDs to gene symbols, which can be done by setReadable function.

library(org.Hs.eg.db)
ewp <- setReadable(ewp, org.Hs.eg.db, keyType = "ENTREZID")
ewp2 <- setReadable(ewp2, org.Hs.eg.db, keyType = "ENTREZID")
head(ewp)
##            ID
## WP2446 WP2446
## WP2361 WP2361
## WP179   WP179
## WP3942 WP3942
## WP4240 WP4240
## WP2328 WP2328
##                                                                           Description
## WP2446                                                  Retinoblastoma (RB) in Cancer
## WP2361                                                       Gastric Cancer Network 1
## WP179                                                                      Cell Cycle
## WP3942                                                         PPAR signaling pathway
## WP4240 Regulation of sister chromatid separation at the metaphase-anaphase transition
## WP2328                                                            Allograft Rejection
##        GeneRatio  BgRatio       pvalue     p.adjust
## WP2446     11/95  88/6249 6.801697e-08 1.054263e-05
## WP2361      6/95  29/6249 3.772735e-06 2.923870e-04
## WP179      10/95 122/6249 1.384549e-05 7.153503e-04
## WP3942      7/95  67/6249 6.210513e-05 2.406574e-03
## WP4240      4/95  16/6249 7.931988e-05 2.458916e-03
## WP2328      7/95  90/6249 4.016758e-04 1.037663e-02
##              qvalue
## WP2446 9.450779e-06
## WP2361 2.621058e-04
## WP179  6.412648e-04
## WP3942 2.157336e-03
## WP4240 2.204258e-03
## WP2328 9.301967e-03
##                                                              geneID
## WP2446 CDC45/CCNB2/TOP2A/RRM2/CCNA2/CDK1/CDT1/TTK/CHEK1/CCNB1/KIF4A
## WP2361                           MYBL2/TOP2A/UBE2C/TPX2/S100P/AURKA
## WP179       CDC45/CDC20/CCNB2/CCNA2/CDK1/TTK/CHEK1/CCNB1/MCM5/PTTG1
## WP3942                    MMP1/FADS2/ADIPOQ/PCK1/FABP4/HMGCS2/PLIN1
## WP4240                                     CDC20/CENPE/MAD2L1/PTTG1
## WP2328                    CXCL13/CXCL11/CXCL9/GZMB/GNLY/HLA-DQA1/C7
##        Count
## WP2446    11
## WP2361     6
## WP179     10
## WP3942     7
## WP4240     4
## WP2328     7
head(ewp2)
##            ID
## WP4172 WP4172
## WP3932 WP3932
## WP306   WP306
## WP236   WP236
## WP474   WP474
## WP2911 WP2911
##                                           Description
## WP4172                     PI3K-Akt Signaling Pathway
## WP3932 Focal Adhesion-PI3K-Akt-mTOR-signaling pathway
## WP306                                  Focal Adhesion
## WP236                                    Adipogenesis
## WP474                       Endochondral Ossification
## WP2911    miRNA targets in ECM and membrane receptors
##        setSize enrichmentScore       NES      pvalue
## WP4172     311      -0.3406070 -1.467847 0.001347709
## WP3932     281      -0.3903410 -1.664489 0.001383126
## WP306      188      -0.4230017 -1.731019 0.001449275
## WP236      125      -0.4387536 -1.694197 0.001526718
## WP474       59      -0.5111776 -1.751890 0.001572327
## WP2911      21      -0.6953475 -1.888706 0.001757469
##          p.adjust    qvalues rank
## WP4172 0.03001658 0.02487562 2133
## WP3932 0.03001658 0.02487562 1994
## WP306  0.03001658 0.02487562 2221
## WP236  0.03001658 0.02487562 2301
## WP474  0.03001658 0.02487562 2142
## WP2911 0.03001658 0.02487562 2629
##                          leading_edge
## WP4172 tags=24%, list=17%, signal=21%
## WP3932 tags=26%, list=16%, signal=22%
## WP306  tags=28%, list=18%, signal=23%
## WP236  tags=34%, list=18%, signal=28%
## WP474  tags=47%, list=17%, signal=40%
## WP2911 tags=76%, list=21%, signal=60%
##                                                                                                                                                                                                                                                                                                                                                                                                                                      core_enrichment
## WP4172 PGF/VEGFC/TP53/BDNF/FGF7/THBS3/G6PC3/PRKAA2/PIK3R1/STK11/COL4A6/TEK/LAMA4/TNC/HGF/COL6A1/MYB/KDR/CDKN1B/CREB3L1/IFNA4/INSR/CHRM2/FLT3/EIF4B/VWF/BCL2/ITGAV/EFNA1/F2R/COL2A1/NGFR/ITGA3/CCND1/FGFR3/TSC1/FGF1/NGF/FGF14/LAMB1/LPAR1/COL1A2/COL1A1/LPAR4/EPOR/COL6A3/FGF2/GNG12/PRLR/THBS2/LPAR6/PDGFC/ITGB5/KITLG/IGF1R/NTF3/PDGFRB/COL6A2/LAMA2/GHR/LAMA3/FGF18/NTRK2/IKBKB/GNG11/TNN/LAMB2/IRS1/COL4A5/ITGA7/THBS4/IGF1/PDGFD/COMP/PCK1/CHAD
## WP3932      FGF7/THBS3/CAB39/IRS2/PRKAA2/PIK3R1/STK11/COL4A6/TEK/LAMA4/TNC/HGF/KDR/COL11A1/CDKN1B/CREB3L1/INSR/CHRM2/EIF4B/VWF/ITGAV/EPAS1/EFNA1/F2R/COL2A1/NGFR/ITGA3/FGFR3/TSC1/FGF1/NGF/FGF14/LAMB1/LPAR1/FOXO1/COL1A2/COL1A1/CAB39L/LPAR4/EPOR/FGF2/COL3A1/COL5A3/COL5A2/GNG12/PRLR/THBS2/LPAR6/PDGFC/ITGB5/KITLG/SREBF1/IGF1R/PDGFRB/LIPE/COL5A1/COL6A2/LAMA2/GHR/LAMA3/FGF18/IKBKB/GNG11/TNN/LAMB2/IRS1/ITGA7/THBS4/IGF1/PDGFD/COMP/CHAD/FOXA1
## WP306                                                                                                                                   TESK2/MAPK3/PGF/VEGFC/CTNNB1/MYL5/TLN2/THBS3/PIK3R1/COL4A6/PIP5K1C/LAMA4/TNC/HGF/ARHGAP5/KDR/VWF/BCL2/ITGAV/COL2A1/ITGA3/CCND1/FLNC/LAMB1/DOCK1/COL1A2/PTK6/COL1A1/MYL9/PARVA/COL5A3/COL5A2/FLNB/THBS2/SHC2/PDGFC/ITGB5/IGF1R/PDGFRB/CAV1/COL6A2/LAMA2/LAMA3/TNN/LAMB2/ITGA7/THBS4/IGF1/VAV3/PDGFD/COMP/CHAD
## WP236                                                                                                                                                                                                        NCOR1/FZD1/RB1/KLF7/CTNNB1/NR3C1/SMAD3/IRS2/ID3/STAT6/NCOA1/RXRG/GDF10/RARA/STAT5A/EPAS1/AHR/NRIP1/NDN/MEF2C/FOXO1/PPARG/SPOCK1/LPL/CYP26A1/UCP1/WNT5B/LEP/CFD/PRLR/SREBF1/LIPE/FRZB/IRS1/IL6ST/IGF1/SFRP4/ADIPOQ/PCK1/BMP4/PLIN1/GATA3
## WP474                                                                                                                                                                                                                                                                              HDAC4/DDR2/CAB39/RUNX2/PTH1R/CDKN1C/COL2A1/FGFR3/MGP/TGFB2/MEF2C/MMP13/IFT88/TIMP3/FGF2/ADAMTS5/SCIN/IGF1R/GHR/FGF18/ENPP1/GLI3/PTHLH/FRZB/PLAT/COL10A1/CST5/IGF1
## WP2911                                                                                                                                                                                                                                                                                                                                        SDC2/ITGA1/THBS1/LAMC1/LAMA4/COL6A1/COL1A2/COL6A3/COL3A1/COL5A3/COL5A2/THBS2/ITGB5/COL5A1/COL6A2/LAMB2

As an alternative to manually downloading gmt files, install the rWikiPathways package to gain scripting access to the latest gmt files using the downloadPathwayArchive function.

3.3 Cell Marker

cell_markers <- vroom::vroom('http://bio-bigdata.hrbmu.edu.cn/CellMarker/download/Human_cell_markers.txt') %>%
   tidyr::unite("cellMarker", tissueType, cancerType, cellName, sep=", ") %>% 
   dplyr::select(cellMarker, geneID) %>%
   dplyr::mutate(geneID = strsplit(geneID, ', '))
cell_markers
## # A tibble: 2,868 x 2
##    cellMarker                                       geneID  
##    <chr>                                            <list>  
##  1 Kidney, Normal, Proximal tubular cell            <chr [1…
##  2 Liver, Normal, Ito cell (hepatic stellate cell)  <chr [1…
##  3 Endometrium, Normal, Trophoblast cell            <chr [1…
##  4 Germ, Normal, Primordial germ cell               <chr [1…
##  5 Corneal epithelium, Normal, Epithelial cell      <chr [1…
##  6 Placenta, Normal, Cytotrophoblast                <chr [1…
##  7 Periosteum, Normal, Periosteum-derived progenit… <chr [4…
##  8 Amniotic membrane, Normal, Amnion epithelial ce… <chr [2…
##  9 Primitive streak, Normal, Primitive streak cell  <chr [2…
## 10 Adipose tissue, Normal, Stromal vascular fracti… <chr [1…
## # … with 2,858 more rows
y <- enricher(gene, TERM2GENE=cell_markers, minGSSize=1)
DT::datatable(as.data.frame(y))

<label style="-webkit-font-smoothing: antialiased; box-sizing: border-box; -webkit-tap-highlight-color: transparent; text-size-adjust: none;">Show <select name="DataTables_Table_0_length" aria-controls="DataTables_Table_0" class="" style="-webkit-font-smoothing: antialiased; box-sizing: border-box; -webkit-tap-highlight-color: transparent; text-size-adjust: none; font-family: inherit; font-size: 16px; margin: 0px; text-transform: none;"><option value="10" style="-webkit-font-smoothing: antialiased; box-sizing: border-box; -webkit-tap-highlight-color: transparent; text-size-adjust: none;">10</option><option value="25" style="-webkit-font-smoothing: antialiased; box-sizing: border-box; -webkit-tap-highlight-color: transparent; text-size-adjust: none;">25</option><option value="50" style="-webkit-font-smoothing: antialiased; box-sizing: border-box; -webkit-tap-highlight-color: transparent; text-size-adjust: none;">50</option><option value="100" style="-webkit-font-smoothing: antialiased; box-sizing: border-box; -webkit-tap-highlight-color: transparent; text-size-adjust: none;">100</option></select> entries</label>

<label style="-webkit-font-smoothing: antialiased; box-sizing: border-box; -webkit-tap-highlight-color: transparent; text-size-adjust: none;">Search:<input type="search" class="" placeholder="" aria-controls="DataTables_Table_0" style="-webkit-font-smoothing: antialiased; box-sizing: border-box; -webkit-tap-highlight-color: transparent; text-size-adjust: none; font-family: inherit; font-size: 16px; margin: 0px 0px 0px 0.5em; line-height: normal; -webkit-appearance: textfield;"></label>

ID Description GeneRatio BgRatio pvalue p.adjust qvalue geneID Count
Embryonic prefrontal cortex, Normal, Neural progenitor cell Embryonic prefrontal cortex, Normal, Neural progenitor cell Embryonic prefrontal cortex, Normal, Neural progenitor cell 33/153 166/11582 4.43518066168319e-30 6.34230834620697e-28 4.80866955950915e-28 991/9493/9833/9133/10403/7153/259266/6241/9787/11065/55355/55872/51203/83461/22974/10460/79019/55839/890/983/4085/5080/332/3832/9212/51659/9055/891/4174/9232/1602/2018/4137 33
Large intestine, Normal, MKI67+ progenitor cell Large intestine, Normal, MKI67+ progenitor cell Large intestine, Normal, MKI67+ progenitor cell 20/153 111/11582 1.42606245514518e-17 1.0196346554288e-15 7.7307596252607e-16 991/9133/10403/7153/259266/6241/9787/11065/51203/22974/10460/890/983/332/9212/9055/891/9232/2167/51705 20
Kidney, Normal, Neutrophil Kidney, Normal, Neutrophil Kidney, Normal, Neutrophil 6/153 42/11582 0.0000170645935795602 0.000813412293959037 0.000616720399542001 6280/6279/597/7850/6286/366 6
Large intestine, Normal, DCLK1+ progenitor cell Large intestine, Normal, DCLK1+ progenitor cell Large intestine, Normal, DCLK1+ progenitor cell 7/153 99/11582 0.000329875455341899 0.0117930475284729 0.00894136102637251 1307/730/54829/10234/79901/57758/4969 7
Kidney, Renal Cell Carcinoma, Neutrophil Kidney, Renal Cell Carcinoma, Neutrophil Kidney, Renal Cell Carcinoma, Neutrophil 6/153 77/11582 0.0005276542867749 0.0150909126017621 0.0114417666395399 6280/6279/597/7850/6286/366 6
Fetal gonad, Normal, Endothelial cell Fetal gonad, Normal, Endothelial cell Fetal gonad, Normal, Endothelial cell 9/153 209/11582 0.00181179940206466 0.0431812190825411 0.0327395330548526 9493/23397/9787/10460/55839/332/7272/4521/9232 9

Showing 1 to 6 of 6 entries

Previous1Next

3.4 MSigDb analysis

Molecular Signatures Database contains 8 major collections:

  • H: hallmark gene sets
  • C1: positional gene sets
  • C2: curated gene sets
  • C3: motif gene sets
  • C4: computational gene sets
  • C5: GO gene sets
  • C6: oncogenic signatures
  • C7: immunologic signatures

Users can download GMT files from Broad Institute and use read.gmt to parse the file to be used in enricher() and GSEA().

There is an R package, msigdbr, that already packed the MSigDB gene sets in tidy data format that can be used directly with clusterProfiler.

It supports several specices:

library(msigdbr)
msigdbr_show_species()
##  [1] "Bos taurus"               "Caenorhabditis elegans"  
##  [3] "Canis lupus familiaris"   "Danio rerio"             
##  [5] "Drosophila melanogaster"  "Gallus gallus"           
##  [7] "Homo sapiens"             "Mus musculus"            
##  [9] "Rattus norvegicus"        "Saccharomyces cerevisiae"
## [11] "Sus scrofa"

We can retrieve all human gene sets:

m_df <- msigdbr(species = "Homo sapiens")
head(m_df, 2) %>% as.data.frame
##          gs_name  gs_id gs_cat gs_subcat human_gene_symbol
## 1 AAACCAC_MIR140 M12609     C3       MIR             ABCC4
## 2 AAACCAC_MIR140 M12609     C3       MIR          ABRAXAS2
##   species_name entrez_gene gene_symbol sources
## 1 Homo sapiens       10257       ABCC4    <NA>
## 2 Homo sapiens       23172    ABRAXAS2    <NA>

Or specific collection. Here we use C6, oncogenic gene sets as an example:

m_t2g <- msigdbr(species = "Homo sapiens", category = "C6") %>% 
  dplyr::select(gs_name, entrez_gene)
head(m_t2g)
## # A tibble: 6 x 2
##   gs_name              entrez_gene
##   <chr>                      <int>
## 1 AKT_UP_MTOR_DN.V1_DN       25864
## 2 AKT_UP_MTOR_DN.V1_DN          95
## 3 AKT_UP_MTOR_DN.V1_DN      137872
## 4 AKT_UP_MTOR_DN.V1_DN         134
## 5 AKT_UP_MTOR_DN.V1_DN       55326
## 6 AKT_UP_MTOR_DN.V1_DN         216
em <- enricher(gene, TERM2GENE=m_t2g)
em2 <- GSEA(geneList, TERM2GENE = m_t2g)
head(em)
##                                            ID
## RPS14_DN.V1_DN                 RPS14_DN.V1_DN
## GCNP_SHH_UP_LATE.V1_UP GCNP_SHH_UP_LATE.V1_UP
## VEGF_A_UP.V1_DN               VEGF_A_UP.V1_DN
## PRC2_EZH2_UP.V1_DN         PRC2_EZH2_UP.V1_DN
## CSR_LATE_UP.V1_UP           CSR_LATE_UP.V1_UP
## ATF2_S_UP.V1_UP               ATF2_S_UP.V1_UP
##                                   Description GeneRatio
## RPS14_DN.V1_DN                 RPS14_DN.V1_DN    22/185
## GCNP_SHH_UP_LATE.V1_UP GCNP_SHH_UP_LATE.V1_UP    16/185
## VEGF_A_UP.V1_DN               VEGF_A_UP.V1_DN    15/185
## PRC2_EZH2_UP.V1_DN         PRC2_EZH2_UP.V1_DN    14/185
## CSR_LATE_UP.V1_UP           CSR_LATE_UP.V1_UP    12/185
## ATF2_S_UP.V1_UP               ATF2_S_UP.V1_UP    12/185
##                          BgRatio       pvalue     p.adjust
## RPS14_DN.V1_DN         185/10962 4.855041e-13 8.205020e-11
## GCNP_SHH_UP_LATE.V1_UP 186/10962 9.357887e-08 7.907414e-06
## VEGF_A_UP.V1_DN        193/10962 8.887046e-07 5.006369e-05
## PRC2_EZH2_UP.V1_DN     190/10962 3.907976e-06 1.651120e-04
## CSR_LATE_UP.V1_UP      166/10962 2.368079e-05 8.004105e-04
## ATF2_S_UP.V1_UP        190/10962 8.879028e-05 2.143651e-03
##                              qvalue
## RPS14_DN.V1_DN         6.694846e-11
## GCNP_SHH_UP_LATE.V1_UP 6.452017e-06
## VEGF_A_UP.V1_DN        4.084923e-05
## PRC2_EZH2_UP.V1_DN     1.347223e-04
## CSR_LATE_UP.V1_UP      6.530911e-04
## ATF2_S_UP.V1_UP        1.749102e-03
##                                                                                                                                        geneID
## RPS14_DN.V1_DN         10874/55388/991/9493/1062/4605/9133/23397/79733/9787/55872/83461/54821/51659/9319/9055/10112/4174/5105/2532/7021/79901
## GCNP_SHH_UP_LATE.V1_UP                                      55388/7153/79733/6241/9787/51203/983/9212/1111/9319/9055/3833/6790/4174/3169/1580
## VEGF_A_UP.V1_DN                                                   8318/9493/1062/9133/10403/6241/9787/4085/332/3832/7272/891/23362/2167/10234
## PRC2_EZH2_UP.V1_DN                                               8318/55388/4605/23397/9787/55355/10460/81620/2146/7272/9212/11182/3887/24137
## CSR_LATE_UP.V1_UP                                                               55143/2305/4605/6241/11065/55872/983/332/2146/51659/9319/1580
## ATF2_S_UP.V1_UP                                                               4312/6280/2305/9493/7153/2146/10112/3887/8544/22885/10234/10974
##                        Count
## RPS14_DN.V1_DN            22
## GCNP_SHH_UP_LATE.V1_UP    16
## VEGF_A_UP.V1_DN           15
## PRC2_EZH2_UP.V1_DN        14
## CSR_LATE_UP.V1_UP         12
## ATF2_S_UP.V1_UP           12
head(em2)
##                              ID     Description setSize
## LEF1_UP.V1_DN     LEF1_UP.V1_DN   LEF1_UP.V1_DN     186
## LTE2_UP.V1_UP     LTE2_UP.V1_UP   LTE2_UP.V1_UP     183
## RAF_UP.V1_DN       RAF_UP.V1_DN    RAF_UP.V1_DN     189
## IL2_UP.V1_DN       IL2_UP.V1_DN    IL2_UP.V1_DN     182
## ATF2_S_UP.V1_DN ATF2_S_UP.V1_DN ATF2_S_UP.V1_DN     181
## ATF2_UP.V1_DN     ATF2_UP.V1_DN   ATF2_UP.V1_DN     177
##                 enrichmentScore       NES      pvalue
## LEF1_UP.V1_DN        -0.4857056 -1.987140 0.001381215
## LTE2_UP.V1_UP        -0.3938357 -1.608416 0.001383126
## RAF_UP.V1_DN         -0.5521243 -2.254389 0.001383126
## IL2_UP.V1_DN         -0.4252756 -1.732758 0.001388889
## ATF2_S_UP.V1_DN      -0.4460160 -1.815767 0.001394700
## ATF2_UP.V1_DN        -0.5008413 -2.032113 0.001402525
##                   p.adjust     qvalues rank
## LEF1_UP.V1_DN   0.01354839 0.007394831 1705
## LTE2_UP.V1_UP   0.01354839 0.007394831 2110
## RAF_UP.V1_DN    0.01354839 0.007394831 2035
## IL2_UP.V1_DN    0.01354839 0.007394831 2774
## ATF2_S_UP.V1_DN 0.01354839 0.007394831 2531
## ATF2_UP.V1_DN   0.01354839 0.007394831 2055
##                                   leading_edge
## LEF1_UP.V1_DN   tags=35%, list=14%, signal=31%
## LTE2_UP.V1_UP   tags=33%, list=17%, signal=28%
## RAF_UP.V1_DN    tags=43%, list=16%, signal=37%
## IL2_UP.V1_DN    tags=39%, list=22%, signal=31%
## ATF2_S_UP.V1_DN tags=39%, list=20%, signal=32%
## ATF2_UP.V1_DN   tags=38%, list=16%, signal=32%
##                                                                                                                                                                                                                                                                                                                                                                                                                                                  core_enrichment
## LEF1_UP.V1_DN                                                                        79153/22998/10325/23221/79170/9187/4300/51479/25956/7552/8644/11162/7168/1490/54463/65084/25837/80303/1809/3397/4208/11223/79762/6451/481/54795/51626/8991/5950/5627/64699/6414/10103/221078/9961/5874/56898/90865/83989/5002/6653/56034/55314/85458/956/54502/4832/5783/1363/53832/54985/80221/214/50853/51149/57088/5157/2947/79932/7031/6505/9338/25924/80736/3169/10551
## LTE2_UP.V1_UP                                                                                                       23074/64759/1601/6558/7138/11138/10497/81031/665/2878/3554/4925/55005/9252/2034/5283/54869/5025/6019/65124/10979/51435/26999/4212/444/23588/55556/2954/93129/79762/54453/79789/316/79679/151011/51280/9630/57037/5627/9867/55198/6653/10455/5101/65055/8309/9429/23303/9590/253190/4982/4128/4680/1846/51760/80129/4857/5507/3158/5304/10974
## RAF_UP.V1_DN    23314/221037/1960/51340/55105/10265/22996/9687/5357/1952/12/4602/9231/596/51454/57419/595/4256/23022/8777/7837/7042/3397/57613/323/1831/6451/54843/7358/2353/8773/6938/8991/64699/57007/23389/10560/7227/3485/79068/10769/4254/26018/3480/2674/23327/857/116039/6542/2690/1955/1363/26353/7033/5376/89927/51363/8821/2239/6947/4886/214/3487/3667/5157/54847/54898/7031/6505/57535/10451/18/771/80129/5174/5507/56521/8839/8614/5241/10551/57758
## IL2_UP.V1_DN                                                  51276/10858/5333/2977/79874/10014/7773/64798/26959/6588/579/777/5158/23361/9915/5087/64759/2535/6123/7010/51285/126353/80201/79841/1028/2322/4222/947/9187/225/9976/5592/54996/3202/55779/23148/5334/8605/57326/26115/54453/55766/64927/6591/55204/9940/57608/653483/9886/10279/55805/10628/901/9891/7102/27244/79921/4674/8292/5125/79843/6857/79940/23541/3357/50853/9863/79932/90362/55663/4250
## ATF2_S_UP.V1_DN                                                               8714/53616/55667/51279/4642/7130/7139/2/710/80352/4608/11145/881/2252/3399/22846/10687/12/1397/1301/3554/2619/26011/78987/9284/1306/6586/9627/4054/54813/115207/55890/26249/4803/11075/3400/79789/8527/2353/10580/79987/80760/5493/90865/3485/3751/2202/2199/10610/1047/1294/23492/3908/5364/658/5350/2121/2331/4330/9737/4982/57088/8490/9358/6424/1307/3708/125/1311/56521/10351
## ATF2_UP.V1_DN                                                                             11145/9738/2252/8660/10687/5357/80201/12/22998/3554/2619/23556/8840/9284/57419/5737/6586/9627/8863/10085/115207/55890/23015/629/3212/3400/50650/8527/2353/79987/80760/9668/1012/10614/5493/8553/23126/90865/3485/3751/2202/57212/2199/10610/3075/6387/22834/1909/10129/57161/5350/2121/2331/4330/2952/4982/57088/9358/6424/6038/51313/1311/56521/10351/5304/4239/79901

We can test with other collections, for example, using C3 to test whether the genes are up/down-regulated by sharing specific motif.

m_t2g <- msigdbr(species = "Homo sapiens", category = "C3") %>% 
  dplyr::select(gs_name, entrez_gene)
head(m_t2g)
## # A tibble: 6 x 2
##   gs_name        entrez_gene
##   <chr>                <int>
## 1 AAACCAC_MIR140       10257
## 2 AAACCAC_MIR140       23172
## 3 AAACCAC_MIR140          81
## 4 AAACCAC_MIR140          90
## 5 AAACCAC_MIR140        8754
## 6 AAACCAC_MIR140       11096
em3 <- GSEA(geneList, TERM2GENE = m_t2g)
head(em3)
##                                                            ID
## GGATTA_PITX2_Q2                               GGATTA_PITX2_Q2
## TTTGCAC_MIR19A_MIR19B                   TTTGCAC_MIR19A_MIR19B
## CAGTATT_MIR200B_MIR200C_MIR429 CAGTATT_MIR200B_MIR200C_MIR429
## AAAYRNCTG_UNKNOWN                           AAAYRNCTG_UNKNOWN
## TTGCACT_MIR130A_MIR301_MIR130B TTGCACT_MIR130A_MIR301_MIR130B
## ATGTACA_MIR493                                 ATGTACA_MIR493
##                                                   Description
## GGATTA_PITX2_Q2                               GGATTA_PITX2_Q2
## TTTGCAC_MIR19A_MIR19B                   TTTGCAC_MIR19A_MIR19B
## CAGTATT_MIR200B_MIR200C_MIR429 CAGTATT_MIR200B_MIR200C_MIR429
## AAAYRNCTG_UNKNOWN                           AAAYRNCTG_UNKNOWN
## TTGCACT_MIR130A_MIR301_MIR130B TTGCACT_MIR130A_MIR301_MIR130B
## ATGTACA_MIR493                                 ATGTACA_MIR493
##                                setSize enrichmentScore
## GGATTA_PITX2_Q2                    455      -0.3365909
## TTTGCAC_MIR19A_MIR19B              397      -0.3672122
## CAGTATT_MIR200B_MIR200C_MIR429     380      -0.3607152
## AAAYRNCTG_UNKNOWN                  290      -0.4355277
## TTGCACT_MIR130A_MIR301_MIR130B     313      -0.3890571
## ATGTACA_MIR493                     251      -0.3980657
##                                      NES      pvalue
## GGATTA_PITX2_Q2                -1.495575 0.001230012
## TTTGCAC_MIR19A_MIR19B          -1.612649 0.001254705
## CAGTATT_MIR200B_MIR200C_MIR429 -1.581860 0.001257862
## AAAYRNCTG_UNKNOWN              -1.863237 0.001319261
## TTGCACT_MIR130A_MIR301_MIR130B -1.675646 0.001322751
## ATGTACA_MIR493                 -1.676300 0.001347709
##                                  p.adjust    qvalues rank
## GGATTA_PITX2_Q2                0.04123119 0.02860412 2464
## TTTGCAC_MIR19A_MIR19B          0.04123119 0.02860412 4078
## CAGTATT_MIR200B_MIR200C_MIR429 0.04123119 0.02860412 3186
## AAAYRNCTG_UNKNOWN              0.04123119 0.02860412 2952
## TTGCACT_MIR130A_MIR301_MIR130B 0.04123119 0.02860412 2861
## ATGTACA_MIR493                 0.04123119 0.02860412 3911
##                                                  leading_edge
## GGATTA_PITX2_Q2                tags=26%, list=20%, signal=22%
## TTTGCAC_MIR19A_MIR19B          tags=44%, list=33%, signal=30%
## CAGTATT_MIR200B_MIR200C_MIR429 tags=34%, list=25%, signal=26%
## AAAYRNCTG_UNKNOWN              tags=38%, list=24%, signal=29%
## TTGCACT_MIR130A_MIR301_MIR130B tags=36%, list=23%, signal=28%
## ATGTACA_MIR493                 tags=45%, list=31%, signal=32%
##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       core_enrichment
## GGATTA_PITX2_Q2                                                                                                                                                                                                                                                                                                                               25836/311/3199/4604/57556/8019/3321/57835/8929/6671/1760/10401/4430/9497/5530/6532/54894/23528/5228/5087/2295/6453/56171/3232/4921/3005/221037/56912/10265/10521/10497/23767/8322/37/23429/6561/4303/57082/845/94/51447/9922/6258/23243/2571/10253/4602/29119/55068/126393/7402/56675/6302/5624/7716/7779/596/4300/57804/79923/1280/100506658/9627/3675/26037/2261/2804/65084/11030/3216/57326/140890/23255/5396/9854/1907/54843/57037/862/23452/55152/1848/8835/89795/57007/64221/5136/7227/10324/55273/6310/2202/56034/956/54502/51474/4254/6925/6097/744/443/23446/4306/4915/4487/25803/4081/4223/10472/388677/55107/347902/3708/10742/51313/4857/730/2532/4036/3169
## TTTGCAC_MIR19A_MIR19B          22911/1605/659/10318/4299/6249/81609/57194/9686/10425/9839/23013/22936/989/51742/4154/55625/51460/51592/11127/3976/54980/9751/56937/6045/4204/27252/4862/27236/2115/5869/152006/57698/57018/23032/56145/10150/55054/4090/25852/10618/2060/7107/10771/7071/9044/10904/9752/6431/8473/83452/392/10395/6777/51230/22889/23236/55652/158471/6595/905/56146/4026/79665/5932/115/8325/546/10154/9655/23499/54521/26959/23612/3321/56929/54778/26060/56848/9098/107/10420/23001/29922/9759/29116/23405/94134/2295/6453/83660/51719/8654/11138/2186/23341/8038/860/2004/9256/6548/23621/23503/29/394/220594/831/92689/55082/54014/1901/23111/7552/27303/1490/51454/57419/66008/2318/6252/7248/4929/23041/9900/55727/91694/22863/1389/54796/3400/9649/23253/54453/7048/22862/22905/23365/79899/5412/60481/54838/89795/5793/23389/7227/6310/3625/26018/23492/2099/23116/744/775/90355/388/23414/11069/6857/26960/55638/10472/2152/2697/3131/7786/55107/9863/23261/3479/3708/10742/2018/4036/2066
## CAGTATT_MIR200B_MIR200C_MIR429                                                                                                                                                                                                                                                    7071/31/23291/5179/57092/9969/9865/8473/9644/10395/6383/5569/1392/4008/9647/8613/6405/4884/8491/9706/2263/6567/53353/64766/5310/546/5128/55145/54469/200576/8434/7003/687/3915/1456/3321/10499/23189/26060/55578/1998/51421/22864/10484/10140/23001/4661/253782/9807/83660/95681/1960/2908/51719/2186/22846/108/9710/80205/2104/9522/7320/23243/23258/4602/129642/2669/51439/7716/54014/23047/5099/2823/3340/1942/7552/1843/100506658/8204/10979/79618/23015/8848/4692/5334/57515/26249/6563/23122/6196/1902/4077/2494/10370/3953/51339/79365/54838/89795/8076/5577/3751/6310/1009/29994/10769/1501/57509/3899/2273/23116/4908/27244/775/5783/65055/4035/23414/2737/5744/6857/55800/10186/3131/9828/6935/4982/219654/1602/3708/4857
## AAAYRNCTG_UNKNOWN                                                                                                                                                                                                                                                                                                                                                                                                        598/1392/4008/11080/79893/4724/8515/8613/81847/4297/3218/27314/7025/6595/8516/29966/11334/90/54329/89797/120/783/4750/8289/5071/3386/8929/2776/5530/23564/25959/4776/607/8038/5926/1288/2104/4303/845/51447/23243/22902/3082/54879/5745/4211/9024/23047/1942/79625/27303/4628/55911/51585/3202/6252/23338/64112/7482/444/55128/26249/64236/3397/64393/23635/23253/4013/1278/5252/323/1831/10370/1384/5549/26575/1272/89795/81603/55799/8470/7227/54875/761/477/29951/10468/6444/4254/7102/744/2045/4915/658/23414/11069/55800/10186/10472/2205/1287/7122/3479/1846/2018/2066/3169/8614/79901
## TTGCACT_MIR130A_MIR301_MIR130B                                                                                                                                                                                                                                                                                                                                                   9044/1808/196441/10904/27430/192670/8473/392/2909/83482/4762/4297/23236/27314/5932/4884/9706/90/546/9655/84196/5562/8626/757/26959/26060/56848/9098/23328/113791/107/10420/23001/29922/4661/23405/94134/2295/60468/55914/11138/2186/23341/6500/108/8038/51257/860/8648/80205/2004/9256/23243/23503/23469/4602/220594/8452/51111/55082/54014/1901/23111/7552/57419/389136/171586/3213/3202/6252/7248/23181/7482/55727/5797/91694/22863/10612/23253/8829/7048/753/23365/79365/79899/5468/5793/7227/6310/3625/26018/2099/23116/744/90355/23446/4306/23414/11069/64061/90627/26960/4223/55638/10472/2697/7786/23261/3708/10742/2018/2066
## ATGTACA_MIR493                                                                                         
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 203,098评论 5 476
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 85,213评论 2 380
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 149,960评论 0 336
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 54,519评论 1 273
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 63,512评论 5 364
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 48,533评论 1 281
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 37,914评论 3 395
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 36,574评论 0 256
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 40,804评论 1 296
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 35,563评论 2 319
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 37,644评论 1 329
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 33,350评论 4 318
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 38,933评论 3 307
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 29,908评论 0 19
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 31,146评论 1 259
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 42,847评论 2 349
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 42,361评论 2 342

推荐阅读更多精彩内容