网站:https://sourceforge.net/projects/apatrap/
说明书:https://sourceforge.net/p/apatrap/wiki/User%20Manual/
1.fatsq的QC
2.fastq做trim
3.比对:需要用转录本:
hisat2 网站上下载的GRCm38.fasta.tran.index
这是ensembl的数据,比对后给他加"chr"
hisat2 -x $index -p 10 -1 T1-R1.fastq.gz -2 T1-R2.fastq.gz -S T1.sam --add-chrname
hisat2 -x $index -p 10 -1 T2-R1.fastq.gz -2 T2-R2.fastq.gz -S T2.sam --add-chrname
hisat2 -x $index -p 10 -1 WT1-R1.fastq.gz -2 WT1-R2.fastq.gz -S WT1.sam --add-chrname
hisat2 -x $index -p 10 -1 WT2-R1.fastq.gz -2 WT2-R2.fastq.gz -S WT2.sam --add-chrname
4.samtools转换成bam:
sort不需要加n 正常sort即可!
samtools sort --threads 20 -m 3G -o T-1.bam T-1.sam
samtools sort --threads 20 -m 3G -o T-2.bam T-2.sam
samtools sort --threads 20 -m 3G -o WT-1.bam WT-1.sam
samtools sort --threads 20 -m 3G -o WT-2.bam WT-2.sam
5.转成bedgraph:
genomeCoverageBed -bg -ibam T-1.bam -split > T-1.bedgraph
genomeCoverageBed -bg -ibam T-2.bam -split > T-1.bedgraph
genomeCoverageBed -bg -ibam WT-1.bam -split > WT-1.bedgraph
genomeCoverageBed -bg -ibam WT-2.bam -split > WT-2.bedgraph
6.提取UTR:
参数:
'/home/pc/biosoft/APAtrap/identifyDistal3UTR' -s $genesymbol -i T-1.bedgraph T-2.bedgraph WT-1.bedgraph WT-2.bedgraph -m mm10.refgene.bed -o UTR.bed
# genesymbol文件如下:
下载ref:转录本的bed文件下载 - 简书
这里有个致命出错:就是这个genesymbol竟然是以空格分隔的!用awk改为\t分隔。
# 此处mm10的refbed时ensembl:
下载ref:转录本的bed文件下载 - 简书
7.生成APA:
参数:
'/home/pc/biosoft/APAtrap/predictAPA' -i T-1.bedgraph T-2.bedgraph WT-1.bedgraph WT-2.bedgraph -g 2 -n 2 2 -u UTR.bed -o APA.txt
8.R进行差异APA分析:
用法:deAPA(input_file, output_file, group1, group2, least_qualified_num_in_group1, least_qualified_num_in_group2, coverage_cutoff)
library('deAPA')
deAPA('APA.txt', 'APA.result.txt', 2, 2, 1, 1, 20)