2019-08-12 【三代出现】安装 cDNA_Cupcake

Last Updated: 07/26/2019
Cupcake是一个三代测序后续分析软件的集合，可以
cDNA_Cupcake is a miscellaneous collection of Python and R scripts used for analyzing sequencing data. Most of the scripts only require Biopython. For scripts that require additional libraries, it will be specified in documentation.
https://github.com/Magdoll/cDNA_Cupcake
Current version: 8.2

发现一个不错的介绍

https://github.com/Magdoll/cDNA_Cupcake/wiki#refgmap

首先通过git拉包

git clone https://github.com/Magdoll/cDNA_Cupcake.git

出现错误

(base) [jing@localhost ~]$ git clone https://github.com/Magdoll/cDNA_Cupcake.git
Cloning into 'cDNA_Cupcake'...
error: RPC failed; curl 56 OpenSSL SSL_read: SSL_ERROR_SYSCALL, errno 104
fatal: the remote end hung up unexpectedly

查询得解答：
使用git clone error: RPC failed

#Solution:
#修改Git的传输字节限制即可。
 git config --global http.postBuffer  524288000

运行以上代码后，正常下载了
这步骤比较慢，，14:20-15:05，断线，重新上，大约60分钟

(base) [jing@localhost ~]$ git clone https://github.com/Magdoll/cDNA_Cupcake.git
Cloning into 'cDNA_Cupcake'...
remote: Enumerating objects: 164, done.
remote: Counting objects: 100% (164/164), done.
remote: Compressing objects: 100% (115/115), done.
Receiving objects:  18% (301/1615), 9.45 MiB | 49.00 KiB/s

运行以下：

export PATH=$PATH:/home/jing/cDNA_Cupcake/sequence/
export PATH=$PATH:/home/jing/cDNA_Cupcake/rarefaction/
改为自己的路径

装Cupcake ToFU
因为： The only exception is Cupcake ToFU, which does require compiling and installation.
https://github.com/Magdoll/cDNA_Cupcake/wiki/Cupcake-ToFU%3A-supporting-scripts-for-Iso-Seq-after-clustering-step

下载下来之后，
cd cDNA_Cupcake
python setup.py build
python setup.py install

报错

image.png

缺啥安啥
conda install numpy
yum search zlib
install之后重新运行安装

image.png

继续yum search gcc
install
还是不行

image.png

试一下
yum install gcc libffi-devel python-devel openssl-devel
还是不行
装了一堆，，还是不行。。。。。。有装好的告诉我下怎么装好么？

What to do after Iso Seq Cluster?https://github.com/PacificBiosciences/IsoSeq_SA3nUP/wiki/What-to-do-after-Iso-Seq-Cluster%3F

Cupcake ToFU 能做什么？

在经过cluster步骤之后，我们应该已经获得了高质量isoforms（HQ isoform sequences.），满足以下条件：

所得序列为全长（包含5‘UTR，序列中包含polyA）
高质量(predicted accuracy by default is >= 99%)
有至少2个全长序列支持（subreads？）

独白：可能用不着那么高质量的reads，也可以挖掘很多有用的信息

这写高质量isoforms中，依旧存在冗余序列（isoforms），因此前步骤产出的序列，并不能真正代表样品中的所有unique isoforms。有两个原因：

Clustering algorithm tradeoff between sensitivity and specificity.
Natural 5' degradation in RNA.
所以，下面需要做的步骤有
Best practice for aligning Iso Seq to reference genome: minimap2, GMAP, STAR, BLAT
Collapse identical isoforms to obtain final set of unique, full-length, high-quality isoforms
Obtain associated count information for each unique isoform
Robust ORF prediction using ANGEL
Fusion finding -- tutorial to come soon

Cupcake TOFU 可以做第 (2), (3), and (5)步

最后编辑于：2019.08.19 09:45:38

2019-08-12 【三代出现】安装 cDNA_Cupcake

发现一个不错的介绍

Cupcake ToFU 能做什么？

推荐阅读更多精彩内容