我之前在2021年写过上传数据到GEO数据库获得GEO号,那时候使用ncftpput上传数据,这次我也有数据需要上传,也是打算用这个代码,然后我发现好慢啊,我从0712开始上传,然后就一直到0719还没完成,我无语了,真的是浪费时间,好吧,一查原来是有更快的工具,我自己蠢~
那就是lftp,那就记录下啦
GEO 中其实提及到了这个
https://www.ncbi.nlm.nih.gov/geo/info/submissionftp.html
1. 安装lftp
conda create -n ftp
conda activate ftp
conda install -c conda-forge lftp
lftp --version
2.上传数据
使用大佬的代码
#!/bin/bash
#set -x
set -e
set -u
usage()
{
cat <<EOF >&2
${txtcyn}
Usage:
$0 options${txtrst}
${bldblu}Function${txtrst}:
This script is used to upload files to an FTP server using lftp.
${txtbld}OPTIONS${txtrst}:
-f FTP address ${bldred}[NECESSARY]${txtrst}
-u User name ${bldred}[NECESSARY]${txtrst}
-p Password ${bldred}[NECESSARY]${txtrst}
-t Target dir ${bldred}[NECESSARY, for GEO in format like
<fasp/GEO_metadata_zhaohui>]${txtrst}
-s Source dir ${bldred}[NECESSARY, default current directory]${txtrst}
-r Send success information to given email.
${bldred}[OPTIONAL, default chentong_biology@163.com]${txtrst}
EOF
}
ftp=
user=
passwd=
target=
source_dir="."
email="chentong_biology@163.com"
while getopts "hf:u:p:t:s:r:" OPTION
do
case $OPTION in
h)
usage
exit 1
;;
f)
ftp=$OPTARG
;;
u)
user=$OPTARG
;;
p)
passwd=$OPTARG
;;
t)
target=$OPTARG
;;
s)
source_dir=$OPTARG
;;
r)
email=$OPTARG
;;
?)
usage
exit 1
;;
esac
done
if [ -z $ftp ]; then
usage
exit 1
fi
cat <<END >lftp.script
open -u ${user},${passwd} ${ftp}
mkdir -p ${target}
cd ${target}
cache size 33554432
set cmd:parallel 10
mput -c ${source_dir}/*
END
lftp -f lftp.script
上传所使用的代码
nohup bash GEO_upload.sh -f ftp-private.ncbi.nlm.nih.gov -u geoftp -p $password -t ./uploads/***_AtWh8SUu/submission_Jul_19 -s $local_dir &