当有多个数据集(CSV格式)文件时,用
read.csv
一个一个读取非常繁琐,有没有批量读取的方法?
示例代码
# setwd("./BB/Venn_13Oct") ## 如果用了project就无需这一步
#filenames<-list.files("./group1",full.names=TRUE) ## 如果文件夹中只有CSV文件可以这样写,否则要:
filenames<-list.files("./group1", pattern="*.csv",full.names=TRUE) ## full.names=T会得到带有相对路径的文件名如./R/com.report.csv,相反full.names=F就只有文件名,可以和下面第二行做比较
ldf <- filenames %>% lapply(read.csv) ## we use 'lapply' function to read all the files in filenames
names(ldf) <- substr(filenames, 10, 14) ## names of 'ldf' are empty, use 'substr' can cut out the wanted names for all files each
df1<-ldf[1] ## 提取单个数据文件
有时我需要合并多个txt文本到一个中去,再用lapply
就不好使,可以用for循环,如下:
fil_path_nam <- list.files("./R", full.names = T)
fil_nam <- list.files("./R", full.names = F)
temp <- c()
for (file_i in 1:length(fil_path_nam)) {
temp0 <- r_fil_path_nam[i] %>% readLines
temp <- c(temp, paste0("#----",r_fil_nam[i]), temp0) #第2个元素可以将文件名添加到文件内容头里
}
write(temp, file = "./R/all.txt")
这里借用了怎么保存循环中输出的向量?的知识。
some examples下面这个例子是我有6个CSV文件,每个文件中有一列是相同的,想要进一步把这6个文件按照共有的这一列合并到一个数据框中,用merge
函数
filenames<-names(ldf)
# write a function
d_merge_jj <- function(file1, file2,filename) {
names(file1)[1]<-names(file2)[1]<-c("id")
merge.file <- merge(file1, file2, by.x = "id", by.y = "id",all.x=TRUE,sort=FALSE)
# write merged content Job done!
file<-paste0("merge.",filename,".csv")
write.table(merge.file, file, sep = ",")
print("Job done!")
}
## Implementation
for(i in 1:6){
ao<-ldf[i]
ao<-as.data.frame(ao)
ao<-ao[,-1]
ao<-as.data.frame(ao)
d_merge_jj (ao,pilot,filenames[i])
}
The idea was from Opening all files in a folder, and applying a function on