前言
本文用于记录笔者在将R语言中的for语句并行化处理中的一些问题。
实验
这里使用foreach和doParallel包提供的函数实现for语句的并行处理。
for语句脚本
func <- function(x, y, z) { return(x^y/z) } # >>> main <<< x <- 2 y <- 3 z <- 1:100000 start <- (proc.time())[3][[1]] a <- 0 for (i_z in z) { a <- a + func(x, y, i_z) } end <- (proc.time())[3][[1]] print(paste('Result = ', round(a, 2), ', time = ', (end-start), 's', sep=''))
输出:
[1] "Result = 96.72, time = 0.177s"
并行化版本
library(foreach) library(doParallel) func <- function(x, y, z) { return(x^y/z) } # >>> main <<< x <- 2 y <- 3 z <- 1:100000 start <- (proc.time())[3][[1]] cl <- makeCluster(12) registerDoParallel(cl) a <- foreach(z=z, .combine='rbind') %dopar% func(x, y, z) a <- sum(a) stopCluster(cl) end <- (proc.time())[3][[1]] print(paste('Result = ', round(a, 2), ', time = ', (end-start), 's', sep=''))
输出:
[1] "Result = 96.72, time = 37.988s"
总结
1、这里发现并行化所用时间大于非并行化所用过的时间,是因为需要执行的操作(func函数)过于简单,而foreach处理时会有额外的资源消耗。此时foreach额外消耗的资源远大于需要执行的操作所需的资源,因此会导致并行化后反而使用的时间增加了。所以对于一些复杂的操作比较适合使用并行化的策略。
2、foreach函数的.packages参数可以为并行化函数传递额外需要的包。
3、foreach中的参数为需要在func中循环的变量,其他固定的变量则在func中传入。参数可以是data.frame类型。
补充:R语言--for循环语句的使用
R语言for循坏语句的使用(多个for)
对于多个for循还语句,R语言的执行顺序(以3个for为例):从外向内单个执行,里边循还完整,再往外一层,直到全部完成。话不多说,上例子:
代码:
library(data.table) mm<-data.table() m<-c(1,2,3,4,5) n<-c('a','b','c','d','e') o<-c(6,7,8,9,10) for (i1 in m){ for ( i2 in n){ for (i3 in o){ print(c(i1,i2,i3)) aa<-data.table(i1,i2,i3) bb<-rbind(mm,aa) } } }
执行结果:
[1] "1" "a" "6" [1] "1" "a" "7" [1] "1" "a" "8" [1] "1" "a" "9" [1] "1" "a" "10" [1] "1" "b" "6" [1] "1" "b" "7" [1] "1" "b" "8" [1] "1" "b" "9" [1] "1" "b" "10" [1] "1" "c" "6" [1] "1" "c" "7" [1] "1" "c" "8" [1] "1" "c" "9" [1] "1" "c" "10" [1] "1" "d" "6" [1] "1" "d" "7" [1] "1" "d" "8" [1] "1" "d" "9" [1] "1" "d" "10" [1] "1" "e" "6" [1] "1" "e" "7" [1] "1" "e" "8" [1] "1" "e" "9" [1] "1" "e" "10" [1] "2" "a" "6" [1] "2" "a" "7" [1] "2" "a" "8" [1] "2" "a" "9" [1] "2" "a" "10" [1] "2" "b" "6" [1] "2" "b" "7" [1] "2" "b" "8" [1] "2" "b" "9" [1] "2" "b" "10" [1] "2" "c" "6" [1] "2" "c" "7" [1] "2" "c" "8" [1] "2" "c" "9" [1] "2" "c" "10" [1] "2" "d" "6" [1] "2" "d" "7" [1] "2" "d" "8" [1] "2" "d" "9" [1] "2" "d" "10" [1] "2" "e" "6" [1] "2" "e" "7" [1] "2" "e" "8" [1] "2" "e" "9" [1] "2" "e" "10" [1] "3" "a" "6" [1] "3" "a" "7" [1] "3" "a" "8" [1] "3" "a" "9" [1] "3" "a" "10" [1] "3" "b" "6" [1] "3" "b" "7" [1] "3" "b" "8" [1] "3" "b" "9" [1] "3" "b" "10" [1] "3" "c" "6" [1] "3" "c" "7" [1] "3" "c" "8" [1] "3" "c" "9" [1] "3" "c" "10" [1] "3" "d" "6" [1] "3" "d" "7" [1] "3" "d" "8" [1] "3" "d" "9" [1] "3" "d" "10" [1] "3" "e" "6" [1] "3" "e" "7" [1] "3" "e" "8" [1] "3" "e" "9" [1] "3" "e" "10" [1] "4" "a" "6" [1] "4" "a" "7" [1] "4" "a" "8" [1] "4" "a" "9" [1] "4" "a" "10" [1] "4" "b" "6" [1] "4" "b" "7" [1] "4" "b" "8" [1] "4" "b" "9" [1] "4" "b" "10" [1] "4" "c" "6" [1] "4" "c" "7" [1] "4" "c" "8" [1] "4" "c" "9" [1] "4" "c" "10" [1] "4" "d" "6" [1] "4" "d" "7" [1] "4" "d" "8" [1] "4" "d" "9" [1] "4" "d" "10" [1] "4" "e" "6" [1] "4" "e" "7" [1] "4" "e" "8" [1] "4" "e" "9" [1] "4" "e" "10" [1] "5" "a" "6" [1] "5" "a" "7" [1] "5" "a" "8" [1] "5" "a" "9" [1] "5" "a" "10" [1] "5" "b" "6" [1] "5" "b" "7" [1] "5" "b" "8" [1] "5" "b" "9" [1] "5" "b" "10" [1] "5" "c" "6" [1] "5" "c" "7" [1] "5" "c" "8" [1] "5" "c" "9" [1] "5" "c" "10" [1] "5" "d" "6" [1] "5" "d" "7" [1] "5" "d" "8" [1] "5" "d" "9" [1] "5" "d" "10" [1] "5" "e" "6" [1] "5" "e" "7" [1] "5" "e" "8" [1] "5" "e" "9" [1] "5" "e" "10"
以上为个人经验,希望能给大家一个参考,也希望大家多多支持。如有错误或未考虑完全的地方,望不吝赐教。