R anti-tips
Not all R tips are equally good. Let's set the record straight.
Anti-tip #1: For loops are slower than functions in the apply family
Why should that be the case? Let's see what the R interpreter has to say about it. Let's get some numbers to chew on first:z = rnorm(10^6)For loop first:
> system.time({x = 0; for(y in z) x = x + y})To avoid the explicit loop a good match here is the Reduce function, which may be not exactly in the apply family, but it's faster than several attempts I made using those functions.
user system elapsed
0.521 0.004 0.526
> system.time({x = Reduce('+', z)})Faster, but not by much. The true tip: use C.
user system elapsed
0.461 0.030 0.491
> system.time({x = sum(z)})Now that's 250 times faster. That's worth talking about. The reason is that the interpreter does nothing here, compiled code does all the work. No black magic. Let's see if this is limited to sums or we can see the same effect again First with the for loop.
user system elapsed
0.002 0.000 0.002
> system.time({x = z; for (i in 1:length(z)) x[i] = x[i]^2})Then an *apply type function:
user system elapsed
2.110 0.030 2.139
> system.time({x = sapply(z, function(x) x^2)})A tad slower, not sure if it is significant.
user system elapsed
2.476 0.021 2.496
> system.time({x = z^2})400X faster. Now you get my attention. To do this right, one would have to put some confidence intervals around these numbers, but out of experience using R and knowing a little about R internals and compiler and interpreter technology, I am confident the final answer will be that for or apply, it doesn't really matter. As a matter of programming style, I believe apply functions to be far superior. I wrote a whole package using I think only two for loops which seemed absolutely necessary. Speed is not the argument though.
user system elapsed
0.003 0.003 0.006
Anti-tip #2: Don't use nested loops
This is a particularly pernicious anti-tip. The previous one would have resulted in people wasting time to remove loops just to find out that their program was about as slow, but likely shorter and easier to understand. In this case the anti-tip discourages a very useful R optimization technique: optimizing only the innermost loop to reap most of the speed benefits. Let's see this in two steps. First there is absolutely nothing wrong with nested loops. They are as slow as single loops with the same number of iterations:> system.time({x = rnorm(10^6); I = numeric(10^6);With that notion put to rest, let's see the fast inner loop approach in action. This is with two loops:
for (i in 1:10^6) {k = sample(I, 1); x[k] = x[k]^2}})
user system elapsed
7.589 0.248 7.837
> system.time({x = rnorm(10^6); I = numeric(10^6);
for (i in 1:1000)
for(j in 1:1000) {k = sample(I, 1); x[k] = x[k]^2}})
user system elapsed
7.486 0.233 7.770
> M = matrix(rnorm(10^6), ncol = 1000)And this is with the inner loop replaced by a vectorized operation:
> system.time({for (i in 1:1000) for (j in 1:1000) M[i,j] = -M[i,j]})
user system elapsed
2.369 0.041 2.410
> system.time({for (i in 1:1000) M[i,] = -M[i,]})80X faster! You may say: yes but you don't have nested loops any more. That is not the reason why it is faster as the previous pair of examples showed. The reason is that the interpreter is going through only thousands of steps, while millions of steps take place in compiled code. Once you have given a 1000X "break" to the interpreter, that's enough to approach C speeds. Not completely
user system elapsed
0.028 0.001 0.030
> system.time({M = -M})If you had 10 nested loops and the innermost required a large enough amount of work, say 1000 operations as a rule of thumb, then optimizing away that innermost loop would be enough to give a considerable boost. You would still have 9 nested loops and it would approach C speeds. Nesting is not the problem, the problem is compiled vs interpreted code. The important message is that, depending on the algorithm, you may have to replace with a fast library function or, at worst, rewrite in C only a small fraction of your code.
user system elapsed
0.003 0.000 0.004