Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying R High Performance Programming
  • Table Of Contents Toc
  • Feedback & Rating feedback
R High Performance Programming

R High Performance Programming

By : Aloysius Shao Qin Lim, Tjhi W Chandra
4.4 (14)
close
close
R High Performance Programming

R High Performance Programming

4.4 (14)
By: Aloysius Shao Qin Lim, Tjhi W Chandra

Overview of this book

This book is for programmers and developers who want to improve the performance of their R programs by making them run faster with large data sets or who are trying to solve a pesky performance problem.
Table of Contents (12 chapters)
close
close
11
Index

Algorithm design affects time and space complexity

There is one other performance factor that we have not discussed—your code. The types of computations and algorithms that you run can have a huge impact on performance. Computer scientists describe the performance characteristics of programs in terms of complexity. In particular, we are concerned about two types of complexities:

  • Time complexity: This refers to the computing time required to run an R program in relation to the size of the data being processed
  • Space complexity: This refers to the memory that is required to run an R program in relation to the size of the data being processed

Let's look at an example of time complexity. Suppose that we need to write a function to compute the nth Fibonacci number, that is, a number in the sequence 0, 1, 1, 2, 3, 5, 8, 13, … where each number is the sum of the previous two numbers. A simple way to do this would be to write a recursive function such as:

fibonacci_rec <- function(n) {
    if (n <= 1) {
        return(n)
    }
    return(fibonacci_rec(n - 1) + fibonacci_rec(n - 2))
}

Since the nth Fibonacci number is the sum of the (n-1)th and (n-2)th Fibonacci numbers, this function simply calls itself to compute the previous two numbers, then adds them up. Let's see how long it takes to compute the 25th Fibonacci number using the microbenchmark() function from the microbenchmark package, which can be downloaded and installed from CRAN (we will take a closer look at how to use this function in Chapter 2, Measuring Code's Performance):

microbenchmark(fibonacci_rec(25), unit = "ms")
## Unit: milliseconds
##               expr      min    lq     mean   median       uq
##  fibonacci_rec(25) 170.1014 179.8 191.4213 183.5275 197.5833
##       max neval
##  253.1433   100

It took a median of 184 milliseconds. Because of the way the recursion works, there is a lot of unnecessary repetition. For example, to compute the 25th Fibonacci number, we need to compute the 23rd and 24th numbers in the sequence. But, computing the 24th number also involves computing the 23rd number, so the 23rd number is computed twice. And the 22nd number is needed to compute both the 23rd and 24th numbers, and so on.

We can reduce this repetition by computing each number only once. The following code presents an alternative implementation of the Fibonacci function that does just that. It computes the Fibonacci numbers in sequence from smallest to largest and remembers the numbers that it has computed in the numeric vector fib. Thus, each Fibonacci number is computed only once:

fibonacci_seq <- function(n) {
    if (n <= 1) {
        return(n)
    }
    # (n+1)th element of this vector is the nth Fibonacci number
    fib <- rep.int(NA_real_, n + 1)
    fib[1] <- 0
    fib[2] <- 1
    for (i in 2:n) {
        fib[i + 1] <- fib[i] + fib[i - 1]
    }
    return(fib[n + 1])
}

Tip

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

By benchmarking this sequential function, we see that it takes a median of 0.04 milliseconds to run, a reduction of 99.98 percent from the recursive version!

microbenchmark(fibonacci_seq(25), unit = "ms")
## Unit: milliseconds
##               expr     min       lq      mean    median      uq
##  fibonacci_seq(25) 0.03171 0.036133 0.0446416 0.0405555 0.04459
##                        max neval
##                   0.114714   100

To demonstrate the concept of time complexity, we ran the benchmark for different values of n ranging from 0 to 50. The median execution times are shown in the following figure:

Algorithm design affects time and space complexity

Execution time of recursive versus sequential versions of Fibonacci function

As we increase the value of n, the execution time of the recursive version of the Fibonacci function increases exponentially. It is roughly proportional to 1.6n—every time n increases by 1, it gets multiplied by about 1.6 times. The execution time increased so fast that it took too long to compute the Fibonacci numbers after the 50th one. On the other hand, though it is imperceptible from the chart, the execution time of the sequential version increases linearly—every increase in n increases the execution time by 1.3 microseconds. Since the computational complexity of the sequential version is much lower than that of the recursive version, it will perform much better as n increases. As a case in point, with a modest value of n=50, the sequential version took a fraction of a millisecond to get computed while the recursive version took over eight hours!

Though we will not do it here, a similar exercise can be conducted in order to compare the space complexity of different algorithms. Given a certain amount of computational resources, your choice of algorithm and the design of your code can have a big impact on your R program's ability to achieve the desired level of performance.

bookmark search playlist font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY