Reading

Concurrent and Asynchronous Computing

Asynchronous computing refers to having events that occur independenty of the primary control flow in our program.

In a traditional, synchronous program each statement or expression blocks while evaluating. In other words, it forces the program to wait until if continues. An asynchronous program, in contrast, has some statements that do not block – allowing the program to continue until either (1) the value of the earlier statement is needed or (2) execution resources such as CPU cores are exhausted.

In parallel programming we explicitly split portions of our program into chunks of code that can be executed independently. In concurrent programming we specify chunks of code that can be executed independently of others. A concurrent program can be executed sequentially or in parallel.

Traditionally concurrent programming has been focused on I/O bound tasks where one is querying external servers or databases and would otherwise have to wait for each query to finish and return before sending the next request. Concurrency helps in this situation because it allows the program to wait in multiple queues at once. The video at this link explains how concurrency helps to load webpages more quickly.

Concurrent Programming with Futures in R

The R package future provides utilities that allow us to write concurrent programs using an abstraction known as a future. Quoting the package author,

In programming, a future is an abstraction for a value that may be available at some point in the future.

Once the future has resolved, its value becomes available immediately. If we request the value of a future that has not yet resolved the request blocks leading our program to wait until the value becomes available.

Implicit and Explicit Futures

An implicit future can be created using the future assignment operator future::%<-%

Here is a pedagological example.

First, using sequential code …

system.time(
 {
    a <- {
      Sys.sleep(2)
      2
     }
    
     b <- {
       Sys.sleep(1)
       3
     }
     a*b
 }
)
##    user  system elapsed 
##   0.000   0.000   3.009

Now using implicit futures …

library(future)
plan(multisession)
system.time(
 {
    a %<-% {
      Sys.sleep(2)
      2
    }
    
    b %<-% {
      Sys.sleep(1)
      3
    }
    a*b
 }
)
##    user  system elapsed 
##   0.436   0.002   2.270

We can also create explicit futures using the future() function and then value() to query the result.

system.time({
  a = future( { Sys.sleep(2); 2 } )
  b = future( { Sys.sleep(1); 3 } )
  value(a)*value(b)
})
##    user  system elapsed 
##   0.015   0.001   2.015

Controlling how futures are resolved

In the code above, we called plan(multisession) to specify that we want futures to be resolved in independent background R sessions. The other options we will explore are sequential and multicore. See some examples here.