Memory and Memory Profiling in R

Resources

The material in this lesson is largely based on:

Chapter 18 of Advanced R
Chapter 14 of The Art of R Programming

General Information on Memory

You likely already know that binary data is composed of bits (0 or 1). You may also know that computer systems generally use bytes containing multiple bits as the basic addressable unit.
0In modern computer architectures a byte usually consists of 8 bits. When describing a data storage quantity in terms of bytes we traditionally use prefixes based on powers of 2, so that 1KB \(= 2^{10} (1024)\) bytes, a MB = \(2^{20}\) (1,048,576) bytes etc.

You may also want to review the section ‘Storage’ from Professor Shedden’s notes.

An understanding of virtual memory, segmentation, and caching may be of use at some point in the future but is not required for this course.

Understanding memory usage in R

A useful tool for exploring memory management in R is Hadley Wickham’s pryr package. It contains a function pryr::object_size similar to object.size but with these differences:

It accounts for memory shared by elements in an object
It attempts to account for memory from environments associated with an object.

Here are some examples.

library(pryr); library(ggplot2)

nyc14 = data.table::fread('https://github.com/arunsrinivasan/flights/wiki/NYCflights14/flights14.csv')

# Simple size of objects
object.size(1:10)

## 88 bytes

pryr::object_size(1:10)

## 88 B

object.size(nyc14)

## 21469968 bytes

print(object.size(nyc14), units='Mb')

## 20.5 Mb

object_size(nyc14)

## 21.5 MB

compare_size(nyc14)

##     base     pryr 
## 21469968 21468872

## How to understand the difference? 
# Example from object_size documentation.
x = 1:1e5
z = list(x, x, x)
compare_size(z)

##    base    pryr 
## 1200192  400112

object_size(x)

## 400 kB

object_size(z)

## 400 kB

object_size(x,z)

## 400 kB

To understand this last example, we need to recall R’s “copy on modify semantics” discussed during the lessons on data.table. Here we will use the function tracemem() to get the memory address of an object. When this changes we can be sure that an object has been copied. Similar functionality is available via pryr::address() or data.table::address()..

## Recall R has "copy-on-modify semantics"
invisible(tracingState(on=FALSE))
tracemem(x)

## [1] "<0x10c350000>"

tracemem(z)

## [1] "<0x7ff16873cd68>"

## pryr::address doesn't work well with knitr, but would be the 
## same interactively
pryr::address(z)

## [1] "0x7ff165e8af48"

data.table::address(z)

## [1] "0x7ff16873cd68"

inspect(z)

## <VECSXP 0x7ff16873cd68>
##   <INTSXP 0x10c2ee000>
##   <INTSXP 0x10c28c000>
##   <INTSXP 0x10c3b2000>

# Modify z, tracemem causes message to print
z[[2]] = -x

sapply(z, tracemem)

## [1] "<0x10c2ee000>" "<0x10c565000>" "<0x10c3b2000>"

# Having changed z[[2]], the object is larger. 
object_size(z)

## 1.2 MB

object_size(x,z)

## 1.6 MB

Be aware that memory profiling is a compile time option, meaning it may not be available on all instances of R. By default, the mac OS and Windows builds distributed through CRAN have memory profiling enabled.

Here is an example of how environments impact object_size.

f <- function(){
  x <- 1:1e5
  a ~ b
}
compare_size(f())

##   base   pryr 
##    712 400864

g <- function(){
  a ~ b
}
compare_size(g())

## base pryr 
##  712  712

h <- function(){
  x <- 1:1e5
  x~y
}
compare_size(h())

##   base   pryr 
##    712 400808

object_size(a~b)

## 656 B

object_size(1:1e5)

## 400 kB

Two other useful functions from pryr are mem_used() which adds up the total size of all objects in memory and mem_change() which tracks changes to this quantity. When working with mem_change() ignore anything ~2KB or smaller as this mostly is impacted by changes to .Rhistory.

ls()

## [1] "f"     "g"     "h"     "nyc14" "x"     "z"

mem_used()

## 64.1 MB

mem_change({new_vec = 1:1e6})

## 4 MB

# Negatives represent memory freed
mem_change(rm(new_vec))

## -4 MB

mem_change(rm(nyc14))

## -21.5 MB

Vector size

This example is taken from section 18.1 of Advanced R.

Here we exam the size, in bytes, of R vectors of class integer with lengths 0 through 100.

sizes = sapply(0:100, function(n) object_size(seq_len(n)))
plot(0:100, sizes, xlab='Vector Length', ylab='Size (B)', type='s')
abline(h=40,lty='dashed')

sizes[1:20]

##  [1]  40  48  48  56  56  72  72  72  72  88  88  88  88 104 104 104 104
## [18] 168 168 168

It turns out that empty vectors of any type occupy 40 bytes of memory,

sapply(c('numeric', 'logical', 'integer', 'raw', 'list'),
       function(x) object.size(vector(mode=x, length=0))
       )

## numeric logical integer     raw    list 
##      40      40      40      40      40

These bytes are used to store the following components:

(4 bytes) Metadata including the type and some other information
(16 = 2*8 bytes) Pointers to the next and previous object in memory.
(8 bytes) A pointer to the attributes.
(4 bytes) Vector length
(4 bytes) “True length” used primarily for environments.

The additional 4 bytes are used for padding to ensure each of these elements starts on an 8 byte boundary. These boundaries are generally required by 64 bit CPU architectures, see here.

How do we interpret the remaining steps in the graph? First, consider the regular steps for the later vectors:

diff(sizes[41:45])

## [1] 8 0 8 0

For vectors beyond 128 bytes in size (excluding overhead) R requests memory from the OS in 8 byte chunks using the C function malloc(). Since an integer occupies 4 bytes, the memory increases every other integer.

## Adjusted sizes
plot(0:100, sizes-40, xlab='Vector Length', ylab='Size (B) less overhead', type='s')
abline(h=0,lty='dashed')
abline(v=c(41,43), lty='dotted')
abline(h={8*c(1,2,4,6,8,16)}, col='blue', lty='dotted')

For vectors, smaller than 128 bytes in size R performs its own memory management using something called the ‘small vector pool’ to avoid unnecessary requests to the OS for RAM. For simplicity, it only allocates specific multiples of 8 bytes as shown in the plot. Note that these value correspond to the data held by the vector and not the 40 B of overhead. This small vector pool is expanded by a page in increments of 2000 bytes as needed.

Exercises

Question 1: What vector lengths are shown by the vertical lines in the plot below? Where do the horizontal lines intersect the y-axis?

Question 2: Recall that an integer type is stored using 4 bytes while a numeric type uses 8 bytes. What are the values of a and b after running the R code below?

a = object_size(1:15)
b = object_size(as.numeric(1:15))

Question 3: What are the approximate values of mem_a - mem_c below?

x = 1:1e6
z = list(x, x, x)
object_size(x, z)

## 4 MB

mem_a = object_size(z) - object_size(x)                   # Exact in bytes
mem_b = mem_change(z[[1]] <- rep(2L, 1e6))                # Approximate in Mb
mem_c = mem_change(z[[2]][sample(1:1e6, 1)] <-  runif(1)) # Approximate in Mb

Strings and factors

We noted above that integers are stored using 4 bytes (32 bits) and doubles 8 bytes (64 bits). What about characters? According to the R documenation on CRAN, R uses a global pool of strings and pointers to them in actual strings.

# pointers to strings take 8 bytes each
x = 'abc'
x1e5 = rep(x,1e5)
object_size(x)

## 96 B

object_size(x1e5)

## 800 kB

object_size(x,x1e5)

## 800 kB

The global pool stores both the encoding of each string and the actual bytes.

In contrast, R objects of class factor are stored as integers encoding the levels which are in turn strings. Since integers occupy only 4 bytes, if there are only a few levels the factor can have a smaller memory footprint.

x = sample(1:3,1e5, replace=TRUE)
object_size(x) - 40

## 400 kB

x_char = LETTERS[x]
object_size(x_char) - 40

## 800 kB

x_factor = as.factor(x_char) 
object_size(x_factor) - 40

## 401 kB

In contrast, if there are many levels the factor may have a larger footprint. Read more from data.table author Matt Dowle here.

Profiling with `Rprof` and `profviz`

The Rprof() function can be used to profile R code for both speed and memory usage by sampling. This works by recording in a log every so often (by default .02 s) what functions are currently on the stack. It will also

Example: Screening Correlation Coefficients

Recall our comparisons of various R implementations for screening correlation coefficients for a large number of possible predictors in the rows of a matrix xmat with a single outcome y. Here we will add a Fisher transform as well and return just the indices where the sample coefficients is nominally significant.

# Example data
n = 3e2
m = 1e5
yvec = rnorm(n)
xmat = outer(array(1, m), yvec)
rmat = matrix(runif(m, -.8, .8), m, n)
xmat = rmat*xmat + sqrt(1 - rmat^2)*rnorm(n * m)

object_size(xmat)

## 240 MB

object_size(yvec)

## 2.44 kB

# When memory is an issue, be sure to clean up when possible. 
mem_change(rm(rmat))

## -240 MB

# Functions to compare
cor_screen_1 = function(yvec, xmat){
  r1 = NULL
  for (i in 1:m) {
    r1[i] = cor(xmat[i, ], yvec)
  }
  z = {.5*{log(1+r1) - log(1-r1)}}*sqrt(length(yvec)-3)
  
  which(abs(z)>qnorm(.975))
}

cor_screen_2 = function(yvec, xmat){
  r2 = apply(xmat, 1, function(v) cor(v, yvec))
  z = {.5*{log(1+r2) - log(1-r2)}}*sqrt(length(yvec)-3)

  which(abs(z)>qnorm(.975))
}


cor_screen_3 = function(yvec, xmat){
  rmn = rowMeans(xmat)
  xmat_c = xmat - outer(rmn, array(1, n)) 
  rsd = apply(xmat, 1, sd)
  xmat_s = xmat_c / outer(rsd, array(1, n))
  yvec_s = {yvec - mean(yvec)} / sd(yvec)
  r3 = xmat_s %*% yvec_s / {n - 1}
  
  z = as.vector({.5*{log(1+r3) - log(1-r3)}} * sqrt(length(yvec)-3))
  
  which(abs(z)>qnorm(.975))
}

cor_screen_4 = function(yvec, xmat){
  rmn = rowMeans(xmat)
  xmat_c = xmat - rmn
  rvar = rowSums(xmat_c^2) / {dim(xmat)[2] - 1}
  rsd = sqrt(rvar)
  xmat_s = xmat_c / rsd
  yvec_s = {yvec - mean(yvec)} / sd(yvec)
  r4 = xmat_s %*% yvec_s / {n - 1}
  
  z = as.vector({.5*{log(1+r4) - log(1-r4)}} * sqrt(length(yvec)-3))

  which(abs(z)>qnorm(.975))
}

# Check that all are equal
s = list(cor_screen_1(yvec, xmat), cor_screen_2(yvec, xmat),
         cor_screen_3(yvec, xmat), cor_screen_4(yvec, xmat)
         )
sapply(2:4, function(i) setdiff(s[[1]],s[[i]]))

## [[1]]
## integer(0)
## 
## [[2]]
## integer(0)
## 
## [[3]]
## integer(0)

Here is an example of profiling cor_screen_1 for speed with Rprof().

Rprof(memory.profiling = TRUE, interval=.002)
 invisible(cor_screen_1(yvec, xmat))
Rprof(NULL)
summaryRprof(memory = 'both')

## $by.self
##                 self.time self.pct total.time total.pct mem.total
## "cor"               0.216    20.65      0.976     93.31    1841.8
## "is.data.frame"     0.154    14.72      0.154     14.72     273.4
## "stopifnot"         0.130    12.43      0.396     37.86     696.7
## "match.call"        0.110    10.52      0.242     23.14     409.9
## "cor_screen_1"      0.068     6.50      1.046    100.00    1967.0
## "eval"              0.064     6.12      1.046    100.00    1967.0
## "sys.function"      0.056     5.35      0.082      7.84     142.1
## "sys.call"          0.052     4.97      0.068      6.50     112.6
## "sys.parent"        0.042     4.02      0.042      4.02      80.3
## "match.arg"         0.040     3.82      0.186     17.78     413.1
## "formals"           0.036     3.44      0.064      6.12     142.0
## "pmatch"            0.022     2.10      0.022      2.10      23.4
## "parent.frame"      0.014     1.34      0.014      1.34      21.1
## "all"               0.008     0.76      0.008      0.76       7.9
## "c"                 0.008     0.76      0.008      0.76      24.6
## "is.atomic"         0.006     0.57      0.006      0.57      16.5
## "anyNA"             0.004     0.38      0.004      0.38       5.4
## "is.pairlist"       0.004     0.38      0.004      0.38       6.9
## "length"            0.004     0.38      0.004      0.38      12.6
## "abs"               0.002     0.19      0.002      0.19       1.5
## "baseenv"           0.002     0.19      0.002      0.19       7.3
## "invisible"         0.002     0.19      0.002      0.19       4.5
## "is.na"             0.002     0.19      0.002      0.19       1.4
## 
## $by.total
##                       total.time total.pct mem.total self.time self.pct
## "cor_screen_1"             1.046    100.00    1967.0     0.068     6.50
## "eval"                     1.046    100.00    1967.0     0.064     6.12
## "block_exec"               1.046    100.00    1967.0     0.000     0.00
## "call_block"               1.046    100.00    1967.0     0.000     0.00
## "evaluate_call"            1.046    100.00    1967.0     0.000     0.00
## "evaluate"                 1.046    100.00    1967.0     0.000     0.00
## "handle"                   1.046    100.00    1967.0     0.000     0.00
## "in_dir"                   1.046    100.00    1967.0     0.000     0.00
## "knitr::knit"              1.046    100.00    1967.0     0.000     0.00
## "process_file"             1.046    100.00    1967.0     0.000     0.00
## "process_group.block"      1.046    100.00    1967.0     0.000     0.00
## "process_group"            1.046    100.00    1967.0     0.000     0.00
## "rmarkdown::render"        1.046    100.00    1967.0     0.000     0.00
## "timing_fn"                1.046    100.00    1967.0     0.000     0.00
## "withCallingHandlers"      1.046    100.00    1967.0     0.000     0.00
## "withVisible"              1.046    100.00    1967.0     0.000     0.00
## "cor"                      0.976     93.31    1841.8     0.216    20.65
## "stopifnot"                0.396     37.86     696.7     0.130    12.43
## "match.call"               0.242     23.14     409.9     0.110    10.52
## "match.arg"                0.186     17.78     413.1     0.040     3.82
## "is.data.frame"            0.154     14.72     273.4     0.154    14.72
## "sys.function"             0.082      7.84     142.1     0.056     5.35
## "sys.call"                 0.068      6.50     112.6     0.052     4.97
## "formals"                  0.064      6.12     142.0     0.036     3.44
## "sys.parent"               0.042      4.02      80.3     0.042     4.02
## "pmatch"                   0.022      2.10      23.4     0.022     2.10
## "parent.frame"             0.014      1.34      21.1     0.014     1.34
## "all"                      0.008      0.76       7.9     0.008     0.76
## "c"                        0.008      0.76      24.6     0.008     0.76
## "is.atomic"                0.006      0.57      16.5     0.006     0.57
## "anyNA"                    0.004      0.38       5.4     0.004     0.38
## "is.pairlist"              0.004      0.38       6.9     0.004     0.38
## "length"                   0.004      0.38      12.6     0.004     0.38
## "abs"                      0.002      0.19       1.5     0.002     0.19
## "baseenv"                  0.002      0.19       7.3     0.002     0.19
## "invisible"                0.002      0.19       4.5     0.002     0.19
## "is.na"                    0.002      0.19       1.4     0.002     0.19
## "which"                    0.002      0.19       1.5     0.000     0.00
## 
## $sample.interval
## [1] 0.002
## 
## $sampling.time
## [1] 1.046

Here are two other options for the memory parameter.

#summaryRprof(memory = 'tseries')
summaryRprof(memory = 'stats')

## index: "rmarkdown::render":"knitr::knit"
##      vsize.small  max.vsize.small      vsize.large  max.vsize.large 
##            12747           414647           308554        120000044 
##            nodes        max.nodes     duplications tot.duplications 
##          3282033         35004816             3060          1600184 
##          samples 
##              523

Using `profviz` to visualize profiling information

The profviz package is built on Rprof() but aims to provide more useful summary information.

#install.packages('profvis')
library(profvis)
profvis(cor_screen_1(yvec, xmat), interval = .005)

profvis(cor_screen_2(yvec, xmat), interval = .005)

profvis(cor_screen_3(yvec, xmat), interval = .005)

profvis(cor_screen_4(yvec, xmat), interval = .005)