Object Oriented Programming

When we attach values to names in an R environment we generally refer to the name and value collectively as an ‘object’. Object oriented programming is a programming paradigm built around the notions of classes, methods, and, of course, objects. There are a wide variety of object oriented languages and R has (at least) three object oriented (OO) systems you should be aware of.

Before digging into R’s OO systems it will be helpful to define a few terms.

Reading

The S3 system in R

The S3 system in R is based on the idea of generic functions. The basic idea is that a generic function is used to dispatch a class-specific method for a given object. Some common S3 generic functions in R inlcude, print, summary, plot, mean, head, tail, and str. If we look at the definitions for these functions, we see they are all quite simply defined in terms of a call to UseMethod.

print
## function (x, ...) 
## UseMethod("print")
## <bytecode: 0x7fe209e4d758>
## <environment: namespace:base>
summary
## function (object, ...) 
## UseMethod("summary")
## <bytecode: 0x7fe20b765678>
## <environment: namespace:base>
head
## function (x, ...) 
## UseMethod("head")
## <bytecode: 0x7fe20afe9b10>
## <environment: namespace:utils>

When UseMethod is called R searches for an S3 method based on the name of the generic function and the class of its first argument. As an example, let’s construct a matrix object mat and examine a call to head(mat):

mat = matrix(1:45, nrow=9, ncol=5)
class(mat)
## [1] "matrix"
head(mat)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1   10   19   28   37
## [2,]    2   11   20   29   38
## [3,]    3   12   21   30   39
## [4,]    4   13   22   31   40
## [5,]    5   14   23   32   41
## [6,]    6   15   24   33   42

The object mat has class matrix so UseMethod("head") searches for a function (method) called head.matrix() to apply to mat:

head.matrix
## function (x, n = 6L, ...) 
## {
##     stopifnot(length(n) == 1L)
##     n <- if (n < 0L) 
##         max(nrow(x) + n, 0L)
##     else min(n, nrow(x))
##     x[seq_len(n), , drop = FALSE]
## }
## <bytecode: 0x7fe20bc38780>
## <environment: namespace:utils>

You can see all the methods associated with a generic function using methods().

methods(head)
## [1] head.data.frame* head.default*    head.ftable*     head.function*  
## [5] head.matrix      head.table*     
## see '?methods' for accessing help and source code

When an object has more than one class, R searches successively until a suitable method is found.

class(mat) = c('green', class(mat))
class(mat)
## [1] "green"  "matrix"
head(mat)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1   10   19   28   37
## [2,]    2   11   20   29   38
## [3,]    3   12   21   30   39
## [4,]    4   13   22   31   40
## [5,]    5   14   23   32   41
## [6,]    6   15   24   33   42

If a suitable method is not found, S3 generics revert to a default method if defined and throw an error if not. We can call some methods explicitly:

head.matrix(mat)
##      [,1] [,2] [,3] [,4] [,5]
## [1,]    1   10   19   28   37
## [2,]    2   11   20   29   38
## [3,]    3   12   21   30   39
## [4,]    4   13   22   31   40
## [5,]    5   14   23   32   41
## [6,]    6   15   24   33   42

but others such as head.default are not exposed. You can view the source code for unexposed S3 generics using getS3method('generic','class').

getS3method('head', 'default')
## function (x, n = 6L, ...) 
## {
##     stopifnot(length(n) == 1L)
##     n <- if (n < 0L) 
##         max(length(x) + n, 0L)
##     else min(n, length(x))
##     x[seq_len(n)]
## }
## <bytecode: 0x7fe20c850040>
## <environment: namespace:utils>

Defining a new S3 method is as simple as defining a function and naming it accordingly. Here we define a method head.green.

head.green = function(obj) {
  # Make sure obj is an object
  if(!is.object(obj)) warning("Object 'obj' is not an object!")

  # Check if its green
  if ('green' %in% class(obj)) {
    if (length(class(obj)) > 1) {
      next_class = class(obj)[-grep('green', class(obj))][1]
      cat('This is a green ',next_class,'.\n',sep='')
    } else{
      cat('This a generic green object.\n')
    }
  } else{
    cat('The object is not green!\n')
  }
  
}

Now we can test it under various conditions.

## We previously assigned
class(mat)
## [1] "green"  "matrix"
head(mat)
## This is a green matrix.
## Test head.green for generic class
class(mat) = 'green'
head(mat)
## This a generic green object.
## Test on a non-green object
red_obj = 1:100
class(red_obj) = 'red'
head.green(red_obj)
## The object is not green!
head(red_obj)
## [1] 1 2 3 4 5 6

We can also define our own S3 generic functions via UseMethod.

# Generic Color finder
getColor = function(obj) {
  UseMethod("getColor")
}

# Default method 
getColor.default = function(obj) {
  # Are any classes colors?
  ind = class(obj) %in% colors()
  if (any(ind)) {
     # Yes. Return first color.
     class(obj)[which(ind)[1]]
  } else {
    # No return a random color.
    sample(colors(),1)
  }
}

# Specific method for green
getColor.green = function(obj) {
  "darkgreen"
}

As a quick example of how we might use this, we could define a col_boxplot function to pick colors according to the class of the object passed.

# A box plot function that uses the class attribute to define colors.
col_boxplot = function(dat, ...) {
  if(is.atomic(dat)){
    boxplot(dat, col=getColor(dat), ...)
  } else{
    col = sapply(dat, getColor)
    boxplot(dat, col=col, ...)
  }
}
# Define some iid data
x = rnorm(100, 1, 1); class(x) = 'green'
y = rnorm(100, 0, 2); class(y) = 'red'
z = rnorm(100, 0, 1)
col_boxplot(list(x=x, y=y, z=z), las=1)

col_boxplot(list(z=x, y=y, z=z), las=1)

You should be aware that the class of the object returned by generic functions can depend on the input class.

class(x + y)
## [1] "green"
class(y + x)
## [1] "red"
class(mean(x))
## [1] "numeric"

Defining an S3 class

The majority of S3 objects are simply lists plus a class attribute. As an example, consider the class lm returned by the lm function for linear regression modeling,

# How does head.function work?
getS3method('head','function')
## function (x, n = 6L, ...) 
## {
##     lines <- as.matrix(deparse(x))
##     dimnames(lines) <- list(seq_along(lines), "")
##     noquote(head(lines, n = n))
## }
## <bytecode: 0x7fe20cc84678>
## <environment: namespace:utils>
# Borrow the structure to read lm line by line
lines = deparse(lm)

# Print lines involving 'list' keyword
noquote(lines[grep('list', lines)])
## [1]         z <- list(coefficients = if (is.matrix(y)) matrix(, 0,
# Print lines involving 'class' keyword
noquote(lines[grep('class', lines)])
## [1]     class(z) <- c(if (is.matrix(y)) "mlm", "lm")

The S4 System

The S3 system described above is very flexible making it easy to work with, but at the expense of the safety and uniformity of a more formal OO system.

The S4 system is a more formal OO system in R. One key difference is that S4 classes have formal definitions and classes, methods, and generics must all be explicitly defined as such.

Defining an S4 class

S4 classes are defined using the setClass function:

setClass("color_vector",
   representation(
     name='character',
     data='numeric',
     color='character'
   )
)

Create a new instance of an S4 class using new:

x = new("color_vector", name="x", color="darkgreen")
x
## An object of class "color_vector"
## Slot "name":
## [1] "x"
## 
## Slot "data":
## numeric(0)
## 
## Slot "color":
## [1] "darkgreen"

The function new is used above as a constructor for creating an object with the desired class. Most S4 classes defined in packages you download have their own constructors which you should use when defined. We can create a default constructor by assigning the output of setClass a name:

color_vector = 
 setClass("color_vector",
   representation(
     name='character',
     data='numeric',
     color='character'
   )
 )
y = color_vector(name="y", data = rnorm(100, 0, 2), color="red")
class(y)
## [1] "color_vector"
## attr(,"package")
## [1] ".GlobalEnv"

You could also create an explicit constructor by writing a function that calls new and manipulates the object in some way, say providing defaults for attributes.

Accessing slots in an S4 object

You can access and set attributes for an S4 object using an @ symbol, the slot function, or an attr(obj, 'name') construction:

## Access slots using @
x@color
## [1] "darkgreen"
# Assign some data to the data slot
x@data = rnorm(10, 1, 1)

# Check the color
slot(x, 'color')
## [1] "darkgreen"
# Change the name of x
attr(x, 'name') = 'Green Values'
names(attributes(x))
## [1] "name"  "data"  "color" "class"

S4 Methods

We can control how an object of class color_vector gets displayed by defining a show method (the S4 equivalent of print).

## This is an S4 generic
show(x)
## An object of class "color_vector"
## Slot "name":
## [1] "Green Values"
## 
## Slot "data":
##  [1]  0.98480787 -0.19682076  0.09777846  1.73153674  0.40582701
##  [6] -0.11913151 -0.78672536  1.26722958  1.10175352  2.91787208
## 
## Slot "color":
## [1] "darkgreen"
## Change how color_vector objects are shown.
setMethod('show', 'color_vector',
  function(object) {
    msg = sprintf('name: %s, color: %s\n\n', object@name, object@color) 
    cat(msg)
    cat('Data:')
    str(object@data)
    cat('\n')
  }
)
## [1] "show"

Now, when we call show on an object of class color_vector R will use the custom method.

show(x)
## name: Green Values, color: darkgreen
## 
## Data: num [1:10] 0.9848 -0.1968 0.0978 1.7315 0.4058 ...

Resources