When we attach values to names in an R environment we generally refer to the name and value collectively as an ‘object’. More formally, we can distinguish between base objects and object-oriented objects where the latter are those with a non-null class attribute. Compare the following:
## NULL
## [1] "factor"
Object oriented programming is a programming paradigm built around the notions of classes, methods, and, of course, objects. There are a wide variety of object oriented languages and R has (at least) three object oriented (OO) systems you should be aware of:
We will focus on the S3 and S4 systems which predominate in R. You can read about RC and the related R6 system in Advanced R.
Before digging into R’s OO systems it will be helpful to define a few terms.
An object’s class defines its structure and behavior using attributes and its relationship with other classes.
Methods are functions that have definitions which depend on the class of an object.
Classes are often organized into hierarchies with a “child” class defined more strictly than any “parents”. Child classes often inherit from any parents. This means a parent’s structure or methods serve as a default when not explicitly defined for the child. More formally, these are known as superclasses and subclasses.
“Object Oriented Programming”: read the section introduction, and chapters 12, 13, and 15 in Advanced R.
Programmer’s Niche: A simple class, in S3 and S4 by Thomas Lumley on page 33 at this link.
The S3 system in R is based on the idea of generic functions.
The core idea is that a generic function is used to dispatch a class-specific method taken from an object passed to it.
Some common S3 generic functions in R include: print
, summary
, plot
, mean
, head
, tail
, and str
. If we look at the definitions for these functions, we see they are all quite simply defined in terms of a call to UseMethod()
.
## function (x, ...)
## UseMethod("print")
## <bytecode: 0x7fef28107be8>
## <environment: namespace:base>
## function (object, ...)
## UseMethod("summary")
## <bytecode: 0x7fef265e78a8>
## <environment: namespace:base>
## function (x, ...)
## UseMethod("head")
## <bytecode: 0x7fef265a0538>
## <environment: namespace:utils>
## function (generic, object) .Primitive("UseMethod")
When UseMethod()
is called R searches for an S3 method based on the name of the generic function and the class of its first argument. The specific function it looks for follows the naming pattern generic.class
– this is why it is advisable not to use dots when naming functions, classes, or other objects unless explicitly defining an S3 method.
As an example, let’s construct a matrix object mat
and examine a call to head(mat)
:
## [1] "matrix" "array"
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 10 19 28 37
## [2,] 2 11 20 29 38
## [3,] 3 12 21 30 39
## [4,] 4 13 22 31 40
## [5,] 5 14 23 32 41
## [6,] 6 15 24 33 42
The object mat
has classes matrix
and array
. Therefore, UseMethod("head")
searches for a function (or method) called head.matrix()
to apply to mat
:
## function (x, n = 6L, ...)
## {
## checkHT(n, d <- dim(x))
## args <- rep(alist(x, , drop = FALSE), c(1L, length(d), 1L))
## ii <- which(!is.na(n[seq_along(d)]))
## args[1L + ii] <- lapply(ii, function(i) seq_len(if ((ni <- n[i]) <
## 0L) max(d[i] + ni, 0L) else min(ni, d[i])))
## do.call("[", args)
## }
## <bytecode: 0x7fef2475ed50>
## <environment: namespace:utils>
We can describe this sequence using a call tree:
\[ \texttt{head(mat)} \to \texttt{UseMethod("head")} \to \texttt{.Primitive("UseMethod")} \to \texttt{head.matrix(mat)}. \]
S3 methods always work by this pattern: generic \(\to\) dispatch \(\to\) method.
You can see all the methods associated with a generic function using methods()
.
## [1] head.array* head.data.frame* head.default* head.ftable*
## [5] head.function* head.matrix
## see '?methods' for accessing help and source code
The *
following some methods is used to denote methods that are not exported as part of the namesapce of the packages in which they are defined. For instance, the head.data.frame
method is defined in the (base) package utils
, but is not exported.
## function (x, n = 6L, ...)
## {
## checkHT(n, d <- dim(x))
## args <- rep(alist(x, , drop = FALSE), c(1L, length(d), 1L))
## ii <- which(!is.na(n[seq_along(d)]))
## args[1L + ii] <- lapply(ii, function(i) seq_len(if ((ni <- n[i]) <
## 0L) max(d[i] + ni, 0L) else min(ni, d[i])))
## do.call("[", args)
## }
## <bytecode: 0x7fef266283a8>
## <environment: namespace:utils>
When an object has more than one class, R searches successively through the class attribute until a suitable method is found. The sloop
package and its function s3_dispatch()
are helpful for understanding how this works.
Here is an example:
## [1] "green" "matrix" "array"
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 10 19 28 37
## [2,] 2 11 20 29 38
## [3,] 3 12 21 30 39
## [4,] 4 13 22 31 40
## [5,] 5 14 23 32 41
## [6,] 6 15 24 33 42
## head.green
## => head.matrix
## * head.array
## * head.default
If a suitable method is not found, S3 generics revert to a default method when defined and throw an error if not. We can call some methods explicitly:
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 10 19 28 37
## [2,] 2 11 20 29 38
## [3,] 3 12 21 30 39
## [4,] 4 13 22 31 40
## [5,] 5 14 23 32 41
## [6,] 6 15 24 33 42
but others such as head.default
are not exposed as discussed above.
You can view the source code for unexposed S3 generics using getS3method('generic', 'class')
.
## function (x, n = 6L, ...)
## {
## checkHT(n, dx <- dim(x))
## if (!is.null(dx))
## head.array(x, n, ...)
## else if (length(n) == 1L) {
## n <- if (n < 0L)
## max(length(x) + n, 0L)
## else min(n, length(x))
## x[seq_len(n)]
## }
## else stop(gettextf("no method found for %s(., n=%s) and class %s",
## "head", deparse(n), sQuote(class(x))), domain = NA)
## }
## <bytecode: 0x7fef253a74c0>
## <environment: namespace:utils>
Once you know the package namespace within which a method is defined you can call it explicitly using, e.g. utils:::head.default()
.
Defining a new S3 method is as simple as defining a function and naming it accordingly. Here we define a method head.green()
.
head.green = function(obj) {
# Green escape sequences, from e.g. crayon::green("green").
g1 = '\033[32m'
g2 = '\033[39m'
# Make sure obj is an object
if ( !is.object(obj) ) warning("Object 'obj' is not an object!")
# Check if its green
if ('green' %in% class(obj) ) {
if ( length(class(obj)) > 1 ) {
next_class = class(obj)[-grep('green', class(obj))][1]
cat( sprintf('This is a %sgreen %s%s.\n', g1, next_class, g2) )
# This calls the next available method, allowing us to offload work in
# a method for the subclass to an existing method for the superclass.
NextMethod("head")
} else {
cat(sprintf('This a %sgeneric green object%s.\n', g1, g2))
}
} else {
cat(sprintf('The object is not %sgreen%s!\n', g1, g2))
}
}
Now we can test it under various conditions.
## [1] "green" "matrix" "array"
## This is a [32mgreen matrix[39m.
## [,1] [,2] [,3] [,4] [,5]
## [1,] 1 10 19 28 37
## [2,] 2 11 20 29 38
## [3,] 3 12 21 30 39
## [4,] 4 13 22 31 40
## [5,] 5 14 23 32 41
## [6,] 6 15 24 33 42
## This a [32mgeneric green object[39m.
## The object is not [32mgreen[39m!
## [1] 1 2 3 4 5 6
In our definition of head.green
, notice the use of NextMethod()
to dispatch a method previously defined on one of the parent classes.
We can similarly define our own S3 generic functions via UseMethod()
.
Note that the first argument to both UseMethod()
and NextMethod()
should be a character vector with the name of the generic.
# Generic Color finder
getColor = function(obj) {
UseMethod("getColor")
}
# Default method
getColor.default = function(obj) {
# Are any classes colors?
ind = class(obj) %in% colors()
if ( any(ind) ) {
# Yes. Return color with highest class predence.
class(obj)[which(ind)[1]]
} else {
# No return a random color.
sample(colors(), 1)
}
}
# Specific method aliasing green to "darkgreen"
getColor.green = function(obj) {
"darkgreen"
}
As a quick, somewhat contrived, example of how we might use this, we could define a col_boxplot
function to pick colors according to the class of the object passed.
# A box plot function that uses the class attribute to define colors.
col_boxplot = function(dat, ...) {
if ( is.atomic(dat) ) {
boxplot(dat, col = getColor(dat), ...)
} else{
col = sapply(dat, getColor)
boxplot(dat, col = col, ...)
}
}
You should be aware that the class of the object returned by some generic functions (especially primitives) can depend on the input class.
## [1] "green"
## [1] "red"
## [1] "numeric"
There are four common “styles” of S3 object:
factor
and Date
classes which are built on atomic vectors and use attributes to add additional structure,lm
and glm
classes,POSIXlt
class has a fixed set of elements all of equal lengths that represent aspects of each datum,The majority of S3 objects you will encounter are in the scalar style – they are simply lists with a class attribute governing method dispatch.
As an example, consider the class lm
returned by the lm()
function for linear regression modeling. We can find the definition of an lm
object in the R documentatation.
The S3 system described above is very flexible making it easy to work with, but at the expense of the safety and uniformity of a more formal OO system.
The S4 system is a more formal OO system in R. One key difference is that S4 classes have formal definitions and classes, methods, and generics must all be explicitly defined as such. The functionality of the S4 object system comes from the (base) “methods” package.
S4 classes are defined using the setClass
function:
Create a new instance of an S4 class using new
:
## An object of class "color_vector"
## Slot "name":
## [1] "x"
##
## Slot "data":
## numeric(0)
##
## Slot "color":
## [1] "darkgreen"
The function new
is used above as a constructor for creating an object with the desired class. Most S4 classes defined in packages you download have their own constructors which you should use when defined. We can create a default constructor by assigning the output of setClass
a name:
color_vector =
setClass("color_vector",
slots = c(
name = 'character',
data = 'numeric',
color = 'character'
)
)
y = color_vector(name = "y", data = rnorm(100, 0, 2), color = "red")
class(y)
## [1] "color_vector"
## attr(,"package")
## [1] ".GlobalEnv"
You could also create an explicit constructor by writing a function that calls new
and manipulates the object in some way, say providing defaults for attributes.
You can access and set attributes for an S4 object using an @
symbol, the slot
function, or an attr(obj, 'name')
construction:
## [1] "darkgreen"
## [1] "darkgreen"
## [1] "name" "data" "color" "class"
Only the first slot(obj, 'name')
and last attr(obj, 'name')
work for S3 objects.
## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
## [1] "a" "b" "c" "d" "e" "f" "g" "h" "i" "j"
## <simpleError in doTryCatch(return(expr), name, parentenv, handler): trying to get slot "levels" from an object (class "factor") that is not an S4 object >
In addition, authors of S4 classes often provide accessor functions to get at the most common slots. Here is an example:
color = function(obj) {
# Accessor function for the "color" slot in the color_vector class
# Inputs: obj - an object of class color vector
# Returns: the value of the color slot
# If the object is not of class color vector, a random color is returned with
# a warning.
y = as.character(match.call())
if ( !{"color_vector" %in% class(obj) } ) {
msg = sprintf('Object %s is not of class color_vector.\n', y[2] )
warning(msg)
return( sample(colors(), 1) )
}
slot(obj, 'color')
}
color(x)
## [1] "darkgreen"
## [1] "red"
## Warning in color(LETTERS): Object LETTERS is not of class color_vector.
## [1] "deepskyblue3"
A validator is a function that ensures an object is a valid member of a given class. Here is an example validator for our color_vector
class.
setValidity("color_vector", function(object) {
if ( !{object@color %in% colors()} ) {
sprintf('@color = %s is not a valid color. See colors().', object@color)
} else {
TRUE
}
})
## Class "color_vector" [in ".GlobalEnv"]
##
## Slots:
##
## Name: name data color
## Class: character numeric character
## <simpleError in validObject(.Object): invalid class "color_vector" object: @color = A is not a valid color. See colors().>
We can control how an object of class color_vector
gets displayed by defining a show
method (the S4 equivalent of print
).
## An object of class "color_vector"
## Slot "name":
## [1] "Green Values"
##
## Slot "data":
## [1] 2.0793072 1.1020231 0.0194862 0.4737538 1.2446874 1.1115460
## [7] 2.4734511 0.3636716 -1.2003157 -0.3708521
##
## Slot "color":
## [1] "darkgreen"
## Change how color_vector objects are shown.
setMethod('show', 'color_vector',
function(object) {
msg = sprintf('name: %s, color: %s\n\n', object@name, object@color)
cat(msg)
cat('Data:')
str(object@data)
cat('\n')
}
)
Now, when we call show
on an object of class color_vector
R will use the custom method.
## name: Green Values, color: darkgreen
##
## Data: num [1:10] 2.0793 1.102 0.0195 0.4738 1.2447 ...
## name: Green Values, color: darkgreen
##
## Data: num [1:10] 2.0793 1.102 0.0195 0.4738 1.2447 ...
We could similarly define a method that allows the user to change the value in the color slot. This is a so-called “setter” function.
## Define a new accessor for setting the color
setGeneric("color<-", function(object, value) standardGeneric('color<-'))
## [1] "color<-"
setMethod('color<-', 'color_vector',
function(object, value) {
object@color = value
validObject(object)
object
}
)
color(x) = 'purple'
color(x)
## [1] "purple"
## name: Green Values, color: purple
##
## Data: num [1:10] 2.0793 1.102 0.0195 0.4738 1.2447 ...
“Object Oriented Programming” (chapter 9) in Norman Matloff’s The Art of R Programming.
The R Language from Professor Shedden’s 2016 course notes.
The R6 package provides C++
style classes in R.