Object-Oriented in R
Object-Oriented in R
Object-Oriented in R
Susana Eyheramendy
Introduction
Object-oriented programming (OOP) has
become a widely used and valuable tool for software engineering. easier to design, write and maintain software when there is some clear separation of the data representation from the operations that are to be performed on it.
2
Introduction
In an OOP system, real physical things are generally
represented by classes, and methods (functions) are written to handle the different manipulations that need to be performed on the objects. system, in which classes dene objects and there are repositories for the methods that can act on those objects. specication of generic functions, its a functioncentric system. 3
Many peoples view if OOP is based on a class-centric R separates the class specication from the
Introduction
R supports two internal OOP systems: S3
and S4.
Introduction
Objects: encapsulate state information and control behavior. Classes: describe general properties for groups of objects. Inheritance: new classes can be dened in terms of existing classes. Polymorphism: a (generic) function has different behaviors, although similar outputs, depending on the class of one or more of its arguments. 5
Introduction
In S3, there is no formal specication for classes In S4, formal class denitions were included in
and hence there is a weak control of objects and inheritance. The emphasis of the S3 system was on generic functions and polymorphism. the language and based on these, more controlled software tools and paradigms for the creation of objects and the handling of inheritance were introduced.
6
If a class A extends the class B , then we say No class can be its own subclass. A class is
A method is a type of function that is invoked depending on the class of one or more of its arguments and this process is called dispatch. While in some systems, such as S3, methods can be invoked directly, it is more common for them to be invoked via a generic function. When a generic function is invoked, the set of methods that might apply must be sorted into a linear order, with the most specic method rst and the least specic method last. This is often called method linearization and computing it depends on being able to linearize the class hierarchy. 11
The actual classes of supplied arguments that match the signature of the generic function are determined. Based on these, the available methods are ordered from most specic to least. Then, after evaluating any code supplied in the generic, control is transferred to the most specic method.
In S4, a generic function has a xed set of named formal arguments and these form the basis of the signature. Any call to the generic will be dispatched with respect to its signature. 13
the FreqFlyer class has every slot that an instance of the Passenger class has. The relationship between a subclass and its superclasses should be an is a relationship. Every frequent yer is a passenger and not all passengers are frequent yers. Sometimes the notion of subclass and superclass can be confusing. One reason that the more specialized class is called a subclass is because the set of objects that can be used exchangeably with the FreqFlyer class are a subset of those that can be used exchangeably with the Passenger class. In the example below, we provide a very basic S4 implementation of the Passenger and FreqFlyer classes.
> setClass("Passenger", representation(name = "character", + origin = "character", destination = "character")) [1] "Passenger" > setClass("FreqFlyer", representation(ffnumber = "numeric"), + contains = "Passenger") [1] "FreqFlyer" > getClass("FreqFlyer")
We then say that the FreqFlyer is a subclass of Slots: Passenger and that Passenger is a superclass of Name: ffnumber name origin FreqFlyer . 14
Class: numeric character character
Exercise
Dene a class for passenger names that has slots for the rst name, middle initial and last name. Change the denition of the Passenger class to reect your new class. Does this change the inheritance properties of the Passenger class or the FreqFlyer class?
15
[1] "Passenger"
Object-Oriented Programming in R
[1] "FreqFlyer"
Extends: "Passenger"
71
71
"
The process of determining the appropriate A call to a function, such as plot, will invoke
a method that is determined by the class of the rst argument in the call to plot.
17
For example, consider a print method for passengers that prints their names and ight details. invoke the passenger method, and then add a line indicating the frequent yer number. Using this approach, very little additional code is needed; and if the printing of passenger information is changed, the update is automatically applied to printing of frequent yer information.
19
20
Many programmers believe that objectoriented programming (OOP) makes for clearer, more reusable code. Though very different from the familiar OOP languages like C++, Java, and Python, R is very much OOP in outlook.
The following themes are key to R: Everything you touch in Rranging from numbers to character strings to matricesis an object. R promotes encapsulation, which is packaging separate but related data items into one class instance. Encapsulation helps you keep track of related variables, enhancing clarity. R classes are polymorphic, which means that the same function call leads to different operations for objects of different classes. For instance, a call to print() on an object of a certain class triggers a call to a print function tailored to that class. Polymorphism promotes reusability. R allows inheritance, which allows extending a given class to a more specialized class.
21
The S3 system
S3 is the original R structure for classes S3 is still the dominant class paradigm in R use today Most of Rs built-in classes are of the S3 type An S3 class consists of a list, with a class name S4 classes were developed later with the goal of
adding safety (cannot accidentally access a class component that is not already in existence).
22
attribute and dispatch capability added, which enables the use of generic functions
The S3 system
Generic functions and methods are widely Some classes are internal or implicit and One determines the class of an object
the function class().
23
used but there is little use of inheritance and classes are quite loosely dened. others are specied explicitly, typically by using the class attribute. using
class of an object using the function class, and for most purposes this is sufcient; however, there are some important exceptions that arise with respect to internal functions. While there is no formal mechanism for organizing or representing instances of a class, they are typically lists, where the dierent slots are represented as named elements in the list. Using setOldClass will register an S3 class as an S4 class. The class attribute is a vector of character values, each of which species a particular class. The most specic class comes rst, followed by any less specic classes. For our frequent yer example from Section 3.2.1, the class vector should always have FreqFlyer rst and Passenger second. The recommended way of testing whether an S3 object is an instance of a particular class is to use the inherits function. Direct inspection of the class attribute is not recommended since implicit classes, such as matrix and array , are not listed in the class attribute. Notice in the code below that the class of x changes Object-Oriented Programming R class attribute, and when a dimension attribute isin added, that there is no75 that once x is a matrix it is no longer considered to be an integer .
The S3 system
1] "matrix"
ULL
Object-Oriented Programming in R > dim(x) = c(2, 5) inherits(x, "integer") > class(x) [1] "matrix" 1] FALSE
75
> attr(x, "class") In the next example we return to our FreqFlyer example and provide an S3 mplementation. NULL
24
The S3 system
A way of testing whether an S3 object is an
[1] "matrix" > Object-Oriented Programming in R 75
NULL > inherits(x, "integer") [1] FALSE In the next example we return to our FreqFlyer example and provide an S3 implementation. > x = list(name = "Josephine Biologist", origin = "SEA", + destination = "YXY") 25 > class(x) = "Passenger"
In the next example we return to our FreqFlyer example and provide an S3 implementation. > + > > + > > x = list(name = "Josephine Biologist", origin = "SEA", destination = "YXY") class(x) = "Passenger" y = list(name = "Josephine Physicist", origin = "SEA", destination = "YVR", ffnumber = 10) class(y) = c("FreqFlyer", "Passenger") inherits(x, "Passenger")
Example
[1] TRUE > inherits(x, "FreqFlyer") [1] FALSE > inherits(y, "Passenger") [1] TRUE
26
A major problem with this approach is that there is no mechanism programmers can use to ensure that all instances of the Passenger or Freq classes have the correct slots, the correct types of values in those slots, an correct class attribute. One can easily produce an object with these c that has none of the slots we have dened. And as a result, one typicall to do a great deal of checking of arguments in every S3 method. The function function is.object tests whether orwhether not an R object The is.object tests or has a clas tribute. This is somewhat important as the help page for class indicates not dispatch an R object has to a class attribute. some is restricted objects for which is.object is true.
The S3 system
3.3.1
Implicit classes
27
The S3 system
v <- 1:10 >v [1] 1 2 3 4 5 6 7 8 9 10 > attributes(v) NULL > class(v) [1] "integer" > class(v) <- "character"
> attributes(v) NULL
OOP in the lm() Linear 9.1.2 Example: OOP in the lm() Linear Model Function Model As an example, lets look function at a simple regression analysis run vi
tion. First, lets see what lm() does:
> ?lm
Lets try creating an instance of this object and then printing it:
> x <- c(1,2,3) > y <- c(1,3,8) The output of this help query will tell you, > lmout <- lm(y ~ x) function returns an object of class "lm". > class(lmout) [1] "lm" > lmout Call: lm(formula = y ~ x) Coefficients: (Intercept) -3.0
dispatch the call to the proper class method, meaning that it w call to a function dened for the objects class.
hapter 9
x 3.5
30
initiates the dispatch on a single argument, usually the rst argument to the generic function. only two formal arguments, one often named x and the other the ... argument.
32
OOP in the lm() Linear 9.1.2 Example: OOP in the lm() Linear Model Function Model As an example, lets look function at a simple regression analysis run vi
tion. First, lets see what lm() does:
> ?lm
Lets try creating an instance of this object and then printing it:
dispatch the call to the proper class method, meaning that it w call to a function dened for the objects class.
> x <- c(1,2,3) In R terminology, the call to > y <- c(1,3,8) the generic function print() wasthing The output of this help query will tell you, among other > lmout <- lm(y ~ x) function returns an object ofdispatched class "lm". to the method > class(lmout) print.lm() associated with the [1] "lm" class "lm". > lmout Call: lm(formula = y ~ x) Coefficients: (Intercept) -3.0
hapter 9
x 3.5
33
Methods are regular functions and are identied by A simple generic function named
new arguments that are appropriate to the computations they will perform. A disadvantage of this approach is that mistakes in naming arguments will be silently ignored. The mis-typed name will not match any formal argument and hence is placed in the . . . argument, where it is never used. In R, UseMethod dispatches on the class as returned by class, not that returned by oldClass. Not all method dispatch honors implicit classes. In particular, group generics (Section 3.3.5) and internal generics do not. Group generics dispatch on the oldClass for eciency reasons, and internal generics only dispatch on objects for which is.object is TRUE. An internal generic is a function that calls directly to C code (a primitive or internal function), and there checks to see if it should dispatch. To make use of these, you will need to explicitly set the class attribute. You can do that using class<-, oldClass<or by setting the attribute directly using attr<-. For most generic functions, a default method will be needed. The default method is invoked if no applicable methods are found, or if the least specic method makes a call to NextMethod. fun Methods are regular functions and are identied by their name, which is a concatenation of the name of the generic and the name of the class that they are intended to apply to, separated by a dot. A simple generic function named fun and a default method are shown below. The string default is used as if it were a class and indicates that the method is a default method for the generic. > fun = function(x, ...) UseMethod("fun") > fun.default = function(x, ...) print("In the default method") > fun(2) [1] "In the default method"
34
their name, which is a concatenation of the name of the generic and the name of the class that they are intended to apply to, separated by a dot. and a default method are shown below. The string default is used as if it were a class and indicates that the method is a default method for the generic.
generic.
> fun = function(x, ...) UseMethod("fun") > fun.default = function(x, ...) print("In the default method") > fun(2) [1] "In the default method"
Consider a class system with two classes, Foo which extends Bar. consider Then we adene two methods: fun.Foo We have Next, class system with two classes,and Foofun.Bar which . extends Bar . them out methods: a message, calland thefun.Bar function and then fun.Foo . WeNextMethod have them print out Then we print dene two print out second message. a message, call a the function NextMethod and then print out a second message.
> + + + + > + + + +
fun.Foo = function(x) { print("start of fun.Foo") NextMethod() print("end of fun.Foo") } fun.Bar = function(x) { print("start of fun.Bar") NextMethod() print("end of fun.Bar") }
35
80
Now we can show how dispatch occurs by creating an instance that has both classes and calling fun with that instance as the rst argument.
Now we can show how dispatch occurs by creating an instance that has both classes and calling fun with that instance as the rst argument.
> x = 1 > class(x) = c("Foo", "Bar") > fun(x) [1] [1] [1] [1] [1] "start of fun.Foo" "start of fun.Bar" "In the default method" "end of fun.Bar" "end of fun.Foo"
Notice that the call to NextMethod transfers control to the next most specic method.
Notice that the call to NextMethod transfers control to the next most specic method. This is one of the benets of using an OOP paradigm. Typically, less code needs to be written, and it is easier to maintain as the methods for 36
Here, we printed out the object lmout. (Remember that by simply typing the name of an object in interactive mode, the object is printed.) The R interpreter then saw that lmout was an object of class "lm" and thus called print.lm(), a special print method for the "lm" class. In R terminology, the call to the generic function print() was dispatched to the method print.lm() associated with the class "lm". Lets take a look at the generic function and the class method in this case:
> print function(x, ...) UseMethod("print") <environment: namespace:base> > print.lm function (x, digits = max(3, getOption("digits") - 3), ...) { cat("\nCall:\n", deparse(x$call), "\n\n", sep = "") if (length(coef(x))) { cat("Coefficients:\n") print.default(format(coef(x), digits = digits), print.gap = 2, quote = FALSE) } else cat("No coefficients\n") cat("\n") invisible(x) } <environment: namespace:stats>
What happens when we print Dont worry about the details of . The main point isthis that the object with its printing depends on context, with a special print function called for the class removed? class. Now lets see what happens attribute when we print this object with its
print.lm() "lm" x 3.5
x -4.949747
The author of lm() decided to make print.lm() much more concise, limiting it to printing a few key quantities.
1.224745
Ive shown only the rst few lines heretheres a lot 38 more. (Try run-
able methods generic function but it does this simply by loo can be very large for and a wegiven want to control the default information that is printed for the its PHENODS3 and S3 EXPRS3 classes. bythe R. Write S3 print at names. We methods demonstrate use on the generic function mean in code below.
3.3.3.1 Due to the somewhat simple nature of the S3 system, there is very little or reection possible. The function methods reports on all avail>introspection methods("mean") able methods for a given generic function but it does this simply by looking at the names. We demonstrate its use on the S3 generic function mean in the The function methods reports on all available methods for a given [1] mean.POSIXct mean.POSIXlt code mean.Date below. generic function but it does this simply by looking at the names.
mean.difftime
[1] mean.Date mean.POSIXct mean.POSIXlt One can also use methods to nd all available methods [4] code mean.data.frame mean.default the below we nd all methods mean.difftime for the class glm .
One can also use methods to nd all available methods for a given class. In
the code below we nd all methods class glm . One can also use methods to nd for all the available methods for a given class. In Object-Oriented Programming in R 81 code below we nd= all"glm") methods for the class glm . >the methods(class
[1] [3] [5] [7] [9] [11] [13] [15] [17] [19] [21]
add1.glm* confint.glm* deviance.glm* effects.glm* family.glm* influence.glm* model.frame.glm print.glm rstandard.glm summary.glm weights.glm*
anova.glm cooks.distance.glm* drop1.glm* extractAIC.glm* formula.glm* logLik.glm* predict.glm residuals.glm rstudent.glm vcov.glm*
39
The S3 system
# S3 generic functions and methods print # the print generic print.lm # print method for "lm" objects mod.prestige print(mod.prestige) # equivalent print.lm(mod.prestige) # equivalent, but bad form methods("print") # print methods methods(class="lm") # methods for objects of class "lm" [1] add1.lm* alias.lm* anova.lm case.names.lm* [5] confint.lm* cooks.distance.lm* deviance.lm* dfbeta.lm* [9] dfbetas.lm* drop1.lm* dummy.coef.lm* effects.lm* [13] extractAIC.lm* family.lm* formula.lm* hatvalues.lm [17] influence.lm* kappa.lm labels.lm* logLik.lm* [21] model.frame.lm model.matrix.lm plot.lm predict.lm [25] print.lm proj.lm* residuals.lm rstandard.lm [29] rstudent.lm simulate.lm* summary.lm variable.names.lm* [33] vcov.lm* Non-visible functions are asterisked
40
The S3 system
> utils:::print.aspell(aspout) Youmispelled can nd the invisible functions via the wrds:1:15 function getAnywhere()
see all the generic methods this You can You can see all the generic methods th way:
> methods(class="default") ...
41
Writing S3 classes
A class instance is created by forming a list,
with the components of the list being the member variables of the class. the attr() or class() function.
42
9.1.4
Writing S3 Classes
S3 classes have a rather cobbled-together structure. A class instance is created by forming a list, with the components of the list being the member variables of the class. (Readers who know Perl may recognize this ad hoc nature in Perls own OOP system.) The "class" attribute is set by hand by using the attr() or class() function, and then various implementations of generic functions are dened. We can see this in the case of lm() by inspecting the function:
Writing S3 classes
> lm ... z <- list(coefficients = if (is.matrix(y)) matrix(,0,3) else numeric(0L), residuals = y, fitted.values = 0 * y, weights = w, rank = 0L, df.residual = if (is.matrix(y)) nrow(y) else length(y)) } ... class(z) <- c(if(is.matrix(y)) "mlm", "lm") ...
Again, dont mind the details; the basic process is there. A list was created and assigned to z, which will serve as the framework for the "lm" class instance (and which will eventually be the value returned by the function). Some components of that list, such as residuals, were already assigned when the list was created. In addition, the class attribute was set to "lm" (and possi43 bly to "mlm", as will be explained in the next section).
Some components of that list, such as residuals, were already assigned when the list was created. In addition, the class attribute was set to "lm" (and possi$class bly to "mlm", as will be explained in the next section). [1] "employee" As an example of how to write an S3 class, lets switch to something simpler. Continuing our employee example from Section 4.1, we could Before we write a print method when we call the default print(): write this:
Writing S3 classes
> j $name [1] "Joe" $salary [1] 55000 $union [1] TRUE
> j <- list(name="Joe", salary=55000, union=T) > class(j) <- "employee" > attributes(j) # let's check
$names [1] "name" "salary" "union" $class [1] "employee"
pter 9
Before we write a print method for this class, lets see what happens attr(,"class") when we call the default print(): [1] "employee"
Before we write a print > j method for this class, $name lets see what happens [1] "Joe" when we call the default $salary print():
[1] 55000 $union
Essentially, j was treated as a list for Now lets write our own print m Essentially, j was
Writing S3 classes
Essentially, j was treated as a list for printing purposes. Now lets write our own print method:
print.employee <- function(wrkr) { cat(wrkr$name,"\n") cat("salary",wrkr$salary,"\n") cat("union member",wrkr$union,"\n") }
So, any call to print() on an object of class "employee" should now be referred to print.employee(). We can check that formally:
> methods(,"employee") [1] print.employee
45
Using inheritance
9.1.5 Using new Inheritance The idea of inheritance is to form The idea of inheritance is to form new class classes as specialized versions of old ones. ones. In our previous employee example, fo
k as <- specialized list(name="Kate", salary= 68000, union=F The idea of inheritance is to form new classes versions of old <-we c("hrlyemployee","employee") ones. In our previous employee example, forclass(k) instance, could form a new class devoted to hourly employees, "hrlyemployee", as a subclass of "employee", as follows: Our new class has one extra variable: hr
devoted to hourly employees, hrlyemployee, as a subclass of employee, as follows: class consists of two character strings, repres
class. Our new k <- list(name="Kate", salary= 68000, union=F,old hrsthismonth= 2) class inherits the method print.employee() still works on the new class: class(k) <- c("hrlyemployee","employee")
> k OurOur new class new class has one extra variable: hrsthismonth. The name of the new Kate class consists of two character strings, representing the new class and the salary 68000 inherits the methods old class. Our new class inherits the methodsunion of the old one. For instance, member FALSE print.employee() works on the new class: from the old still class
46
This means that operators such as == or < can have The functions and operators have been grouped
into three categories and group methods can be written for each of these categories. 47
83
Summary Ops
It is possible to write methods specic to any function within a group and then a method dened for a single member of group takes precedence over the group method.
49
The S3 system
# S3 "inheritance" mod.mroz <- glm(lfp ~ ., family=binomial, data=Mroz) class(mod.mroz)
50
The S3 system
# Example: a logistic-regression function lreg3 <- function(X, y, predictors=colnames(X), max.iter=10, tol=1E-6, constant=TRUE) { if (!is.numeric(X) || !is.matrix(X)) # data checks stop("X must be a numeric matrix") if (!is.numeric(y) || !all(y == 0 | y == 1)) stop("y must contain only 0s and 1s") if (nrow(X) != length(y)) stop("X and y contain different numbers of observations") if (constant) { # attach constant? X <- cbind(1, X) colnames(X)[1] <- "Constant" } b <- b.last <- rep(0, ncol(X)) it <- 1 while (it <= max.iter){ p <- as.vector(1/(1 + exp(-X %*% b))) var.b <- solve(crossprod(X, p * (1 - p) * X)) b <- b + var.b %*% crossprod(X, y - p) if (max(abs(b - b.last)/(abs(b.last) + 0.01*tol)) < tol) break b.last <- b it <- it + 1 } if (it > max.iter) warning("maximum iterations exceeded") dev <- -2*sum(y*log(p) + (1 - y)*log(1 - p)) result <- list(coefficients=as.vector(b), var=var.b, deviance=dev, converged= it <= max.iter, predictors=predictors) class(result) <- "lreg3" # assign class result } 51
The S3 system
Mroz$lfp <- with(Mroz, ifelse(lfp == "yes", 1, 0)) Mroz$wc <- with(Mroz, ifelse(wc == "yes", 1, 0)) Mroz$hc <- with(Mroz, ifelse(hc == "yes", 1, 0)) mod.mroz.3 <- with(Mroz, lreg3(cbind(k5, k618, age, wc, hc, lwg, inc), lfp)) class(mod.mroz.3) mod.mroz.3 # whoops! print.lreg3 <- function(x, ...) # print method for class "lreg3" { coef <- x$coefficients names(coef) <- x$predictors print(coef) if (!x$converged) cat("\n *** lreg did not converge ***\n") invisible(x) # note: passes through argument invisible } mod.mroz.3
52
The S3 system
summary
# summary generic
summary.lreg3 <- function(object, ...) # summary method for class "lreg3" { b <- object$coefficients se <- sqrt(diag(object$var)) z <- b/se table <- cbind(b, se, z, 2*(1-pnorm(abs(z)))) colnames(table) <- c("Estimate", "Std.Err", "Z value", "Pr(>z)") rownames(table) <- object$predictors result <- list(coef=table, deviance=object$deviance, converged=object$converged) class(result) <- "summary.lreg3" # creates an object of class "summary.lreg3" result } print.summary.lreg3 <- function(x, ...) # print method for class "summary.lreg3" { printCoefmat(x$coef) cat("\nDeviance =", x$deviance,"\n") if (!x$converged) cat("\n Note: *** lreg did not converge ***\n") } summary(mod.mroz.3)
53
The S3 system
# writing a generic function names(summary(mod.prestige)) rsq <- function(model, ...) { UseMethod("rsq") } rsq.lm <- function(model, adjusted=FALSE, ...) { summary <- summary(model) if (adjusted) summary$adj.r.squared else summary$r.squared } rsq(mod.prestige) rsq(mod.prestige, adjusted=TRUE) rsq(mod.mroz) # via inheritance (doesn't work)
54
caused UseMethod() to search for a print method on the rst of ks two cl names, "hrlyemployee". That search failed, so UseMethod() tried the other name, "employee", and found print.employee(). It executed the latter. Recall that in inspecting the code for "lm", you saw this line:
You can now see that "mlm" is a subclass of "lm" for vector-valued res variables.
triangular matrices (squared matrices whose 9.1.6 Extended Example: A Class for Storing Upper-Triangular Matric elements below the diagonal are zeros). Now its time for a more involved example, in which we will write an R
ments below the diagonal are zeros, such as shown in Equation 9.1. 1 5 12 0 6 9 0 0 2
"ut" for upper-triangular matrices. These are square matrices whose ele
Our motivation here is to save storage space (though at the expens little extra access time) by storing only the nonzero portion of the mat
The R class "dist" also uses such storage, though in a more focused context and out the class functions we have here.
be stored, in column-major order. Storage for the matrix (9.1), for instance, consists of the vector (1,5,6,12,9,2), and the component mat has that value. We will include a component ix in this class, to show where in mat the various columns begin. For the preceding case, ix is c(1,2,4), meaning that column 1 begins at mat[1], column 2 begins at mat[2], and column 3 begins at mat[4]. This allows for handy access to individual elements or columns of the matrix. The following is the code for our class.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
# class "ut", compact storage of upper-triangular matrices # utility function, returns 1+...+i sum1toi <- function(i) return(i*(i+1)/2)
# create an object of class "ut" from the full matrix inmat (0s included) ut <- function(inmat) { n <- nrow(inmat) rtrn <- list() # start to build the object vector that contains where in class(rtrn) <- "ut" mat each column rtrn$mat <- vector(length=sum1toi(n)) rtrn$ix <- sum1toi(0:(n-1)) + 1 begins for (i in 1:n) { # store column i ixi <- rtrn$ix[i] rtrn$mat[ixi:(ixi+i-1)] <- inmat[1:i,i] } return(rtrn) }
56
11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37
rtrn$mat <- vector(length=sum1toi(n)) rtrn$ix <- sum1toi(0:(n-1)) + 1 for (i in 1:n) { # store column i ixi <- rtrn$ix[i] rtrn$mat[ixi:(ixi+i-1)] <- inmat[1:i,i] } return(rtrn)
215
57
# multiply one ut matrix by another, returning another ut instance; # implement as a binary operation "%mut%" <- function(utmat1,utmat2) { n <- length(utmat1$ix) # numbers of rows and cols of matrix utprod <- ut(matrix(0,nrow=n,ncol=n)) for (i in 1:n) { # compute col i of product # let a[j] and bj denote columns j of utmat1 and utmat2, respectively, # so that, e.g. b2[1] means element 1 of column 2 of utmat2 # then column i of product is equal to # bi[1]*a[1] + ... + bi[i]*a[i] # find index of start of column i in utmat2 startbi <- utmat2$ix[i] # initialize vector that will become bi[1]*a[1] + ... + bi[i]*a[i] prodcoli <- rep(0,i) for (j in 1:i) { # find bi[j]*a[j], add to prodcoli startaj <- utmat1$ix[j] bielement <- utmat2$mat[startbi+j-1] prodcoli[1:j] <- prodcoli[1:j] + bielement * utmat1$mat[startaj:(startaj+j-1)] } # now need to tack on the lower 0s startprodcoli <- sum1toi(i-1)+1 utprod$mat[startbi:(startbi+i-1)] <- prodcoli } return(utprod) } 58
roduct can be expressed as a linear combination of the columns of the rst column i to ofbe the product can expressed as a this linear combination of the columns of product can expressed as abe linear combination of the columns of shown the rst actor. It will help see a specic example of property, in Equathe rst factor. factor. It will help to see a specic example of this property, shown in Equaon 9.2. tion 9.2. 4 3 2 4 5 9 1 2 3 4 3 2 4 5 9 1 2 3 0 1 2 0 1 4 2 = (9.2) 0 1 0 21 0 1 2 0 1 4 = (9.2) 0 0 1 0 0 5 0 0 5 0 0 1 0 0 5 0 0 5 The comments say that, for instance, column 3 of the product is equal to
the following: The The comments say that, for instance, column 3 of the product third column of the can be calculated asis equal to product 1 2 3 he following: 2 0 + 2 1 + 1 2 0 0 5 1 2 3 Inspection of Equation conrms relation. 09.2 the + 1 2 1 2 + 2 Couching the multiplication problem in terms of columns of the 0 us to compact 0 the code and5to likely increase two input matrices enables
the speed. The latter again stems from vectorization, a benet discussed nspectionin of Equation 9.2 conrms the detail in Chapter 14. This approach is relation. used in the loop beginning at 59 line 53. (Arguably, in this case, problem the increasein in speed comes at the expense Couching the multiplication terms of columns of the
60
61
tting, so that the prediction of new, future data actually deteriorates for degrees higher than some value. The class "polyreg" aims to deal with this issue. It ts polynomials of various degrees but assesses ts via cross-validation to reduce the risk of overtting. In this form of cross-validation, known as the leaving-one-out method, for each point we t the regression to all the data except this observation, and then we predict that observation from the t. An object of this class consists of outputs from the various regression models, plus the original data. The following is the code for the "polyreg" class.
1 2 3 4 5 6 7 8 9 10
# "polyreg," S3 class for polynomial regression in one predictor variable # polyfit(y,x,maxdeg) fits all polynomials up to degree maxdeg; y is # vector for response variable, x for predictor; creates an object of # class "polyreg" polyfit <- function(y,x,maxdeg) { # form powers of predictor variable, ith power in ith column pwrs <- powers(x,maxdeg) # could use orthog polys for greater accuracy lmout <- list() # start to build class class(lmout) <- "polyreg" # create a new class
11 12 13 14 15 16 17 18 19 20
for (i in 1:maxdeg) { lmo <- lm(y ~ pwrs[,1:i]) Object-Oriented Programming # extend the lm class here, with the cross-validated predictions lmo$fitted.cvvalues <- lvoneout(y,pwrs[,1:i,drop=F]) lmout[[i]] <- lmo } lmout$x <- x lmout$y <- y return(lmout) }
219
62
14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52
for (i in 1:maxdeg) { fi <- fits[[i]] errs <- fits$y - fi$fitted.cvvalues spe <- crossprod(errs,errs) # sum of squared prediction errors tbl[i,1] <- spe/n } cat("mean squared prediction errors, by degree\n") print(tbl)
# finds cross-validated predicted values; could be made much faster via # matrix-update methods 64 lvoneout <- function(y,xmat) {
41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58
pw <- matrix(x,nrow=length(x)) prod <- x for (i in 2:dg) { prod <- prod * x pw <- cbind(pw,prod) } return(pw)
59
Chapter 9 60
61 62 63 64 65
59
# the 1 accommodates the constant term predy[i] <- betahat %*% c(1,xmat[i,])
} return(predy)
# polynomial function of x, coefficients cfs poly <- function(x,cfs) { val <- cfs[1] prod <- 1 dg <- length(cfs) - 1 for (i in 1:dg) { prod <- prod * x val <- val + cfs[i+1] * prod } }
66
29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54
fi <- fits[[i]] errs <- fits$y - fi$fitted.xvvalues spe <- sum(errs^2) tbl[i,1] <- spe/n
S4 classes
Some programmers feel that S3 does not provide the safety normally associated with OOP. For example, consider our earlier employee database example where our class "employee" had three elds: name, salary, and union. Here are some possible mishaps: ! We forget to enter the union status. ! We misspell union as onion. ! We create an object of some class other than "employee" but accidentally set its class attribute to "employee". In each of these cases, R will not complain. The goal of S4 is to elicit a complaint and prevent such accidents.
68
Overview between the differences In each of these cases, R will not complain. The goal of S4 is to elici complaint and prevent S3 such accidents. between and S4 classes
Table 9-1: Basic R Operators
Operation Dene class Create object Reference member variable Implement generic f() Declare generic S3 Implicit in constructor code Build list, set class attr $ Dene f.classname() UseMethod() S4
setClass() new() @ setMethod() setGeneric()
We create an object of some class other than "employee" but acciden set its class attribute to "employee".
S4 structures are considerably richer than S3 structures, but here we present just the basics. Table 9-1 shows an overview of the differences between the two classes.
9.2.1
Writing S4 Classes
You dene an S4 class by calling setClass(). Conti ple, we could write the following:
> setClass("employee", + representation( + name="character", + salary="numeric", + union="logical") + ) [1] "employee"
Writing S4 classes
Writing S4 classes
This denes a new class, "employee", with three member variables of the specied types. NowNow letslets create an instance of this class, for Joe, using new(), a create an instance of this class, for Joe, using new(), a built-in built-in constructor for S4 classes: constructor functionfunction for S4 classes:
> joe <- new("employee",name="Joe",salary=55000,union=T) > joe An object of class "employee" Slot "name": [1] "Joe" Slot "salary": [1] 55000
Object-Oriented Programming
223
Note that the member variables are called slots, referenced via the @ symbol. Heres an example:
Note that the member variables are called slots, referenced via the @ symbol.
71
Writing S4 classes
Note that the member variables are called slots, referenced via the @ symbol. Heres an example:
> joe@salary [1] 55000
We can also use the slot() function, say, as another way to query Joes salary:
> slot(joe,"salary") [1] 55000
72
We can also use the slot() function, say, as another way to query Joes salary:
> slot(joe,"salary") [1] 55000
Writing S4 classes
Writing S4 classes
As noted, an advantage of using S4 is safety. To illustrate this, suppose we were to accidentally spell salary as salry, like this:
> joe@salry <- 48000 Error in checkSlotAssignment(object, name, value) : "salry" is not a slot in class "employee"
By contrast, in S3 there would be no error message. S3 classes are just lists, and you are allowed to add a new component (deliberately or not) at any time.
9.2.2
75
To dene an implementation of a generic function on an S4 class, use setMethod(). Lets do that for our class "employee" here. Well implement show() function, which S4 analog S3s generic "print" . of the In the R, when you type the name ofis a the variable while inof interactive mode, the value As know, variable is you printed out: in R, when you type the name of a variable while in interactive mode, the value of the variable is printed out:
> joe An object of class "employee" Slot "name": [1] "Joe" Slot "salary": [1] 88000 Slot "union": [1] TRUE
By contrast, in S3 there would be no error message. S3 classes are just lists, and you are allowed to add a new component (deliberately or not) at any time.
Since joe is an S4 object, the action here is that show() is called. In fact, we would get the same output by typing this:
> show(joe)
76
[1] TRUE
Since joe is an S4 object, the action here is that show() is called. In fact, we would get the same output by typing this:
> show(joe)
setMethod("show", "employee", function(object) { inorout <- ifelse(object@union,"is","is not") cat(object@name,"has a salary of",object@salary, "and",inorout, "in the union", "\n") } )
The rst argument gives the name of the generic function for which we will dene a classwill dene class-specic method, and argument the specic method,a and the second argument gives the the classsecond name. We then denegives the new name. then function. Lets We try it out:dene the new function.
The rst argument gives the name of the generic function for which we class
225
Object-Oriented Programming
S4 system
The S4 system was designed to overcome some of
the deciencies of the S3 system as well as to provide other functionality that was simply missing from the S3 system. explicit representation of classes, together with tools that support programmatic inspection of the class denitions and properties. S4 methods are registered directly with the appropriate generic. 78
Among the major changes between S3 and S4 are the Multiple dispatch is supported in S4, but not in S3, and
S4 system
These changes greatly increase the stability of the
system and make it much more likely that code will perform as intended by its authors. slower and it is more difficult to design and modify a system interactively.
79
S4 system: classes
A class denition species the structure, inheritance and initialization of instances of that class. A class is dened by a call to the function setClass. The following arguments can be specied in the call to setClass:
Class a character string naming the class. representation a named vector of types or classes. The names correspond to the slot names in the class and the types indicate what type of value can be stored in the slot. contains a character vector of class names, indicating the classes extended or subclassed by the new class. prototype an object (usually a list) providing the default data for the slots specied in the representation. validity a function that checks the validity of instances of the class. It must return either TRUE or a character vector describing how the object is invalid.
80
S4 system: classes
Once a class has been dened by a call to
setClass, it
is possible to create instances of the class through calls to new. dene default values to use for the different components of the class. Prototype values can be overridden by expressly setting the value for the slot in the call to new.
81
Once a class has been dened by a call to setClass, it is possible to create instances of the class through calls to new. The prototype argument can be used to dene default values to use for the dierent components of the class. Prototype values can be overridden by expressly setting the value for the slot in the call to new. In the code below, we create a new class named A that has a single slot, s1, that contains numeric data and we set the prototype for that slot to be 0.
Example
> setClass("A", representation(s1 = "numeric"), + prototype = prototype(s1 = 0)) [1] "A" > myA = new("A") > myA An object of class "A" Slot "s1": [1] 0
82
Example
86 R Programming for Bioinformatics > m2 = new("A", s1 = 10) > m2 An object of class "A" Slot "s1": [1] 10
We can create a second class B that contains A, so that B is a direct subclass of A or, put another way, B inherits from class A. Any instance of the class B will have all the slots in the A class and any additional ones dened specically for B . Duplicate slot names are not allowed, so the slot 83 names for B must be distinct from those for A.
S4 system: classes
We can create a second class B that contains
A, so that B is a direct subclass of A or, B inherits from class A.
Any instance of the class B will have all the slots in the A class and any additional ones dened specically for B . Duplicate slot names are not allowed, so the slot names for B must be distinct from those for A.
84
We can create a second class B that contains A, so that B is a direct subclass of A or, put another way, B inherits from class A. Any instance of the class B will have all the slots in the A class and any additional ones dened specically for B . Duplicate slot names are not allowed, so the slot names for B must be distinct from those for A.
Example
> setClass("B", contains = "A", representation(s2 = "character"), + prototype = list(s2 = "hi")) [1] "B" > myB = new("B") > myB An object of class "B" Slot "s2": [1] "hi" Slot "s1": [1] 0
85
S4 system: classes
Classes can be removed using the function
removeClass. However, this
is not especially useful since you cannot remove classes from attached packages. with class creation interactively.
not especially useful since you cannot remove classes from attached packages. The removeClass is most useful when experimenting with class creation interactively. But in most cases, users are developing classes within packages, and the simple expedient of removing the class denition and rebuilding the package is generally used instead. We demonstrate the use of this function on a user-dened class in the code below.
Example
Object-Oriented Programming in R
87
Slots: > getClass("Ohno") Name: y Class: numeric > removeClass("Ohno") [1] TRUE > tryCatch(getClass("Ohno"), error = function(x) "Ohno is gone") [1] "Ohno is gone"
87
S4 system: classes
Once a class has been dened, there are a These include:
getSlots
number of software tools that can be used to nd out about that class. that will report the slot names and
slotNames
types,
the function
3.4.1.1
Introspection
Once a class has been dened, there are a number of software tools that can be used to nd out about that class. These include getSlots that will report the slot names and types, the function slotNames that will report only the slot names. These functions are demonstrated using the class A dened above. > getSlots("A") s1 "numeric" > slotNames("A") [1] "s1" The class itself can be retrieved using getClass.The function extends can be called with either the name of a single class, or two class names. If called with two class names, it returns TRUE if its rst argument is a subclass of its second argument. If called with a single class name, it returns the names of 89
Example
S4 system: classes
The class itself can be retrieved using
getClass.
The function extends can be called with either the name of a single class, or two class names.
If called with two class names, it returns TRUE if its rst argument is a subclass of its second argument. If called with a single class name, it returns the names of all subclasses, including the class itself.
Additional helper functions have been dened in the RBioinf package, superClassNames and subClassNames, to print the names of the superclasses and of the subclasses, respectively.
90
second argument. If called with a single class name, it returns the names of all subclasses, including the class itself. However, this is slightly confusing and additional helper functions have been dened in the RBioinf package, superClassNames and subClassNames, to print the names of the superclasses and of the subclasses, respectively. The use of these functions is shown in the code below.
Example
> extends("B") 88
91 These functions also provide information about builtin classes that have
[1] "A" > superClassNames("B") [1] "A" > subClassNames("A") [1] "A" > subClassNames("A") [1] "B" > subClassNames("A") [1] "B" [1] "B" These functions also provide information about builtin classes that have been converted via setOldClass . information about builtin classes that have These functions also provide
S4 system: classes
that
been converted via also setOldClass These functions provide . information about builtin classes that have functions also .provide information about built-in classes been These converted via setOldClass > getClass("matrix") have been converted via setOldClass. > getClass("matrix")
> getClass("matrix") No Slots, prototype of class "matrix" No Slots, prototype of class "matrix" Extends: No Slots, prototype of class "matrix" Class "array", directly Extends: Class by class "array", distance 2 Class "structure", "array", directly Extends: Class by class "array", distance 3, with explicit co Class "vector", "structure", by class "array", distance 2 Class "array", directly erce Class "vector", by class "array", distance 3, with explicit co Class "structure", by class "array", distance 2 erce Class "vector", by class "array", distance 3, with explicit co Known Subclasses: erce "array", directly, with explicit test and coerce Class Known Subclasses: Class "array", directly, with explicit test and coerce Known Subclasses: Class "array", directly, with explicit test and coerce > extends("matrix") > extends("matrix") [1] "matrix" "array" "structure" "vector" > extends("matrix") [1] "matrix" "array" "structure" "vector" [1] "matrix" "array" "structure" "vector" To determine whether or not a class has been dened, use isClass. You 92 . can test whether whether or not an Rnot object is an instance of an S4use class using. isS4 To determine or a class has been dened, isClass You
S4 system: classes
To determine whether or not a class has
been dened, use isClass.
S4 system: classes
The standard mechanism for coercing objects from
one class to another is the function as, which has two forms. class is coerced to the other class, and
The standard mechanism for coercing objects from one class to another is the function as, which has two forms. One form is coercion where an instance of one class is coerced to the other class, and the second form is an assignment version, where a portion of the object supplied is coerced. The second form is really only applicable to situations where one class is a subclass of the other. In the example below, we rst create an instance of B , then coerce it to be an instance of A. The method for this is automatically available since the classes are nested, and in fact you can also coerce from the superclass to the subclass, with missing slots being lled in from the prototype.
Example
> myb = new("B") > as(myb, "A") An object of class "A" Slot "s1": [1] 0 The second form is the assignment form where we replace the A part of myb with the new values in mya.
95
Example
The second form is the assignment form where we replace the A part of myb with the new values in mya.
> mya = new("A", s1 = 20) > as(myb, "A") <- mya > myb An object of class "B" Slot "s2": [1] "hi" Slot "s1": [1] 20 When classes are not nested, the user must provide an explicit version of the coercion function, and optionally of the replacement function. The syntax
96
S4 system: classes
When classes are not nested, the user must provide
an explicit version of the coercion function, and optionally of the replacement function.
S4 system: classes
Once a class has been dened, users will want to the specication of a prototype for the class, the creation of an initialize method, or through values supplied in the call to new.
98
create instances of that class. The creation of instances is controlled by three separate but related tools:
to the initialize method hierarchy. Provided any user-supplied initialize methods have a call to callNextMethod, this hierarchy will be traversed until the default method is encountered. In this method the value is modied according to the arguments supplied to new and the result is returned. The prototype can be set using either a list or a call to prototype. In the example below, we dene a class, Ex1 , whose prototype has a random sample of values from the N (0, 1) distribution in its s1 slot.
Example
Programming in R > setClass("Ex1",Object-Oriented representation(s1 = "numeric"), prototype = prototype(s1 = rnorm(10))) [1] "Ex1" > b = new("Ex1") > b An object of class "Ex1" Slot "s1": [1] -1.3730 -0.5483 0.2648 [8] -1.6695 -0.0536 0.0729
91
0.0487
1.4423
0.0283
1.1793
Exercise 3.6 What happens if you generate a second instance of the Ex1 class? Why 99
might this not be desirable? Examine the prototype for the class and see if you can understand what has happened. Will changing the prototype to list(s1=quote(rnorm(10))) x the problem? When a subclass, such as B from our previous example, is dened, then a prototype is constructed from the prototypes of the superclasses for slots that are not specied in the prototype for the subclass. We see, below, that the prototype for B has a value for the s1 slot, even though none was formally supplied, and that value is the one for the superclass A. > bb = getClass("B") > bb@prototype <S4 Type Object> attr(,"s2") [1] "hi" attr(,"s1") [1] 0 If desired, one can dene an initialize method for a class. The default initialize method takes either named arguments, where the names are those 100 of slots, or one or more unnamed arguments that correspond to instances of
Example
Example
In the example below, we dene two new classes, one a simple class, W, and then a class that is a subclass of both A, dened earlier, and W . When creating new instances of W and A, we made use of named arguments to the initialize method, but when creating a new instance of the WA class, we used the unnamed variant and supplied instances of the superclasses. 92 R Programming for Bioinformatics
> setClass("W", representation(c1 = "character")) [1] "W" > setClass("WA", contains = (c("A", "W"))) [1] "WA" > a1 = new("A", s1 = 20) > w1 = new("W", c1 = "hi") > new("WA", a1, w1) An object of class "WA" Slot "s1": [1] 20 Slot "c1": [1] "hi"
101
Types of classes
A class can be instantiable or virtual. Direct instances of virtual classes cannot be
created.
102
3.4.6 UsinginS3 classes S4 classes used for dispatch S4 methods bywith rst creating an S4 virtualization of the class. This is
S3 classes can be used to describe the contents of a slot in an S4 class, and they can be
done a call to setOldClass , and many such are of created when S3 with classes can be used to describe the classes contents a slot in the an methods S4 class, package is attached. The resulting S4 classes classes, that creating instances an S4 and they can be used for dispatch in are S4 virtual methods by so rst cannot be created All classes created a call to inherit from the class , and many virtualization of directly. the class. This is done by with a call to setOldClass oldClass . such classes are created when the methods package is attached.
> setOldClass("mymatrix") > getClass("mymatrix") Virtual Class No Slots, prototype of class "S4" Extends: "oldClass" The resulting S4 classes are virtual 103 classes, so that instances cannot be
setMethod()
setMethod(f, signature=character(), definition, where = topenv(parent.frame()), valueClass = NULL, sealed = FALSE) f A generic function or the character-string name of the function. signature A match of formal argument names for f with the character-string names of corresponding classes. See the details below; however, if the signature is not trivial, you should use method.skeleton to generate a valid call to setMethod. definition A function definition, which will become the method called when the arguments in a call to f match the classes in signature, directly or through inheritance. where the environment in which to store the definition of the method. For setMethod, it is recommended to omit this argument and to include the call in source code that is evaluated at the top level; that is, either in an R session by something equivalent to a call to source, or as part of the R source code for a package. For removeMethod, the default is the location of the (first) instance of the method for this signature. valueClass Obsolete and unused, but see the same argument for setGeneric. sealed If TRUE, the method so defined cannot be redefined by another call to setMethod (although it can be removed and then re-assigned).
107
generic function and it also establishes a default method that will be used if no function with matching signature is found. The syntax is quite straightforward. The def argument is a function, each named argument can be dispatched on, and the . . . argument should be used if other arguments to the generic will be permitted. These arguments cannot be dispatched on, however. So in the code below, the generic function has two named arguments, object and x, and methods can be dened that indicate dierent signatures for these two arguments.
Example
> setGeneric("foo", function(object, x) standardGeneric("foo")) [1] "foo" > setMethod("foo", signature("numeric", "character"), function(object, x) print("Hi, I m method one")) [1] "foo" Exercise 3.9 Dene another method for the generic function foo dened above, with a dierent signature. Test that the correct method is dispatched to for dierent 108 arguments.
part of the signature of the generic function. This is achieved by explicitly stating the signature for the generic function using the signature argument in the call to setGeneric.
109
arguments. Any argument passed through the . . . argument cannot be dispatched on. It is possible to have named arguments that are not part of the signature of the generic function. This is achieved by explicitly stating the signature for the generic function using the signature argument in the call to setGeneric, as is demonstrated below. In that case it may make sense for a method to provide default values for the arguments not in the signature.
Example
> setGeneric("genSig", signature = c("x"), function(x, y = 1) standardGeneric("genSig")) [1] "genSig" > setMethod("genSig", signature("numeric"), function(x, y = 20) print(y)) [1] "genSig" > genSig(10) [1] 20
110
this is not too useful since only generic functions dened in the users workspace are easily removed. the packages that they are dened in, use the function getGenerics, with no arguments.
111
Example
> getClass("ObjectsWithPackage") Class "ObjectsWithPackage" [package "methods"]
104 Slots:
generic functions dened in that package. In the example below, we load the Extends: Biobase package and then try to nd all generic functions that are dened Class "character", from data part in it. "vector", by class "character", distance 2 Class
Class "data.frameRowLabels", by class "character", distance 2 Class "characterORMIAME", by class "character", distance 2
[1] 78
most specic to least specic. Dispatch is entirely determined by the signature and the registered methods at the time evaluation of the generic function begins.
114
They can be removed through a call to either The method should have one argument matching
each argument in the signature of the generic function.
method will handle some, but not all, of the arguments in the signature of the generic.
116
you can dene methods with named arguments that will be handled by the . . . argument to the generic function. But some care is needed because these arguments, in some sense, do not count. signature (set of classes dened for the formal arguments to the generic), regardless of whether or not other argument names match.
When . . . is an argument to the generic function, you can dene methods with named arguments that will be handled by the . . . argument to the generic function. But some care is needed because these arguments, in some sense, do not count. There can be only one method, with any given signature (set of classes dened for the formal arguments to the generic), regardless of whether or not other argument names match.
Example
>
[1] "bar" > > ##removes the method above setMethod("bar", signature("numeric", "numeric"), function(x, y, z) print("Method2"))
if other arguments to the generic will be permitted. These arguments cannot be dispatched on, however. So in the code below, the generic function has two named arguments, object and x, and methods can be dened that indicate dierent signatures for these two arguments.
Example
> setGeneric("foo", function(object, x) standardGeneric("foo")) [1] "foo" > setMethod("foo", signature("numeric", "character"), function(object, x) print("Hi, I m method one")) [1] "foo"
Dene another method for the generic function foo dened above, with a di erent signature. Test that the correct method is dispatched to for dierent > foo(5,3) arguments.
Error en function (classes, fdef, mtable) : unable to nd an inherited method for function "foo", for signature "numeric","numeric" Any argument passed through the . . . argument cannot be dispatched on.
119 It is possible to have named arguments that are not part of the signature of
100
3.4.5
Accessing slots directly using the @ operator relies on the implementation details of the class, and such access will make it very dicult to change that implementation. In many cases it will be advantageous to provide accessor functions for some, or all, of the components of an object. Suppose that the create accessor for this slot, we createfor a generic function named class To Foo has a an slot named a.function To create an accessor function this slot, we aa and a method fornamed instances ofathe classfor Foo. create generic function a and method instances of the class Foo. > setClass("Foo", representation(a = "ANY")) [1] "Foo" > setGeneric("a", function(object) standardGeneric("a")) [1] "a" > setMethod("a", "Foo", function(object) object@a) [1] "a" > b = new("Foo", a = 10) > a(b) [1] 10
120
Accessor functions
The S4 system
# definition of S4 classes setClass("lreg4",representation(coefficients="numeric", var="matrix",iterations="numeric", deviance="numeric", predictors="character"))
121
The S4 system
lreg4 <- function(X, y, predictors=colnames(X), constant=TRUE, max.iter=10, tol=1E-6) { if (!is.numeric(X) || !is.matrix(X)) stop("X must be a numeric matrix") if (!is.numeric(y) || !all(y == 0 | y == 1)) stop("y must contain only 0s and 1s") if (nrow(X) != length(y)) stop("X and y contain different numbers of observations") if (constant) { X <- cbind(1, X) colnames(X)[1] <- "Constant" } b <- b.last <- rep(0, ncol(X)) it <- 1 while (it <= max.iter){ p <- as.vector(1/(1 + exp(-X %*% b))) var.b <- solve(crossprod(X, p * (1 - p) * X)) b <- b + var.b %*% crossprod(X, y - p) if (max(abs(b - b.last)/(abs(b.last) + 0.01*tol)) < tol) break b.last <- b it <- it + 1 } if (it > max.iter) warning("maximum iterations exceeded") # create an instance of the "lreg4" class: result <- new("lreg4", coefficients=as.vector(b), var=var.b, iterations=it, deviance=-2*sum(y*log(p) + (1 - y)*log(1 - p)), predictors=predictors) result } 122
The S4 system
mod.mroz.4 <- with(Mroz, lreg4(cbind(k5, k618, age, wc, hc, lwg, inc), lfp)) class(mod.mroz.4) mod.mroz.4
123
The S4 system
show # the S4 generic function show # defining an S4 method setMethod("show", signature(object="lreg4"), definition=function(object) { coef <- object@coefficients names(coef) <- object@predictors print(coef) } ) mod.mroz.4 # invokes show method
124
The S4 system
setMethod("summary", signature(object="lreg4"), definition=function(object, ...) { b <- object@coefficients se <- sqrt(diag(object@var)) z <- b/se table <- cbind(b, se, z, 2*(1-pnorm(abs(z)))) colnames(table) <- c("Estimate", "Std.Err", "Z value", "Pr(>z)") rownames(table) <- object@predictors printCoefmat(table) cat("\nDeviance =", object@deviance,"\n") } ) summary(mod.mroz.4)
125
The S4 system
# Lexical scope f <a <x <f(2) x function (x) x + a 10 5 # x bound to 2 in frame of f(), a to 10 in global frame # global x is undisturbed f <- function (x) { a <- 5 g(x) } g <- function(y) y + a f(2) # a bound to 10 in global frame a # global a is undisturbed f <- function (x) { a <- 5 g <- function (y) y + a g(x) } f(2) # a is bound to 5, x to 2 in frame of f(), y to 2 in frame of g()
126
The S4 system
# a function that returns a closure (function + environment) makePower <- function(power) { function(x) x^power } square <- makePower(2) square # power bound to 2 square(4) cuberoot <- makePower(1/3) cuberoot # power bound to 1/3 cuberoot(64)
127
signature if for every argument in the signature the class specied by the method is the same as the class of the corresponding supplied argument, a superclass of that class, or has class ANY. on the classes.
if the classes are the same, the distance is zero; if the class in the signature of the method is a direct superclass of the class of the supplied argument, then the distance is one, and so on. The distance from a class to ANY is chosen to be larger than any other distance.
The distance between an applicable method and the target signature can then be computed by summing up the distances over all arguments in the signature of the generic function, and these distances can then be used to order the methods.
130
Finding methods
We will often need to be able to determine
which methods are registered with a particular generic function.
Finding methods
showMethods shows the methods for one or more generic functions. The class argument can be used to ask for all methods that have a particular class in their signature. The output is printed to stdout by default and cannot easily be captured for programmatic use. getMethod returns the method for a specic generic function whose signature is congruent with the specied signature. An error is thrown if no such method exists. ndMethod returns the packages in the search path that contain a denition for the generic and signature specied.
132
Finding methods
selectMethod returns the method for a specic generic function and signature, but differs from getMethod in that inheritance is used to identify a method. existsMethod tests for a method with a congruent signature (to that provided) registered with the specied generic function. No inheritance is used. Returns either TRUE or FALSE. hasMethod tests for a method with a congruent signature for the specied generic function. It seems that this would always return TRUE (since there must be a default method). It does return FALSE if there is no generic function, but it seems that there are better ways to handle that.
133
Finding Documentation
Either a direct call to help or the use of the ?
operator will obtain the help page for most functions. example, t he syntax for displaying the help page for the graph class, from the graph package is: class?graph help("graph-class")
134
Finding Documentation
Help for generic functions requires no special syntax;
one just looks for help on the name of the generic function.
page for a method for the nodes generic function, for an argument of class graphNEL.
Finding Documentation
library(RBioinf) S4Help() The function takes the name of either a S4 generic
136
137
The function isS4 returns TRUE for an instance of an S4 class. For primitive functions that support dispatch, S4 methods are restricted to S4 objects. The function asS4 can be used to allow an instance of an S3 class to be passed to an S4 method. In the next example we show that when x is an S3 instance, we do not dispatch to the S4 method, but once we use asS4, then dispatch to the S4 method occurs.
Object-Oriented Programming in R
113
138