| xval-methods {MLInterfaces} | R Documentation |
support for cross-validatory machine learning with ExpressionSets
xval( data, classLab, proc, xvalMethod, group, indFun, niter, fsFun=NULL, fsNum=NULL, decreasing=TRUE, cluster=NULL, ... ) balKfold(K)
data |
instance of class ExpressionSet |
classLab |
character string identifying phenoData variable to label classifications |
proc |
an MLInterfaces method that returns an instance of classifOutput |
xvalMethod |
character string identifying cross-validation procedure to use: default is "LOO" (leave one out), alternatives are "LOG" (leave group out) and "FUN" (user-supplied partition extraction function, see Details below) |
group |
a vector (length equal to number of samples) enumerating groups for LOG xval method |
indFun |
a function that returns a set of indices to be saved as a test set;
this function must have parameters data, clab, iternum; see Details |
niter |
number of iterations for user-specified partition function to be run |
fsFun |
function computing ranks of features for feature selection |
fsNum |
number of features to be kept for learning in each iteration |
decreasing |
logical, should be TRUE if fsFun provides high scores for high-performing features
(e.g., is absolute value of a test statistics) and false if it provides low scores
for high-performing features (e.g., p-value of a test). |
cluster |
NULL or an S4-class object with a defined
xvalLoop method. Use this to execute xval on
several nodes in a computer cluster. See documentation for
xvalLoop for more information |
... |
arguments passed to the MLInterfaces generic proc |
K |
number of partitions to be used if balKfold is used as indFun |
For fixed feature sets (fsFun not specified),
a vector or matrix with length equal to the number of cross-validation
assignments. Each element contains the label resulting from the
cross-validation.
For dynamic feature sets (fsFun specified), a list with element
out containing labels from cross-validations, and element
fs.memory recording features used in each cross-validation.
If xvalMethod is "FUN", then indFun must be a function
with parameters data, clab, and iternum.
This function returns
indices that identify the training set for a given
cross-validation iteration passed as the value of iternum. An example
function is printed out when the example of this page is executed.
if fsFun is not NULL, then it must be a function with two
arguments: the first can be transformed to a feature matrix (rows are objects,
columns are features) and the second is a vector of class labels.
The function returns a vector of scores, one for each object. The
scores will be interpreted according to the value of decreasing,
to select fsNum features. Thanks to Stephen Henderson of University
College London for
this functionality.
library(golubEsets)
data(Golub_Merge)
smallG <- Golub_Merge[200:250,]
lk1 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", group=as.integer(0))
table(lk1,smallG$ALL.AML)
lk2 <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOG", group=as.integer(
rep(1:8,each=9)))
table(lk2,smallG$ALL.AML)
balKfold
lk3 <- xval(smallG, "ALL.AML", knnB, xvalMethod="FUN", 0:0, indFun=balKfold(5), niter=5)
table(lk3, smallG$ALL.AML)
#
# illustrate the xval FUN method in comparison to LOO
#
LOO2 <- xval(smallG, "ALL.AML", knnB, "FUN", 0:0, function(x,y,i) {
(1:ncol(exprs(x)))[-i] }, niter=72 )
table(lk1, LOO2)
#
# use Stephen Henderson's feature selection extensions
#
t.fun<-function(data, fac)
{
require(genefilter)
# deal with the integer storage of golubTrain@exprs!
xd <- matrix(as.double(exprs(data)), nrow=nrow(exprs(data)))
return(abs(rowttests(xd,pData(data)[[fac]], tstatOnly=FALSE)$statistic))
}
lk3f <- xval(smallG, "ALL.AML", knnB, xvalMethod="LOO", 0:0, fsFun=t.fun)
table(lk3f$out, smallG$ALL.AML)