| pdmClass {pdmclass} | R Documentation |
This function is used to classify microarray data. Since the underlying model fit is based on penalized discriminant methods, there is no need for a pre-filtering step to reduce the number of genes.
pdmClass(formula = formula(data), method = c("pls", "pcr", "ridge"),
data = sys.frame(sys.parent()), weights, theta, dimension = J - 1,
eps = .Machine$double.eps, ...)
formula |
A symbolic description of the model to be fit. Details given below. |
method |
One of "pls", "pcr", "ridge", corresponding to partial least squares, principal components regression and ridge regression. |
data |
An optional data.frame that contains the variables in the
model. If not found in data, the variables are taken from
environment(formula), typically the environment from which
pdmClass is called. Note that unlike most microarray
analyses, in this case rows are samples and columns are genes. |
weights |
An optional vector of sample weights. Defaults to 1. |
theta |
An optional matrix of class scores, typically with less than J - 1 columns. |
dimension |
The dimension of the solution. This will be no greater than J - 1 for partial least squares and ridge regression, and no greater than J for principal components regression. Defaults to J - 1 and J, respectively. |
eps |
A threshold for excluding small discriminant
variables. Defaults to .Machine$double.eps. |
... |
Additional parameters to pass to method. |
The formula interface is identical to all other formula calls in R, namely Y ~ X, where Y is a numeric vector of class assignments and X is a matrix or data.frame containing the gene expression values. Note that unlike most microarray analyses, in this instance the columns of X are genes and rows are samples, so most calls will require something similar to Y ~ t(X).
an object of class "fda". Use predict to extract
discriminant variables, posterior probabilities or predicted class
memberships. Other extractor functions are coef,
and plot.
The object has the following components:
percent.explained |
the percent between-group variance explained by each dimension (relative to the total explained.) |
values |
optimal scaling regresssion sum-of-squares for each
dimension (see reference). The usual discriminant analysis
eigenvalues are given by values / (1-values), which are used
to define percent.explained. |
means |
class means in the discriminant space. These are also
scaled versions of the final theta's or class scores, and can be
used in a subsequent call to fda (this only makes sense if
some columns of theta are omitted—see the references). |
theta.mod |
(internal) a class scoring matrix which allows
predict to work properly. |
dimension |
dimension of discriminant space. |
prior |
class proportions for the training data. |
fit |
fit object returned by method. |
call |
the call that created this object (allowing it to be
update-able) |
James W. MacDonald and Debashis Ghosh, based on fda in
the mda package of Trevor Hastie and Robert Tibshirani, which
was ported to R by Kurt Hornik, Brian D. Ripley, and Friedrich Leisch.
http://www.sph.umich.edu/~ghoshd/COMPBIO/POPTSCORE
"Flexible Disriminant Analysis by Optimal Scoring" by Hastie, Tibshirani and Buja, 1994, JASA, 1255-1270.
"Penalized Discriminant Analysis" by Hastie, Buja and Tibshirani, Annals of Statistics, 1995 (in press).
library(fibroEset) data(fibroEset) y <- as.factor(pData(fibroEset)[,2]) x <- t(exprs(fibroEset)) pdmClass(y ~ x)