rowpAUCs-methods {genefilter} | R Documentation |
Methods for fast rowwise computation of ROC curves and
(partial) area under the curve (pAUC) using the simple classification
rule x > theta
, where theta
is a value in the range of
x
rowpAUCs(x, fac, p=0.1, flip=TRUE, caseNames=c("1", "2"))
x |
ExpressionSet or numeric matrix . The
matrix must not contain NA values. |
fac |
A factor or numeric or character that can
be coerced to a factor . If x is an ExpressionSet ,
this may also be a character vector of length 1 with the name of
a covariate variable in x . fac must have exactly 2 levels.
For better control over the classification, use integer values in 0 and 1,
where 1 indicates the "Disease" class in the sense of the Pepe et al paper
(see below). |
p |
Numeric vector of length 1. Limit in (0,1) to integrate pAUC
to. |
flip |
Logical. If TRUE , both classification rules x
> theta and x < theta are tested and the (partial) area under
the curve of the better one of the two is returned. This is
appropriate for the cases in which the classification is not
necessarily linked to higher expression values, but instead it is
symmetric and one would assume both over- and under-expressed genes for
both classes. You can set flip to FALSE if you only want
to screen for genes which discriminate Disease from Control with the
x > theta rule. |
caseNames |
The class names that are used when plotting the
data. If fac is the name of the covariate variable in the
ExpressionSet the function will use its levels as
caseNames . |
Rowwise calculation of Receiver Operating Characteristic (ROC) curves
and the corresponding partial area under the curve (pAUC) for a given
data matrix or ExpressionSet
. The function is implemented in C
and thus reasonably fast and memory efficient. Cutpoints (theta
are calculated before the first, in between and after the last data
value. By default, both classification rules x > theta
and
x < theta
are tested and the (partial) area under the curve of
the better one of the two is returned. This is only valid for
symmetric cases, where the classification is independent of the
magnitude of x
(e.g., both over- and under-expression of
different genes in the same class). For unsymmetric cases in which
you expect x to be consistently higher/lower in of of the two classes
(e.g. presence or absence of a single biomarker) set flip=FALSE
or use the functionality provided in the ROC
package. For
better control over the classification (i.e., the choice of "Disease"
and "Control" class in the sense of the Pepe et al paper), argument
fac
can be an integer in [0,1]
where 1 indicates
"Disease" and 0 indicates "Control".
An object of class rowROC
with the
calculated specificities and sensitivities for each row and the
corresponding pAUCs and AUCs values. See
rowROC
for details.
rowPAUCs
:
signature(x="matrix", fac="factor")
signature(x="matrix", fac="numeric")
signature(x="ExpressionSet")
signature(x="ExpressionSet", fac="character")
Florian Hahne <fhahne@fhcrc.org>
Pepe MS, Longton G, Anderson GL, Schummer M.: Selecting differentially expressed genes from microarray experiments. Biometrics. 2003 Mar;59(1):133-42.
library(Biobase) data(sample.ExpressionSet) r1 = rowttests(sample.ExpressionSet, "sex") r2 = rowpAUCs(sample.ExpressionSet, "sex", p=0.1) plot(area(r2, total=TRUE), r1$statistic, pch=16) sel <- which(area(r2, total=TRUE) > 0.7) plot(r2[sel]) ## this compares performance and output of rowpAUCs to function pAUC in ## package ROC if(require(ROC)){ ## performance myRule = function(x) pAUC(rocdemo.sca(truth = as.integer(sample.ExpressionSet$sex)-1 , data = x, rule = dxrule.sca), t0 = 0.1) nGenes = 200 cat("computation time for ", nGenes, "genes:\n") cat("function pAUC: ") print(system.time(r3 <- esApply(sample.ExpressionSet[1:nGenes, ], 1, myRule))) cat("function rowpAUCs: ") print(system.time(r2 <- rowpAUCs(sample.ExpressionSet[1:nGenes, ], "sex", p=1))) ## compare output myRule2 = function(x) pAUC(rocdemo.sca(truth = as.integer(sample.ExpressionSet$sex)-1 , data = x, rule = dxrule.sca), t0 = 1) r4 <- esApply(sample.ExpressionSet[1:nGenes, ], 1, myRule2) plot(r4,area(r2), xlab="function pAUC", ylab="function rowpAUCs", main="pAUCs") plot(r4, area(rowpAUCs(sample.ExpressionSet[1:nGenes, ], "sex", p=1, flip=FALSE)), xlab="function pAUC", ylab="function rowpAUCs", main="pAUCs") r4[r4<0.5] <- 1-r4[r4<0.5] plot(r4, area(r2), xlab="function pAUC", ylab="function rowpAUCs", main="pAUCs") }