| ebam.wilc {siggenes} | R Documentation |
Performs an Empirical Bayes Analysis of Microarrays by using Wilcoxon Rank Sums as expression scores for the genes.
ebam.wilc(data,cl,delta=.9,p0=NA,ties.rand=TRUE,zero.rand=TRUE,gene.names=NULL,
R.fold=TRUE,R.unlog=TRUE,file.out=NA,na.rm=FALSE,rand=NA)
data |
a matrix, data frame or exprSet object containing the data that should be analyzed. Every row of this data set must correspond to a gene, and each column to a sample. |
cl |
a numeric vector of length ncol(data) containing the class
labels of the samples. In the two class paired case, cl can also
be a matrix with ncol(data) rows and 2 columns. If data is
a exprSet object, cl can also be a character string naming the column
of pData(data) that contains the class labels of the samples.
In the two class unpaired case, cl should be a vector containing 0's
(specifying the samples of, e.g., the control group) and 1's (specifying,
e.g., the case group).
In the two class paired case, cl can be either a vector or a matrix.
If it is a vector, then cl has to consist of the integers between -1 and
-n/2 (e.g., before treatment group) and between 1 and n/2 (e.g.,
after treatment group), where n is the length of cl and k
is paired with -k, k=1,...,n/2. If cl is a matrix, one
column should contain -1's and 1's specifying, e.g., the before and the after
treatment samples, respectively, and the other column should contain integer
between 1 and n/2 specifying the n/2 pairs of observations.
For examples of how cl can be specified, see the manual of siggenes |
delta |
a gene will be called significant, if its posterior probability of
being differentially expressed is larger than or equal to delta. |
p0 |
prior probability that a gene is differentially expressed. If not specified, it will automatically be computed. |
ties.rand |
if TRUE (default), non-integer expression scores will be randomly
assigned to the next lower or upper integer. Otherwise, they are assigned to
the integer that is closer to the mean. |
zero.rand |
if TRUE (default), the sign of each Zero in the computation of
the Wilcoxon signed rank sums will be randomly assigned. If FALSE, the
sign of the Zeros will be set to '–'. |
gene.names |
a vector containing the names of the genes. |
R.fold |
if TRUE (default), the fold change for each differentially
expressed gene will be computed. |
R.unlog |
if TRUE, the anti-log of data will be used in the computation of the
R.fold. This is recommended if data consists of log2 transformed gene expression
data. |
file.out |
if specified, general information like the number of significant genes and the estimated FDR and gene-specific information like the expression scores, the q-values, the R fold etc. of the differentially expressed genes are stored in this file. |
na.rm |
if FALSE (default), the fold change of genes with at least one
missing value will be set to NA. If TRUE, missing values will be
replaced by the genewise mean. |
rand |
if specified, the random number generator will be set in a reproducible state. |
a plot of the expression scores vs. their posterior probability of being differentially expressed, and (optionally) a file containing general information like the FDR and the number of differentially expressed genes and gene-specific information on the differentially expressed genes like their names, their q-values and their fold change.
nsig |
number of significant genes. |
fdr |
estimated FDR. |
ebam.output |
table containing gene-specific information on the differentially expressed genes. |
row.sig.genes |
vector containing of the row numbers that belong to the differentially expressed genes. |
... |
Holger Schwender, holger.schw@gmx.de
Efron, B., Storey, J.D., Tibshirani, R. (2001). Microarrays, empirical Bayes methods, and the false discovery rate, Technical Report, Department of Statistics, Stanford University.
Storey, J.D., and Tibshirani, R. (2003). Statistical significance for genome-wide experiments, Technical Report, Department of Statistics, Stanford University.
Schwender, H. (2003). Assessing the false discovery rate in a statistical analysis of gene expression data, Chapter 8, Diploma thesis, Department of Statistics, University of Dortmund, http://de.geocities.com/holgerschw/thesis.pdf.
## Not run:
library(multtest)
# Load the data of Golub et al. (1999). data(golub) contains
# a 3051x38 gene expression matrix called golub, a vector of
# length called golub.cl that consists of the 38 class labels,
# and a matrix called golub.gnames whose third column contains
# the gene names.
data(golub)
# An EBAM-Wilc analysis of the Golub data is performed by
ebam.wilc.out<-ebam.wilc(golub,golub.cl,gene.names=golub.gnames[,3],rand=123)
# For further analyses, the row numbers of the differentially expressed
# genes are obtained by
ebam.wilc.out$row.sig.genes
## End(Not run)