| ebam.wilc {siggenes} | R Documentation |
Performs an Empirical Bayes Analysis of Microarrays by using Wilcoxon Rank Sums as expression scores for the genes.
ebam.wilc(data,cl,delta=.9,p0=NA,ties.rand=TRUE,zero.rand=TRUE,gene.names=NULL,
R.fold=TRUE,R.unlog=TRUE,file.out=NA,na.rm=FALSE,rand=NA)
data |
the data set that should be analyzed. Every row of this data set must correspond to a gene, and each column to a biological sample. |
cl |
a vector containing the class labels of the samples. In the two class unpaired case,
the label of a sample is either 0 (e.g, control group) or 1 (e.g., case group).
In the two class paired case, the labels are the integers between 1 and n/2
(e.g., before treatment group) and between -1 and -n/2 (e.g., after treatment
group), where n is the length of cl and k is paired with -k. |
delta |
a gene will be called significant, if its posterior probability of
being differentially expressed is larger than or equal to delta. |
p0 |
prior probability that a gene is differentially expressed. If not specified, it will automatically be computed. |
ties.rand |
if TRUE (default), non-integer expression scores will be randomly
assigned to the next lower or upper integer. Otherwise, they are assigned to
the integer that is closer to the mean. |
zero.rand |
if TRUE (default), the sign of each Zero in the computation of
the Wilcoxon signed rank sums will be randomly assigned. If FALSE, the
sign of the Zeros will be set to '–'. |
gene.names |
a vector containing the names of the genes. |
R.fold |
if TRUE (default), the fold change for each differentially
expressed gene will be computed. |
R.unlog |
if TRUE, 2^data will be used in the computation of the
R.fold. This is recommended if data consists of log2 transformed gene expression
data. |
file.out |
if specified, general information like the number of significant genes and the estimated FDR and gene-specific information like the expression scores, the q-values, the R fold etc. of the differentially expressed genes are stored in this file. |
na.rm |
if FALSE (default), the fold change of genes with at least one
missing value will be set to NA. If TRUE, missing values will be
replaced by the genewise mean. |
rand |
if specified, the random number generator will be set in a reproducible state. |
a plot of the expression scores vs. their posterior probability of being differentially expressed, and (optionally) a file containing general information like the FDR and the number of differentially expressed genes and gene-specific information on the differentially expressed genes like their names, their q-values and their fold change.
nsig |
number of significant genes. |
fdr |
estimated FDR. |
ebam.output |
table containing gene-specific information on the differentially expressed genes. |
row.sig.genes |
vector containing of the row numbers that belong to the differentially expressed genes. |
... |
Holger Schwender, holger.schw@gmx.de
Efron, B., Storey, J.D., Tibshirani, R. (2001). Microarrays, empirical Bayes methods, and the false discovery rate, Technical Report, Department of Statistics, Stanford University.
Storey, J.D., and Tibshirani, R. (2003). Statistical significance for genome-wide experiments, Technical Report, Department of Statistics, Stanford University.
Schwender, H. (2003). Assessing the false discovery rate in a statistical analysis of gene expression data, Chapter 8, Diploma thesis, Department of Statistics, University of Dortmund, http://de.geocities.com/holgerschw/thesis.pdf.
if (interactive()) {
library(multtest)
# Load the data of Golub et al. (1999). data(golub) contains a 3051x38 gene expression
# matrix called golub, a vector of length called golub.cl that consists of the 38 class labels,
# and a matrix called golub.gnames whose third column contains the gene names.
data(golub)
# An EBAM-Wilc analysis of the Golub data is performed by
ebam.wilc.out<-ebam.wilc(golub,golub.cl,gene.names=golub.gnames[,3],rand=123)
# For further analyses, the row numbers of the differentially expressed genes are obtained by
ebam.wilc.out$row.sig.genes
}