run.plgem {plgem} | R Documentation |
This function automatically performs PLGEM fitting and evaluation, determination of observed and resampled PLGEM-STN values, and selection of differentially expressed genes/proteins (DEG) using the PLGEM method.
run.plgem(esdata, signLev=0.001, rank=100, covariate=1, baselineCondition=1, Iterations="automatic", trimAllZeroRows=FALSE, zeroMeanOrSD=c("replace", "trim"), fitting.eval=TRUE, plotFile=FALSE, writeFiles=FALSE, Verbose=FALSE)
esdata |
an object of class ExpressionSet ; see Details for
important information on how the phenoData slot of this object will
be interpreted by the function. |
signLev |
numeric vector; significance level(s) for the DEG selection. Value(s) must be in (0,1). |
rank |
integer (or coercible to integer ); the number of
genes or proteins to be selected according to their PLGEM-STN rank. Only
used if number of available replicates is too small to perform resampling
(see Details). |
covariate |
integer , numeric or character ; specifies
the covariate to be used to distinguish the various experimental conditions
from one another. See Details for how to specify the covariate . |
baselineCondition |
integer , numeric or character ;
specifies the condition to be treated as the baseline. See Details for how
to specify the baselineCondition . |
Iterations |
number of iterations for the resampling step; if
"automatic" it is automatically determined. |
trimAllZeroRows |
logical ; if TRUE , rows in the data set
containing only zero values are trimmed before fitting PLGEM. See
help page of function plgem.fit for details. |
zeroMeanOrSD |
either NULL or character ; what should be
done if a row with non-positive mean or zero standard deviation is
encountered before fitting PLGEM? Current options are one of
"replace" or "trim" . Partial matching is used to switch
between the options and setting the value to NULL will cause the
default behaviour to be enforced, i.e. to "replace" . See help page of
function plgem.fit for details. |
fitting.eval |
logical ; if TRUE , the fitting is evaluated
generating a diagnostic plot. |
plotFile |
logical ; if TRUE , the generated plot is written on a
file. |
writeFiles |
logical ; if TRUE , the generated list of DEG is
written on disk file(s). |
Verbose |
logical ; if TRUE , comments are printed out while
running. |
The phenoData
slot of the ExpressionSet
given as input is
expected to contain the necessary information to distinguish the various
experimental conditions from one another. The columns of the pData
are
referred to as ‘covariates’. There has to be at least one covariate
defined in the input ExpressionSet
. The sample attributes according to
this covariate must be distinct for samples that are to be treated as distinct
experimental conditions and identical for samples that are to be treated as
replicates.
There is a couple different ways how to specify the covariate
: If an
integer
or a numeric
is given, it will be taken as the covariate
number (in the same order in which the covariates appear in the
colnames
of the pData
). If a character
is given, it will
be taken as the covariate name itself (in the same way the covariates are
specified in the colnames
of the pData
). By default, the first
covariate appearing in the colnames
of the pData
is used.
Similarly, there is a couple different ways how to specify which experimental
condition to treat as the baseline. The available ‘condition names’ are
taken from unique(as.character(pData(data)[, covariate]))
. If
baselineCondition
is given as a character
, it will be taken as
the condition name itself. If baselineCondition
is given as an
integer
or a numeric
value, it will be taken as the condition
number (in the same order of appearance as in the ‘condition names’).
By default, the first condition name is used.
The model is fitted on the most replicated condition. When more conditions exist with the maximum number of replicates, the condition providing the best fit is chosen (based on the adjusted r^2). If there is again a tie, the first one is arbitrarily taken.
If less than 3 replicates are provided for the condition used for fitting,
then the selection is based on ranking according to the observed PLGEM-STN
values. In this case the first rank
genes or proteins are selected for
each comparison.
Otherwise DEG are selected comparing the observed and resampled PLGEM-STN
values at the signLev
significance level(s), based on p-values obtained
via a call to function plgem.pValue
. See References for details.
A list
of four elements:
fit |
the input plgemFit . |
PLGEM.STN |
a matrix of observed PLGEM-STN values (see
plgem.obsStn for details). |
p-value |
a matrix of p-values (see plgem.pValue
for details). |
significant |
a list with a number of elements equal to the number
of different significance levels (delta ) used as input. If ranking
method is used due to insufficient number of replicates (see Details), this
list will be of length 1 and named firstXXX , where XXX is the
number provided by argument rank . Each element of this list is again a
list, whose number of elements correspond to the number of performed
comparisons (i.e. the number of conditions in the starting
ExpressionSet minus the baseline). Each of these second level elements
is a character vector of significant gene/protein names that passed the
statistical test at the corresponding significance level. |
Mattia Pelizzola mattia.pelizzola@gmail.com
Norman Pavelka nxp@stowers.org
Pavelka N, Pelizzola M, Vizzardelli C, Capozzoli M, Splendiani A, Granucci F, Ricciardi-Castagnoli P. A power law global error model for the identification of differentially expressed genes in microarray data. BMC Bioinformatics. 2004 Dec 17; 5:203; http://www.biomedcentral.com/1471-2105/5/203.
Pavelka N, Fournier ML, Swanson SK, Pelizzola M, Ricciardi-Castagnoli P, Florens L, Washburn MP. Statistical similarities between transcriptomics and quantitative shotgun proteomics data. Mol Cell Proteomics. 2008 Apr; 7(4):631-44; http://www.mcponline.org/cgi/content/abstract/7/4/631.
plgem.fit
, plgem.obsStn
,
plgem.resampledStn
, plgem.pValue
,
plgem.deg
, plgem.write.summary
data(LPSeset) set.seed(123) LPSdegList <- run.plgem(esdata=LPSeset, fitting.eval=FALSE)