qpPAC {qpgraph} | R Documentation |
Estimates partial correlation coefficients (PACs) and their corresponding P-values in a given undirected graph, from an input data set.
## S4 method for signature 'ExpressionSet': qpPAC(data, g, return.K=FALSE, long.dim.are.variables=TRUE, verbose=TRUE, R.code.only=FALSE) ## S4 method for signature 'data.frame': qpPAC(data, g, return.K=FALSE, long.dim.are.variables=TRUE, verbose=TRUE, R.code.only=FALSE) ## S4 method for signature 'matrix': qpPAC(data, g, return.K=FALSE, long.dim.are.variables=TRUE, verbose=TRUE, R.code.only=FALSE)
data |
data set from where to estimate the partial correlation coefficients. It can be an ExpressionSet object, a data frame or a matrix. |
g |
either a graphNEL object or an incidence matrix of the given
undirected graph. |
return.K |
logical; if TRUE this function also returns the concentration
matrix K ; if FALSE it does not return it (default). |
long.dim.are.variables |
logical; if TRUE it is assumed that when data are in a data frame or in a matrix, the longer dimension is the one defining the random variables (default); if FALSE, then random variables are assumed to be at the columns of the data frame or matrix. |
verbose |
show progress on the calculations. |
R.code.only |
logical; if FALSE then the faster C implementation is used (default); if TRUE then only R code is executed. |
The estimation of PACs requires that the sample size n
is strictly larger
than the number of variables p
. In the context of microarray data and
regulatory networks, genes play the role of variables and thus normally
p >> n
. For this reason, we can estimate PACs from the edges of a
regulatory network represented by an undirected graph G
if and only if the
maximum clique size of the graph, noted w(G)
, is strictly smaller than the
sample size n
(number of experiments in the microarray data context).
In the context of this package, the undirected graph should correspond to a
qp-graph (see function qpGraph
) we have selected by thresholding
on the (average) non-rejection rate calculated from this same data set using
the functions qpNrr
or qpAvgNrr
. If the resulting
graph is sparse enough we may have a chance to meet the requirement of
w(G) < n
and the function qpClique
can be useful to
investigate this. In the context of transcriptional regulatory networks we may
consider to remove edges between non-transcription factor genes which will
substantially increase the sparseness of the network.
The PAC estimation is done by first obtaining a maximum likelihood estimate of
the sample covariance matrix of the input data set using the {link{qpIPF}
function and the P-values are calculated based on the estimation of the standard
errors of the edges following the procedure by Roverato and Whittaker (1996).
A list with two matrices, one with the estimates of the PACs and the other with their P-values.
R. Castelo and A. Roverato
Castelo, R. and Roverato, A. A robust procedure for Gaussian graphical model search from microarray data with p larger than n. J. Mach. Learn. Res., 7:2621-2650, 2006.
Castelo, R. and Roverato, A. Reverse engineering molecular regulatory networks from microarray data with qp-graphs. J. Comp. Biol., accepted, 2008.
Roverato, A. and Whittaker, J. Standard errors for the parameters of graphical Gaussian models. Stat. Comput., 6:297-302, 1996.
qpGraph
qpCliqueNumber
qpClique
qpGetCliques
qpIPF
nVar <- 50 # number of variables maxCon <- 5 # maximum connectivity per variable nObs <- 30 # number of observations to simulate I <- qpRndGraph(n.vtx=nVar, n.bd=maxCon) K <- qpI2K(I) X <- qpSampleMvnorm(K, nObs) nrr.estimates <- qpNrr(X, verbose=FALSE) g <- qpGraph(nrr.estimates, 0.5) pac.estimates <- qpPAC(X, g=g, verbose=FALSE) # estimated partial correlation coefficients of the present edges summary(abs(pac.estimates$R[upper.tri(pac.estimates$R) & I])) # estimated partial correlation coefficients of the missing edges summary(abs(pac.estimates$R[upper.tri(pac.estimates$R) & !I]))