| mergeComplexes {apComplex} | R Documentation |
Repeatedly applies the function LCdelta to make combinations of columns in the affiliation matrix representing the protein complex membership graph (PCMG) for AP-MS data.
mergeComplexes(bhmax,adjMat,VBs=NULL,VPs=NULL,simMat=NULL,sensitivity=.75,specificity=.995,Beta=0,commonFrac=2/3,wsVal = 2e7)
bhmax |
Initial complex estimates coming from bhmaxSubgraph |
adjMat |
Adjacency matrix of bait-hit data from an AP-MS experiment. Rows correspond to baits and columns to hits. |
VBs |
VBs is an optional vector of viable baits. |
VPs |
VPs is an optional vector of viable prey. |
simMat |
An optional square matrix with entries between 0 and 1. Rows and columns correspond to the proteins in the experiment, and should be reported in the same order as the columns of adjMat. Higher values in this matrix are interpreted to mean higher similarity for protein pairs. |
sensitivity |
Believed sensitivity of AP-MS technology. |
specificity |
Believed specificity of AP-MS technology. |
Beta |
Optional additional parameter for the weight to give data
in simMat in the logistic regression model. |
commonFrac |
This is the fraction of baits that need to be overlapping for a complex combination to be considered. |
wsVal |
A numeric. This is the value assigned to the work-space in the call to fisher.test. |
The local modeling algorithm for AP-MS data described by Scholtens and
Gentleman (2004) and Scholtens, Vidal, and Gentleman (2005) uses a
two-component measure of protein complex estimate quality, namely P=LxC.
Columns in cMat represent individual complex estimates. The algorithm
works by starting with a maximal BH-complete subgraph estimate of cMat,
and then improves the estimate by combining complexes such that P=LxC
increases.
By default commonFrac is set relatively high at 2/3. This means
that some potentially reasonable complex combinations could be missed. For
smaller data sets, users may consider decreasing the fraction. For larger
data sets, this may cause a large increase in computation time.
A list of character vectors containing the names of the proteins in the estimated complexes.
Denise Scholtens
Scholtens D and Gentleman R. Making sense of high-throughput protein-protein interaction data. Statistical Applications in Genetics and Molecular Biology 3, Article 39 (2004).
Scholtens D, Vidal M, and Gentleman R. Local modeling of global interactome networks. Bioinformatics 21, 3548-3557 (2005).
data(apEX) PCMG0 <- bhmaxSubgraph(apEX) PCMG1 <- mergeComplexes(PCMG0,apEX,sensitivity=.7,specificity=.75)