crlmm {crlmm} | R Documentation |
This is a faster and more efficient implementation of the CRLMM algorithm, especially designed for Affymetrix SNP 5 and 6 arrays (to be soon extended to other platforms).
crlmm(filenames, row.names=TRUE, col.names=TRUE, probs=c(1/3, 1/3, 1/3), DF=6, SNRMin=5, gender=NULL, save.it=FALSE, load.it=FALSE, intensityFile, mixtureSampleSize=10^5, eps=0.1, verbose=TRUE, cdfName, sns, recallMin=10, recallRegMin=1000, returnParams=FALSE, badSNP=0.7)
filenames |
'character' vector with CEL files to be genotyped. |
row.names |
'logical'. Use rownames - SNP names? |
col.names |
'logical'. Use colnames - Sample names? |
probs |
'numeric' vector with priors for AA, AB and BB. |
DF |
'integer' with number of degrees of freedom to use with t-distribution. |
SNRMin |
'numeric' scalar defining the minimum SNR used to filter out samples. |
gender |
'integer' vector, with same length as 'filenames', defining sex. (1 - male; 2 - female) |
save.it |
'logical'. Save preprocessed data? |
load.it |
'logical'. Load preprocessed data to speed up analysis? |
intensityFile |
'character' with filename to be saved/loaded - preprocessed data. |
mixtureSampleSize |
Number of SNP's to be used with the mixture model. |
eps |
Minimum change for mixture model. |
verbose |
'logical'. |
cdfName |
'character' defining the CDF name to use ('GenomeWideSnp5', 'GenomeWideSnp6') |
sns |
'character' vector with sample names to be used. |
recallMin |
Minimum number of samples for recalibration. |
recallRegMin |
Minimum number of SNP's for regression. |
returnParams |
'logical'. Return recalibrated parameters. |
badSNP |
'numeric'. Threshold to flag as bad SNP (affects batchQC) |
A SnpSet
object.
calls |
Genotype calls (1 - AA, 2 - AB, 3 - BB) |
confs |
Confidence scores 'round(-1000*log2(1-p))' |
SNPQC |
SNP Quality Scores |
batchQC |
Batch Quality Score |
params |
Recalibrated parameters |
Carvalho B, Bengtsson H, Speed TP, Irizarry RA. Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics. 2007 Apr;8(2):485-99. Epub 2006 Dec 22. PMID: 17189563.
Carvalho B, Louis TA, Irizarry RA. Describing Uncertainty in Genome-wide Genotype Calling. (in prep)
## this can be slow if (require(genomewidesnp5Crlmm) & require(hapmapsnp5)){ path <- system.file("celFiles", package="hapmapsnp5") ## the filenames with full path... ## very useful when genotyping samples not in the working directory cels <- list.celfiles(path, full.names=TRUE) (crlmmOutput <- crlmm(cels)) }