crlmm {crlmm}R Documentation

Genotype oligonucleotide arrays with CRLMM

Description

This is a faster and more efficient implementation of the CRLMM algorithm, especially designed for Affymetrix SNP 5 and 6 arrays (to be soon extended to other platforms).

Usage

crlmm(filenames, row.names=TRUE, col.names=TRUE,
      probs=c(1/3, 1/3, 1/3), DF=6, SNRMin=5,
      gender=NULL, save.it=FALSE, load.it=FALSE,
      intensityFile, mixtureSampleSize=10^5,
      eps=0.1, verbose=TRUE, cdfName, sns, recallMin=10,
      recallRegMin=1000, returnParams=FALSE, badSNP=0.7)

Arguments

filenames 'character' vector with CEL files to be genotyped.
row.names 'logical'. Use rownames - SNP names?
col.names 'logical'. Use colnames - Sample names?
probs 'numeric' vector with priors for AA, AB and BB.
DF 'integer' with number of degrees of freedom to use with t-distribution.
SNRMin 'numeric' scalar defining the minimum SNR used to filter out samples.
gender 'integer' vector, with same length as 'filenames', defining sex. (1 - male; 2 - female)
save.it 'logical'. Save preprocessed data?
load.it 'logical'. Load preprocessed data to speed up analysis?
intensityFile 'character' with filename to be saved/loaded - preprocessed data.
mixtureSampleSize Number of SNP's to be used with the mixture model.
eps Minimum change for mixture model.
verbose 'logical'.
cdfName 'character' defining the CDF name to use ('GenomeWideSnp5', 'GenomeWideSnp6')
sns 'character' vector with sample names to be used.
recallMin Minimum number of samples for recalibration.
recallRegMin Minimum number of SNP's for regression.
returnParams 'logical'. Return recalibrated parameters.
badSNP 'numeric'. Threshold to flag as bad SNP (affects batchQC)

Value

A SnpSet object.

calls Genotype calls (1 - AA, 2 - AB, 3 - BB)
confs Confidence scores 'round(-1000*log2(1-p))'
SNPQC SNP Quality Scores
batchQC Batch Quality Score
params Recalibrated parameters

References

Carvalho B, Bengtsson H, Speed TP, Irizarry RA. Exploration, normalization, and genotype calls of high-density oligonucleotide SNP array data. Biostatistics. 2007 Apr;8(2):485-99. Epub 2006 Dec 22. PMID: 17189563.

Carvalho B, Louis TA, Irizarry RA. Describing Uncertainty in Genome-wide Genotype Calling. (in prep)

Examples

## this can be slow
if (require(genomewidesnp5Crlmm) & require(hapmapsnp5)){
  path <- system.file("celFiles", package="hapmapsnp5")

  ## the filenames with full path...
  ## very useful when genotyping samples not in the working directory
  cels <- list.celfiles(path, full.names=TRUE)
  (crlmmOutput <- crlmm(cels))
}

[Package crlmm version 1.2.4 Index]