getGenotype {SeqVarTools}R Documentation

Get genotype data

Description

Get matrix of genotype values from a GDS object

Usage

## S4 method for signature 'SeqVarGDSClass'
getGenotype(gdsobj, use.names=TRUE)
## S4 method for signature 'SeqVarGDSClass'
getGenotypeAlleles(gdsobj, use.names=TRUE, sort=FALSE)
## S4 method for signature 'SeqVarGDSClass'
refDosage(gdsobj, use.names=TRUE)
## S4 method for signature 'SeqVarGDSClass'
altDosage(gdsobj, use.names=TRUE)
## S4 method for signature 'SeqVarGDSClass'
expandedAltDosage(gdsobj, use.names=TRUE)
## S4 method for signature 'SeqVarGDSClass,numeric'
alleleDosage(gdsobj, n=0, use.names=TRUE)
## S4 method for signature 'SeqVarGDSClass,list'
alleleDosage(gdsobj, n, use.names=TRUE)

Arguments

gdsobj

A SeqVarGDSClass object with VCF data.

use.names

A logical indicating whether to assign sample and variant IDs as dimnames of the resulting matrix.

sort

Logical for whether to sort alleles lexographically ("G/T" instead of "T/G").

n

An integer, vector, or list indicating which allele(s) to return dosage for. n=0 is the reference allele, n=1 is the first alternate allele, and so on.

Details

In getGenotype, genotypes are coded as in the VCF file, where "0/0" is homozygous reference, "0/1" is heterozygous for the first alternate allele, "0/2" is heterozygous for the second alternate allele, etc.

Separators are "/" for unphased and "|" for phased. If sort=TRUE, all returned genotypes will be unphased. Missing genotypes are coded as NA.

Only diploid genotypes (the first two alleles at a given site) are returned.

If the argument n toalleleDosage is a single integer, the same allele is counted for all variants. If n is a vector with length=number of variants in the current filter, a different allele is counted for each variant. If n is a list, more than one allele can be counted for each variant. For example, if n[[1]]=c(1,3), genotypes "0/1" and "0/3" will each have a dosage of 1 and genotype "1/3" will have a dosage of 2.

Value

getGenotype and getGenotypeAlleles return a character matrix with dimensions [sample,variant] containing diploid genotypes.

getGenotype returns alleles as "0", "1", "2", etc. indicating refernence and alternate alleles.

getGenotypeAlleles returns alleles as "A", "C", "G", "T". sort=TRUE sorts lexographically, which may be useful for comparing genotypes with data generated using a different reference sequence.

refDosage returns an integer matrix with the dosage of the reference allele: 2 for two copies of the reference allele ("0/0"), 1 for one copy of the reference allele, and 0 for two alternate alleles.

altDosage returns an integer matrix with the dosage of any alternate allele: 2 for two alternate alleles ("1/1", "1/2", etc.), 1 for one alternate allele, and 0 for no alternate allele (homozygous reference).

expandedAltDosage returns an integer matrix with the dosage of each alternate allele as a separate column. A variant with 2 possible alternate alleles will have 2 columns of output, etc.

alleleDosage with an integer argument returns an integer matrix with the dosage of the specified allele only: 2 for two copies of the allele ("0/0" if n=0, "1/1" if n=1, etc.), 1 for one copy of the specified allele, and 0 for no copies of the allele.

alleleDosage with a list argument returns a list of sample x allele matrices with the dosage of each specified allele for each variant.

Author(s)

Stephanie Gogarten

See Also

SeqVarGDSClass, applyMethod, seqGetData, seqSetFilter, alleleFrequency

Examples

gds <- seqOpen(seqExampleFileName("gds"))
seqSetFilter(gds, variant.sel=1323:1327, sample.sel=1:10)
nAlleles(gds)
getGenotype(gds)
getGenotypeAlleles(gds)
refDosage(gds)
altDosage(gds)
expandedAltDosage(gds)
alleleDosage(gds, n=0)
alleleDosage(gds, n=1)
alleleDosage(gds, n=c(0,1,0,1,0))
alleleDosage(gds, n=list(0,c(0,1),0,c(0,1),1))
seqClose(gds)

[Package SeqVarTools version 1.16.1 Index]