getGenotype {SeqVarTools} | R Documentation |
Get matrix of genotype values from a GDS object
## S4 method for signature 'SeqVarGDSClass' getGenotype(gdsobj, use.names=TRUE) ## S4 method for signature 'SeqVarGDSClass' getGenotypeAlleles(gdsobj, use.names=TRUE, sort=FALSE) ## S4 method for signature 'SeqVarGDSClass' refDosage(gdsobj, use.names=TRUE) ## S4 method for signature 'SeqVarGDSClass' altDosage(gdsobj, use.names=TRUE) ## S4 method for signature 'SeqVarGDSClass' expandedAltDosage(gdsobj, use.names=TRUE) ## S4 method for signature 'SeqVarGDSClass,numeric' alleleDosage(gdsobj, n=0, use.names=TRUE) ## S4 method for signature 'SeqVarGDSClass,list' alleleDosage(gdsobj, n, use.names=TRUE)
gdsobj |
A |
use.names |
A logical indicating whether to assign sample and variant IDs as dimnames of the resulting matrix. |
sort |
Logical for whether to sort alleles lexographically ("G/T" instead of "T/G"). |
n |
An integer, vector, or list indicating which allele(s) to return dosage
for. |
In getGenotype
, genotypes are coded as in the VCF file, where "0/0" is homozygous
reference, "0/1" is heterozygous for the first alternate allele, "0/2"
is heterozygous for the second alternate allele, etc.
Separators are
"/" for unphased and "|" for phased. If sort=TRUE
, all
returned genotypes will be unphased.
Missing genotypes are coded as NA
.
Only diploid genotypes (the first two alleles at a given site) are returned.
If the argument n
toalleleDosage
is a single integer, the same allele is counted for all variants. If n
is a vector with length=number of variants in the current filter, a different allele is counted for each variant. If n
is a list, more than one allele can be counted for each variant. For example, if n[[1]]=c(1,3)
, genotypes "0/1" and "0/3" will each have a dosage of 1 and genotype "1/3" will have a dosage of 2.
getGenotype
and getGenotypeAlleles
return a character matrix with dimensions [sample,variant] containing diploid
genotypes.
getGenotype
returns alleles as "0", "1", "2", etc. indicating
refernence and alternate alleles.
getGenotypeAlleles
returns alleles as "A", "C", "G", "T".
sort=TRUE
sorts lexographically, which may be useful for
comparing genotypes with data generated using a different reference
sequence.
refDosage
returns an integer matrix with the dosage of the
reference allele: 2 for two copies of the reference allele ("0/0"), 1
for one copy of the reference allele, and 0 for two alternate alleles.
altDosage
returns an integer matrix with the dosage of any
alternate allele: 2 for two alternate alleles ("1/1", "1/2", etc.), 1
for one alternate allele, and 0 for no alternate allele (homozygous reference).
expandedAltDosage
returns an integer matrix with the dosage of each
alternate allele as a separate column. A variant with 2 possible alternate alleles will have 2 columns of output, etc.
alleleDosage
with an integer argument returns an integer matrix with the dosage of the
specified allele only: 2 for two copies of the allele ("0/0" if n=0
, "1/1" if n=1
, etc.), 1
for one copy of the specified allele, and 0 for no copies of the allele.
alleleDosage
with a list argument returns a list of sample x allele matrices with the dosage of each specified allele for each variant.
Stephanie Gogarten
SeqVarGDSClass
,
applyMethod
,
seqGetData
,
seqSetFilter
,
alleleFrequency
gds <- seqOpen(seqExampleFileName("gds")) seqSetFilter(gds, variant.sel=1323:1327, sample.sel=1:10) nAlleles(gds) getGenotype(gds) getGenotypeAlleles(gds) refDosage(gds) altDosage(gds) expandedAltDosage(gds) alleleDosage(gds, n=0) alleleDosage(gds, n=1) alleleDosage(gds, n=c(0,1,0,1,0)) alleleDosage(gds, n=list(0,c(0,1),0,c(0,1),1)) seqClose(gds)