regression {SeqVarTools} | R Documentation |
Run linear or logistic regression on variants
## S4 method for signature 'SeqVarData' regression(gdsobj, outcome, covar=NULL, model.type=c("linear", "logistic", "firth"))
gdsobj |
A |
outcome |
A character string with the name of the column in |
covar |
A character vector with the name of the column(s) in |
model.type |
the type of model to be run. "linear" uses |
a data.frame with the following columns (if applicable):
variant.id |
variant identifier |
n |
number of samples with non-missing data |
n0 |
number of controls (outcome=0) with non-missing data |
n1 |
number of cases (outcome=1) with non-missing data |
freq |
reference allele frequency |
freq0 |
reference allele frequency in controls |
freq1 |
reference allele frequency in cases |
Est |
beta estimate for genotype |
SE |
standard error of beta estimate for the genotype |
Wald.Stat |
chi-squared test statistic for association |
Wald.pval |
p-value for association |
PPL.Stat |
firth only: profile penalized likelihood test statistic for association |
PPL.pval |
firth only: p-value for association |
Stephanie Gogarten
SeqVarData
,
seqSetFilter
,
lm
,
glm
,
logistf
gds <- seqOpen(seqExampleFileName("gds")) ## create some phenotype data library(Biobase) sample.id <- seqGetData(gds, "sample.id") n <- length(sample.id) df <- data.frame(sample.id, sex=sample(c("M", "F"), n, replace=TRUE), age=sample(18:70, n, replace=TRUE), phen=rnorm(n), stringsAsFactors=FALSE) meta <- data.frame(labelDescription=c("sample identifier", "sex", "age", "phenotype"), row.names=names(df)) sample.data <- AnnotatedDataFrame(df, meta) seqData <- SeqVarData(gds, sample.data) ## select samples and variants seqSetFilter(gds, sample.id=sample.id[1:50], variant.id=1:10) res <- regression(seqData, outcome="phen", covar=c("sex", "age")) res seqClose(gds)