skipTrack

Lifecycle: experimental R-CMD-check

Welcome to the SkipTrack Package!

SkipTrack is a Bayesian hierarchical model for self-reported menstrual cycle length data on mobile health apps. The model is an extension of the hierarchical model presented in Li et al. (2022) that focuses on predicting an individual’s next menstrual cycle start date while accounting for cycle length inaccuracies introduced by non-adherence in user self-tracked data. Check out the ‘Getting Started’ vignette to see an overview of the SkipTrack Model!

Installation

#Install from CRAN
install.packages('skipTrack')

#Install Development Version
devtools::install_github("LukeDuttweiler/skipTrack")

Package Usage

The SkipTrack package provides functions for fitting the SkipTrack model, evaluating model run diagnostics, retrieving and visualizing model results, and simulating related data. We begin our tutorial by examining some simulated data.

library(skipTrack)

First, we simulate data on 100 individuals from the SkipTrack model where each observed \(y_{ij}\) value has a 75% probability of being a true cycle, a 20% probability of being two true cycles recorded as one, and a 5% probability of being three true cycles recorded as one.

#Simulate data
dat <- skipTrack.simulate(n = 100, model = 'skipTrack', skipProb = c(.75, .2, .05))

Fitting the SkipTrack model using this simulated data requires a call to the function skipTrack.fit. Note that because this is a Bayesian model and is fit with an MCMC algorithm, it can take some time with large datasets and a high number of MCMC reps and chains.

In this code we ask for 4 chains, each with 1000 iterations, run sequentially. Note that we recommend allowing the sampler to run longer than this (usually at least 5000 iterations per chain), but we use a short run here to save time.

If useParallel = TRUE, the MCMC chains will be evaluated in parallel, which helps with longer runs.

ft <- skipTrack.fit(Y = dat$Y, cluster = dat$cluster,
                    reps = 1000, chains = 4, useParallel = FALSE)

Once we have the model results we are able to examine model diagnostics, visualize results from the model, and view a model summary.

Diagnostics

Multivariate, multichain MCMC diagnostics, including traceplots, Gelman-Rubin diagnostics, and effective sample size, are all available for various parameters from the model fit. These are supplied using the genMCMCDiag package, see that packages’ documentation for details.

Here we show the output of the diagnostics on the \(c_{ij}\) parameters, which show that (at least for the \(c_{ij}\) values) the algorithm is mixing effectively (or will be, once the algorithm runs a little longer).

skipTrack.diagnostics(ft, param = 'cijs')

#> ----------------------------------------------------
#> Generalized MCMC Diagnostics using lanfear Method 
#> ----------------------------------------------------
#> 
#> |Effective Sample Size: 
#> |---------------------------
#> | Chain 1| Chain 2| Chain 3| Chain 4|     Sum|
#> |-------:|-------:|-------:|-------:|-------:|
#> | 105.807|   46.84| 104.824|  92.098| 349.569|
#> 
#> |Gelman-Rubin Diagnostic: 
#> |---------------------------
#> | Point est.| Upper C.I.|
#> |----------:|----------:|
#> |      1.024|      1.027|

Visualization

In order to see some important plots for the SkipTrack model fit, you can simply use plot(ft), and the plots are directly accessible using skipTrack.visualize(ft).

plot(ft)

Summary

A summary is available for the SkipTrack model fit with summary(ft), with more detailed results accessible through skipTrack.results(ft). Importantly, these results are based on a default chain burn-in value of 750 draws. This can be changed using the parameter burnIn for either function.

summary(ft)
#> ----------------------------------------------------
#> Summary of skipTrack.fit using skipTrack model
#> ----------------------------------------------------
#> Mean Coefficients: 
#> 
#>             Estimate       95% CI Lower 95% CI Upper
#> (Intercept)    3.411              3.383         3.44
#> 
#> ----------------------------------------------------
#> Precision Coefficients: 
#> 
#>             Estimate       95% CI Lower 95% CI Upper
#> (Intercept)    5.573              5.352        5.749
#> 
#> ----------------------------------------------------
#> Diagnostics: 
#> 
#>        Effective Sample Size       Gelman-Rubin
#> Betas                3616.43               1.00
#> Gammas                 22.24               1.00
#> cijs                  343.01               1.02
#> 
#> ----------------------------------------------------

summary(ft, burnIn = 500)
#> ----------------------------------------------------
#> Summary of skipTrack.fit using skipTrack model
#> ----------------------------------------------------
#> Mean Coefficients: 
#> 
#>             Estimate       95% CI Lower 95% CI Upper
#> (Intercept)    3.411              3.384         3.44
#> 
#> ----------------------------------------------------
#> Precision Coefficients: 
#> 
#>             Estimate       95% CI Lower 95% CI Upper
#> (Intercept)    5.554              5.325        5.755
#> 
#> ----------------------------------------------------
#> Diagnostics: 
#> 
#>        Effective Sample Size       Gelman-Rubin
#> Betas                3719.83               1.00
#> Gammas                 21.79               1.00
#> cijs                  342.89               1.02
#> 
#> ----------------------------------------------------

This introduction provides enough information to start fitting the SkipTrack model. For further information regarding different methods of simulating data, additional model fitting, and tuning parameters for fitting the model, please see the help pages and the ‘Getting Started’ vignette. Additional vignettes are forthcoming.

Bibliography

Li, Kathy, Iñigo Urteaga, Amanda Shea, Virginia J Vitzthum, Chris H Wiggins, and Noémie Elhadad. 2022. “A Predictive Model for Next Cycle Start Date That Accounts for Adherence in Menstrual Self-Tracking.” Journal of the American Medical Informatics Association 29 (1): 3–11.