Bayesian SITAR model - An introduction

SITAR growth curve model - an overview

The superimposition by translation and rotation (SITAR) model is a shape-invariant nonlinear mixed-effects growth curve model that fits a population average (i.e., mean) curve to the data and aligns each individual’s growth trajectory to the population average curve using a set of three random effects (Cole et al., 2010): size relative to the mean growth curve (vertical shift), timing of the adolescent growth spurt relative to the average age at peak growth velocity (horizontal shift), and the intensity of the growth spurt compared to the mean growth intensity (horizontal stretch). The concept of the shape-invariant model (SIM) was first described by Lindstrom (1995) and later used by Beath (2007) for modeling infant growth data (birth to 2 years). The current version of the SITAR model that we describe below was developed by Cole et al. (2010). The SITAR model is particularly useful for modeling human physical growth during adolescence. Recent studies have used SITAR to analyze height and weight data (Cole & Mori, 2018; Mansukoski et al., 2019; Nembidzane et al., 2020; Riddell et al., 2017), as well as to study jaw growth during adolescence (Sandhu, 2020). All of these studies estimated the SITAR model within the frequentist framework, as implemented in the R package, ‘sitar’ (T. Cole, 2022).

Consider a dataset consisting of \(j\) individuals \((j = 1,..,j)\) where individual \(j\) provides \(n_j\) measurements (\(i = 1,.., n_j\)) of height (\(y_{ij}\)), recorded at age \(x_{ij}\).

\[\begin{equation} \label{eq:1} y_{ij}=\alpha_{0\ }+\alpha_{j\ }+\sum_{r=1}^{p-1}{\beta_r\mathbf{Spl}\left(\frac{x_{ij}-\bar{x_{ij}}-\left(\zeta_0+\zeta_j\right)}{e^{-\ \left(\left.\ \gamma_0+\ \gamma_j\right)\right.}}\right)}+e_{ij} \end{equation}\]

Where Spl(.) is the natural cubic spline function that generates the spline design matrix, and \(\beta_1, \dots, \beta_{p-1}\) are the spline regression coefficients for the mean curve, with \(\alpha_0\), \(\zeta_0\), and \(\gamma_0\) representing the population average size, timing, and intensity parameters. By default, the predictor, age (\(x_{ij}\)), is mean-centered by subtracting the mean age (\(\bar{x}\)), where \(\bar{x} = \frac{1}{n} \sum_{i=1}^{n} x_{i.}\). The individual-specific random effects for size (\(\alpha_j\)), timing (\(\zeta_j\)), and intensity (\(\gamma_j\)) describe how an individual’s growth trajectory differs from the mean growth curve. The residuals \(e_{ij}\) are assumed to be normally distributed with zero mean and residual variance parameter \(\sigma^2\), and are independent of the random effects. The random effects are assumed to be multivariate normally distributed with zero means and an unstructured variance-covariance matrix (i.e., distinct variances and co-variance between random effects) as shown below:

\[\begin{equation} \label{eq:2} \begin{matrix}&\\&\\\left(\begin{matrix}\begin{matrix}\alpha_j\\\zeta_j\\\gamma_j\\\end{matrix}\\\end{matrix}\right)&\sim M V N\left(\left(\begin{matrix}\begin{matrix}0\\0\\0\\\end{matrix}\\\end{matrix}\right),\left(\begin{matrix}\sigma_{\alpha_j}^2&\rho_{\alpha_j\zeta_j}&\rho_{\alpha_j\gamma_j}\\\rho_{\zeta_j\alpha_j}&\sigma_{\zeta_j}^2&\rho_{\zeta_j\gamma_j}\\\rho_{\gamma_j\alpha_j}&\rho_{\gamma_j\zeta_j}&\sigma_{\gamma_j}^2\\\end{matrix}\right)\right)\mathrm{,\ for\ individual\ j\ =\ 1,} \ldots\mathrm{,J} \\\end{matrix} \end{equation}\]

Bayesian SITAR model - Univariate formulation

Here we describe the Bayesian model specification for a two-level SITAR model along with the default priors specified for each parameter. To better understand the data-generative mechanism and to simplify the presentation of prior specifications for each individual parameter, we re-express the model in Equation \(\ref{eq:1}\) as follows:

\[\begin{equation} \label{eq:3} \begin{aligned} \text{y}_{ij} & \sim \operatorname{Normal}(\mu_{ij}, \sigma) \\ \\ \mu_{ij} & = \alpha_0+ \alpha_j+\sum_{r=1}^{p-1}{\beta_r\mathbf{Spl}\left(\frac{x_{ij}-\bar{x_{ij}}-\left(\zeta_0+\zeta_j\right)}{e^{-\ \left(\left.\ \gamma_0+\gamma_j\right)\right.}}\right)} \\ \sigma & = \sigma_\epsilon \\ \\ \begin{bmatrix} \alpha_{j} \\ \zeta_{j} \\ \gamma_{j} \end{bmatrix} & \sim {\operatorname{MVNormal}\begin{pmatrix} \begin{bmatrix} 0 \\ 0 \\ 0 \end{bmatrix},\ \mathbf \Sigma_{ID} \end{pmatrix}} \\ \\ \mathbf {\Sigma_{ID}} & = \mathbf S \mathbf R \mathbf S \\ \\ \mathbf S & = \begin{bmatrix} \sigma_{\alpha_j} & 0 & 0 \\ 0 &\sigma_{\zeta_j} & 0 \\ 0 & 0 & \sigma_{\gamma_j} \end{bmatrix} \\ \\ \mathbf R & = \begin{bmatrix} 1 & \rho_{\alpha_j\zeta_j} & \rho_{\alpha_j\gamma_j} \\\rho_{\zeta_j\alpha_j} & 1 & \rho_{\zeta_j\gamma_j} \\\rho_{\gamma_j\alpha_j} & \rho_{\gamma_j\zeta_j} & 1 \end{bmatrix} \\ \\ \alpha_0 & \sim \operatorname{normal}(\ y_{mean},\ {y_{sd}}) \\ \zeta_0 & \sim \operatorname{normal}(\ 0,\ 2.0) \\ \gamma_0 & \sim \operatorname{normal}(\ 0,\ 1.0) \\ \beta_1 \text{,..,} \beta_r & \sim \operatorname{normal}(\ \mathbf{\beta_{lm}}, \ \mathbf {X_{Spl}}) \\ \alpha_j & \sim \operatorname{normal_{Half}}(\ {0},\ {y_{sd}}) \\ \zeta_j & \sim \operatorname{normal_{Half}}(\ {0},\ {2.0}) \\ \gamma_j & \sim \operatorname{normal_{Half}}(\ {0},\ {1.0}) \\ \sigma_\epsilon & \sim \operatorname{normal_{Half}}(\ {0},\ {y_{sd}}) \\ \mathbf R & \sim \operatorname{LKJ}(1), \end{aligned} \end{equation}\]

The first line in the equation above represents the likelihood, which states that the outcome is distributed with a mean mu and standard deviation sigma, \(\sigma_\epsilon\). The mu is a function of the growth curve parameters, as described earlier (see Equation \(\ref{eq:1}\)). The unstructured variance co-variance matrix \(\mathbf {\Sigma_{ID}}\) is same as show earlier (Equation \(\ref{eq:1}\)) and is constructed using the separation strategy. In this strategy., which is followed in Stan, the variance covariance matrix \(\mathbf {\Sigma_{ID}}\) is decomposed into a diagonal matrix \(\mathbf S\) (composed of standard deviation vector), and a correlation matrix \(\mathbf R\) (see here for details). The residuals are assumed to be independent and normally distributed with a mean of 0 and a \(n_j \times n_j\) dimensional identity covariance matrix with a diagonal constant variance parameter, \(\sigma_\epsilon\), i.e., \(\mathbf {I\sigma_\epsilon}\), where \(\mathbf I\) is the identity matrix (a diagonal matrix of 1s) and \(\sigma_\epsilon\) is the residual standard deviation. In other words, the residual variance (i.e., within individual variance) matrix is composed of a single parameter \(\sigma_\epsilon\) which form the diagonal of the matrix. The assumption of homoscedasticity (constant variance) of the residuals can be relaxed. A detailed description of the prior distributions for each parameter is provided in below Section priors.

Priors

We follow the recommendations made in the popular packages rstanrm and brms for prior specification. Technically, the priors used in the ‘rstanarm’ and ‘brms’ packages are data-dependent, and hence weakly informative. This is because the priors are scaled based on the distribution (i.e., standard deviation) of the outcome and predictor(s). However, the amount of information used is weak and mainly regulatory, helping to stabilize the computation. An important feature of this approach is that the default priors are reasonable for many models. Like ‘rstanarm’ and ‘brms’, the ‘bsitar’ package offers full flexibility in setting a wide range of priors that encourage users to specify priors that reflect their prior knowledge about the human growth processes.

Similar to the ‘brms’ and ‘rstanarm’ packages, the ‘bsitar’ package allows user to control the scale parameter for the location-scale based priors such as normal, student_t and cauchy distributions via the autoscale option. Here again we adopt an amalgamation of the best strategies offered by the ‘brms’ and ‘rstanarm’ packages. While ‘rstanarm’ earlier used to set autoscale=TRUE which transformed prior by multiplying scale parameter with a fixed value \(2.5\) (recently authors changed this behavior to FALSE), the ‘brms’ package sets scale factor between \(1.0\) and \(2.5\) depending on the the Median Absolute Deviation (MAD) of the outcome. If MAD is less than \(2.5\), it scales prior by a factor of \(2.5\) otherwise the scale factor is \(1.0\) (i.e., no auto scaling). The ‘bsitar’ package, on the other hand, offers full flexibility in choosing the scale factor via a built in option, autoscale. Setting autoscale=TRUE scales prior by a factor of \(2.5\) (as earlier used in ‘rstanarm’). However, as mentioned earlier, the scaling factor can be set as any real number such as \(1.5\) (e.g., autoscale = 1.5). The autoscale option is available for all location-scale based distibutions such as normal, student_t, cauchy etc. We strongly recommend to go through the documentation on priors included in the brms and rstanrm packages. The default setting for autoscale in ‘bsitar’ package is FALSE i.e., the scale factor is set to 1.

Below we describe the default priors used for the regression coefficients as well as the standard deviation of random effects for a (size), b (timing) and c (intensity) parameters and their correlations. The default distribution for each parameter (regression coefficients, standard deviation for group level random effects, and residuals) is normal distribution.

The population regression parameter a is assigned ‘normal’ prior centered at \(y_{\text{mean}}\) (i.e., mean of the outcome) with scale defined as the standard deviation of the outcome \(y_{\text{sd}}\) multiplied by the default scaling factor \(1.0\) i.e., normal(ymean, ysd, autoscale = FALSE). The prior on the standard deviation of the random effect parameter a is identical to the regression parameter with the exception that it is centered at mean 0 i.e., normal(0, ysd, autoscale = FALSE).
For the population regression parameter b, the default prior follows a ‘normal’ distribution with mean 0 and a scale of 2.0, i.e., normal(0, 2.0, autoscale = FALSE). Note that the autoscale option is set to FALSE. Since the predictor age is typically mean-centered when fitting the SITAR model, this prior implies that 95% of the distribution’s mass (assuming it approaches a normal curve) for the timing parameter will cover range between 10 and 18 years when the predictor age is centered at 14 years. Depending on the mean age and whether the data correspond to males or females, the scale factor can be adjusted accordingly. The prior for the standard deviation of b parameter is also normal, with a mean of 0 and a standard deviation of 2.0 (normal(0, 2.0, autoscale = FALSE)), implying that 95% of the distribution’s mass for the individual variability in the timing parameter will cover 4 years (\(\pm 2.0\)) around the population average parameter b.
The default prior for the population average intensity regression parameter c is ‘normal’ with a mean of 0 and a scale of 1.0, i.e., normal(0, 1.0, autoscale = FALSE). Note that intensity parameter is estimated on the exp scale, and therefore is interpreted as percentage increase in size. For the standard deviation of c parameter, prior assigned is normal(0, 1.0, autoscale = FALSE). Note that since parameter c is as exp, the user should consider reducing the scale of prior from 1.0 to 0.5 or even 0.25 particularly for the standard deviation of random effect.
The prior for the correlations between random effect parameters follows the Lewandowski-Kurowicka-Joe (LKJ) distribution. The LKJ prior is specified via a single parameter eta. If eta = 1 (the default), all correlation matrices are equally likely a priori. If eta > 1, extreme correlations become less likely, whereas if 0 < eta < 1, higher probabilities are assigned to extreme correlations. See brms for more details.
For the spline coefficients, we assign a normal prior with the location set to \(\beta_{\text{lm}}\), the vector of spline coefficients obtained from a linear model fit to the data. The scale is defined as the standard deviation of the spline design matrix \(X_{\text{Spl}}\) multiplied by the default scaling factor \(1.0\) i.e., normal(lm, lm, autoscale = FALSE).

Bayesian SITAR model - Multivariate formulation

The univariate described earlier can be easily extended to analyze two or more outcomes simultaneously. Consider two outcomes \(Y1\) and \(Y2\) (e.g., height and weight) measured repeatedly on \(j\) individuals (j = 1,..,j) where individual \(j\) provides \(n_j\) measurements (\(i = 1,.., n_j\)) of \(y_{ij}^\text{Y1}\) (outcome \(Y1\)) and \(y_{ij}^\text{Y2}\) (outcome \(Y2\)) recorded at age, \(x_{ij}\). A multivariate model is then written as follows:

\[\begin{equation} \label{eq:4} \begin{aligned} \begin{bmatrix} \text{Y1}_{ij} \\ \text{Y2}_{ij} \end{bmatrix} & \sim \operatorname{MVNormal}\begin{pmatrix} \begin{bmatrix} \mu_{ij}^\text{Y1} \\ \mu_{ij}^\text{Y2} \end{bmatrix}, \mathbf {\Sigma_{Residual}} \end{pmatrix} \\ \\ \mu_{ij}^{\text{Y1}} & = \alpha_0^\text{Y1}+ \alpha_j^\text{Y1}+\sum_{r^\text{Y1}=1}^{p^\text{Y1}-1}{\beta_r^\text{Y1}\mathbf{Spl}^\text{Y1}\left(\frac{x_{ij}-\bar{x_{ij}}-\left(\zeta_0^\text{Y1}+\zeta_j^\text{Y1}\right)}{e^{-\ \left(\left.\ \gamma_0^\text{Y1}+\gamma_j^\text{Y1}\right)\right.}}\right)} \\ \\ \mu_{ij}^{\text{Y2}} & = \alpha_0^\text{Y2}+ \alpha_j^\text{Y2}+\sum_{r^\text{Y2}=1}^{p^\text{Y2}-1}{\beta_r^\text{Y2}\mathbf{Spl}^\text{Y2}\left(\frac{x_{ij}-\bar{x_{ij}}-\left(\zeta_0^\text{Y2}+\zeta_j^\text{Y2}\right)}{e^{-\ \left(\left.\ \gamma_0^\text{Y2}+\gamma_j^\text{Y2}\right)\right.}}\right)} \\ \\ \mathbf {\Sigma_{Residual}} & = \mathbf S_W \mathbf R_W \mathbf S_W \\ \\ \mathbf S_W & = \begin{bmatrix} \sigma_{ij}^\text{Y1} & 0 \\ 0 & \sigma_{ij}^\text{Y2} \\ \end{bmatrix} \\ \\ \mathbf R_W & = \begin{bmatrix} 1 & \rho_{\sigma_{ij}^\text{Y1}\sigma_{ij}^\text{Y2}} \\ \rho_{\sigma_{Ij}^\text{Y2}\sigma_{Ij}^\text{Y1}} & 1 \end{bmatrix} \\ \\ \begin{bmatrix} \alpha_{j}^\text{Y1} \\ \zeta_{j}^\text{Y1} \\ \gamma_{j}^\text{Y1} \\ \alpha_{j}^\text{Y2} \\ \zeta_{j}^\text{Y2} \\ \gamma_{j}^\text{Y2} \end{bmatrix} & \sim {\operatorname{MVNormal}\begin{pmatrix} \begin{bmatrix} 0 \\ 0 \\ 0 \\ 0 \\ 0 \\ 0 \end{bmatrix},\ \mathbf {\Sigma_{ID}} \end{pmatrix}} \\ \\ \\ \mathbf {\Sigma_{ID}} & = \mathbf S_{ID} \mathbf R_{ID} \mathbf S_{ID} \\ \\ \mathbf S_{ID} & = \begin{bmatrix} \alpha_{j}^\text{Y1} & 0 & 0 & 0 & 0 & 0 \\ 0 & \zeta_{j}^\text{Y1} & 0 & 0 & 0 & 0 \\ 0 & 0 & \gamma_{j}^\text{Y1} & 0 & 0 & 0 \\ 0 & 0 & 0 & \alpha_{j}^\text{Y2} & 0 & 0 \\ 0 & 0 & 0 & 0 & \zeta_{j}^\text{Y2} & 0 \\ 0 & 0 & 0 & 0 & 0 & \gamma_{j}^\text{Y2} \\ \end{bmatrix} \\ \\ \mathbf R_{ID} & = \begin{bmatrix} 1 & \rho_{\alpha_{j}^\text{Y1}\zeta_{j}^\text{Y1}} & \rho_{\alpha_{j}^\text{Y1}\gamma_{j}^\text{Y1}} & \rho_{\alpha_{j}^\text{Y1}\alpha_{j}^\text{Y2}} & \rho_{\alpha_{j}^\text{Y1}\zeta_{j}^\text{Y2}} & \rho_{\alpha_{j}^\text{Y1}\gamma_{j}^\text{Y2}} \\ \rho_{\zeta_{j}^\text{Y1}\alpha_{j}^\text{Y1}} & 1 & \rho_{\zeta_{j}^\text{Y1}\gamma_{j}^\text{Y1}} & \rho_{\zeta_{j}^\text{Y1}\alpha_{j}^\text{Y2}} & \rho_{\zeta_{j}^\text{Y1}\zeta_{j}^\text{Y2}} & \rho_{\zeta_{j}^\text{Y1}\gamma_{j}^\text{Y2}} \\ \rho_{\gamma_{j}^\text{Y1}\alpha_{j}^\text{Y1}} & \rho_{\gamma_{j}^\text{Y1}\zeta_{j}^\text{Y1}} & 1 & \rho_{\gamma_{j}^\text{Y1}\alpha_{j}^\text{Y2}} & \rho_{\gamma_{j}^\text{Y1}\zeta_{j}^\text{Y2}} & \rho_{\gamma_{j}^\text{Y1}\gamma_{j}^\text{Y2}} \\ \rho_{\alpha_{j}^\text{Y1}\alpha_{j}^\text{Y2}} & \rho_{\zeta_{j}^\text{Y1}\alpha_{j}^\text{Y2}} & \rho_{\gamma_{j}^\text{Y1}\alpha_{j}^\text{Y2}} & 1 & \rho_{\alpha_{j}^\text{Y2}\zeta_{j}^\text{Y2}} & \rho_{\alpha_{j}^\text{Y2}\gamma_{j}^\text{Y2}} \\ \rho_{\alpha_{j}^\text{Y1}\zeta_{j}^\text{Y2}} & \rho_{\zeta_{j}^\text{Y1}\zeta_{j}^\text{Y2}} & \rho_{\gamma_{j}^\text{Y1}\zeta_{j}^\text{Y2}} & \rho_{\zeta_{j}^\text{Y2}\alpha_{j}^\text{Y2}} & 1 & \rho_{\zeta_{j}^\text{Y2}\gamma_{j}^\text{Y2}} \\ \rho_{\alpha_{j}^\text{Y1}\gamma_{j}^\text{Y2}} & \rho_{\zeta_{j}^\text{Y1}\gamma_{j}^\text{Y2}} & \rho_{\gamma_{j}^\text{Y1}\gamma_{j}^\text{Y2}} & \rho_{\gamma_{j}^\text{Y2}\alpha_{j}^\text{Y2}} & \rho_{\gamma_{j}^\text{Y2}\zeta_{j}^\text{Y2}} & 1 \end{bmatrix} \\ \\ \alpha_0^\text{Y1} & \sim \operatorname{normal}(\ y^\text{Y1}_{mean},\ {y^\text{Y1}_{sd}}) \\ \alpha_0^\text{Y2} & \sim \operatorname{normal}(\ y^\text{Y2}_{mean},\ {y^\text{Y2}_{sd}}) \\ \zeta_0^\text{Y1} & \sim \operatorname{normal}(\ 0, 2.0) \\ \zeta_0^\text{Y2} & \sim \operatorname{normal}(\ 0, 2.0) \\ \gamma_0^\text{Y1} & \sim \operatorname{normal}(\ 0, 1.0) \\ \gamma_0^\text{Y2} & \sim \operatorname{normal}(\ 0, 1.0) \\ \beta_1^\text{Y1} \text{,..,} \beta_r^\text{Y1} & \sim \operatorname{normal}(\ \mathbf {\beta^\text{Y2}_{lm}}, \ \mathbf {X^\text{Y1}_{Spl}}) \\ \beta_1^\text{Y2} \text{,..,} \beta_r^\text{Y2} & \sim \operatorname{normal}(\ \mathbf {\beta^\text{Y2}_{lm}}, \ \mathbf {X^\text{Y2}_{Spl}}) \\ \alpha_j^\text{Y1} & \sim \operatorname{normal_{Half}}(\ {0},\ {y^\text{Y1}_{sd}}) \\ \alpha_j^\text{Y2} & \sim \operatorname{normal_{Half}}(\ {0},\ {y^\text{Y2}_{sd}}) \\ \zeta_j^\text{Y1} & \sim \operatorname{normal_{Half}}(\ {0},\ {2.0}) \\ \zeta_j^\text{Y2} & \sim \operatorname{normal_{Half}}(\ {0},\ {2.0}) \\ \gamma_j^\text{Y1} & \sim \operatorname{normal_{Half}}(\ {0},\ {1.0}) \\ \gamma_j^\text{Y2} & \sim \operatorname{normal_{Half}}(\ {0},\ {1.0}) \\ \sigma_{ij}^\text{Y1} & \sim \operatorname{normal_{Half}}(\ {0},\ {y^\text{Y1}_{sd}}) \\ \sigma_{ij}^\text{Y2} & \sim \operatorname{normal_{Half}}(\ {0},\ {y^\text{Y2}_{sd}}) \\ \mathbf R_W & \sim \operatorname{LKJ}(1) \\ \mathbf R_{ID} & \sim \operatorname{LKJ}(1), \end{aligned} \end{equation}\]

Where \(^\text{Y1}\) and \(^\text{Y2}\) superscripts indicate which variable is connected with which parameter. This is a straightforward multivariate generalization from the previous model (see Equation \(\ref{eq:3}\)). At the individual level, we have six parameters varying across individuals, resulting in a \(6 \times 6\) \(\mathbf{S}_{ID}\) matrix and a \(6 \times 6\) \(\mathbf{R}_{ID}\) matrix. The within-individual variability is captured by the residual parameters, which include a \(2 \times 2\) \(\mathbf{S}_{W}\) matrix and a \(2 \times 2\) \(\mathbf{R}_{W}\) matrix. The priors described above for the Univariate model specification are applied to each outcome. The prior on the residual correlation between outcomes is the lkj prior, as described earlier for the correlation between group level random effects. A detailed description of the prior distributions for each parameter is provided in Section priors. The default setting applies the same priors to all outcomes.

Model estimation - frequentist vs. Bayesian

There are two competing philosophies of model estimation (Bland & Altman, 1998; Schoot et al., 2014): the Bayesian (based on Bayes’ theorem) and the frequentist (e.g., maximum likelihood estimation). While the frequentist approach was predominant in earlier years, the advent of powerful computers has given new impetus to Bayesian analysis (Bland & Altman, 1998; Hamra et al., 2013; Schoot et al., 2014). As a result, Bayesian statistical methods are becoming increasingly popular in applied and fundamental research. The key difference between Bayesian statistical inference and frequentist statistical methods concerns the nature of the unknown parameters. In the frequentist framework, a parameter of interest is assumed to be unknown, but fixed. That is, it is assumed that in the population, there is only one true population parameter— for example, one true mean or one true regression coefficient. In the Bayesian view of subjective probability, all unknown parameters are treated as uncertain and, therefore, should be described by a probability distribution (Schoot et al., 2014). A particularly attractive feature of Bayesian modeling is its ability to handle otherwise complex model specifications, such as hierarchical models (i.e., multilevel/mixed-effects models) that involve nested data structures (e.g., repeated height measurements in individuals) (Hamra et al., 2013). Bayesian statistical methods are becoming increasingly popular in applied and clinical research (Schoot et al., 2014).

There are three essential components underlying Bayesian statistics (Bayes, 1763; Stigler, 1986): the prior distribution, the likelihood function, and the posterior distribution. The prior distribution refers to all knowledge available before seeing the data, whereas the likelihood function expresses the information in the data given the parameters defined in the model. The third component, the posterior distribution, is obtained by combining the first two components via Bayes’ theorem, and the results are summarized by the so-called posterior inference. The posterior distribution, therefore, reflects one’s updated knowledge, balancing prior knowledge with observed data.

The task of combining these components can lead to a complex model in which the exact distribution of one or more variables is unknown. Estimators that rely on assumptions of normality may perform poorly in such cases. This limitation is often mitigated by estimating Bayesian models using Markov Chain Monte Carlo (MCMC). Unlike deterministic maximum-likelihood algorithms, MCMC is a stochastic procedure that repeatedly generates random samples to characterize the distribution of parameters of interest (Hamra et al., 2013). The popular software platforms for Bayesian estimation include the BUGS family, such as WinBUGS (Lunn et al., 2000), OpenBUGS (Spiegelhalter et al., 2007), and JAGS (Plummer, 2003). More recently, the software Stan has been developed to achieve higher computational and algorithmic efficiency by using the No-U-Turn Sampler (NUTS), an adaptive variant of Hamiltonian Monte Carlo (HMC) (Gelman et al., 2015; Hoffman & Gelman, 2011; Neal, 2011).

References

Bayes, T. (1763). LII. An essay towards solving a problem in the doctrine of chances. By the late rev. Mr. Bayes, FRS communicated by mr. Price, in a letter to john canton, AMFR s. Philosophical Transactions of the Royal Society of London, 53, 370418.

Beath, K. J. (2007). Infant growth modelling using a shape invariant model with random effects. Statistics in Medicine, 26(12), 2547–2564. https://doi.org/10.1002/sim.2718

Bland, J. M., & Altman, D. G. (1998). Statistics notes: Bayesians and frequentists. BMJ : British Medical Journal, 317(7166), 1151. https://doi.org/10.1136/bmj.317.7166.1151

Bogin, B. (2010). Evolution of human growth (M. P. Muehlenbein, Ed.; pp. 379–395). Cambridge University Press. https://doi.org/10.1017/CBO9780511781193.028

Busscher, I., Kingma, I., Bruin, R. de, Wapstra, F. H., Verkerke, G. J., & Veldhuizen, A. G. (2012). Predicting the peak growth velocity in the individual child: Validation of a new growth model. European Spine Journal : Official Publication of the European Spine Society, the European Spinal Deformity Society, and the European Section of the Cervical Spine Research Society, 21(1), 71–76. https://doi.org/10.1007/s00586-011-1845-z

Cameron, N., & Bogin, B. (2012a). Human growth and development (2nd ed.). Academic Press. https://books.google.co.in/books?id=V8v3yD9--x8C

Cameron, N., & Bogin, B. (2012b). Human growth and development (2nd ed.). Academic Press. https://books.google.co.in/books?id=V8v3yD9--x8C

Cole, T. (2022). Sitar: Super imposition by translation and rotation growth curve analysis. https://CRAN.R-project.org/package=sitar

Cole, T. J., Donaldson, M. D. C., & Ben-Shlomo, Y. (2010). SITAR—a useful instrument for growth curve analysis. International Journal of Epidemiology, 39(6), 1558–1566. https://doi.org/10.1093/ije/dyq115

Cole, T. J., & Mori, H. (2018). Fifty years of child height and weight in Japan and South Korea: Contrasting secular trend patterns analyzed by SITAR. American Journal of Human Biology, 30(1), e23054. https://doi.org/10.1002/ajhb.23054

Dean, J. A., Jones, J. E., & Vinson, L. A. W. (2016). McDonald and avery’s dentistry for the child and adolescent (10th ed.). Elsevier Health Sciences. https://books.google.co.uk/books?id=HqtcCgAAQBAJ

Gelman, A., Lee, D., & Guo, J. (2015). Stan: A probabilistic programming language for bayesian inference and optimization. Journal of Educational and Behavioral Statistics, 40(5), 530–543. https://doi.org/10.3102/1076998615606113

Hamra, G., MacLehose, R., & Richardson, D. (2013). Markov chain monte carlo: An introduction for epidemiologists. International Journal of Epidemiology, 42(2), 627–634. https://doi.org/10.1093/ije/dyt043

Hauspie, R. C., Cameron, N., & Molinari, L. (2004a). Methods in human growth research. Cambridge University Press.

Hauspie, R. C., Cameron, N., & Molinari, L. (2004b). Methods in human growth research. Cambridge University Press.

Hoffman, M. D., & Gelman, A. (2011). The no-u-turn sampler: Adaptively setting path lengths in hamiltonian monte carlo. https://doi.org/10.48550/ARXIV.1111.4246

Lindstrom, M. J. (1995). Self-modelling with random shift and scale parameters and a free-knot spline shape function. Statistics in Medicine, 14(18), 2009–2021. https://doi.org/https://doi.org/10.1002/sim.4780141807

Lunn, D. J., Thomas, A., Best, N., & Spiegelhalter, D. (2000). WinBUGS - A Bayesian modelling framework: Concepts, structure, and extensibility. Statistics and Computing, 10(4), 325–337. https://doi.org/10.1023/A:1008929526011

Mansukoski, L., Johnson, W., Brooke‐Wavell, K., Galvez‐Sobral, J. A., Furlan, L., Cole, T. J., & Bogin, B. (2019). Life course associations of height, weight, fatness, grip strength, and all‐cause mortality for high socioeconomic status Guatemalans. American Journal of Human Biology, 31(4). https://doi.org/10.1002/ajhb.23253

McArdle, J. J. (2015). Growth curve analysis (J. Wright, Ed.; pp. 441–446). Elsevier. https://doi.org/http://dx.doi.org/10.1016/B978-0-08-097086-8.44030-4

Neal, R. M. (2011). MCMC Using Hamiltonian Dynamics. Routledge Handbooks Online. https://doi.org/10.1201/b10905-7

Nembidzane, C., Lesaoana, M., Monyeki, K. D., Boateng, A., & Makgae, P. J. (2020). Using the SITAR Method to Estimate Age at Peak Height Velocity of Children in Rural South Africa: Ellisras Longitudinal Study. Children, 7(3), 17. https://doi.org/10.3390/children7030017

Plummer, M. (2003). JAGS: A program for analysis of bayesian graphical models using gibbs sampling.

Proffit, W. R. (2014a). Concepts of growth and development (W. R. Proffit, H. W. Fields, & D. M. Sarver, Eds.; 5th ed., pp. 20–65). Elsevier Health Sciences. https://books.google.co.in/books?id=gn8xBwAAQBAJ

Proffit, W. R. (2014b). Concepts of growth and development (W. R. Proffit, H. W. Fields, & D. M. Sarver, Eds.; 5th ed., pp. 20–65). Elsevier Health Sciences. https://books.google.co.in/books?id=gn8xBwAAQBAJ

Riddell, C. A., Platt, R. W., Bodnar, L. M., & Hutcheon, J. A. (2017). Classifying Gestational Weight Gain Trajectories Using the SITAR Growth Model. Paediatric and Perinatal Epidemiology, 31(2), 116–125. https://doi.org/10.1111/ppe.12336

Sanders, J. O., Browne, R. H., McConnell, S. J., Margraf, S. A., Cooney, T. E., & Finegold, D. N. (2007). Maturity assessment and curve progression in girls with idiopathic scoliosis. The Journal of Bone and Joint Surgery. American Volume, 89(1), 64–73. https://doi.org/10.2106/JBJS.F.00067

Sandhu, S. S. (2020). Analysis of longitudinal jaw growth data to study sex differences in timing and intensity of the adolescent growth spurt for normal growth and skeletal discrepancies [Thesis].

Schoot, R. van de, Kaplan, D., Denissen, J., Asendorpf, J. B., Neyer, F. J., & Aken, M. A. G. van. (2014). A gentle introduction to bayesian analysis: Applications to developmental research [Journal Article]. Child Dev, 85(3), 842–860. https://doi.org/10.1111/cdev.12169

Spiegelhalter, D., Thomas, A., Best, N., & Lunn, D. (2007). OpenBUGS user manual. Version, 3(2), 2007.

Stigler, S. M. (1986). Laplace’s 1774 memoir on inverse probability. Statistical Science, 1(3), 359363.

Stulp, G., & Barrett, L. (2016). Evolutionary perspectives on human height variation. Biological Reviews, 91(1), 206–234. https://doi.org/10.1111/brv.12165