This vignette provides a brief introduction to using the package Spower
for prospective/post-hoc, a priori, sensitivity, criterion, and compromise power analyses. For a more detailed description of the package refer to associated publication (Chalmers, in press) as well as the documentation found within the functions, particularly Spower()
, SpowerBatch()
, and SpowerCurve()
.
There are generally two-ways to go about using Spower
; either by utilizing a handful of build-in simulation experiments with an associated analysis, or by constructing the simulation experiment yourself. In either case, the goal is to manipulate R code to perform a single simulation experiment and have the function return:
sig.level
(\(\alpha\)) indicates whether a sample was (\(P(D|H_0)<\alpha\)) or was not (\(P(D|H_0)\ge\alpha\)) significant,sig.level
to indicate that a sample supported (\(P(H_1|D)>\alpha\)) or did not support (\(P(H_1|D)\le\alpha\)) the hypothesis of interest (requires setting Spower(..., sig.direction = 'above')
), orTRUE
if rejecting the null; TRUE
if supporting the alternative; some more complicated combination involving multiple hypothesis or regions or practical significance, etc)The package largely focuses on the first point as these are more common applications of power analysis, though nothing precludes the package from more complex power analysis tests and experiments. See the vignette “Bayesian power analyses and logical vector returns” for further examples involving CIs, ROPEs, equivalence tests, precision tests, and Bayes Factors.
Spower
ships with several common experiment structures, such as those involving linear regression models, mediation, ANOVAs, \(t\)-tests, and so on. The simulation experiments are organized with the prefix p_
, followed by the name of the analysis method. For instance,
## [1] 0.004935905
performs a single simulation experiment reflecting the null hypothesis \(H_0:\, R^2=0\) for a linear regression model with \(k=3\) predictor variables and a sample size of \(N=50\). Translating this into a power-analysis simply requires passing this experiment in some form to Spower()
(the details of which are discussed below), where in this case the estimate of power (\(1-\beta\)) is returned.
##
## Design conditions:
##
## # A tibble: 1 × 5
## n R2 k sig.level power
## <dbl> <dbl> <dbl> <dbl> <lgl>
## 1 50 0.3 3 0.05 NA
##
## Estimate of power: 0.971
## 95% Confidence Interval: [0.968, 0.974]
Each of these functions return a \(p\)-value (\(P(D|H_0\)) as this is the general information required to evaluate statistical power with Spower()
. Alternatively, users may define their own simulation functions if the desired experiment has not been defined within the package.
As a very simple example, suppose one were interested in the power to reject the null hypothesis \(H_0:\, \mu = \mu_0\) in a one-sample \(t\)-test scenario, where \(P(D|H_0\)) is the probability of the data given the null hypothesis. While the package already supports this (see help(p_t.test)
), users may write their own version of this experiment, one instance of which may look like the following.
Using the defaults of this defined function evaluates whether a sample of data drawn from a Gaussian distribution with some specific mean
(\(\mu\)) differs from the value of \(mu = 0\) (\(\mu_0\)). Hence, the \(p\)-value returned from this experiments reflects the null hypothesis \(H_0:\, \mu=\mu_0\), or more specifically \(H_0:\, \mu=0\), for a single generated dataset.
## [1] 0.06328707
For power analyses there are typically four parameters that are/can be manipulated or solved in a given experiment: the \(\alpha\) level (Type I error, often reflexively set to \(\alpha=.05\)), power (the complement of the Type II error, \(1-\beta\)), an effect size of interest, and the sample size. Given three of these values, the fourth can always be solved. Note that this description reflects a rule-of-thumb as there may be multiple effect sizes of interest, multiple sample sizes (e.g., in the form of cluster sizes in multi-level models), and so on. In Spower()
switching between these types of power analysis criteria is done simply by explicitly specifying which parameter is missing (NA
), as demonstrated below.
The canonical setup for Spower()
will evaluate prospective or post-hoc power, thereby obtaining the estimate \(1-\hat{\beta}\). Default in Spower()
uses \(10,000\) independent simulation replications
. The following provides an estimate of power given the null hypothesis \(H_0:\, \mu=0.3\).
##
## Design conditions:
##
## # A tibble: 1 × 5
## n mean mu sig.level power
## <dbl> <dbl> <dbl> <dbl> <lgl>
## 1 100 0.5 0.3 0.05 NA
##
## Estimate of power: 0.515
## 95% Confidence Interval: [0.505, 0.525]
Compromise power analysis involves manipulating the \(\alpha\) level until some sufficient balance between the Type I and Type II error rates are met, expressed in terms of the ratio \(q=\frac{\beta}{\alpha}\).
In Spower
, there are two ways to approach this criterion. The first way, which focuses on the beta_alpha
ratio at the outset, simply requires passing the target ratio to Spower()
using the same setup as the previous prospective power analysis.
##
## Design conditions:
##
## # A tibble: 1 × 6
## n mean mu sig.level power beta_alpha
## <dbl> <dbl> <dbl> <dbl> <lgl> <dbl>
## 1 100 0.5 0.3 NA NA 4
##
## Estimate of Type I error rate (alpha/sig.level): 0.093
## 95% Confidence Interval: [0.087, 0.098]
##
## Estimate of power (1-beta): 0.629
## 95% Confidence Interval: [0.620, 0.638]
This returns the estimated sig.level
(\(\hat{\alpha}\)) and resulting \(\hat{\beta}\) that satisfies the target \(q\) ratio.
## [1] 4
The second way to perform a compromise analysis is to re-use a previous evaluation of a prospective power analysis as this contains all the necessary information.
##
## Design conditions:
##
## # A tibble: 1 × 6
## n mean mu sig.level power beta_alpha
## <dbl> <dbl> <dbl> <dbl> <lgl> <dbl>
## 1 100 0.5 0.3 NA NA 4
##
## Estimate of Type I error rate (alpha/sig.level): 0.093
## 95% Confidence Interval: [0.088, 0.099]
##
## Estimate of power (1-beta): 0.627
## 95% Confidence Interval: [0.617, 0.636]
In either case, the use of update()
can be beneficial as the stored result can be reused with alternative beta_alpha
values without having to generate and analyse new experimental data.
The goal of a priori power analysis is generally to obtain the sample size (\(N\)) associated with a specific power rate of interest (e.g., \(1-\beta=.90\)), which is useful in the context of future data collection.
To estimate the sample size using Monte Carlo simulation experiments, Spower()
performs stochastic root solving using the ProBABLI approach from the SimDesign
package, and requires a specific search interval
to search within. The width of the interval should reflect a plausible range where the researcher believes the solution to lie, however this may be quite large as well as ProBABLI is less influenced by the range of the interval (Chalmers, 2024).
Below the sample size n
is solved to achieve a target power of \(1-\beta=.80\), where the solution was suspected to lie somewhere between the search interval = c(20, 200)
.
##
## Design conditions:
##
## # A tibble: 1 × 4
## n mean sig.level power
## <dbl> <dbl> <dbl> <dbl>
## 1 NA 0.5 0.05 0.8
##
## Estimate of n: 32.8
## 95% Prediction Interval: [32.4, 33.1]
Similar to a priori power analysis, however the goal is now to find some specific standardized or unstandardized effect size associated with a specific power rate. This pertains to the question of how large an effect size must be in order to reliably detect the effect of interest.
Below the sample size is fixed at \(N=100\), while the search interval for the standardized effect size \(d\) is searched between the interval = c(.1, 3)
. Note that the use of decimals in the interval tells Spower()
to use a continuous rather than discrete search space (cf. with a priori, which uses an integer search space for the simulation replicates).
##
## Design conditions:
##
## # A tibble: 1 × 4
## n mean sig.level power
## <dbl> <dbl> <dbl> <dbl>
## 1 100 NA 0.05 0.8
##
## Estimate of mean: 0.281
## 95% Prediction Interval: [0.280, 0.283]
Finally, in criterion power analysis the goal is to located the associated \(\alpha\) level (sig.level
) require to achieve a target power output holding constant the other modeling information. This is done in Spower()
by setting the sig.level
input to NA
while providing values for the other parameters. Note that no search interval is require in this context as \(\alpha\) necessarily lies between the interval \([0,1]\).
##
## Design conditions:
##
## # A tibble: 1 × 4
## n mean sig.level power
## <dbl> <dbl> <dbl> <dbl>
## 1 50 0.5 NA 0.8
##
## Estimate of sig.level: 0.010
## 95% Prediction Interval: [0.009, 0.010]
The following functions, SpowerBatch()
and SpowerCurve()
, can be used to evaluate and visualize power analysis results across a range on inputs rather than for a single set of fixed inputs. This section demonstrate their general usage as the specifications slightly differ from that of Spower()
, despite the fact that Spower()
is the underlying estimation engine.
To begin, suppose that there was interest in evaluating the p_single.t()
function across multiple \(n\) inputs to obtain estimates of \(1-\beta\). While this could be performed using independent calls to Spower()
, the function SpowerBatch()
can instead be used, where the variable inputs can be specified in vector format. For instance, given the effect size \(\mu=.5\), what would be the power to reject the null hypothesis \(H_0:\, \mu=0\) across three different sample sizes.
## $CONDITION_1
##
## Design conditions:
##
## # A tibble: 1 × 4
## n mean sig.level power
## <dbl> <dbl> <dbl> <lgl>
## 1 30 0.5 0.05 NA
##
## Estimate of power: 0.752
## 95% Confidence Interval: [0.743, 0.760]
##
## $CONDITION_2
##
## Design conditions:
##
## # A tibble: 1 × 4
## n mean sig.level power
## <dbl> <dbl> <dbl> <lgl>
## 1 60 0.5 0.05 NA
##
## Estimate of power: 0.967
## 95% Confidence Interval: [0.963, 0.970]
##
## $CONDITION_3
##
## Design conditions:
##
## # A tibble: 1 × 4
## n mean sig.level power
## <dbl> <dbl> <dbl> <lgl>
## 1 90 0.5 0.05 NA
##
## Estimate of power: 0.996
## 95% Confidence Interval: [0.995, 0.997]
This can further be coerced to a data.frame
object if there is reason to do so (e.g., for plotting purpose).
## n mean sig.level power
## 1 30 0.5 0.05 0.7516
## 2 60 0.5 0.05 0.9669
## 3 90 0.5 0.05 0.9963
Similarly, if the were related to a priori analyses for sample size planning then the inputs to SpowerBatch()
would be modified to set n
to the missing quantity.
apriori.batch <- p_single.t(mean=.5, n=NA) |>
SpowerBatch(power=c(.7, .8, .9), interval=c(20, 200))
apriori.batch
## $CONDITION_1
##
## Design conditions:
##
## # A tibble: 1 × 4
## n mean sig.level power
## <dbl> <dbl> <dbl> <dbl>
## 1 NA 0.5 0.05 0.7
##
## Estimate of n: 26.7
## 95% Prediction Interval: [26.5, 27.0]
##
## $CONDITION_2
##
## Design conditions:
##
## # A tibble: 1 × 4
## n mean sig.level power
## <dbl> <dbl> <dbl> <dbl>
## 1 NA 0.5 0.05 0.8
##
## Estimate of n: 33.3
## 95% Prediction Interval: [32.9, 33.7]
##
## $CONDITION_3
##
## Design conditions:
##
## # A tibble: 1 × 4
## n mean sig.level power
## <dbl> <dbl> <dbl> <dbl>
## 1 NA 0.5 0.05 0.9
##
## Estimate of n: 44.4
## 95% Prediction Interval: [43.8, 44.8]
## n mean sig.level power
## 1 26.71726 0.5 0.05 0.7
## 2 33.31082 0.5 0.05 0.8
## 3 44.36998 0.5 0.05 0.9
Often times researchers wish to visualize the results of power analyses in the form of graphical representations. Spower
supports various types of visualizations through the function SpowerCurve()
, which creates power curve plots of previously obtained results (e.g., via SpowerBatch()
) or for to-be-explored inputs. Importantly, each graphic contains estimates of the Monte Carlo sampling uncertainty to deter over-interpretation of any resulting point-estimates.
To demonstrate, suppose one were interested in visualizing the power for the running single-sample \(t\) test across four different sample sizes, \(N=[30,60,90,120]\). To do this requires passing the simulation experiment and varying information to the function SpowerCurve()
, which fills in the variable information to the supplied simulation experiment and plots the resulting output.
Alternatively, were the above information already evaluated using SpowerBatch()
then this batch
object could be passed directly to SpowerCurve()
, thereby avoiding the need to re-evaluate the simulation experiments.
SpowerCurve()
will accept as many arguments as exists in the supplied simulation experiment definition, however it will only plot the first three variable input specifications as anything past this becomes more difficult to display automatically. Below is an example that varies both the n
input as well as the input mean
, where the first input appears on the \(x\)-axis while the second is mapped to the default colour aesthetic in ggplot2
.