missRanger 2.6.0

Major bug fix

Fixes a major bug, by which responses would be used as covariates in the random forests. Thanks for reporting @flystar233, see #78. You can expect different and better imputations.

Major feature

Out-of-sample application is now possible! Thanks to @jeandigitale for pushing the idea in #58.

This means you can run imp <- missRanger(..., keep_forests = TRUE) and then apply its models to new data via predict(imp, newdata). The “missRanger” object can be saved/loaded as binary file, e.g, via saveRDS()/readRDS() for later use.

Note that out-of-sample imputation works best for rows in newdata with only one missing value (counting only missings in variables used as covariates in random forests). We call this the “easy case”. In the “hard case”, even multiple iterations (set by iter) can lead to unsatisfactory results.

The out-of-sample algorithm works as follows:

  1. Impute univariately all relevant columns by randomly drawing values from the original unimputed data. This step will only impact “hard case” rows.
  2. Replace univariate imputations by predictions of random forests. This is done sequentially over variables, where the variables are sorted to minimize the impact of univariate imputations. Optionally, this is followed by predictive mean matching (PMM).
  3. Repeat Step 2 for “hard case” rows multiple times.

Possibly breaking changes

Minor changes in output object

Other changes

missRanger 2.5.0

Bug fixes

Enhancements

missRanger 2.4.0

Future Output API

Enhancements

Bug fixes

missRanger 2.3.0

Major improvements

Other changes

missRanger 2.2.1

missRanger 2.2.0

Less dependencies

Maintenance

missRanger 2.1.5 (not on CRAN)

Maintenance release,

missRanger 2.1.4 (not on CRAN)

Minor changes

missRanger 2.1.2 and 2.1.3

Maintenance update

missRanger 2.1.1

Minor changes

Documentation

Other

missRanger 2.1.0

This is a summary of all changes since version 1.x.x.

Major changes

Minor changes

Minor bug fix