lifecycle::deprecated() function calls that was causing
“could not find function deprecated” errors for users. All function
parameters now properly use lifecycle::deprecated() instead
of deprecated().train_spectra() now correctly populates
RMSEcv and R2cv for rf,
svmLinear, and svmRadial model methods.
Previously these were always NA for non-PLS models.plot_spectra(), aggregate_spectra(),
train_spectra(), and the ikeogu.2017 example
to resolve deprecation warnings.trainControl() configuration in
train_spectra() — changed method from
"repeatedcv" (with no repeats set) to "cv" and
removed a malformed seeds argument that was being silently
ignored. Reproducibility is maintained via the existing
set.seed() call.train_spectra()
is now trained with 500 trees (the randomForest default) instead of
tune.length (≤5). The tuned mtry from training
iterations is now correctly applied to the final model.train_spectra() no longer incorrectly uses the
unique.id column as a predictor. Training data is now
subset to reference and spectral columns only, consistent with iterative
model training.train_spectra() is now fit using the modal best
ncomp from training iterations rather than
tune.length. predict_spectra() now reads
ncomp directly from the model object instead of the stats
file.train_spectra() now correctly subsets
partitioned data using column names rather than pre-computed indices,
preventing an undefined columns selected crash when
cv.scheme is used (the genotype column is removed by
format_cv() before subsetting).train_spectra() now correctly determines the
training data for the final model when cv.scheme is used.
It calls format_cv() one final time to obtain the training
set defined by the chosen scheme: CV0/CV00 train on trial2 + trial3; CV1
trains on t1.b + t2.b; CV2 trains on t1.b + trial2. In all schemes the
test set is t1.a (held-out genotypes from trial1).test_spectra() now correctly forwards
best.model.metric and seed to
train_spectra(). Previously these arguments were accepted
but silently ignored.train_spectra() now calls
set.seed() before createDataPartition(),
ensuring that stratified train/test splits are fully reproducible.
Previously, set.seed() was called after
createDataPartition(), so the partition depended on
whatever the random state happened to be at call time. Note on
reproducibility: users who called test_spectra()
with a non-default seed were previously receiving results
generated with seed = 1 regardless of the value they set.
To reproduce old results, set seed = 1 explicitly. This
will restore previous iterative performance statistics (RMSEp, R2p,
etc.), but the returned $model object will still differ due
to the RF, SVM, and PLS final model fixes also introduced in this
version.ntree and mtry
arguments from predict.randomForest() calls in
train_individual_model() and
predict_spectra(). These arguments are not accepted by
predict.randomForest() and were silently ignored.importance output when running
test_spectra() with multiple pretreatments and an SVM
model. Previously, cbind() on a NULL
importance element produced a garbage 1×1 matrix instead of
NULL.plot_spectra() no longer returns an error when
detect.outliers is set to FALSE and no
alternative title is provided via the alternate.title
parameter (#29).spectacles package (v0.5.5) that was causing data frame
construction errors in model performance calculations.spectacles resolved (#31).
spectacles is now restored on CRAN.waves is fully compatible with the restored
version.train_spectra()
using vectorized indexing and preallocated result structures.train_spectra() for
faster column selection and reduced memory overhead.train_spectra() and test_spectra().return.distances = TRUE, the h.distance column is
now located between metadata and spectra in the returned
data.frame (#28).train_spectra() and
test_spectra() to reduce cyclomatic complexity (#26).
train_spectra() complexity reduced from 32+ to 25test_spectra() complexity reduced to 11handle_deprecations(), validate_inputs(),
partition_data(), train_individual_model(),
calculate_performance(), and
create_cv_control().predict_spectra() no longer returns error when
running example code (#25).cv.scheme is set to “CV2” and “CV0” and there are
no overlapping genotypes between “trial1” and “trial2”,
format_cv() now returns NULL. Previously,
results would be returned even if no overlap was present, resulting in
incorrect CV scheme specification.format_cv() parameter cv.method is now the
boolean parameter stratified.sampling for consistency with
other waves functions.plot_spectra() no longer requires a column named
“unique.id”.save_model() output now works correctly with
predict_spectra().train_spectra() no longer returns an error
when stratified.sampling = F.train_spectra(), stratified random sampling of
training and test sets now allows the user to provide a seed value for
set.seed(). For random (non-stratified) sampling of
training and test sets, seed is set to the current iteration
number.model.method = "svmLinear and
model.method = "svmRadial no longer return an error when
used in train_spectra() or
test_spectra().test_spectra() now returns trained model
correctly when only one pretreatment is specified.plot_spectra() is now
NULL (no title) if detect.outliers is set to
FALSE.$summary.model.performance from test_spectra()
now include underscores rather than periods for easier parsing.vignette("waves")AggregateSpectra ->
aggregate_spectra()DoPreprocessing ->
pretreat_spectra()FilterSpectra -> filter_spectra()FormatCV-> format_cv()PlotSpectra()-> plot_spectra()SaveModel()-> save_model()TestModelPerformance()->
test_spectra()TrainSpectralModel()->
train_spectra()preprocessing is now pretreatment).tune.length must be set to 5 when
model.algorithm == "rf").plot_spectra() including
color and title customization and the option to forgo filtering
(#5).train_spectra() and test_spectra().save_model() now automatically selects the best model
if provided with multiple pretreatments.wavelengths is no longer a required argument for any of
the waves functions.proportion.train. Previously,
this proportion was fixed at 0.7 (#13).aggregate_spectra() now allows for aggregation
by a single grouping column (#14).save.model in the function
save_model() has been renamed to write.model
for clarity.TrainSpectralModel().TrainSpectralModel() or when
preprocessing = TRUE in TestModelPerformance()
(#7).PlotSpectra() now allows for missing data in
non-spectral columns of the input data frame.Initial package release