This guide provides comprehensive parameter documentation for all E2E functions.
E2E includes example datasets for both diagnostic and prognostic modeling:
Trains base classification models for diagnostic tasks. Parameters:
data
(required): Data frame with sample names
(column 1), outcomes 0/1 (column 2), features (columns 3+)
model
(required): Character vector of model names or
“all_dia” for all models
tune
: Logical (default FALSE). Whether to perform
hyperparameter tuning
threshold_choices
: Threshold selection method
seed
: Integer (default 123). Random seed for
reproducibility
Bootstrap aggregating ensemble method. Parameters:
data
(required): Training data frame
base_model_name
(required): Base model name (e.g.,
“xb”, “rf”)
n_estimators
: Integer (default 50). Number of base
models
subset_fraction
: Numeric (default 0.632). Bootstrap
sampling fraction
tune_base_model
: Logical (default FALSE). Tune base
models
threshold_choices
: Same as models_dia()
seed
: Integer (default 123). Random seed
Voting ensemble combining multiple models. Parameters:
results_all_models
(required): Output from
models_dia()
data
(required): Training data
type
: Voting type
weight_metric
: String (default “AUROC”). Metric for
soft voting weights
top
: Integer (default 5). Number of top models to
use
threshold_choices
: Same as models_dia()
seed
: Integer (default 123). Random seed
Stacking ensemble with meta-model. Parameters:
results_all_models
(required): Output from
models_dia()
data
(required): Training data
meta_model_name
(required): Meta-model name (e.g.,
“lasso”, “gbm”)
top
: Integer (default 5). Number of top base
models
tune_meta
: Logical (default FALSE). Tune
meta-model
threshold_choices
: Same as models_dia()
seed
: Integer (default 123). Random seed
Handles imbalanced datasets using EasyEnsemble-like algorithm. Parameters:
data
(required): Imbalanced training data
base_model_name
(required): Base model for balanced
subsets
n_estimators
: Integer (default 10). Number of
balanced subsets
tune_base_model
: Logical (default FALSE). Tune base
models
threshold_choices
: Same as models_dia()
seed
: Integer (default 123). Random seed
Applies trained model to new data. Parameters:
trained_model_object
(required): Trained model
object from E2E functions
new_data
(required): New data for prediction (sample
IDs in column 1)
label_col_name
: String (default NULL). True label
column name if available
Evaluates model predictions. Parameters:
prediction_df
(required): Prediction data frame from
apply_dia()
threshold_choices
: Same as models_dia()
Trains base survival models. Parameters:
data
(required): Data frame with sample ID, survival
status, time, features
model
(required): Model names or “all_pro” for all
models
tune
: Logical (default FALSE). Hyperparameter
tuning
time_unit
: String (default “day”). Time unit (“day”,
“month”, “year”)
years_to_evaluate
: Numeric vector (default
c(1,3,5)). Time points for time-dependent AUROC
seed
: Integer (default 789). Random seed
Stacking ensemble for survival analysis. Parameters:
results_all_models
(required): Output from
models_pro()
data
(required): Training data
meta_model_name
(required): Meta-model name
top
: Integer (default 3). Number of top base
models
tune_meta
: Logical (default FALSE). Tune
meta-model
time_unit
: String (default “day”). Time
unit
years_to_evaluate
: Numeric vector (default
c(1,3,5)). Evaluation time points
seed
: Integer (default 789). Random seed
Bootstrap aggregating for survival analysis. Parameters:
data
(required): Training data
base_model_name
(required): Base model name
n_estimators
: Integer (default 10). Number of base
models
subset_fraction
: Numeric (default 0.632). Bootstrap
sampling fraction
tune_base_model
: Logical (default FALSE). Tune base
models
time_unit
: String (default “day”). Time
unit
years_to_evaluate
: Numeric vector (default
c(1,3,5)). Evaluation time points
seed
: Integer (default 456). Random seed
Applies trained survival model to new data. Parameters:
trained_model_object
(required): Trained model
object
new_data
(required): New data with same structure as
training data
time_unit
: String (default “day”). Time
unit
Evaluates survival model predictions. Parameters:
prediction_df
(required): Prediction data frame from
apply_pro()
years_to_evaluate
: Numeric vector (default
c(1,3,5)). Evaluation time points
Creates diagnostic model evaluation plots. Parameters:
type
(required): Plot type
data
(required): Model results objectCreates prognostic model evaluation plots. Parameters:
type
(required): Plot type
data
(required): Model results object
time_unit
: String (default “days”). Time unit for
axis labels
Creates SHAP interpretation plots. Parameters:
data
(required): Model results with sample_score
data frame
raw_data
(required): Original feature data
target_type
(required): Data type
Registers custom algorithms.
Usage: 1. Define custom function following E2E
conventions 2. Register with
register_model_dia("model_name", custom_function)
3. Use
registered model in E2E workflows