Build a pipeline from creating tuning parameters, search spaces, workflows, 10-fold cross-validation samples to finding the best model from lasso, random forest, XGBoost to fitting the best model from each algorithm to the data

build_pipeline(
  input_data,
  category,
  rec,
  prop_ratio = 0.8,
  metric_choice = "accuracy"
)

Arguments

input_data

The data to be trained and tested.

category

The target binary category.

rec

The recipe (preprocessing steps) that will be applied to the training and test data

prop_ratio

The ratio used to split the data. The default value is 0.8

metric_choice

The selected metrics for the model evaluation among accuracy, balanced accuracy (bal_accuracy), F-score (f_means), and Area under the ROC curve (roc_auc). The default value is accuracy.

Value

A list output that contains the best output for lasso, random forest, and XGBoost.