R/classify_text.r
build_pipeline.Rd
Build a pipeline from creating tuning parameters, search spaces, workflows, 10-fold cross-validation samples to finding the best model from lasso, random forest, XGBoost to fitting the best model from each algorithm to the data
build_pipeline(
input_data,
category,
rec,
prop_ratio = 0.8,
metric_choice = "accuracy"
)
The data to be trained and tested.
The target binary category.
The recipe (preprocessing steps) that will be applied to the training and test data
The ratio used to split the data. The default value is 0.8
The selected metrics for the model evaluation among accuracy, balanced accuracy (bal_accuracy), F-score (f_means), and Area under the ROC curve (roc_auc). The default value is accuracy.
A list output that contains the best output for lasso, random forest, and XGBoost.