-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Hands-On Gradient Boosting with XGBoost and scikit-learn
By :

"In our final model, we had XGBoost as an ensemble model, which included 20 XGBoost models, 5 random forests, 6 randomized decision tree models, 3 regularized greedy forests, 3 logistic regression models, 5 ANN models, 3 elastic net models and 1 SVM model."
– Song, Kaggle Winner
(https://hunch243.rssing.com/chan-68612493/all_p1.html)
The winning models of Kaggle competitions are rarely individual models; they are almost always ensembles. By ensembles, I do not mean boosting or bagging models, such as random forests or XGBoost, but pure ensembles that include any distinct models, including XGBoost, random forests, and others.
In this section, we will combine machine learning models into non-correlated ensembles to gain accuracy and reduce overfitting.
The Wisconsin Breast Cancer dataset, used to predict whether a patient has breast cancer, has 569 rows and 30 columns, and can be viewed at https://scikit...