-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Hands-On Gradient Boosting with XGBoost and scikit-learn
By :

"Almost always I can find open source code for what I want to do, and my time is much better spent doing research and feature engineering."
– Owen Zhang, Kaggle Winner
Many Kagglers and data scientists have confessed to spending considerable time on research and feature engineering. In this section, we will use pandas
to engineer new columns of data.
Machine learning models are as good as the data that they train on. When data is insufficient, building a robust machine learning model is impossible.
A more revealing question is whether the data can be improved. When new data is extracted from other columns, these new columns of data are said to be engineered.
Feature engineering is the process of developing new columns of data from the original columns. The question is not whether you should...