-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Hands-On Gradient Boosting with XGBoost and scikit-learn
By :

To get a better sense of how random forests work, let's build one using scikit-learn.
Let's use a random forest classifier to predict whether a user makes more or less than USD 50,000 using the census dataset we cleaned and scored in Chapter 1, Machine Learning Landscape, and revisited in Chapter 2, Decision Trees in Depth. We are going to use cross_val_score
to ensure that our test results generalize well:
The following steps build and score a random forest classifier using the census dataset:
Import pandas
, numpy
, RandomForestClassifier
, and cross_val_score
before silencing warnings:
import pandas as pd import numpy as np from sklearn.ensemble import RandomForestClassifier from sklearn.model_selection import cross_val_score import warnings warnings.filterwarnings('ignore')
Load the dataset census_cleaned.csv
and split it into X
(a predictor column) and y
(a target column):
df_census = pd.read_csv...