-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Extending Excel with Python and R
By :

In this section, we create a simple ML model in Python. Python has grown to be the primary go-to language for ML work (with R as the obvious alternative) and the number of packages implementing ML algorithms is difficult to overestimate. Having said that, sklearn
remains the most widely used so we will also choose it for this section. Similarly to the R part of the chapter, we will use the xgboost
model because it has a great balance between performance and explainability.
We will use the data loaded in the previous section.
The first thing to do for the modeling phase is to prepare the data. Fortunately, sklearn
comes with a preprocessing functionality built-in!
Let’s review the steps involved in data preprocessing:
sklearn
provides methods for imputing missing values or removing rows...