-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

The Data Science Workshop
By :

A model is said to overfit the training data when it generates a hypothesis that accounts for every example. What this means is that it correctly predicts the outcome of every example. The problem with this scenario is that the model equation becomes extremely complex, and such models have been observed to be incapable of correctly predicting new observations.
Overfitting occurs when a model has been over-engineered. Two of the ways in which this could occur are:
We'll discuss each of these two points in the following sections.
When a model trains on too many features, the hypothesis becomes extremely complicated. Consider a case in which you have one column of features and you need to generate a hypothesis. This would be a simple linear equation, as shown here:
Figure 7.1: Equation for a hypothesis for a line...