-
Book Overview & Buying
-
Table Of Contents
-
Feedback & Rating

Machine Learning with Amazon SageMaker Cookbook
By :

In this recipe, we will generate a synthetic dataset similar to what is shown in the following screenshot. Here, we will emulate a sample scenario where a school is planning to build a binary classifier model that automatically approves or rejects candidates for a scholarship based on certain attributes, such as their scores for the math, science, and technology exams:
Figure 7.2 – Synthetic dataset
This dataset will have four primary predictor columns called sex
, math
, science
, and technology
, along with two columns called random1
and random2
that contain random values. These columns will help us verify whether the feature importance report generated by SageMaker Clarify in the Enabling ML explainability with SageMaker Clarify recipe is working or not. In addition to these, the generated dataset will have two additional columns, called event_time
and index
, as...