Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Getting Started with Amazon SageMaker Studio
  • Table Of Contents Toc
  • Feedback & Rating feedback
Getting Started with Amazon SageMaker Studio

Getting Started with Amazon SageMaker Studio

By : Michael Hsieh
4.8 (13)
close
close
Getting Started with Amazon SageMaker Studio

Getting Started with Amazon SageMaker Studio

4.8 (13)
By: Michael Hsieh

Overview of this book

Amazon SageMaker Studio is the first integrated development environment (IDE) for machine learning (ML) and is designed to integrate ML workflows: data preparation, feature engineering, statistical bias detection, automated machine learning (AutoML), training, hosting, ML explainability, monitoring, and MLOps in one environment. In this book, you'll start by exploring the features available in Amazon SageMaker Studio to analyze data, develop ML models, and productionize models to meet your goals. As you progress, you will learn how these features work together to address common challenges when building ML models in production. After that, you'll understand how to effectively scale and operationalize the ML life cycle using SageMaker Studio. By the end of this book, you'll have learned ML best practices regarding Amazon SageMaker Studio, as well as being able to improve productivity in the ML development life cycle and build and deploy models easily for your ML use cases.
Table of Contents (16 chapters)
close
close
1
Part 1 – Introduction to Machine Learning on Amazon SageMaker Studio
4
Part 2 – End-to-End Machine Learning Life Cycle with SageMaker Studio
11
Part 3 – The Production and Operation of Machine Learning with SageMaker Studio

Applying transformation

You can easily apply data transformation using SageMaker Data Wrangler because there are numerous built-in transformations you can use out of the box without any coding. So far, we have observed the following from the analyses that we need to handle next in order to build up an ML dataset:

  • Missing data in some features.
  • The Churn? column is now in string format with True. and False. as values.
  • Redundant CustomerID_* columns after joins.
  • Features that are not providing predictive power, including but not limited to Phone, VMail Plan, and Int'l Plan.

We also would like to perform the following transformations for ML purposes because we want to train an XGBoost model to predict the Churn? status afterwards.

  • Encoding categorical variables, that is, State and Area Code features.

Let's get started:

  1. In the Data Flow tab, click on the plus sign next to the 2nd Join node, and select Add transform. You should...

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY