Chapter 4: Preparing, Processing, and Analyzing the Data | Machine Learning with Amazon SageMaker Cookbook

Book Overview & Buying
Table Of Contents
Feedback & Rating

Machine Learning with Amazon SageMaker Cookbook

By : Joshua Arvin Lat

5 (9)

Buy this Book

Machine Learning with Amazon SageMaker Cookbook

5 (9)

By: Joshua Arvin Lat

Buy this Book

Overview of this book

Amazon SageMaker is a fully managed machine learning (ML) service that helps data scientists and ML practitioners manage ML experiments. In this book, you'll use the different capabilities and features of Amazon SageMaker to solve relevant data science and ML problems. This step-by-step guide features 80 proven recipes designed to give you the hands-on machine learning experience needed to contribute to real-world experiments and projects. You'll cover the algorithms and techniques that are commonly used when training and deploying NLP, time series forecasting, and computer vision models to solve ML problems. You'll explore various solutions for working with deep learning libraries and frameworks such as TensorFlow, PyTorch, and Hugging Face Transformers in Amazon SageMaker. You'll also learn how to use SageMaker Clarify, SageMaker Model Monitor, SageMaker Debugger, and SageMaker Experiments to debug, manage, and monitor multiple ML experiments and deployments. Moreover, you'll have a better understanding of how SageMaker Feature Store, Autopilot, and Pipelines can meet the specific needs of data science teams. By the end of this book, you'll be able to combine the different solutions you've learned as building blocks to solve real-world ML problems.

Preface

Who this book is for

What this book covers

To get the most out of this book

Download the example code files

Code in Action

Download the color images

Conventions used

Sections

Get in touch

Share Your Thoughts

Chapter 1: Getting Started with Machine Learning Using Amazon SageMaker

Technical requirements

Launching an Amazon SageMaker Notebook Instance

Checking the versions of the SageMaker Python SDK and the AWS CLI

Preparing the Amazon S3 bucket and the training dataset for the linear regression experiment

Visualizing and understanding your data in Python

Training your first model in Python

Loading a linear learner model with Apache MXNet in Python

Evaluating the model in Python

Deploying your first model in Python

Invoking an Amazon SageMaker model endpoint with the SageMakerRuntime client from boto3

Free Chapter

Chapter 2: Building and Using Your Own Algorithm Container Image

Technical requirements

Launching and preparing the Cloud9 environment

Setting up the Python and R experimentation environments

Preparing and testing the train script in Python

Preparing and testing the serve script in Python

Building and testing the custom Python algorithm container image

Pushing the custom Python algorithm container image to an Amazon ECR repository

Using the custom Python algorithm container image for training and inference with Amazon SageMaker Local Mode

Preparing and testing the train script in R

Preparing and testing the serve script in R

Building and testing the custom R algorithm container image

Pushing the custom R algorithm container image to an Amazon ECR repository

Using the custom R algorithm container image for training and inference with Amazon SageMaker Local Mode

Chapter 3: Using Machine Learning and Deep Learning Frameworks with Amazon SageMaker

Technical requirements

Preparing the SageMaker notebook instance for multiple deep learning local experiments

Generating a synthetic dataset for deep learning experiments

Preparing the entrypoint TensorFlow and Keras training script

Training and deploying a TensorFlow and Keras model with the SageMaker Python SDK

Preparing the entrypoint PyTorch training script

Preparing the entrypoint PyTorch inference script

Training and deploying a PyTorch model with the SageMaker Python SDK

Preparing the entrypoint scikit-learn training script

Training and deploying a scikit-learn model with the SageMaker Python SDK

Debugging disk space issues when using local mode

Debugging container execution issues when using local mode

Chapter 4: Preparing, Processing, and Analyzing the Data

Technical requirements

Generating a synthetic dataset for anomaly detection experiments

Training and deploying an RCF model

Invoking machine learning models with Amazon Athena using SQL queries

Analyzing data with Amazon Athena in Python

Generating a synthetic dataset for analysis and transformation

Performing dimensionality reduction with the built-in PCA algorithm

Performing cluster analysis with the built-in KMeans algorithm

Converting CSV data into protobuf recordIO format

Training a KNN model using the protobuf recordIO training input type

Preparing the SageMaker Processing prerequisites using the AWS CLI

Managed data processing with SageMaker Processing in Python

Managed data processing with SageMaker Processing in R

Chapter 5: Effectively Managing Machine Learning Experiments

Technical requirements

Synthetic data generation for classification problems

Identifying issues with SageMaker Debugger

Inspecting SageMaker Debugger logs and results

Running and managing multiple experiments with SageMaker Experiments

Experiment analytics with SageMaker Experiments

Inspecting experiments, trials, and trial components with SageMaker Experiments

Chapter 6: Automated Machine Learning in Amazon SageMaker

Technical requirements

Onboarding to SageMaker Studio

Generating a synthetic dataset with additional columns containing random values

Creating and monitoring a SageMaker Autopilot experiment in SageMaker Studio (console)

Creating and monitoring a SageMaker Autopilot experiment using the SageMaker Python SDK

Inspecting the SageMaker Autopilot experiment's results and artifacts

Performing Automatic Model Tuning with the SageMaker XGBoost built-in algorithm

Analyzing the Automatic Model Tuning job results

Chapter 7: Working with SageMaker Feature Store, SageMaker Clarify, and SageMaker Model Monitor

Technical requirements

Generating a synthetic dataset and using SageMaker Feature Store for storage and management

Querying data from the offline store of SageMaker Feature Store and uploading it to Amazon S3

Detecting pre-training bias with SageMaker Clarify

Detecting post-training bias with SageMaker Clarify

Enabling ML explainability with SageMaker Clarify

Deploying an endpoint from a model and enabling data capture with SageMaker Model Monitor

Baselining and scheduled monitoring with SageMaker Model Monitor

Chapter 8: Solving NLP, Image Classification, and Time-Series Forecasting Problems with Built-in Algorithms

Technical requirements

Generating a synthetic dataset for text classification problems

Preparing the test dataset for batch transform inference jobs

Training and deploying a BlazingText model

Using Batch Transform for inference

Preparing the datasets for image classification using the Apache MXNet Vision Datasets classes

Training and deploying an image classifier using the built-in Image Classification Algorithm in SageMaker

Generating a synthetic time series dataset

Performing the train-test split on a time series dataset

Training and deploying a DeepAR model

Performing probabilistic forecasting with a deployed DeepAR model

Chapter 9: Managing Machine Learning Workflows and Deployments

Technical requirements

Working with Hugging Face models

Preparing the prerequisites of a multi-model endpoint deployment

Hosting multiple models with multi-model endpoints

Setting up A/B testing on multiple models with production variants

Preparing the Step Functions execution role

Managing ML workflows with AWS Step Functions and the Data Science SDK

Managing ML workflows with SageMaker Pipelines

Why subscribe?

Other Books You May Enjoy

Packt is searching for authors like you

Share Your Thoughts

Customer Reviews

5 (9)

5 star

100%

4 star

3 star

2 star

1 star

Machine Learning with Amazon SageMaker Cookbook

By : Joshua Arvin Lat

Machine Learning with Amazon SageMaker Cookbook

By: Joshua Arvin Lat

Overview of this book

Converting CSV data into protobuf recordIO format

Getting ready

How to do it…

Unlock full access

Continue reading for free

Machine Learning with Amazon SageMaker Cookbook

By : Joshua Arvin Lat

Machine Learning with Amazon SageMaker Cookbook

By: Joshua Arvin Lat

Overview of this book

Converting CSV data into protobuf recordIO format

Getting ready

How to do it…

Unlock full access

Continue reading for free

Create a Note

Delete Bookmark

Delete Note

Edit Note

Confirmation

Buy this book with your credits?