Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Machine Learning with Amazon SageMaker Cookbook
  • Toc
  • feedback
Machine Learning with Amazon SageMaker Cookbook

Machine Learning with Amazon SageMaker Cookbook

By : Joshua Arvin Lat
5 (9)
close
Machine Learning with Amazon SageMaker Cookbook

Machine Learning with Amazon SageMaker Cookbook

5 (9)
By: Joshua Arvin Lat

Overview of this book

Amazon SageMaker is a fully managed machine learning (ML) service that helps data scientists and ML practitioners manage ML experiments. In this book, you'll use the different capabilities and features of Amazon SageMaker to solve relevant data science and ML problems. This step-by-step guide features 80 proven recipes designed to give you the hands-on machine learning experience needed to contribute to real-world experiments and projects. You'll cover the algorithms and techniques that are commonly used when training and deploying NLP, time series forecasting, and computer vision models to solve ML problems. You'll explore various solutions for working with deep learning libraries and frameworks such as TensorFlow, PyTorch, and Hugging Face Transformers in Amazon SageMaker. You'll also learn how to use SageMaker Clarify, SageMaker Model Monitor, SageMaker Debugger, and SageMaker Experiments to debug, manage, and monitor multiple ML experiments and deployments. Moreover, you'll have a better understanding of how SageMaker Feature Store, Autopilot, and Pipelines can meet the specific needs of data science teams. By the end of this book, you'll be able to combine the different solutions you've learned as building blocks to solve real-world ML problems.
Table of Contents (11 chapters)
close

Converting CSV data into protobuf recordIO format

In this recipe, we will convert and serialize the synthetic data stored in CSV format into the protobuf recordIO format. With the data serialized into the protobuf recordIO format, we can take advantage of Pipe mode, where training start times will be faster as the training job streams data directly from the S3 bucket source. That said, the SageMaker algorithms may perform much better with this training file format.

Getting ready

This recipe continues from Generating a synthetic dataset for analysis and transformation.

How to do it…

In the first few steps of this recipe, we will focus on scaling and transforming the synthetic labeled dataset into a set of values between 0 and 1 using MinMaxScaler from sklearn:

  1. Navigate to the my-experiments/chapter04 directory inside your SageMaker notebook instance. Feel free to create this directory if it does not exist yet.
  2. Create a new notebook using the conda_python3...

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete