Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Machine Learning with Amazon SageMaker Cookbook
  • Toc
  • feedback
Machine Learning with Amazon SageMaker Cookbook

Machine Learning with Amazon SageMaker Cookbook

By : Joshua Arvin Lat
5 (9)
close
Machine Learning with Amazon SageMaker Cookbook

Machine Learning with Amazon SageMaker Cookbook

5 (9)
By: Joshua Arvin Lat

Overview of this book

Amazon SageMaker is a fully managed machine learning (ML) service that helps data scientists and ML practitioners manage ML experiments. In this book, you'll use the different capabilities and features of Amazon SageMaker to solve relevant data science and ML problems. This step-by-step guide features 80 proven recipes designed to give you the hands-on machine learning experience needed to contribute to real-world experiments and projects. You'll cover the algorithms and techniques that are commonly used when training and deploying NLP, time series forecasting, and computer vision models to solve ML problems. You'll explore various solutions for working with deep learning libraries and frameworks such as TensorFlow, PyTorch, and Hugging Face Transformers in Amazon SageMaker. You'll also learn how to use SageMaker Clarify, SageMaker Model Monitor, SageMaker Debugger, and SageMaker Experiments to debug, manage, and monitor multiple ML experiments and deployments. Moreover, you'll have a better understanding of how SageMaker Feature Store, Autopilot, and Pipelines can meet the specific needs of data science teams. By the end of this book, you'll be able to combine the different solutions you've learned as building blocks to solve real-world ML problems.
Table of Contents (11 chapters)
close

Generating a synthetic dataset for anomaly detection experiments

In this recipe, we will generate a synthetic dataset that contains outliers or anomalies. This will enable us to perform anomaly detection experiments using algorithms such as the Random Cut Forest (RCF). If this is your first time hearing about anomaly detection, it is the identification of outliers or records that differ significantly from the rest of the records of the dataset. What's the RCF algorithm? The RCF algorithm is an unsupervised algorithm used for detecting these anomalies in the dataset.

After we have generated the synthetic dataset in this recipe, we will use the generated dataset to train and deploy an RCF model and trigger this model within an Amazon Athena query in the Invoking machine learning models with Amazon Athena using SQL queries recipe. This will enable us to tag anomalies in our dataset during the data preparation and analysis phase.

Tip

Since we will show the steps on how to...

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete