Sign In Start Free Trial

Book Overview & Buying
Table Of Contents
Feedback & Rating

Mastering Machine Learning on AWS

By : Dr. Saket S.R. Mengle , Maximo Gurmendez

4.3 (8)

Mastering Machine Learning on AWS

4.3 (8)

By: Dr. Saket S.R. Mengle , Maximo Gurmendez

Overview of this book

Amazon Web Services (AWS) is constantly driving new innovations that empower data scientists to explore a variety of machine learning (ML) cloud services. This book is your comprehensive reference for learning and implementing advanced ML algorithms in AWS cloud. As you go through the chapters, you’ll gain insights into how these algorithms can be trained, tuned, and deployed in AWS using Apache Spark on Elastic Map Reduce (EMR), SageMaker, and TensorFlow. While you focus on algorithms such as XGBoost, linear models, factorization machines, and deep nets, the book will also provide you with an overview of AWS as well as detailed practical applications that will help you solve real-world problems. Every application includes a series of companion notebooks with all the necessary code to run on AWS. In the next few chapters, you will learn to use SageMaker and EMR Notebooks to perform a range of tasks, right from smart analytics and predictive modeling through to sentiment analysis. By the end of this book, you will be equipped with the skills you need to effectively handle machine learning projects and implement and evaluate algorithms on AWS.

Preface

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

Section 1: Machine Learning on AWS

Section 1: Machine Learning on AWS

Getting Started with Machine Learning for AWS

Getting Started with Machine Learning for AWS

How AWS empowers data scientists

Identifying candidate problems that can be solved using ML

The ML project life cycle

Deploying models

Summary

Exercises

Section 2: Implementing Machine Learning Algorithms at Scale on AWS

Section 2: Implementing Machine Learning Algorithms at Scale on AWS

Classifying Twitter Feeds with Naive Bayes

Classifying Twitter Feeds with Naive Bayes

Classification algorithms

Naive Bayes classifier

Classifying text with language models

Naive Bayes – pros and cons

Summary

Exercises

Predicting House Value with Regression Algorithms

Predicting House Value with Regression Algorithms

Predicting the price of houses

Understanding linear regression

Evaluating regression models

Implementing linear regression through scikit-learn

Implementing linear regression through Apache Spark

Implementing linear regression through SageMaker's Linear Learner

Understanding logistic regression

Pros and cons of linear models

Summary

Predicting User Behavior with Tree-Based Methods

Predicting User Behavior with Tree-Based Methods

Understanding decision trees

Understanding random forest algorithms

Understanding gradient-boosting algorithms

Predicting clicks on log streams

Summary

Exercises

Customer Segmentation Using Clustering Algorithms

Customer Segmentation Using Clustering Algorithms

Understanding how clustering algorithms work

Clustering with Apache Spark on EMR

Summary

Exercises

Analyzing Visitor Patterns to Make Recommendations

Analyzing Visitor Patterns to Make Recommendations

Making theme park attraction recommendations through Flickr data

Collaborative filtering

Finding recommendations through Apache Spark's ALS

Recommending attractions through SageMaker FMs

Summary

Exercises

Section 3: Deep Learning

Section 3: Deep Learning

Implementing Deep Learning Algorithms

Implementing Deep Learning Algorithms

Understanding deep learning

Applications of deep learning

Understanding deep learning algorithms

Understanding convolutional neural networks

Summary

Exercises

Implementing Deep Learning with TensorFlow on AWS

Implementing Deep Learning with TensorFlow on AWS

Introducing TensorFlow

TensorFlow as a general machine learning library

Training and serving the TensorFlow model through SageMaker

Creating a custom neural net with TensorFlow

Summary

Exercises

Image Classification and Detection with SageMaker

Image Classification and Detection with SageMaker

Introducing Amazon SageMaker for image classification

Training a deep learning model using Amazon SageMaker

Classifying images using Amazon SageMaker

Summary

Exercises

Section 4: Integrating Ready-Made AWS Machine Learning Services

Section 4: Integrating Ready-Made AWS Machine Learning Services

Working with AWS Comprehend

Working with AWS Comprehend

Introducing Amazon Comprehend

Accessing Amazon Comprehend

Named-entity recognition using Comprehend

Sentiment analysis using Comprehend

Text classification using Comprehend

Summary

Exercises

Using AWS Rekognition

Using AWS Rekognition

Introducing Amazon Rekognition

Implementing object and scene detection

Implementing facial analysis

Summary

Exercises

Building Conversational Interfaces Using AWS Lex

Building Conversational Interfaces Using AWS Lex

Introducing Amazon Lex

Building a custom chatbot using Amazon Lex

Summary

Exercises

Section 5: Optimizing and Deploying Models through AWS

Section 5: Optimizing and Deploying Models through AWS

Creating Clusters on AWS

Creating Clusters on AWS

Choosing your instance types

Distributed deep learning

Summary

Optimizing Models in Spark and SageMaker

Optimizing Models in Spark and SageMaker

The importance of model optimization

Automatic hyperparameter tuning

Hyperparameter tuning in Apache Spark

Hyperparameter tuning in SageMaker

Summary

Exercises

Tuning Clusters for Machine Learning

Tuning Clusters for Machine Learning

Introduction to the EMR architecture

Tuning EMR for different applications

Managing data pipelines with Glue

Summary

Deploying Models Built in AWS

Deploying Models Built in AWS

SageMaker model deployment

Apache Spark model deployment

Summary

Exercises

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Appendix: Getting Started with AWS

Appendix: Getting Started with AWS

Customer Reviews

4.3 (8)

5 star

75%

4 star

0

3 star

12.5%

2 star

0

1 star

12.5%

Understanding random forest algorithms

There are two main disadvantages to using decision trees. First, decision trees use algorithms that make a choice to split on an attribute based on a cost function. The decision tree algorithm is a greedy algorithm that optimizes toward a local optimum when making every decision regarding splitting the dataset into two subsets. However, it does not explore whether making a suboptimal decision while splitting over an attribute would lead to a more optimal decision tree in the future. Hence, we do not get a globally optimum tree when running this algorithm. Second, decision trees tend to overfit to the training data. For example, a small sample of observations available in the dataset may lead to a branch that provides a very high probability of a certain class event occurring. This leads to the decision trees being really good at generating...

Search

Your notes and bookmarks