Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Automated Machine Learning
  • Table Of Contents Toc
  • Feedback & Rating feedback
Automated Machine Learning

Automated Machine Learning

By : Adnan Masood
4.5 (15)
close
close
Automated Machine Learning

Automated Machine Learning

4.5 (15)
By: Adnan Masood

Overview of this book

Every machine learning engineer deals with systems that have hyperparameters, and the most basic task in automated machine learning (AutoML) is to automatically set these hyperparameters to optimize performance. The latest deep neural networks have a wide range of hyperparameters for their architecture, regularization, and optimization, which can be customized effectively to save time and effort. This book reviews the underlying techniques of automated feature engineering, model and hyperparameter tuning, gradient-based approaches, and much more. You'll discover different ways of implementing these techniques in open source tools and then learn to use enterprise tools for implementing AutoML in three major cloud service providers: Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform. As you progress, you’ll explore the features of cloud AutoML platforms by building machine learning models using AutoML. The book will also show you how to develop accurate models by automating time-consuming and repetitive tasks in the machine learning development lifecycle. By the end of this machine learning book, you’ll be able to build and deploy AutoML models that are not only accurate, but also increase productivity, allow interoperability, and minimize feature engineering tasks.
Table of Contents (15 chapters)
close
close
1
Section 1: Introduction to Automated Machine Learning
5
Section 2: AutoML with Cloud Platforms
12
Section 3: Applied Automated Machine Learning

How automated ML works

ML techniques work great when it comes to finding patterns in large datasets. Today, we use these techniques for anomaly detection, customer segmentation, customer churn analysis, demand forecasting, predictive maintenance, and pricing optimization, among hundreds of other use cases.

A typical ML life cycle is comprised of data collection, data wrangling, pipeline management, model retraining, and model deployment, during which data wrangling is typically the most time-consuming task.

Extracting meaningful features out of data, and then using them to build a model while finding the right algorithm and tuning the parameters, is also a very time-consuming process. Can we automate this process using the very thing we are trying to build here (meta enough?); that is, should we automate ML? Well, that is how this all started – with someone attempting to print a 3D printer using a 3D printer.

A typical data science workflow starts with a business problem (hopefully!), and it is used to either prove a hypothesis or to discover new patterns in the existing data. It requires data; the need to clean and preprocess the data, which takes an awfully large amount of time – almost as much as 80% of your total time; and "data munging" or wrangling, which includes cleaning, de-duplication, outlier analysis and removal, transforming, mapping, structuring, and enriching. Essentially, we're taming this unwieldy vivacious raw real-world data and putting it in a tame desired format for analysis and modeling so that we can gain meaningful insights from it.

Next, we must select and engineer features, which means figuring out what features are useful, and then brainstorming and working with SMEs on the importance and validity of these features. Validating how these features would work with your model, the fitness from both a technical and business perspective, and improving these features as needed is also a critical part of the feature engineering process. The feedback loop to the SME is often very important, albeit being the least emphasized part of the feature engineering pipeline. The transparency of models stems from clear features – if features such as race or gender give higher accuracy regarding your loan repayment propensity model, this does not mean it's a good idea use them. In fact, an SME would tell you – if your conscious mind hasn't – that it's a terrible idea and that you should look for more meaningful and less sexist, racist, and xenophobic features. We will discuss this further in Chapter 10, AutoML in the Enterprise, when we discuss operationalization.

Even though the task of "selecting a model family" sounds like a reality show, that is what data scientists and ML engineers do as part of their day-to-day job. Model selection is the task of picking the right model that best describes the data at hand. This involves selecting a ML model from a set of candidate models. Automated ML can give you a helping hand with this.

Hyperparameters

You will hear about hyperparameters a lot, so let's make sure you understand what they are.

Each model has its own internal and external parameters. Internal parameters (also known as model parameters, or just parameters) are the ones intrinsic to the model, such as its weight and predictor matrix, while external parameters or hyperparameters are "outside" the model itself, such as the learning rate and its number of iterations. An intuitive example can be derived from k-means, a well-understood unsupervised clustering algorithm known for its simplicity.

The k in k-means stands for the number of clusters required, and epochs (pronounced epics, as in Doctor Who is an epic show!) are used to specify the number of passes that are done over the training data. Both of these are examples of hyperparameters – that is, the parameters that are not intrinsic to the model itself. Similarly, the learning rate for training a neural network, C and sigma for support vector machines, the k number of leaves or depth of a tree, the latent factors in a matrix factorization, and the number of hidden layers in a deep neural network are all examples of hyperparameters.

Selecting the right hyperparameters has been called tuning your instrument, which is where the magic happens. In ML tribal folklore, these elusive numbers have been brandished as "nuisance parameters", to the point where proverbial statements such as "tuning is more of an art than a science" and "tuning models is like black magic" tend to discourage newcomers in the industry. Automated ML is here to change this perception by helping you choose the right hyperparameters; more on this later. Automated ML enables citizen data scientists to build, train, and deploy ML models, thus possibly disrupting the status quo.

Important note

Some consider the term "citizen data scientists" as a euphuism for non-experts, but SME and people who are curious about analytics are some of the most important people – and don't let anyone tell you otherwise.

In conclusion, from building the correct ensembles of models to preprocessing the data, selecting the right features and model family, choosing and optimizing model hyperparameters, and evaluating the results, automated ML offers algorithmic solutions that can programmatically address these challenges.

The need for automated ML

At the time of writing, Open AI's GPT-3 model has recently been announced, and it has an incredible 175 billion parameters. Due to this ever-increasing model complexity, which includes big data and an exponentially increasing number of features, we now have a necessity to not only be able to tune these parameters, but also have sophisticated, repeatable procedures in place to tweak these proverbial knobs so that they can be adjusted. This complexity makes it less accessible for citizen data scientists, business subject matter experts, and domain experts – which might sound like job security, but it is not good for business, nor for the long-term success of the field.

Also, this isn't just about the hyperparameters, but the entire pipeline and the reproducibility of the results becoming harder as the model's complexity grows, which curtails AI democratization.

Create a Note

Modal Close icon
You need to login to use this feature.
notes
bookmark search playlist font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Delete Note

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Edit Note

Modal Close icon
Write a note (max 255 characters)
Cancel
Update Note

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY