12. Scalable Frameworks | Scala for Machine Learning

Sign In Start Free Trial

Book Overview & Buying
Table Of Contents
Feedback & Rating

Scala for Machine Learning

By : R. Nicolas

3.8 (12)

Scala for Machine Learning

3.8 (12)

By: R. Nicolas

Overview of this book

Are you curious about AI? All you need is a good understanding of the Scala programming language, a basic knowledge of statistics, a keen interest in Big Data processing, and this book!

Preface

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Free Chapter

1. Getting Started

1. Getting Started

Mathematical notation for the curious

Why machine learning?

Why Scala?

Model categorization

Taxonomy of machine learning algorithms

Tools and frameworks

Source code

Let's kick the tires

Summary

2. Hello World!

2. Hello World!

Modeling

Designing a workflow

Assessing a model

Summary

3. Data Preprocessing

3. Data Preprocessing

Time series

Moving averages

Fourier analysis

The Kalman filter

Alternative preprocessing techniques

Summary

4. Unsupervised Learning

4. Unsupervised Learning

Clustering

Dimension reduction

Performance considerations

Summary

5. Naïve Bayes Classifiers

5. Naïve Bayes Classifiers

Probabilistic graphical models

Naïve Bayes classifiers

Multivariate Bernoulli classification

Naïve Bayes and text mining

Pros and cons

Summary

6. Regression and Regularization

6. Regression and Regularization

Linear regression

Regularization

Numerical optimization

The logistic regression

Summary

7. Sequential Data Models

7. Sequential Data Models

Markov decision processes

The hidden Markov model (HMM)

Conditional random fields

CRF and text analytics

Comparing CRF and HMM

Performance consideration

Summary

8. Kernel Models and Support Vector Machines

8. Kernel Models and Support Vector Machines

Kernel functions

The support vector machine (SVM)

Support vector classifier (SVC)

Anomaly detection with one-class SVC

Support vector regression (SVR)

Performance considerations

Summary

9. Artificial Neural Networks

9. Artificial Neural Networks

Feed-forward neural networks (FFNN)

The multilayer perceptron (MLP)

Evaluation

Benefits and limitations

Summary

10. Genetic Algorithms

10. Genetic Algorithms

Evolution

Genetic algorithms and machine learning

Genetic algorithm components

Implementation

GA for trading strategies

Advantages and risks of genetic algorithms

Summary

11. Reinforcement Learning

11. Reinforcement Learning

Introduction

Learning classifier systems

Summary

12. Scalable Frameworks

12. Scalable Frameworks

Overview

Scala

Scalability with Actors

Akka

Apache Spark

Summary

A. Basic Concepts

A. Basic Concepts

Scala programming

Mathematics

Finances 101

Suggested online courses

References

Index

Index

Customer Reviews

3.8 (12)

5 star

41.7%

4 star

16.7%

3 star

25%

2 star

8.3%

1 star

8.3%

Apache Spark

Apache Spark is a fast and general-purpose cluster computing system, initially developed as AMPLab/UC Berkeley as part of the Berkeley Data Analytics Stack (BDAS) (http://en.wikipedia.org/wiki/UC_Berkeley). It provides high-level APIs for the following programming languages that make large and concurrent parallel jobs easy to write and deploy [12:11]:

Scala: http://spark.apache.org/docs/latest/api/scala/index.html
Java: http://spark.apache.org/docs/latest/api/java/index.html
Python: http://spark.apache.org/docs/latest/api/python/index.html

Note

The link to the latest information

The URLs as any reference to Apache Spark may change in future versions.

The core element of Spark is a resilient distributed dataset (RDD), which is a collection of elements partitioned across the nodes of a cluster and/or CPU cores of servers. An RDD can be created from a local data structure such as a list, array, or hash table, from the local filesystem or the Hadoop distributed file system (HDFS...

Search

Your notes and bookmarks