Sign In Start Free Trial

Book Overview & Buying
Table Of Contents
Feedback & Rating

Hands-On Deep Learning with Apache Spark

By : Iozzia

Hands-On Deep Learning with Apache Spark

By: Iozzia

Overview of this book

Deep learning is a subset of machine learning where datasets with several layers of complexity can be processed. Hands-On Deep Learning with Apache Spark addresses the sheer complexity of technical and analytical parts and the speed at which deep learning solutions can be implemented on Apache Spark. The book starts with the fundamentals of Apache Spark and deep learning. You will set up Spark for deep learning, learn principles of distributed modeling, and understand different types of neural nets. You will then implement deep learning models, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs), and long short-term memory (LSTM) on Spark. As you progress through the book, you will gain hands-on experience of what it takes to understand the complex datasets you are dealing with. During the course of this book, you will use popular deep learning frameworks, such as TensorFlow, Deeplearning4j, and Keras to train your distributed models. By the end of this book, you'll have gained experience with the implementation of your models on a variety of use cases.

Preface

Preface

Who this book is for

What this book covers

To get the most out of this book

Get in touch

Free Chapter

The Apache Spark Ecosystem

The Apache Spark Ecosystem

Apache Spark fundamentals

Getting Spark

RDD programming

Spark SQL, Datasets, and DataFrames

Spark Streaming

Cluster mode using different managers

Summary

Deep Learning Basics

Deep Learning Basics

Introducing DL

DNNs overview

Practical applications of DL

Summary

Extract, Transform, Load

Extract, Transform, Load

Training data ingestion through Spark

Training data ingestion from a database with Spark

Data ingestion from S3

Raw data transformation with Spark

Summary

Streaming

Streaming

Streaming data with Apache Spark

Streaming data with Kafka and Spark

Streaming data with DL4J and Spark

Summary

Convolutional Neural Networks

Convolutional Neural Networks

Convolutional layers

Pooling layers

Fully connected layers

Weights

GoogleNet Inception V3 model

Hands-on CNN with Spark

Summary

Recurrent Neural Networks

Recurrent Neural Networks

LSTM

Use cases

Hands-on RNNs with Spark

Summary

Training Neural Networks with Spark

Training Neural Networks with Spark

Distributed network training with Spark and DeepLearning4j

Hyperparameter optimization

Summary

Monitoring and Debugging Neural Network Training

Monitoring and Debugging Neural Network Training

Monitoring and debugging neural networks during their training phases

Summary

Interpreting Neural Network Output

Interpreting Neural Network Output

Evaluation techniques with DL4J

Other types of evaluation

Summary

Deploying on a Distributed System

Deploying on a Distributed System

Setup of a distributed environment with DeepLearning4j

Spark distributed training architecture details

Importing Python models into the JVM with DL4J

Alternatives to DL4J for the Scala programming language

Summary

NLP Basics

NLP Basics

NLP

Hands-on NLP with Spark

Summary

Textual Analysis and Deep Learning

Textual Analysis and Deep Learning

Hands-on NLP with DL4J

Hands-on NLP with TensorFlow

Hand-on NLP with Keras and a TensorFlow backend

Hands-on NLP with Keras model import into DL4J

Summary

Convolution

Convolution

Convolution

Object recognition strategies

Convolution applied to image recognition

Summary

Image Classification

Image Classification

Implementing an end-to-end image classification web application

Summary

What's Next for Deep Learning?

What's Next for Deep Learning?

What to expect next for deep learning and AI

Topics to watch for

Is Spark ready for RL?

DeepLearning4J future support for GANs

Summary

Other Books You May Enjoy

Other Books You May Enjoy

Leave a review - let other readers know what you think

Appendix A: Functional Programming in Scala

Appendix A: Functional Programming in Scala

Functional programming (FP)

Appendix B: Image Data Preparation for Spark

Appendix B: Image Data Preparation for Spark

Image preprocessing

Customer Reviews

5 star

0

4 star

0

3 star

0

2 star

0

1 star

0

Apache Spark fundamentals

This section covers the Apache Spark fundamentals. It is important to become very familiar with the concepts that are presented here before moving on to the next chapters, where we'll be exploring the available APIs.

As mentioned in the introduction to this chapter, the Spark engine processes data in distributed memory across the nodes of a cluster. The following diagram shows the logical structure of how a typical Spark job processes information:

Figure 1.1

Spark executes a job in the following way:

Figure 1.2

The Master controls how data is partitioned and takes advantage of data locality while keeping track of all the distributed data computation on the Slave machines. If a certain Slave machine becomes unavailable, the data on that machine is reconstructed on another available machine(s). In standalone mode, the Master is a single point of failure. This chapter's Cluster mode using different managers section covers the possible running modes and explains fault tolerance in Spark.

Spark comes with five major components:

Figure 1.3

These components are as follows:

The core engine.
Spark SQL: A module for structured data processing.
Spark Streaming: This extends the core Spark API. It allows live data stream processing. Its strengths include scalability, high throughput, and fault tolerance.
MLib: The Spark machine learning library.
GraphX: Graphs and graph-parallel computation algorithms.

Spark can access data that's stored in different systems, such as HDFS, Cassandra, MongoDB, relational databases, and also cloud storage services such as Amazon S3 and Azure Data Lake Storage.

Search

Your notes and bookmarks