Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Serverless Design Patterns and Best Practices
  • Toc
  • feedback
Serverless Design Patterns and Best Practices

Serverless Design Patterns and Best Practices

By : Zambrano
5 (1)
close
Serverless Design Patterns and Best Practices

Serverless Design Patterns and Best Practices

5 (1)
By: Zambrano

Overview of this book

Serverless applications handle many problems that developers face when running systems and servers. The serverless pay-per-invocation model can also result in drastic cost savings, contributing to its popularity. While it's simple to create a basic serverless application, it's critical to structure your software correctly to ensure it continues to succeed as it grows. Serverless Design Patterns and Best Practices presents patterns that can be adapted to run in a serverless environment. You will learn how to develop applications that are scalable, fault tolerant, and well-tested. The book begins with an introduction to the different design pattern categories available for serverless applications. You will learn thetrade-offs between GraphQL and REST and how they fare regarding overall application design in a serverless ecosystem. The book will also show you how to migrate an existing API to a serverless backend using AWS API Gateway. You will learn how to build event-driven applications using queuing and streaming systems, such as AWS Simple Queuing Service (SQS) and AWS Kinesis. Patterns for data-intensive serverless application are also explained, including the lambda architecture and MapReduce. This book will equip you with the knowledge and skills you need to develop scalable and resilient serverless applications confidently.
Table of Contents (12 chapters)
close

Processing Enron emails with serverless MapReduce


I've based our example application on the Enron email corpus, which is publicly available on Kaggle. This data is made up of some 500,000 emails from the Enron corporation. In total, this dataset is approximately 1.5 GB. What we will be doing is counting the number of From-To emails. That is, for each person who sent an email, we will generate a count of the number of times they sent to a particular person.

Note

Anyone may download and work with this dataset: https://www.kaggle.com/wcukierski/enron-email-dataset. The original data from Kaggle comes as a single file in CSV format. To make this data work with this example MapReduce program, I broke the single ~1.4 GB file into roughly 100 MB chunks. During this example, it's important to remember that we are starting from 14 separate files on S3.

The data format in our dataset is a CSV with two columns, the first being the email message location (on the mail server, presumably) and the second...

bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete