Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Scalable Data Streaming with Amazon Kinesis
  • Toc
  • feedback
Scalable Data Streaming with Amazon Kinesis

Scalable Data Streaming with Amazon Kinesis

By : Makota, Brian Maguire, Gagne, Chakrabarti
5 (4)
close
Scalable Data Streaming with Amazon Kinesis

Scalable Data Streaming with Amazon Kinesis

5 (4)
By: Makota, Brian Maguire, Gagne, Chakrabarti

Overview of this book

Amazon Kinesis is a collection of secure, serverless, durable, and highly available purpose-built data streaming services. This data streaming service provides APIs and client SDKs that enable you to produce and consume data at scale. Scalable Data Streaming with Amazon Kinesis begins with a quick overview of the core concepts of data streams, along with the essentials of the AWS Kinesis landscape. You'll then explore the requirements of the use case shown through the book to help you get started and cover the key pain points encountered in the data stream life cycle. As you advance, you'll get to grips with the architectural components of Kinesis, understand how they are configured to build data pipelines, and delve into the applications that connect to them for consumption and processing. You'll also build a Kinesis data pipeline from scratch and learn how to implement and apply practical solutions. Moving on, you'll learn how to configure Kinesis on a cloud platform. Finally, you’ll learn how other AWS services can be integrated into Kinesis. These services include Redshift, Dynamo Database, AWS S3, Elastic Search, and third-party applications such as Splunk. By the end of this AWS book, you’ll be able to build and deploy your own Kinesis data pipelines with Kinesis Data Streams (KDS), Kinesis Data Firehose (KFH), Kinesis Video Streams (KVS), and Kinesis Data Analytics (KDA).
Table of Contents (13 chapters)
close
1
Section 1: Introduction to Data Streaming and Amazon Kinesis
5
Section 2: Deep Dive into Kinesis
10
Section 3: Integrations

Understanding data format conversion in KDF

KDF allows the conversion of incoming data from JSON to either Apache Parquet (Parquet) or Apache ORC (ORC) format. Parquet and ORC are popular columnar formats as opposed to JSON or Comma Separated Values (CSV), which are row formats. Columnar formats provide several advantages for storage and faster querying compared to row formats, especially in big-data use cases. In row formats, data for all columns in a row is stored together, which means that when querying a subset of columns, the data for all columns needs to be read and the unneeded columns filtered out. In columnar formats, data is stored by columns. This provides the ability to only retrieve data for the columns specified. This results in less data scanned for returning query results, and more sequential reads, resulting in better performance. In addition, since data in a column tends to be similar, columnar formats allow for better compression as well. This results in space saving...

bookmark search playlist font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete