Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Big Data Forensics: Learning Hadoop Investigations
  • Table Of Contents Toc
  • Feedback & Rating feedback
Big Data Forensics: Learning Hadoop Investigations

Big Data Forensics: Learning Hadoop Investigations

By : Joe Sremack
5 (3)
close
close
Big Data Forensics: Learning Hadoop Investigations

Big Data Forensics: Learning Hadoop Investigations

5 (3)
By: Joe Sremack

Overview of this book

Big Data forensics is an important type of digital investigation that involves the identification, collection, and analysis of large-scale Big Data systems. Hadoop is one of the most popular Big Data solutions, and forensically investigating a Hadoop cluster requires specialized tools and techniques. With the explosion of Big Data, forensic investigators need to be prepared to analyze the petabytes of data stored in Hadoop clusters. Understanding Hadoop’s operational structure and performing forensic analysis with court-accepted tools and best practices will help you conduct a successful investigation. Discover how to perform a complete forensic investigation of large-scale Hadoop clusters using the same tools and techniques employed by forensic experts. This book begins by taking you through the process of forensic investigation and the pitfalls to avoid. It will walk you through Hadoop's internals and architecture, and you will discover what types of information Hadoop stores and how to access that data. You will learn to identify Big Data evidence using techniques to survey a live system and interview witnesses. After setting up your own Hadoop system, you will collect evidence using techniques such as forensic imaging and application-based extractions. You will analyze Hadoop evidence using advanced tools and techniques to uncover events and statistical information. Finally, data visualization and evidence presentation techniques are covered to help you properly communicate your findings to any audience.
Table of Contents (10 chapters)
close
close
9
Index

Validating application collections


Collecting application data requires a different form of information for validation. Validating a collection involves proving the following:

  • The collection was performed correctly and completely

  • The collected data is a replica of the source system's data

Unlike file-based collection methods, record-based collections compiled through an application are not typically validated with hash values. Hash values are useful for proving that data was not modified and that the collection was performed correctly. Supplemental information (for example, collection logs) is used to prove that the collection was performed completely. The use of hash values, however, is not always appropriate for application collections. There are several reasons why hash values are not used to validate application collections:

  • It is not necessary to calculate hash values because of the absence of metadata or other artifacts that are collected

  • Large data volumes make computing hash values infeasible...

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY