Sign In Start Free Trial
Account

Add to playlist

Create a Playlist

Modal Close icon
You need to login to use this feature.
  • Book Overview & Buying Essential Statistics for Non-STEM Data Analysts
  • Table Of Contents Toc
  • Feedback & Rating feedback
Essential Statistics for Non-STEM Data Analysts

Essential Statistics for Non-STEM Data Analysts

By : Li
4.6 (10)
close
close
Essential Statistics for Non-STEM Data Analysts

Essential Statistics for Non-STEM Data Analysts

4.6 (10)
By: Li

Overview of this book

Statistics remain the backbone of modern analysis tasks, helping you to interpret the results produced by data science pipelines. This book is a detailed guide covering the math and various statistical methods required for undertaking data science tasks. The book starts by showing you how to preprocess data and inspect distributions and correlations from a statistical perspective. You’ll then get to grips with the fundamentals of statistical analysis and apply its concepts to real-world datasets. As you advance, you’ll find out how statistical concepts emerge from different stages of data science pipelines, understand the summary of datasets in the language of statistics, and use it to build a solid foundation for robust data products such as explanatory models and predictive models. Once you’ve uncovered the working mechanism of data science algorithms, you’ll cover essential concepts for efficient data collection, cleaning, mining, visualization, and analysis. Finally, you’ll implement statistical methods in key machine learning tasks such as classification, regression, tree-based methods, and ensemble learning. By the end of this Essential Statistics for Non-STEM Data Analysts book, you’ll have learned how to build and present a self-contained, statistics-backed data product to meet your business goals.
Table of Contents (19 chapters)
close
close
1
Section 1: Getting Started with Statistics for Data Science
5
Section 2: Essentials of Statistical Analysis
10
Section 3: Statistics for Machine Learning
15
Section 4: Appendix

Basic examples with the Python Matplotlib package

In this chapter, we will start with the most basic functionalities of the Matplotlib package. Let's first understand the elements to make a perfect statistical graph.

Elements of a statistical graph

Before we dive into Python code, l will give you an overview of how to decompose the components of a statistical graph. I personally think the philosophy that embeds the R ggplot2 package is very concise and clear.

Note

R is another famous programming language for data science and statistical analysis. There are also successful R packages. The counterpart of Matplotlib is the R ggplot2 package mentioned previously.

ggplot2 is a very successful visualization tool developed by Hadley Wickman. It decomposes a statistical plot into the following three components:

  • Data: The data must have the information to display; otherwise, the plotting becomes totally misleading. The data can be transformed, such as with categorization...

Unlock full access

Continue reading for free

A Packt free trial gives you instant online access to our library of over 7000 practical eBooks and videos, constantly updated with the latest in tech
bookmark search playlist download font-size

Change the font size

margin-width

Change margin width

day-mode

Change background colour

Close icon Search
Country selected

Close icon Your notes and bookmarks

Delete Bookmark

Modal Close icon
Are you sure you want to delete it?
Cancel
Yes, Delete

Confirmation

Modal Close icon
claim successful

Buy this book with your credits?

Modal Close icon
Are you sure you want to buy this book with one of your credits?
Close
YES, BUY