For correct and accurate predictions calculated with machine learning models, the incoming data should be presented in the ideal format. The ideal format means that all values are present in a dataset, numerical data is used in numerical features and not categories or labels, or the distribution of features is even (Gaussian). However, many presumptions are not always true in the real world. For this reason, after basic transformations, such as joining or merging data, are done, we should undertake statistical research that shows the real format of data. Based on statistical research, we will know the difference between the ideal and real format of incoming data. This section will describe techniques used to transform data from its real format to its ideal, comparable, and meaningful format.

Hands-On Data Science with SQL Server 2017
By :

Hands-On Data Science with SQL Server 2017
By:
Overview of this book
SQL Server is a relational database management system that enables you to cover end-to-end data science processes using various inbuilt services and features.
Hands-On Data Science with SQL Server 2017 starts with an overview of data science with SQL to understand the core tasks in data science. You will learn intermediate-to-advanced level concepts to perform analytical tasks on data using SQL Server. The book has a unique approach, covering best practices, tasks, and challenges to test your abilities at the end of each chapter. You will explore the ins and outs of performing various key tasks such as data collection, cleaning, manipulation, aggregations, and filtering techniques. As you make your way through the chapters, you will turn raw data into actionable insights by wrangling and extracting data from databases using T-SQL. You will get to grips with preparing and presenting data in a meaningful way, using Power BI to reveal hidden patterns. In the concluding chapters, you will work with SQL Server integration services to transform data into a useful format and delve into advanced examples covering machine learning concepts such as predictive analytics using real-world examples.
By the end of this book, you will be in a position to handle the growing amounts of data and perform everyday activities that a data science professional performs.
Table of Contents (14 chapters)
Preface
Data Science Overview
SQL Server 2017 as a Data Science Platform
Data Sources for Analytics
Data Transforming and Cleaning with T-SQL
Data Exploration and Statistics with T-SQL
Custom Aggregations on SQL Server
Data Visualization
Data Transformations with Other Tools
Predictive Model Training and Evaluation
Making Predictions
Getting It All Together - A Real-World Example
Next Steps with Data Science and SQL
Other Books You May Enjoy
How would like to rate this book
Customer Reviews