Book Image

Machine Learning with LightGBM and Python

By : Andrich van Wyk
3 (1)
Book Image

Machine Learning with LightGBM and Python

3 (1)
By: Andrich van Wyk

Overview of this book

Machine Learning with LightGBM and Python is a comprehensive guide to learning the basics of machine learning and progressing to building scalable machine learning systems that are ready for release. This book will get you acquainted with the high-performance gradient-boosting LightGBM framework and show you how it can be used to solve various machine-learning problems to produce highly accurate, robust, and predictive solutions. Starting with simple machine learning models in scikit-learn, you’ll explore the intricacies of gradient boosting machines and LightGBM. You’ll be guided through various case studies to better understand the data science processes and learn how to practically apply your skills to real-world problems. As you progress, you’ll elevate your software engineering skills by learning how to build and integrate scalable machine-learning pipelines to process data, train models, and deploy them to serve secure APIs using Python tools such as FastAPI. By the end of this book, you’ll be well equipped to use various -of-the-art tools that will help you build production-ready systems, including FLAML for AutoML, PostgresML for operating ML pipelines using Postgres, high-performance distributed training and serving via Dask, and creating and running models in the Cloud with AWS Sagemaker.
Table of Contents (17 chapters)
1
Part 1: Gradient Boosting and LightGBM Fundamentals
6
Part 2: Practical Machine Learning with LightGBM
10
Part 3: Production-ready Machine Learning with LightGBM

To get the most out of this book

This book is written assuming that you have some knowledge of Python programming. None of the Python code is very complex, so even understanding the basics of Python should be enough to get you through most of the code examples.

Jupyter notebooks are used for the practical examples in all the chapters. Jupyter Notebooks is an open source tool that allows you to create code notebooks that contain live code, visualizations, and markdown text. Tutorials to get started with Jupyter Notebooks are available at https://realpython.com/jupyter-notebook-introduction/ and at https://plotly.com/python/ipython-notebook-tutorial/.

Software/hardware covered in the book

Operating system requirements

Python 3.10

Windows, macOS, or Linux

Anaconda 3

Windows, macOS, or Linux

scikit-learn 1.2.1

Windows, macOS, or Linux

LightGBM 3.3.5

Windows, macOS, or Linux

XGBoost 1.7.4

Windows, macOS, or Linux

Optuna 3.1.1

Windows, macOS, or Linux

FLAML 1.2.3

Windows, macOS, or Linux

FastAPI 0.103.1

Windows, macOS, or Linux

Amazon SageMaker

Docker 23.0.1

Windows, macOS, or Linux

PostgresML 2.7.0

Windows, macOS, or Linux

Dask 2023.7.1

Windows, macOS, or Linux

We recommend using Anaconda for Python environment management when setting up your own environment. Anaconda also bundles many data science packages, so you don’t have to install them individually. Anaconda can be downloaded from https://www.anaconda.com/download. Notably, the book is accompanied by a GitHub repository, which includes an Anaconda environment file, to create the environment required to run the code examples in this book.

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.