Book Image

Python for Geeks

By : Muhammad Asif
Book Image

Python for Geeks

By: Muhammad Asif

Overview of this book

Python is a multipurpose language that can be used for multiple use cases. Python for Geeks will teach you how to advance in your career with the help of expert tips and tricks. You'll start by exploring the different ways of using Python optimally, both from the design and implementation point of view. Next, you'll understand the life cycle of a large-scale Python project. As you advance, you'll focus on different ways of creating an elegant design by modularizing a Python project and learn best practices and design patterns for using Python. You'll also discover how to scale out Python beyond a single thread and how to implement multiprocessing and multithreading in Python. In addition to this, you'll understand how you can not only use Python to deploy on a single machine but also use clusters in private as well as in public cloud computing environments. You'll then explore data processing techniques, focus on reusable, scalable data pipelines, and learn how to use these advanced techniques for network automation, serverless functions, and machine learning. Finally, you'll focus on strategizing web development design using the techniques and best practices covered in the book. By the end of this Python book, you'll be able to do some serious Python programming for large-scale complex projects.
Table of Contents (20 chapters)
1
Section 1: Python, beyond the Basics
5
Section 2: Advanced Programming Concepts
9
Section 3: Scaling beyond a Single Thread
13
Section 4: Using Python for Web, Cloud, and Network Use Cases

Case studies of using Apache Spark and PySpark

In previous sections, we covered the fundamental concepts and architecture of Apache Spark and PySpark. In this section, we will discuss two case studies for implementing two interesting and popular applications for Apache Spark.

Case study 1 – Pi (π) calculator on Apache Spark

We will calculate Pi (π) using the Apache Spark cluster that is running on our local machine. Pi is the area of a circle when its radius is 1. Before discussing the algorithm and the driver program for this application, it is important to introduce the Apache Spark setup used for this case study.

Setting up the Apache Spark cluster

In all previous code examples, we used PySpark locally installed on our machine without a cluster. For this case study, we will set up an Apache Spark cluster by using multiple virtual machines. There are many virtualization software tools available, such as VirtualBox, and any of these software tools will...