Book Image

Python Parallel Programming Cookbook - Second Edition

By : Giancarlo Zaccone
Book Image

Python Parallel Programming Cookbook - Second Edition

By: Giancarlo Zaccone

Overview of this book

<p>Nowadays, it has become extremely important for programmers to understand the link between the software and the parallel nature of their hardware so that their programs run efficiently on computer architectures. Applications based on parallel programming are fast, robust, and easily scalable. </p><p> </p><p>This updated edition features cutting-edge techniques for building effective concurrent applications in Python 3.7. The book introduces parallel programming architectures and covers the fundamental recipes for thread-based and process-based parallelism. You'll learn about mutex, semaphores, locks, queues exploiting the threading, and multiprocessing modules, all of which are basic tools to build parallel applications. Recipes on MPI programming will help you to synchronize processes using the fundamental message passing techniques with mpi4py. Furthermore, you'll get to grips with asynchronous programming and how to use the power of the GPU with PyCUDA and PyOpenCL frameworks. Finally, you'll explore how to design distributed computing systems with Celery and architect Python apps on the cloud using PythonAnywhere, Docker, and serverless applications. </p><p> </p><p>By the end of this book, you will be confident in building concurrent and high-performing applications in Python.</p>
Table of Contents (16 chapters)
Title Page
Dedication

Introducing Python parallel programming

Python provides many libraries and frameworks that facilitate high-performance computations. However, doing parallel programming with Python can be quite insidious due to the Global Interpreter Lock (GIL). 

In fact, the most widespread and widely used Python interpreter, CPython, is developed in the C programming language. The CPython interpreter needs GIL for thread-safe operations. The use of GIL implies that you will encounter a global lock when you attempt to access any Python objects contained within threads. And only one thread at a time can acquire the lock for a Python object or C API. 

Fortunately, things are not so serious, because, outside the realm of GIL, we can freely use parallelism. This category includes all the topics that we will discuss in the next chapters, including multiprocessing, distributed computing, and GPU computing.

So, Python is not really multithreaded. But what is a thread? What is a process? In the following sections, we will introduce these two fundamental concepts and how they are addressed by the Python programming language.

Processes and threads

Threads can be compared to light processes, in the sense that they offer advantages similar to those of processes, without, however, requiring the typical communication techniques of processes. Threads allow you to divide the main control flow of a program into multiple concurrently running control streams. Processes, by contrast, have their own addressing space and their own resources. It follows that communication between parts of code running on different processes can only take place through appropriate management mechanisms, including pipes, code FIFO, mailboxes, shared memory areas, and message passing. Threads, on the other hand, allow the creation of concurrent parts of the program, in which each part can access the same address space, variables, and constants.

The following table summarizes the main differences between threads and processes:

Threads

Processes

Share memory.

Do not share memory.

Start/change are computationally less expensive.

Start/change are computationally expensive.

Require fewer resources (light processes).

Require more computational resources.

Need synchronization mechanisms to handle data correctly.

No memory synchronization is required.

 

After this brief introduction, we can finally show how processes and threads operate.

In particular, we want to compare the serial, multithread, and multiprocess execution times of the following function, do_something, which performs some basic calculations, including building a list of integers selected randomly (a do_something.py file):

import random

def do_something(count, out_list):
for i in range(count):
out_list.append(random.random())

Next, there is the serial (serial_test.py) implementation. Let's start with the relevant imports:

from do_something import *
import time

Note the importing of the module time, which will be used to evaluate the execution time, in this instance, and the serial implementation of the do_something function. size of the list to build is equal to 10000000, while the do_something function will be executed 10 times:

if __name__ == "__main__":
start_time = time.time()
size = 10000000
n_exec = 10
for i in range(0, exec):
out_list = list()
do_something(size, out_list)

print ("List processing complete.")
end_time = time.time()
print("serial time=", end_time - start_time)

Next, we have the multithreaded implementation (multithreading_test.py).

Import the relevant libraries:

from do_something import *
import time
import threading

Note the import of the threading module in order to operate with the multithreading capabilities of Python.

Here, there is the multithreading execution of the do_something function. We will not comment in-depth on the instructions in the following code, as they will be discussed in more detail in Chapter 2Thread-Based Parallelism.

However, it should be noted in this case, too, that the length of the list is obviously the same as in the serial case, size = 10000000, while the number of threads defined is 10, threads = 10, which is also the number of times the do_something function must be executed:

if __name__ == "__main__":
start_time = time.time()
size = 10000000
threads = 10
jobs = []
for i in range(0, threads):

Note also the construction of the single thread, through the threading.Thread method:

out_list = list()
thread = threading.Thread(target=list_append(size,out_list))
jobs.append(thread)

The sequence of cycles in which we start executing threads and then stop them immediately afterwards is as follows:

    for j in jobs:
j.start()
for j in jobs:
j.join()

print ("List processing complete.")
end_time = time.time()
print("multithreading time=", end_time - start_time)

Finally, there is the multiprocessing implementation (multiprocessing_test.py).

We start by importing the necessary modules and, in particular, the multiprocessing library, whose features will be explained in-depth in Chapter 3, Process-Based Parallelism:

from do_something import *
import time
import multiprocessing

As in the previous cases, the length of the list to build, the size, and the execution number of the do_something function remain the same (procs = 10):

if __name__ == "__main__":
start_time = time.time()
size = 10000000
procs = 10
jobs = []
for i in range(0, procs):
out_list = list()

Here, the implementation of a single process through the multiprocessing.Process method call is affected as follows:

        process = multiprocessing.Process\
(target=do_something,args=(size,out_list))
jobs.append(process)

Next, the sequence of cycles in which we start executing processes and then stop them immediately afterwards is executed as follows:

    for j in jobs:
j.start()

for j in jobs:
j.join()

print ("List processing complete.")
end_time = time.time()
print("multiprocesses time=", end_time - start_time)

Then, we open the command shell and run the three functions described previously. 

Go to the folder where the functions have been copied and then type the following:

> python serial_test.py

The result, obtained on a machine with the following features—CPU Intel i7/8 GB of RAM, is as follows:

List processing complete.
serial time= 25.428767204284668

In the case of the multithreading implementation, we have the following:

> python multithreading_test.py

The output is as follows:

List processing complete.
multithreading time= 26.168917179107666

Finally, there is the multiprocessing implementation:

> python multiprocessing_test.py

Its result is as follows:

List processing complete.
multiprocesses time= 18.929869890213013

As can be seen, the results of the serial implementation (that is, using serial_test.py) are similar to those obtained with the implementation of multithreading (using multithreading_test.py) where the threads are essentially launched one after the other, giving precedence to the one over the other until the end, while we have benefits in terms of execution times using the Python multiprocessing capability (using multiprocessing_test.py).