Multiprocessing in Python

Prem Vishnoi(cloudvala)
3 min readSep 17, 2023

Let’s discuss what multiprocessing is in Python and how it works.

We’ll cover the following

Overview

The multiprocessing module was added to Python in version 2.6. It was originally defined in PEP 371 by Jesse Noller and Richard Oudkerk. The multiprocessing module allows you to spawn processes in much the same manner as you can spawn threads with the threading module. The idea here is that because you are now spawning processes, you can avoid theGlobal Interpreter Lock (GIL) and take full advantages of multiple processors on a machine.

The multiprocessing package also includes some APIs that are not in the threading module at all. For example, there is a neat Pool class that you can use to parallelize executing a function across multiple inputs. We will be looking at Pool later. We will start with the multiprocessing module’s Process class.

Getting started with multiprocessing

The Process class is very similar to the threading module’s Thread class. Let’s try creating a series of processes that call the same function and see how that works:




import os

from multiprocessing import Process



# Define a function that will run in a separate process
def my_function():
print("This is running in a separate process")

if __name__ == "__main__":
# Create a Process object
my_process = Process(target=my_function)

# Start the process
my_process.start()

# Wait for the process to complete (optional)
my_process.join()

print("Main process continues to run")

















print(Process)

def doubler(number):
"""
A doubling function that can be used by a process
"""
result = number * 2
proc = os.getpid()
print('{0} doubled to {1} by process id: {2}'.format(
number, result, proc))

if __name__ == '__main__':
numbers = [5, 10, 15, 20, 25]
procs = []

for index, number in enumerate(numbers):
print(index,number)
proc = Process(target=doubler, args=(number,))
procs.append(proc)
proc.start()

for proc in procs:
proc.join()

For this example, we import Process and create a doubler function. Inside the function, we double the number that was passed in (line 10). We also use Python’s os module to get the current process’s ID or pid (line 11). This will tell us which process is calling the function. Then in the block of code at the bottom, we create a series of Processes and start them. The very last loop just calls the join() method on each process (line 24-25), which tells Python to wait for the process to terminate. If you need to stop a process, you can call its terminate()method.

Simple example using Process class

Sometimes it’s nicer to have a more human readable name for your process though. Fortunately, the Process class does allow you to access the name of your process. Let’s take a look:

import os

from multiprocessing import Process, current_process


def doubler(number):
"""
A doubling function that can be used by a process
"""
result = number * 2
proc_name = current_process().name
print('{0} doubled to {1} by: {2}'.format(
number, result, proc_name))


if __name__ == '__main__':
numbers = [5, 10, 15, 20, 25]
procs = []
proc = Process(target=doubler, args=(5,))

for index, number in enumerate(numbers):
proc = Process(target=doubler, args=(number,))
procs.append(proc)
proc.start()

proc = Process(target=doubler, name='Test', args=(2,))
proc.start()
procs.append(proc)

for proc in procs:
proc.join()

This time around, we import something extra: current_process (line 3). The current_processis basically the same thing as the threading module’s current_thread. We use it to grab the name of the thread that is calling our function.

The output demonstrates that the multiprocessing module assigns a number to each process as a part of its name by default. Of course, when we specify a name, a number isn’t going to get added to it.

--

--

Prem Vishnoi(cloudvala)
Prem Vishnoi(cloudvala)

Written by Prem Vishnoi(cloudvala)

Head of Data and ML experienced in designing, implementing, and managing large-scale data infrastructure. Skilled in ETL, data modeling, and cloud computing

No responses yet