How to Perform Parallel Processing in Python with Multiprocessing?

The multiprocessing module in Python allows you to create and manage many processes to carry out
tasks concurrently, enabling parallel processing. On multi-core CPUs, this can dramatically increase the
performance of tasks that are CPU-bound.

Here is a systematic tutorial on how to use multiprocessing to carry out parallel processing in Python:

Import the multiprocessing module:

Start by importing the multiprocessing module at the beginning of your Python script:

import multiprocessing

Define the function to be parallelized:

Make a function to stand in for the operation you wish to parallelize. This function ought to accept inputs, carry out some computation, and output the outcome. For instance:

def process_data(data):
   # Your processing logic here#
   result = data * 2
   return result

Create a Pool of worker processes:

A worker process pool can be easily managed using the multiprocessing Pool class. You can specify how many processes to employ, which is often the same as the number of CPU cores for tasks that require a CPU:

num_processes = multiprocessing.cpu_count()
# Use all available CPU cores#
pool = multiprocessing.Pool(processes=num_processes)

Split the data:

If you have a large dataset to process, split it into smaller chunks so that each process can work on a portion of the data.

data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10] # Example data#

Map the function to the data:

Use the map method of the Pool object to distribute the data to worker processes and execute the function in parallel:

results = pool.map(process_data, data)

Close and join the pool:

After the processing is complete, it is important to close and join the pool to ensure that all processes are cleaned up properly:

pool.close()
pool.join()

Process the results:

You can now work with the results obtained from the parallel processing, which are stored in the results list.

Complete Example:

Here is a complete example:

import multiprocessing
def process_data(data):
   result = data * 2
   return result

if __name__ == "__main__":
   num_processes = multiprocessing.cpu_count()
   pool = multiprocessing.Pool(processes=num_processes)

   data = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
   results = pool.map(process_data, data)

   pool.close()
   pool.join()

   print("Results:", results)

Make sure to include the if name == “main“: block to ensure that the code is executed
correctly on Windows platforms.
That is it! You have successfully performed parallel processing in Python using the multiprocessing
module. This approach can help you leverage the full processing power of your CPU and speed up
computationally intensive tasks.