Home > Backend Development > Python Tutorial > How to Share Large Readonly Data Efficiently in Python Multiprocessing?

How to Share Large Readonly Data Efficiently in Python Multiprocessing?

Linda Hamilton
Release: 2024-10-24 18:45:50
Original
791 people have browsed it

How to Share Large Readonly Data Efficiently in Python Multiprocessing?

Maintaining Shared Readonly Data in Multiprocessing

Question:

In a Python multiprocessing environment, how to ensure that a sizeable readonly array (e.g., 3 Gb) is shared among multiple processes without creating copies?

Answer:

Utilizing shared memory capabilities provided by the multiprocessing module in conjunction with NumPy allows for efficient sharing of data between processes.

<code class="python">import multiprocessing
import ctypes
import numpy as np

shared_array_base = multiprocessing.Array(ctypes.c_double, 10*10)
shared_array = np.ctypeslib.as_array(shared_array_base.get_obj())
shared_array = shared_array.reshape(10, 10)</code>
Copy after login

This approach leverages the fact that Linux employs copy-on-write semantics for fork(), ensuring that data is only duplicated when modified. As a result, even without explicitly using the multiprocessing.Array, the data is effectively shared between processes unless altered.

<code class="python"># Parallel processing
def my_func(i, def_param=shared_array):
    shared_array[i,:] = i

if __name__ == '__main__':
    pool = multiprocessing.Pool(processes=4)
    pool.map(my_func, range(10))

    print(shared_array)</code>
Copy after login

This code concurrently modifies the shared array and demonstrates the successful sharing of data among multiple processes:

[[ 0.  0.  0.  0.  0.  0.  0.  0.  0.  0.]
 [ 1.  1.  1.  1.  1.  1.  1.  1.  1.  1.]
 [ 2.  2.  2.  2.  2.  2.  2.  2.  2.  2.]
 [ 3.  3.  3.  3.  3.  3.  3.  3.  3.  3.]
 [ 4.  4.  4.  4.  4.  4.  4.  4.  4.  4.]
 [ 5.  5.  5.  5.  5.  5.  5.  5.  5.  5.]
 [ 6.  6.  6.  6.  6.  6.  6.  6.  6.  6.]
 [ 7.  7.  7.  7.  7.  7.  7.  7.  7.  7.]
 [ 8.  8.  8.  8.  8.  8.  8.  8.  8.  8.]
 [ 9.  9.  9.  9.  9.  9.  9.  9.  9.  9.]]
Copy after login

By leveraging shared memory and copy-on-write semantics, this approach provides an efficient solution for sharing large amounts of readonly data between processes in a multiprocessing environment.

The above is the detailed content of How to Share Large Readonly Data Efficiently in Python Multiprocessing?. For more information, please follow other related articles on the PHP Chinese website!

source:php
Statement of this Website
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Latest Articles by Author
Popular Tutorials
More>
Latest Downloads
More>
Web Effects
Website Source Code
Website Materials
Front End Template