Handling Python Multiprocessing PicklingError with .NET Objects
When working with large datasets for plotting, performance can become a bottleneck. To speed up the process, I used Python’s multiprocessing
module to parallelize data processing and retrieval. However, I encountered an error: Python multiprocessing PicklingError: Can’t pickle XXX object
This error occurred because the object I was passing to the multiprocessing function contained a .NET object generated by the hardware instrument. Python’s multiprocessing uses pickling to serialize objects, but .NET objects cannot be pickled. In this blog, I’ll explain the issue and share a workaround that allows me to use multiprocessing successfully.
Python’s multiprocessing
module creates separate processes to execute functions in parallel. These processes do not share memory, so objects must be serialized (pickled) to be transferred between them or passed to a multiprocessing worker. However, certain objects, including those from .NET cannot be pickled.
Since the .NET objects are not picklable, an alternative approach is to serialize the object manually, save it to a temporary file, and pass the file path instead. The worker process can then load the object from the file.
from visualization.plot_compute import pool_initializer, process_plot_data_task
import tempfile
import pickle
import json
import multiprocessing
# Create or obtain the .NET object (assumed to be already defined)
obj = # An ExperimentObject that contains.NET object
# Step 1: Serialize the .NET object by storing it in a temporary file
temp = tempfile.NamedTemporaryFile(delete=False, suffix=".pkl") # Create a temp file
with open(temp.name, "wb") as f:
pickle.dump(obj, f) # Save the .NET object to the file using pickle
# Step 2: Initialize a multiprocessing pool with a custom initializer
with multiprocessing.Pool(
processes=multiprocessing.cpu_count(), # Use all available CPU cores
initializer=pool_initializer # Initialize worker processes
) as pool:
# Step 3: Submit a task to the worker process
# Instead of passing the object, we pass the file path to avoid pickling issues
result = pool.apply_async(process_plot_data_task, (temp.name, plot_type, average_bool))
# Step 4: Wait for the result from the worker process
out_json = result.get()
# Step 5: Deserialize the JSON output into a Python dictionary for further use
fig_dict = json.loads(out_json)
This approach successfully bypasses the pickling limitation when working with .NET objects in Python’s multiprocessing. If you ever encounter PicklingError
with non-pickleable objects, consider saving them to a temporary file and passing the file path instead. While this introduces a small I/O overhead, it allows parallel processing to work without major modifications to your code.