I have been using pool.map in the multiprocessing package for simple parallel jobs because of its simplicity and ease of use. However, the simplicity comes at a cost that the computation function f(x) to be parallelized can only take one argument as input. If f(x, D) needs auxiliary data D, there are a few workarounds:
1. combine the main argument and auxiliary data together as a tuple (x, D), and use this tuple as a single argument, i.e., f((x, D)).
2. use the partial function to generate a wrap-up function of f with the auxiliary data g=partial(f, D=D).
3. just ignore D in the argument list and let python find D in the memory.
It turns out that #3 is the most efficient way. I had been using #2 and didn’t not realize the difference until one day my f needs big auxiliary data D. In both #1 and #2, python will pickle the arguments and send them to the workers. When D is large, the pickling process takes a lot of time and the cost on data transfer is huge.
Lesson learnt: sometimes the naive approach might be the best approach.