Python multiprocessing.Pool() doesn't use 100% of each CPU

Question

I am working on multiprocessing in Python. For example, consider the example given in the Python multiprocessing documentation (I have changed 100 to 1000000 in the example, just to consume more time). When I run this, I do see that Pool() is using all the 4 processes but I don't see each CPU moving upto 100%. How to achieve the usage of each CPU by 100%?

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)            
    result = pool.map(f, range(10000000))

That's because your program is just performing print most of the time, so it is what is called "I/O bound". — Michael Foukarakis, Jan 25, 2014 at 9:32
I don't think the print starts until the map finishes. Synchronization overhead might account for some of the underutilization. — user2357112, Jan 25, 2014 at 9:34
At any rate, the multiplications take milliseconds, which makes the question kind of moot, really. — Michael Foukarakis, Jan 25, 2014 at 9:35
Thank you for your comment. But this was just an example. I have tried for other examples which takes more time for computation and still the processes are like P1: 10% , P2: 30%, P3 : 20%, P4 : 15%. — geekygeek, Jan 25, 2014 at 9:35
Then provide a program which we can use to reproduce your problem. — Michael Foukarakis, Jan 25, 2014 at 9:35

nodakai · Accepted Answer · 2014-01-25 11:47:23Z

It is because requires interprocess communication between the main process and the worker processes behind the scene, and the communication overhead took more (wall-clock) time than the "actual" computation () in your case.multiprocessingx * x

Try "heavier" computation kernel instead, like

def f(x):
  return reduce(lambda a, b: math.log(a+b), xrange(10**5), x)

Update (clarification)

I pointed out that the low CPU usage observed by the OP was due to the IPC overhead inherent in but the OP didn't need to worry about it too much because the original computation kernel was way too "light" to be used as a benchmark. In other words, works the worst with such a way too "light" kernel. If the OP implements a real-world logic (which, I'm sure, will be somewhat "heavier" than ) on top of , the OP will achieve a decent efficiency, I assure. My argument is backed up by an experiment with the "heavy" kernel I presented.multiprocessingmultiprocessingx * xmultiprocessing

@FilipMalczak, I hope my clarification makes sense to you.

By the way there are some ways to improve the efficiency of while using . For example, we can combine 1,000 jobs into one before we submit it to unless we are required to solve each job in real time (ie. if you implement a REST API server, we shouldn't do in this way).x * xmultiprocessingPool

It looks like right answer, and will behave so, but you missed the whole point of multiprocessing — Filip Malczak, Jan 25, 2014 at 11:06

Filip Malczak · Answer 2 · 2014-01-25 11:04:53Z

You're asking wrong kind of question. represents process as understood in your operating system. is just a simple way to run several processes to do your work. Python environment has nothing to do with balancing load on cores/processors. multiprocessing.Processmultiprocessing.Pool

If you want to control how will processor time be given to processes, you should try tweaking your OS, not python interpreter.

Of course, "heavier" computations will be recognised by system, and may look like they do just what you want to do, but in fact, you have almost no control on process handling.

"Heavier" functions will just look heavier to your OS, and his usual reaction will be assigning more processor time to your processes, but that doesn't mean you did what you wanted to - and that's good, because that the whole point of languages with VM - you specify logic, and VM takes care of mapping this logic onto operating system.

Thank you. It helps in better understanding of how multiprocessing works :) — geekygeek, Jan 25, 2014 at

526互联

python使用multiprocessing，为什么没有加快？

Python multiprocessing.Pool() doesn't use 100% of each CPU

2 Answers

Update (clarification)