python使用multiprocessing,为什么没有加快?

发布时间 2023-05-21 21:04:56作者: burlingame

转载:python使用了multiprocessing,为什么并没有加快? - SegmentFault 思否

def calcutype(dataframe,model,xiangguandict):
    '''主函数'''
    typelist = {}
    xiangguan = {}
    
    res = {}
    pool = multiprocessing.Pool(40)
    
    for index, row in dataframe.iterrows(): 
        scorearr = []
        name = row['data_name1'].split(',') #data_name
        descrip = row['data_descrip1'].split(',') #data_descrip
        
        #计算得出分类,与相关度
        res[index]=pool.apply_async(calcumodelnum, (name,descrip,model,xiangguandict))
            
    pool.close()
    pool.join()  
    
    for i in res:
        #本条分类相关度计算完毕
        typelist[i] = res[i].get()[0]
        xiangguan[i] = res[i].get()[1]
        
    return typelist,xiangguan
    
def calcumodelnum(***省略***):
    logging.info(multiprocessing.current_process().name+'计算完了')
    return 结果

def main():
    data1 = **省略***
    model = **省略***
    dict1 = **省略***
    calcutype(data1,model,dict1)
    
if __name__ == '__main__':
    main()

大家看到我上面的代码了,因为在40核的机器上跑,所以我启动了40个进程,但是看cpu情况

clipboard.png

明明任务很多,却根本没有打满,有很多核空着,

clipboard.png

像这些时间那就根本动也不动。

clipboard.png

log显示确实启动了40个进程,可是那些进程好像启动之后就不动了似的。

运行的log如下:

clipboard.png

我又试了试启动10个进程,最后程序运行时间与启用40个几乎一样长!!!

这是怎么回事?求大神指点。(那个for循环很长,计算强度够大)

回复
阅读 19.8k
 
 
7 个回答
头像
起风了
4.1k1920
 
 
✓ 已被采纳

multiprocessing.Pool 只是用来启动多个进程而不是在每个core上启动一个进程。换句话说Python解释器本身不会去在每个core或者processor去做负载均衡。这个是由操作系统决定的。如果你的工作特别的计算密集型的话,操作系统确实会分配更多的core,但这也不是Python或者代码所能控制的或指定的。

multiprocessing.Pool(num)中的num可以很小也可以很大,比如I/O密集型的操作,这个值完全可以大于cpu的个数。

硬件系统的资源分配是由操作系统决定的,如果你希望每个core都在工作,就需要更多的从操作系统出发了~

 
 
suparek

不能强制地分配到多个核中去吗?

回复2017-09-12
起风了

@suparek 这个只能通过操作系统层面实现,不是Python解释器所能决定的

回复2017-09-12
suparek

互相争夺时间片导致的?

回复2017-09-12
头像
同意并接受
7.2k11022
 

计算强度达不到,前面几个进程就干完了,后面的只能闲着~

 
 
suparek

这不可能,活都干不完。

回复2017-09-11
suparek

不存在人多活少

回复2017-09-11
头像
bad_cat
3813
 

我想这里应该解释的比较清楚了。https://stackoverflow.com/que...

 
 
 
 
头像
ferstar
1.2k1111
 

代码修改下, 应该能跑满CPU

from multiprocessing import Pool, cpu_count, current_process

data1 = '...'  # dataframe
model = '...'
dict1 = '...'
typelist = {}
xiangguan = {}
res = {}


def calcutype(row):
    index, data = row
    name = data['data_name1'].split(',')  # data_name
    descrip = data['data_descrip1'].split(',')  # data_descrip
    res[index] = calcumodelnum(name, descrip)
    logging.info(current_process().name + '计算完了')


def calcumodelnum(name, des):
    """你的计算逻辑"""
    pass


def main():
    pool = Pool(cpu_count())
    pool.map(calcutype, data1.iterrows())
    pool.close()
    pool.join()
    for i in res:
        # 本条分类相关度计算完毕
        typelist[i] = res[i].get()[0]
        xiangguan[i] = res[i].get()[1]
    return typelist, xiangguan


if __name__ == '__main__':
    main()

试试吧

 
 
suparek

用map就可以?

回复2017-09-12
ferstar

@suparek 我一般都是这么跑的, 试试看吧

回复2017-09-12
 
新手上路,请多包涵

楼主,请问这个问题你解决了吗?想请教你,很急。谢谢!

 
 
suparek

看看楼下的其他答案,我曾经采纳了一个。

回复2021-04-07
 
 
 
10

I am working on multiprocessing in Python. For example, consider the example given in the Python multiprocessing documentation (I have changed 100 to 1000000 in the example, just to consume more time). When I run this, I do see that Pool() is using all the 4 processes but I don't see each CPU moving upto 100%. How to achieve the usage of each CPU by 100%?

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    pool = Pool(processes=4)            
    result = pool.map(f, range(10000000))  
  •  
    That's because your program is just performing print most of the time, so it is what is called "I/O bound".   Jan 25, 2014 at 9:32
  • 1
    I don't think the print starts until the map finishes. Synchronization overhead might account for some of the underutilization.   Jan 25, 2014 at 9:34
  • 1
    At any rate, the multiplications take milliseconds, which makes the question kind of moot, really.   Jan 25, 2014 at 9:35 
  •  
    Thank you for your comment. But this was just an example. I have tried for other examples which takes more time for computation and still the processes are like P1: 10% , P2: 30%, P3 : 20%, P4 : 15%.   Jan 25, 2014 at 9:35
  • 1
    Then provide a program which we can use to reproduce your problem.   Jan 25, 2014 at 9:35

2 Answers

5
 

It is because requires interprocess communication between the main process and the worker processes behind the scene, and the communication overhead took more (wall-clock) time than the "actual" computation () in your case.multiprocessingx * x

Try "heavier" computation kernel instead, like

def f(x):
  return reduce(lambda a, b: math.log(a+b), xrange(10**5), x)

Update (clarification)

I pointed out that the low CPU usage observed by the OP was due to the IPC overhead inherent in but the OP didn't need to worry about it too much because the original computation kernel was way too "light" to be used as a benchmark. In other words, works the worst with such a way too "light" kernel. If the OP implements a real-world logic (which, I'm sure, will be somewhat "heavier" than ) on top of , the OP will achieve a decent efficiency, I assure. My argument is backed up by an experiment with the "heavy" kernel I presented.multiprocessingmultiprocessingx * xmultiprocessing

@FilipMalczak, I hope my clarification makes sense to you.

By the way there are some ways to improve the efficiency of while using . For example, we can combine 1,000 jobs into one before we submit it to unless we are required to solve each job in real time (ie. if you implement a REST API server, we shouldn't do in this way).x * xmultiprocessingPool

  •  
    It looks like right answer, and will behave so, but you missed the whole point of multiprocessing   Jan 25, 2014 at 11:06
 
4

You're asking wrong kind of question. represents process as understood in your operating system. is just a simple way to run several processes to do your work. Python environment has nothing to do with balancing load on cores/processors. multiprocessing.Processmultiprocessing.Pool

If you want to control how will processor time be given to processes, you should try tweaking your OS, not python interpreter.

Of course, "heavier" computations will be recognised by system, and may look like they do just what you want to do, but in fact, you have almost no control on process handling.

"Heavier" functions will just look heavier to your OS, and his usual reaction will be assigning more processor time to your processes, but that doesn't mean you did what you wanted to - and that's good, because that the whole point of languages with VM - you specify logic, and VM takes care of mapping this logic onto operating system.

  •  
    Thank you. It helps in better understanding of how multiprocessing works :)   Jan 25, 2014 at