itertools.groupby
https://docs.python.org/3/library/itertools.html#itertools.groupby
此工具需要注意, 连续分组和输入顺序。
Make an iterator that returns consecutive keys and groups from the iterable. The key is a function computing a key value for each element. If not specified or is
None
, key defaults to an identity function and returns the element unchanged. Generally, the iterable needs to already be sorted on the same key function.The operation of
groupby()
is similar to theuniq
filter in Unix. It generates a break or new group every time the value of the key function changes (which is why it is usually necessary to have sorted the data using the same key function). That behavior differs from SQL’s GROUP BY which aggregates common elements regardless of their input order.
顺序分组
https://zhuanlan.zhihu.com/p/360161483
import itertools m = itertools.groupby("aaaabbbbccccaaaa") for k, v in m: print(k, len(list(v))) =================== RESTART: C:/Users/Desktop/test.py ============== a 4 b 4 c 4 a 4
列表分组
https://www.geeksforgeeks.org/itertools-groupby-in-python/
# Python code to demonstrate # itertools.groupby() method import itertools L = [("a", 1), ("a", 2), ("b", 3), ("b", 4)] # Key function key_func = lambda x: x[0] for key, group in itertools.groupby(L, key_func): print(key + " :", list(group))
SQL意义上的分组 list.sort
https://www.geeksforgeeks.org/python-grouping-similar-substrings-in-list/
# Python3 code to demonstrate # group similar substrings # using lambda + itertools.groupby() + split() from itertools import groupby # initializing list test_list = ['geek_1', 'coder_2', 'geek_4', 'coder_3', 'pro_3'] # sort list # essential for grouping test_list.sort() # printing the original list print ("The original list is : " + str(test_list)) # using lambda + itertools.groupby() + split() # group similar substrings res = [list(i) for j, i in groupby(test_list, lambda a: a.split('_')[0])] # printing result print ("The grouped list is : " + str(res))
SQL意义上的分组 sorted
https://devtut.github.io/python/groupby.html
things = [("animal", "bear"), ("animal", "duck"), ("plant", "cactus"), ("vehicle", "harley"), \ ("vehicle", "speed boat"), ("vehicle", "school bus")] dic = {} f = lambda x: x[0] for key, group in groupby(sorted(things, key=f), f): dic[key] = list(group) dic
{'animal': [('animal', 'bear'), ('animal', 'duck')], 'plant': [('plant', 'cactus')], 'vehicle': [('vehicle', 'harley'), ('vehicle', 'speed boat'), ('vehicle', 'school bus')]}