jieba分词-526互联

import jieba
path = "all.txt"    # 读取文本文件
file = open(path, "r", encoding="utf-8")
text = file.read()
file.close()
words = jieba.lcut(text)    # 使用jieba分词
counts = {}    # 统计词频
for word in words:
    if len(word) == 1:   # 过滤掉长度为1的词语
        continue
    counts[word] = counts.get(word, 0) + 1    # 更新字典中的词频
items = list(counts.items())    # 对字典中的键值对进行排序
items.sort(key=lambda x: x[1], reverse=True)

for i in range(20):    # 输出前20个高频词语
    word, count = items[i]
    print(f"{word:<10}{count:>5}")
学号：2022310143049
班级：22信计1班
姓名：赵国龙

jieba

jieba-cant-extract-single-charact

python-jieba_fast python jieba fast

红楼次数人物jieba

jieba-cant-extract-single-charact extract charact