学习python自动化——re正则

发布时间 2024-01-05 19:04:01作者: 芒果93
re(正则)
一、正则表达式作用
正则表达式匹配指定规则的字符串
二、re常用方法
  • findall(pattern, string, flags=0):使用正则表达式,匹配所有符合条件的字符串,返回匹配到的所有子串,返回list

    pattern:正则表达式
    string:匹配的字符串
    flags=0:自定义的一些规则,比如不区分大小写

string = "to1212ken132435testr"
re_demo = "\D"
res4 = re.findall(re_demo,string)
print(res4)      #输出:['t', 'o', 'k', 'e', 'n', 't', 'e', 's', 't', 'r']
  • match(pattern, string, flags=0):匹配字符串开始位置的子串,返回的是对象,需要通过结果处理函数(group())返回;如果开始位置未匹配到就会返回none
string = "to1212ken132435testr"
re_demo = "\D"
res4 = re.match(re_demo,string).group()
print(res4)          #输出:t
  • search(pattern, string, flags=0):匹配找到的第一个符合条件的字符,返回的是对象,需要通过结果处理函数(group())返回
string = "111212ken132435testr"
re_demo = "\D"
res5 = re.search(re_demo,string).group()
print(res5)        #输出:k
  • finditer(pattern, string, flags=0):匹配所有符合条件的子串,返回他们的迭代器对象(需要用for循环遍历,遍历的结果也需要通过结果处理函数(group())返回)
string = "to1212ken132435testr"
re_demo = "\D"
res5 = re.finditer(re_demo,string)
print(res5)            #输出:<callable_iterator object at 0x0000016CAD67DD60>
for i in res5:
print(i.group())      #输出:t o k e n t e s t r

三、单字符匹配

  • . :匹配任意一个字符(除\n),匹配多次每次匹配一个字符,返回匹配结果的list【匹配\n时会报错】
string = "token test"
re_demo = "t."
res1= re.findall(re_demo,string)
print(res1)      #输出:['to', 'te']
  • []:匹配[]中列举的任意一个字符
string = "token testr"
re_demo = "[tos]"
res2 = re.findall(re_demo,string)
print(res2)      #输出:['t', 'o', 't', 's', 't']
  • \d:匹配数字,即0-9
string = "to1212ken132435testr"
re_demo = "\d"
res3 = re.findall(re_demo,string)
print(res3)      #输出:['1', '2', '1', '2', '1', '3', '2', '4', '3', '5']
  • \D:匹配非数字,即不是数字
string = "to1212ken132435testr"
re_demo = "\D"
res4 = re.findall(re_demo,string)
print(res4)      #输出:['t', 'o', 'k', 'e', 'n', 't', 'e', 's', 't', 'r']
  • \s:匹配空白,即空格,tab键
string = " tok en testr "
re_demo = "\s"
res2 = re.findall(re_demo,string)
print(res2)      #输出:[' ', ' ', ' ', ' ']
  • \S:匹配非空白
string = " tok en testr "
re_demo = "\S"
res2 = re.findall(re_demo,string)
print(res2)      #输出:['t', 'o', 'k', 'e', 'n', 't', 'e', 's', 't', 'r']
  • \w:匹配非特殊字符,即a-z、A-Z、0-9、_、汉字
string = "'hello好好学Python3_-&^%$#@"
re_demo = "\w"
res2 = re.findall(re_demo,string)
print(res2)      #输出:['h', 'e', 'l', 'l', 'o', '好', '好', '学', 'P', 'y', 't', 'h', 'o', 'n', '3', '_']
  • \W:匹配特殊字符,即非字母、非数字、非汉字、非下划线
string = "'hello好好学Python3_-&^%$#@"
re_demo = "\W"
res2 = re.findall(re_demo,string)
print(res2)      #输出:["'", '-', '&', '^', '%', '$', '#', '@']

四、多字符匹配

  • *:匹配前一个字符出现0次或无限次,即可有可无,输出的字符与字符串的长度一致i,没有匹配到的字符回显示成空字符
string = "token test ktv"
re_demo ="k*"
res6 = re.findall(re_demo,string)
print(res6)      #输出:['', '', 'k', '', '', '', '', '', '', '', '', 'k', '', '', '']
  • +:匹配前一个字符出现1次或无限次,即至少有1次匹配一个字符串
string = "token test ktn"
re_demo ="k.+n"
res6 = re.findall(re_demo,string)
print(res6)      #输出:['ken test ktn']
  • ?:匹配前一个字符出现0次或1次,即要么1次,要么没有
string = "token test ktn"
re_demo ="k.?n"
res6 = re.findall(re_demo,string)
print(res6)      #输出:['ken', 'ktn']
  • +?:匹配任意一个出现过的字符
string = "token test ktn"
re_demo ="k.+?n"
res7 = re.findall(re_demo,string)
print(res7)      #输出:['ken', 'ktn']
string = "tokn test ktn"
re_demo ="k.+?n"
res7 = re.findall(re_demo,string)
print(res7)      #输出:['kn test ktn']
string = "token test kn"
re_demo ="k.+?n"
res7 = re.findall(re_demo,string)
print(res7)      #输出:['ken']
  • {n}:匹配前一个字符连续出现n次
string = "tokken test kkktn"
re_demo ="k{2}"
res8 = re.findall(re_demo,string)
print(res8)      #输出:['kk', 'kk']
  • {m,n}:匹配前一个字符连续出现从m到n次【至少出现m次,最多出现n次】
string = "tokken test kkktn"
re_demo ="k{1,3}"
res9 = re.findall(re_demo,string)
print(res9)      #输出:['kk', 'kkk']

五、逻辑运算符

  • |:将两个匹配条件进行逻辑”或“(or)运算
string = "tokken test kkktn"
re_demo ="to|te|kk"
res10 = re.findall(re_demo,string)
print(res10)      #输出:['to', 'kk', 'te', 'kk']

六、边界值

  • ^:匹配输入字符串开始位置
string = "tokken test kkktn"
re_demo ="^to"
res11 = re.findall(re_demo,string)
print(res11)      #输出:['to']
  • $:匹配输入字符串结束位置
string = "tokken test kkktn"
re_demo ="tn$"
res12 = re.findall(re_demo,string)
print(res12)      #输出:['tn']

七、分组匹配

  • ():只取括号内的值