Python | Pandas Series.str.contains() 过滤pandas datafram格式中包含特定字符串的行

发布时间 2023-06-20 09:26:04作者: Oops!#

Example #1: Use Series.str.contains a () function to find if a pattern is present in the strings of the underlying data in the given series object.
 

  • Python3
 
# importing pandas as pd
import pandas as pd
 
# importing re for regular expressions
import re
 
# Creating the Series
sr = pd.Series(['New_York', 'Lisbon', 'Tokyo', 'Paris', 'Munich'])
 
# Creating the index
idx = ['City 1', 'City 2', 'City 3', 'City 4', 'City 5']
 
# set the index
sr.index = idx
 
# Print the series
print(sr)

Output : 
 

 

Now we will use Series.str.contains a () function to find if a pattern is contained in the string present in the underlying data of the given series object.
 

  • Python3
 
 
# find if 'is' substring is present
result = sr.str.contains(pat = 'is')
 
# print the result
print(result)

Output : 
 

As we can see in the output, the Series.str.contains() function has returned a series object of boolean values. It is true if the passed pattern is present in the string else False is returned.
Example #2: Use Series.str.contains a () function to find if a pattern is present in the strings of the underlying data in the given series object. Use regular expressions to find patterns in the strings.
 

  • Python3
 
# importing pandas as pd
import pandas as pd
 
# importing re for regular expressions
import re
 
# Creating the Series
sr = pd.Series(['Mike', 'Alessa', 'Nick', 'Kim', 'Britney'])
 
# Creating the index
idx = ['Name 1', 'Name 2', 'Name 3', 'Name 4', 'Name 5']
 
# set the index
sr.index = idx
 
# Print the series
print(sr)

Output : 
 

Now we will use Series.str.contains a () function to find if a pattern is contained in the string present in the underlying data of the given series object.
 

  • Python3
 
# find if there is a substring such that it has
# the letter 'i' followed by any small alphabet.
result = sr.str.contains(pat = 'i[a-z]', regex = True)
 
# print the result
print(result)

Output : 
 

As we can see in the output, the Series.str.contains() function has returned a series object of boolean values. It is true if the passed pattern is present in the string else False is returned.

 

import pandas as pd
import csv

aliUid=["123","124","125","126"]
file = './123.log'


data = pd.read_csv(file,delimiter=',',quoting=csv.QUOTE_NONE,header=None)


for uid in aliUid:
    df = data.loc[data[52].str.contains(uid)]

    for column in df:
        df[column]=df[column].str.replace('"','')

    print(df)
    new_file=f"./{uid}.log"
    df.to_csv(new_file,quoting=csv.QUOTE_NONE,index=False,header=False)