DataFrame随机选行+纵向拼接

发布时间 2023-07-25 15:58:06作者: Bonne_chance

Dataframe随机选行

(1)dataframe实例:

city_data = {'city': ['beijing', 'shanghai', 'xining', 'dalian', 'xian', 'chongqing'],
             'location': ['north', 'south', 'northwest', 'northeast', 'west', 'southwest'],
             'level': ['first', 'first', 'third', 'second', 'second', 'second'],
             'if-to-sea':['no', 'yes', 'no','yes','no','no']}

city_df = pd.DataFrame(city_data)

dataframe具体如下:

         city   location   level if-to-sea
0    beijing      north   first        no
1   shanghai      south   first       yes
2     xining  northwest   third        no
3     dalian  northeast  second       yes
4       xian       west  second        no
5  chongqing  southwest  second        no

(2)随机取行--方式1
city_df_2_3 = city_df.sample(frac=0.6) # 随机取行比例为0.6

         city   location   level if-to-sea
4       xian       west  second        no
0    beijing      north   first        no
5  chongqing  southwest  second        no
3     dalian  northeast  second       yes

(2)随机取行--方式2
city_df_2_3_1 = city_df.sample(n=4)

    city     location   level   if-to-sea
0  beijing      north   first        no
2   xining  northwest   third        no
4     xian       west  second        no
3   dalian  northeast  second       yes

Dataframe在随机取行后,取剩余的Dataframe

承接上面的案例,在取完原Dataframe的2/3后,我们想得到剩余的1/3.

city_df_2_3_index = city_df_2_3.index.to_list()
city_df_1_3 = city_df[~city_df.index.isin(city_df_2_3_index)]

得到:

         city   location   level if-to-sea
3     dalian  northeast  second       yes
5  chongqing  southwest  second        no

将两个dataframe进行纵向拼接

city_df_concat = pd.concat([city_df_1_3, city_df_2_3])

得到:

        city   location   level if-to-sea
1   shanghai      south   first       yes
3     dalian  northeast  second       yes
5  chongqing  southwest  second        no
4       xian       west  second        no
0    beijing      north   first        no
2     xining  northwest   third        no