hive

发布时间 2023-09-25 15:33:44作者: Men!

hive测试步骤

1.hive中建表,一个表存放原始数据,一个表存放清洗后的数据

create table sales_initial (

    day_id varchar(30),

    sale_nbr varchar(30),

    buy_nbr varchar(30),

    cnt varchar(30),

    round varchar(30)

)

row format delimited

fields terminated by ',';

 

2.csv文件拖入虚拟机,然后执行下边的语句,将csv文件数据导入hive建的表中

load data local inpath '/opt/module/hadoop-3.1.3/sales_initial.csv' into table sales_initial;

 

3.清洗日期

insert overwrite table sales_clean
select date_add('2023-8-31', cast(day_id AS INT)) as day_id,
       sale_nbr,
       buy_nbr,
       cnt,
       round
from sales_initial ;

 

4.根据题目要求输出的字段,建立对应的表

 

5.题目实现

题目3_1统计每天各个机场的销售数量和销售金额

insert into sales_3_1 (day_id, sale_nbr, cnt, round)
select day_id, sale_nbr, sum(cnt), sum(round)
from sales_clean
where sale_nbr like 'C%'
group by day_id, sale_nbr;

 

 

 

题目3_2统计每天各个代理商的销售数量和销售金额

insert into sales_3_2 (day_id, sale_nbr, cnt, round)

select day_id, sale_nbr, sum(cnt), sum(round)

from sales_clean

where sale_nbr like 'O%'

group by day_id, sale_nbr;

 

题目3_3统计每天各个代理商的销售活跃度

insert into sales_3_3(day_id, sale_nbr, sale_number)
select day_id, sale_nbr, count(sale_nbr)
from sales_clean
where sale_nbr like 'O%'
group by day_id, sale_nbr;

 

题目3_4汇总统计9月1日到9月15日之间各个代理商的销售利润

create table buy as select day_id, buy_nbr, sum(cnt) as cnt, sum(round) as round from sales_initial where buy_nbr like 'O%' and day_id >= '2023-09-01' AND day_id <= '2023-09-15' group by day_id, buy_nbr

create table sale as select day_id, sale_nbr, sum(cnt) as cnt , sum(round) as round from sales_initial where sale_nbr like 'O%' and day_id >= '2023-09-01' AND day_id <= '2021-09-15' group by day_id, sale_nbr

create table earned as select a.day_id as day_id, b.sale_nbr as nbr, a.cnt as cnt_buy, a.round as round_buy, b.cnt as cnt_sale, b.round as round_sale, b.round-a.round as money from buy a join sale b on a.buy_nbr = b.sale_nbr and a.day_id = b.day_id

 

6.将以上统计结果保存到本机的mysql

实例:

bin/sqoop export \

--connect jdbc:mysql://localhost:3306/test \

--username root \

--password ******** \

--table sales_3_1 \

--export-dir /user/hive/warehouse/sales_3_1 \

--input-fields-terminated-by '\t'

 

 

数据表部分展示

 

 

 

7.echarts数据可视化实现

下面是echarts的官网的一个新手指导,我以前没了解过,因此需要学一下

快速上手 - Handbook - Apache ECharts

 

 

这里提供了大量的指导,因此剩下的难点就是如何将mysql表中的数据传入表中

 

 

 

 

 

这里我使用了一种简单粗暴的方法,但有的数据好像还有些问题

 

目前只能达到这个水平.