flume安装配置与使用

发布时间 2023-08-30 12:03:16作者: whiteY

1.安装包下载路径

https://mirrors.tuna.tsinghua.edu.cn/apache/

2.安装环境

192.168.11.128
192.168.11.129
192.168.11.130

3.解压gz压缩包到3个机器节点

mkdir /usr/local/flume
tar -zxvf apache-flume-1.9.0-bin.tar.gz -C /usr/local/flume

4.配置flume-env.sh

进入flume的conf目录下,复制flume-env.sh.template为flume-env.sh,并修改

cp flume-env.sh.template flume-env.sh

添加JAVA_HOME环境变量
export JAVA_HOME=/usr/local/jdk/jdk1.8.0_261

5.查看是否配置成功

[root@hadoop01 bin]# ./flume-ng version
Flume 1.9.0
Source code repository: https://git-wip-us.apache.org/repos/asf/flume.git
Revision: d4fcab4f501d41597bc616921329a4339f73585e
Compiled by fszabo on Mon Dec 17 20:45:25 CET 2018
From source with checksum 35db629a3bda49d23e9b3690c80737f9

6.测试flume

6.1新建一个flume-conf.properties文件
[root@hadoop01 conf]# vi flume-conf.properties
#agent中各组件的名字
##表示agent中的source组件
a1.sources = r1
##表示的是下沉组件sink
a1.sinks = k1
##agent内部的数据传输通道channel,用于从source将数据传递到sink
a1.channels = c1
 
#描述和配置source组件:r1
##netcat用于监听一个端口的
a1.sources.r1.type = netcat
##配置的绑定地址,这个机器的hostname是master,所以下面也可以配置成master
a1.sources.r1.bind = 192.168.11.128
##配置的绑定端口
a1.sources.r1.port = 44444

#描述和配置sink组件:k1
a1.sinks.k1.type = logger

##描述和配置channel组件,此处使用时内存缓存的方式
#下面表示的是缓存到内存中,如果是文件,可以使用file的那种类型
a1.channels.c1.type = memory
#表示用多大的空间
a1.channels.c1.capacity = 1000
#下面表示用事务的空间是多大
a1.channels.c1.transactionCapacity = 100

# 描述和配置source channel sink之间的连接关系,因为source和sink依赖channel来传递数据,所以要分别指定用的是哪个channel。
a1.sources.r1.channels = c1
a1.sinks.k1.channel = c1

注:a1.sources.r1.bind = master (主机名或者IP地址)


######6.2打开flume数据采集

[root@hadoop01 apache-flume-1.9.0-bin]# bin/flume-ng agent -c conf -f conf/flume-conf.properties -n a1 -Dflume.root.logger=INFO,console
Info: Sourcing environment configuration script /usr/local/flume/apache-flume-1.9.0-bin/conf/flume-env.sh
Info: Including Hadoop libraries found via (/bin/hadoop) for HDFS access
Info: Including HBASE libraries found via (/bin/hbase) for HBASE access
Info: Including Hive libraries found via () for Hive access
2023-08-29 21:34:02,056 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:151)] Creating channels
2023-08-29 21:34:02,079 (conf-file-poller-0) [INFO - org.apache.flume.channel.DefaultChannelFactory.create(DefaultChannelFactory.java:42)] Creating instance of channel c1 type memory
2023-08-29 21:34:02,092 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.loadChannels(AbstractConfigurationProvider.java:205)] Created channel c1
2023-08-29 21:34:02,092 (conf-file-poller-0) [INFO - org.apache.flume.source.DefaultSourceFactory.create(DefaultSourceFactory.java:41)] Creating instance of source r1, type netcat
2023-08-29 21:34:02,109 (conf-file-poller-0) [INFO - org.apache.flume.sink.DefaultSinkFactory.create(DefaultSinkFactory.java:42)] Creating instance of sink: k1, type: logger
2023-08-29 21:34:02,117 (conf-file-poller-0) [INFO - org.apache.flume.node.AbstractConfigurationProvider.getConfiguration(AbstractConfigurationProvider.java:120)] Channel c1 connected to [r1, k1]
2023-08-29 21:34:02,136 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:162)] Starting new configuration:{ sourceRunners:{r1=EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:r1,state:IDLE} }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@52f8d298 counterGroup:{ name:null counters:{} } }} channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} }
2023-08-29 21:34:02,150 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:169)] Starting Channel c1
2023-08-29 21:34:02,152 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:184)] Waiting for channel: c1 to start. Sleeping for 500 ms
2023-08-29 21:34:02,421 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.register(MonitoredCounterGroup.java:119)] Monitored counter group for type: CHANNEL, name: c1: Successfully registered new MBean.
2023-08-29 21:34:02,421 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.instrumentation.MonitoredCounterGroup.start(MonitoredCounterGroup.java:95)] Component type: CHANNEL, name: c1 started
2023-08-29 21:34:02,652 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:196)] Starting Sink k1
2023-08-29 21:34:02,655 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:207)] Starting Source r1
2023-08-29 21:34:02,658 (lifecycleSupervisor-1-4) [INFO - org.apache.flume.source.NetcatSource.start(NetcatSource.java:155)] Source starting
2023-08-29 21:34:02,695 (lifecycleSupervisor-1-4) [INFO - org.apache.flume.source.NetcatSource.start(NetcatSource.java:166)] Created serverSocket:sun.nio.ch.ServerSocketChannelImpl[/192.168.11.128:44444]
6.3 192.168.11.129节点安装Telnet,使用Telnet连接192.168.11.128上的Flume
yun install -y telnet

[root@hadoop02 xinetd.d]# telnet 192.168.11.128 44444
Trying 192.168.11.128...
Connected to 192.168.11.128.
Escape character is '^]'.
hello world
OK
hello flume
OK

6.4返回192.168.11.128节点查看
2023-08-29 21:50:17,574 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 68 65 6C 6C 6F 20 77 6F 72 6C 64 0D             hello world. }
2023-08-29 21:50:35,395 (SinkRunner-PollingRunner-DefaultSinkProcessor) [INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:95)] Event: { headers:{} body: 68 65 6C 6C 6F 20 66 6C 75 6D 65 0D             hello flume. }