修改atlas-application.properties
添加
atlas.hook.hive.synchronous=false atlas.hook.hive.numRetries=3 atlas.hook.hive.queueSize=10000 atlas.cluster.name=primary
如果是内嵌安装需要修改localhost为IP或域名,否则外部无法访问kafka
atlas.kafka.zookeeper.connect=172.31.6.205:9026 atlas.kafka.bootstrap.servers=172.31.6.205:9027
适当增加kafka、zk的超时时间
将配置文件放入插件jar包中
cd /opt/atlas/apache-atlas-sources-2.1.0/distro/target/apache-atlas-2.1.0-bin/apache-atlas-2.1.0/conf zip -u atlas-application.properties /opt/atlas/apache-atlas-sources-2.1.0/distro/target/apache-atlas-2.1.0-hive-hook/apache-atlas-hive-hook-2.1.0/hook/hive/atlas-plugin-classloader-2.1.0.jar
拷贝atlas-application.properties /opt/atlas/apache-atlas-sources-2.1.0/distro/target/apache-atlas-2.1.0-hive-hook/apache-atlas-hive-hook-2.1.0目录到hive的安装节点
配置环境变量hive-env.sh
vim hive-env.sh export HIVE_AUX_JARS_PATH=/opt/module/apache-atlas-hive-hook-2.1.0/hook/hive
修改hive-site.xml增加配置
<property> <name>hive.exec.post.hooks</name> <value>org.apache.atlas.hive.hook.HiveHook,org.apache.hadoop.hive.ql.hooks.LineageLogger</value> </property>
重启hive
以上配置完成hive新增元数据的实时同步,但已有的元数据需要手动同步一次,执行以下脚本即可
/opt/module/apache-atlas-hive-hook-2.1.0/hook-bin/import-hive.sh