记录一次hudi 编译过程遇到过的问题

发布时间 2023-08-15 20:21:57作者: .狂飙的蜗牛

准备工作

pom中初始依赖组件版本配置如下

<java.version>1.8</java.version>
<hadoop.version>3.1.1.3.1.0.0-78</hadoop.version>
<hive.version>3.1.0.3.1.0.0-78</hive.version>
<kafka.version>2.0.0</kafka.version>
起始命令
mvn clean package -U -DskipTests -Dcheckstyle.skip -Dmaven.javadoc.skip=true -Pspark3.1 -Pflink1.14 -am

1.hudi-hadoop-mr 模块和hive版本不兼容问题

报错内容如下

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) on project hudi-hadoop-mr: Compilation failure: Compilation failure: 
[ERROR] /root/bigdata/hudi/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HiveAvroSerializer.java:[302,93] incompatible types: org.apache.hadoop.hive.common.type.Date cannot be converted to java.sql.Date
[ERROR] /root/bigdata/hudi/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HiveAvroSerializer.java:[305,72] incompatible types: org.apache.hadoop.hive.common.type.Timestamp cannot be converted to java.sql.Timestamp
[ERROR] /root/bigdata/hudi/hudi-hadoop-mr/src/main/java/org/apache/hudi/hadoop/utils/HiveAvroSerializer.java:[309,98] incompatible types: org.apache.hadoop.hive.common.type.Date cannot be converted to java.sql.Date

1.1 修改代码

      case DATE:
        return DateWritable.dateToDays(((DateObjectInspector)fieldOI).getPrimitiveJavaObject(structFieldData));
      case TIMESTAMP:
        Timestamp timestamp =
            ((TimestampObjectInspector) fieldOI).getPrimitiveJavaObject(structFieldData);
        return timestamp.getTime();
      case INT:
        if (schema.getLogicalType() != null && schema.getLogicalType().getName().equals("date")) {
          return DateWritable.dateToDays(new WritableDateObjectInspector().getPrimitiveJavaObject(structFieldData));
        }

修改为

      case DATE:
        return ((DateObjectInspector)fieldOI).getPrimitiveJavaObject(structFieldData).toEpochDay();
      case TIMESTAMP:
        Timestamp timestamp =
                ((TimestampObjectInspector) fieldOI).getPrimitiveJavaObject(structFieldData).toSqlTimestamp();
        return timestamp.getTime();
      case INT:
        if (schema.getLogicalType() != null && schema.getLogicalType().getName().equals("date")) {
          return new WritableDateObjectInspector().getPrimitiveJavaObject(structFieldData).toEpochDay();
        }

2.hudi-hadoop-mr 模块代码风格检查问题

2.1 报错如下:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-checkstyle-plugin:3.0.0:check (default) on project hudi-hadoop-mr: You have 1 Checkstyle violation. -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException

2.2 编译时指定跳过代码风格检查

mvn clean package -U -DskipTests -Dcheckstyle.skip -Dmaven.javadoc.skip=true -Pspark3.1 -Pflink1.14 -am -rf :hudi-hadoop-mr

3.hudi-hive-sync模块和zookeeper 版本不兼容问题

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:testCompile (default-testCompile) on project hudi-hive-sync: Compilation failure
[ERROR] /data/HDP_Codebase/hudi-0.13.1/hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/testutils/HiveTestUtil.java:[188,15] method shutdown in class org.apache.zookeeper.server.ZooKeeperServer cannot be applied to given types;
[ERROR]   required: no arguments
[ERROR]   found: boolean
[ERROR]   reason: actual and formal argument lists differ in length
[ERROR] 
[ERROR] -> [Help 1]

3.1 调整pom 里zookeeper 的版本依赖为

3.5.7
调整为
3.4.6.3.1.0.0-78

3.2 修改HiveTestUtil源码

查看 wesure-hive-release 源码中都是 zkServer.shutdown();
修改代码 /HDP_Codebase/hudi-0.13.1/hudi-sync/hudi-hive-sync/src/test/java/org/apache/hudi/hive/testutils/HiveTestUtil.java

    if (zkServer != null) {
      zkServer.shutdown(true);
    }

修改为

    if (zkServer != null) {
      zkServer.shutdown();
    }

4.hudi-timeline-service模块参数不兼容问题

4.1 报错如下

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.8.0:compile (default-compile) on project hudi-timeline-service: Compilation failure
[ERROR] /data/HDP_Codebase/hudi-0.13.1/hudi-timeline-service/src/main/java/org/apache/hudi/timeline/service/TimelineService.java:[348,115] incompatible types: int cannot be converted to java.lang.ClassLoader
[ERROR] 
[ERROR] -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoFailureException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <args> -rf :hudi-timeline-service

4.2 代码传参异常,ScheduledExecutorScheduler 第三个参数为ClassLoader

    final Server server = new Server(pool);
    ScheduledExecutorScheduler scheduler = new ScheduledExecutorScheduler("TimelineService-JettyScheduler", true, 8);
    server.addBean(scheduler);

支持2个参数,去掉第三个参数调整为
final Server server = new Server(pool);
ScheduledExecutorScheduler scheduler = new ScheduledExecutorScheduler("TimelineService-JettyScheduler", true);
server.addBean(scheduler);

5.hudi-utilities 模块io.confluent 依赖无法解析

5.1 报错内容如下

[ERROR] Failed to execute goal on project hudi-utilities_2.11: Could not resolve dependencies for project org.apache.hudi:hudi-utilities_2.11:jar:0.13.1: The following artifacts could not be resolved: io.confluent:kafka-avro-serializer:jar:5.3.4, io.confluent:common-config:jar:5.3.4, io.confluent:common-utils:jar:5.3.4, io.confluent:kafka-schema-registry-client:jar:5.3.4: Could not find artifact io.confluent:kafka-avro-serializer:jar:5.3.4 in HDPReleases (http://repo.hortonworks.com/content/repositories/releases) -> [Help 1]
[ERROR] 
[ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch.
[ERROR] Re-run Maven using the -X switch to enable full debug logging.
[ERROR] 
[ERROR] For more information about the errors and possible solutions, please read the following articles:
[ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/DependencyResolutionException
[ERROR] 
[ERROR] After correcting the problems, you can resume the build with the command
[ERROR]   mvn <args> -rf :hudi-utilities_2.11

5.2 调整 mvn 中的confluent仓库检索顺序

<mirrorOf>*,!HDPReleases</mirrorOf>

调整为
<mirrorOf>*,!HDPReleases,!confluent</mirrorOf>

6.编译成功