尚硅谷Hadoop的WordCount案例实操练习出现的bug

发布时间 2023-05-26 14:49:56作者: 辰南以北

报错日志和exception如下:

点击查看代码
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in [jar:file:/D:/Environment/RepMaven/org/slf4j/slf4j-reload4j/1.7.36/slf4j-reload4j-1.7.36.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: Found binding in [jar:file:/D:/Environment/RepMaven/org/slf4j/slf4j-log4j12/1.7.30/slf4j-log4j12-1.7.30.jar!/org/slf4j/impl/StaticLoggerBinder.class]
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Reload4jLoggerFactory]
2023-05-26 13:55:26,083 WARN [org.apache.hadoop.util.NativeCodeLoader] - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  2023-05-26 13:55:26,580 WARN [org.apache.hadoop.metrics2.impl.MetricsConfig] - Cannot locate configuration: tried hadoop-metrics2-jobtracker.properties,hadoop-metrics2.properties
  2023-05-26 13:55:26,783 INFO [org.apache.hadoop.metrics2.impl.MetricsSystemImpl] - Scheduled Metric snapshot period at 10 second(s).
  2023-05-26 13:55:26,783 INFO [org.apache.hadoop.metrics2.impl.MetricsSystemImpl] - JobTracker metrics system started
  2023-05-26 13:55:27,228 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
  2023-05-26 13:55:27,351 WARN [org.apache.hadoop.mapreduce.JobResourceUploader] - No job jar file set.  User classes may not be found. See Job or Job#setJar(String).
  2023-05-26 13:55:27,383 INFO [org.apache.hadoop.mapreduce.JobSubmitter] - Cleaning up the staging area file:/tmp/hadoop/mapred/staging/yeqiu523558444/.staging/job_local523558444_0001
  Exception in thread "main" java.lang.UnsatisfiedLinkError: 'boolean org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(java.lang.String, int)'
	at org.apache.hadoop.io.nativeio.NativeIO$Windows.access0(Native Method)
	at org.apache.hadoop.io.nativeio.NativeIO$Windows.access(NativeIO.java:793)
	at org.apache.hadoop.fs.FileUtil.canRead(FileUtil.java:1249)
	at org.apache.hadoop.fs.FileUtil.list(FileUtil.java:1454)
	at org.apache.hadoop.fs.RawLocalFileSystem.listStatus(RawLocalFileSystem.java:601)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1972)
	at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:2014)
	at org.apache.hadoop.fs.FileSystem$4.<init>(FileSystem.java:2180)
	at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:2179)
	at org.apache.hadoop.fs.ChecksumFileSystem.listLocatedStatus(ChecksumFileSystem.java:783)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:320)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.listStatus(FileInputFormat.java:279)
	at org.apache.hadoop.mapreduce.lib.input.FileInputFormat.getSplits(FileInputFormat.java:404)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeNewSplits(JobSubmitter.java:310)
	at org.apache.hadoop.mapreduce.JobSubmitter.writeSplits(JobSubmitter.java:327)
	at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:200)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1571)
	at org.apache.hadoop.mapreduce.Job$11.run(Job.java:1568)
	at java.base/java.security.AccessController.doPrivileged(AccessController.java:712)
	at java.base/javax.security.auth.Subject.doAs(Subject.java:439)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1878)
	at org.apache.hadoop.mapreduce.Job.submit(Job.java:1568)
	at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:1589)
	at com.atgui.mapreduce.wordcount.WordCountDriver.main(WordCountDriver.java:47)

pom.xml环境依赖如下:
点击查看代码
<dependencies>
        <dependency>
            <groupId>org.apache.hadoop</groupId>
            <artifactId>hadoop-client</artifactId>
            <version>3.3.4</version>
        </dependency>
        <dependency>
            <groupId>junit</groupId>
            <artifactId>junit</artifactId>
            <version>4.12</version>
        </dependency>
        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-log4j12</artifactId>
            <version>1.7.30</version>
        </dependency>
    </dependencies>
WordCountMapper.java
点击查看代码
package com.atgui.mapreduce.wordcount;

import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

import java.io.IOException;
public class WordCountMapper extends Mapper<LongWritable, Text,Text, IntWritable> {
    private  Text outK = new Text();
    private  IntWritable outV = new IntWritable(1);
    @Override
    protected void map(LongWritable key, Text value,Context context) throws IOException, InterruptedException {
        // 1 获取一行
        // atguigu atguigu
        String line = value.toString();

        // 2 切割
        // atguigu
        // atguigu
        String[] words = line.split(" ");

        // 3 循环写出
        for (String word : words) {
            // 封装outK
            outK.set(word);
            // 写出
            context.write(outK, outV);
        }

    }
}
WordCountReduce.java
点击查看代码
package com.atgui.mapreduce.wordcount;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.mapreduce.Reducer;
import org.w3c.dom.Text;

import java.io.IOException;
public class WordCountReducer extends Reducer<Text, IntWritable,Text,IntWritable> {
    private IntWritable outV = new IntWritable();
    @Override
    protected void reduce(Text key, Iterable<IntWritable> values, Reducer<Text, IntWritable, Text, IntWritable>.Context context) throws IOException, InterruptedException {
        int sum = 0;

        // atguigu,(1,1)
        for (IntWritable value : values) {
            sum += value.get();
        }

        outV.set(sum);

        // 写出
        context.write(key, outV);
    }
}

WordCountDriver
点击查看代码
package com.atgui.mapreduce.wordcount;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;

import java.io.IOException;

public class WordCountDriver {
//    static {
//        try {
//            System.load("D:\\Environment\\hadoop-3.1.0\\bin\\hadoop.dll");
//        } catch (UnsatisfiedLinkError e) {
//            System.err.println("Native code library failed to load.\n" + e);
//            System.exit(1);
//        }
//    }
    public static void main(String[] args) throws IOException, InterruptedException, ClassNotFoundException {
        // 1 获取job
        Configuration conf = new Configuration();
        Job job = Job.getInstance(conf);

        // 2 设置jar包路径
        job.setJarByClass(WordCountDriver.class);

        // 3 关联mapper和reducer
        job.setMapperClass(WordCountMapper.class);
        job.setReducerClass(WordCountReducer.class);

        // 4 设置map输出的kv类型
        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(IntWritable.class);

        // 5 设置最终输出的kv类型
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);

        // 6 设置输入路径和输出路径
        FileInputFormat.setInputPaths(job,new Path("D:\\Environment\\input\\inputword"));
        FileOutputFormat.setOutputPath(job,new Path("D:\\Environment\\hadoop\\output1"));

        // 7 提交job
        boolean result = job.waitForCompletion(true);

        System.exit(result? 0 : 1);
    }
}

----------------------------------------------------------------------------------------------

询问Claude,回答如下:

  这个错误是由于WordCount程序在Windows系统上运行时,尝试调用了Hadoop的NativeIO类的access0方法,但无法找到正确的JNI库导致的UnsatisfiedLinkError异常。
NativeIO类是Hadoop用来执行一些本地文件操作的类,它依赖于JNI来调用Windows的API。在Windows上运行Hadoop程序时,需要从Hadoop发行包中copy出对应的winutils.exe和hadoop.dll两个文件,并设置相关环境变量,Hadoop程序才能正确加载这两个文件,执行本地文件操作。
  所以,导致这个错误的原因是:

  1. 没有从Hadoop发行包中获取winutils.exe和hadoop.dll两个文件。

  2. 获取了这两个文件,但环境变量设置不正确,Hadoop无法加载它们。

    解决方案:在你的driver类中加下面代码,如果可以执行就说明是hadoop.dll的问题

static {
        try {
            System.load("D:\\Environment\\hadoop-3.1.0\\bin\\hadoop.dll");
        } catch (UnsatisfiedLinkError e) {
            System.err.println("Native code library failed to load.\n" + e);
            System.exit(1);
        }
    }