【hadoop】 3002-mapreduce程序统计单词个数示例

2023-11-29 04:32

本文主要是介绍【hadoop】 3002-mapreduce程序统计单词个数示例,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!

一、新建文本文件wordcount.txt,并上传至hdfs服务器上
[hadoop@cloud01 HDFSdemo]$ hadoop fs -cat /wc/wordcount.txt
hello world
hello China
hello wenjie
hello USA
hello China
hello China
hello Japan

[hadoop@cloud01 HDFSdemo]$ hadoop fs -cat /wc/wordcount1.txt
hello USA


期望结果:
<hello,8>,<world,1><China,3><wenjie,1>,<USA,2>,<Japan,1>

二、通过MR程序统计
1、在Eclipse下编写map程序、reduce程序、Main主程序

package mapreduce;

import java.io.IOException;

import org.apache.commons.lang.StringUtils;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;

/**
* Mapper
*
* @author shenfl
*
*/
public class WCMapper extends Mapper<LongWritable, Text, Text, LongWritable> {

     /**
     *  @param key : text offset
     *  @param value: each line text
     *  @context : hadoop context
     */
     protected void map(LongWritable key, Text value, Context context) throws IOException,
          InterruptedException {

          String[] values = StringUtils.split(value.toString(), " ");
         
          for(String v:values){
               context.write(new Text(v),new LongWritable(1));
          }
     }
}



package mapreduce;

import java.io.IOException;

import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Reducer;

public class WCReducer extends Reducer<Text, LongWritable, Text, LongWritable> {

     @Override
     protected void reduce(Text key, Iterable<LongWritable> values, Context context) throws IOException,
               InterruptedException {

          long count = 0;
          for(LongWritable v:values){
               count += v.get();
          }
          context.write(key, new  LongWritable(count));
     }
}

2、Main主程序查看运行结果

package mapreduce;

import java.net.URI;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Job;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;

/**
* <p>
* Test hadoop 2.4.1 version program
* </p>
*
* @author shenfl
*
*/
public class WordCount {

     private static final String HDFS_PATH = "hdfs://cloud01:9000";

     public static void main(String[] args) {

          Configuration conf = new Configuration();
          try {
               // conf.set("", "");
               Job job = Job.getInstance(conf);

               /**
               * Set the job's jar file by finding an example class location.
               *
               * @param cls
               *            the example class.
               */
               job.setJarByClass(WordCount.class);
               job.setJar("wc.jar");
              
               job.setInputFormatClass(TextInputFormat.class);
               job.setOutputFormatClass(TextOutputFormat.class);

               job.setMapperClass(WCMapper.class);
               job.setReducerClass(WCReducer.class);

               job.setMapOutputKeyClass(Text.class);
               job.setMapOutputValueClass(LongWritable.class);

               job.setOutputKeyClass(Text.class);
               job.setOutputKeyClass(LongWritable.class);

               Path inputPath = new Path(HDFS_PATH + "/wc");
               Path outputDir = new Path(HDFS_PATH + "/tmp");
               /**
               * Set the array of as the list of inputs for the
               * map-reduce job.
               * @param job The job to modify
               * @param inputPaths
               *            the of the input directories/files for
               *            the map-reduce job.
               */
               FileInputFormat.setInputPaths(job, inputPath);
               /**
               * Set the of the output directory for the map-reduce
               * job.
               * @param job The job to modify
               * @param outputDir
               *            the of the output directory for the
               *            map-reduce job.
               */
               FileOutputFormat.setOutputPath(job, outputDir);

               FileSystem fs = FileSystem.get(new URI(HDFS_PATH), conf);
               if (fs.exists(outputDir)) {
                    fs.delete(outputDir, true);
               }
               System.exit(job.waitForCompletion(true) ? 0 : 1);
          } catch (Exception e) {
               e.printStackTrace();
          }
     }

}

3、通过MR执行后,查看hdfs上的结果

[hadoop@cloud01 HDFSdemo]$ hadoop fs -cat /tmp/part-r-00000
China     3
Japan     1
USA     2
hello     8
wenjie     1
world     1


4、分析MR的执行过程
FileInputFormat  ( read input paths to process)-> JobSubmitter  (number of split )->load confirguation->   Running job->

5、Eclipse下MR执行过程日志分析
2015-02-24 22:37:09,812 WARN  [main] util.NativeCodeLoader (  NativeCodeLoader.java:<clinit>(62)  ) - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2015-02-24 22:37:10,673 INFO  [main] Configuration.deprecation ( Configuration.java:warnOnceIfDeprecated(1009)  ) - session.id is deprecated. Instead, use dfs.metrics.session-id
2015-02-24 22:37:10,674 INFO  [main] jvm.JvmMetrics (  JvmMetrics.java:init(76) ) - Initializing JVM Metrics with processName=JobTracker, sessionId=
2015-02-24 22:37:10,958 WARN  [main] mapreduce.JobSubmitter ( JobSubmitter.java:copyAndConfigureFiles(150)  ) - Hadoop command-line option parsing not performed. Implement the Tool interface and execute your application with ToolRunner to remedy this.
2015-02-24 22:37:14,867 INFO  [main] input.FileInputFormat (  FileInputFormat.java:listStatus(280)  ) - Total input paths to process : 2
2015-02-24 22:37:14,987 INFO  [main] mapreduce.JobSubmitter (  JobSubmitter.java:submitJobInternal(396)  ) - number of splits:2
2015-02-24 22:37:15,165 INFO  [main] mapreduce.JobSubmitter (  JobSubmitter.java:printTokens(479)  ) - Submitting tokens for job: job_local1376201216_0001
2015-02-24 22:37:15,193 WARN  [main] conf.Configuration (  Configuration.java:loadProperty(2358)  ) - file:/tmp/hadoop-hadoop/mapred/staging/hadoop1376201216/.staging/job_local1376201216_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2015-02-24 22:37:15,194 WARN  [main] conf.Configuration (  Configuration.java:loadProperty(2358)  ) - file:/tmp/hadoop-hadoop/mapred/staging/hadoop1376201216/.staging/job_local1376201216_0001/job.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2015-02-24 22:37:15,386 WARN  [main] conf.Configuration (  Configuration.java:loadProperty(2358)  ) - file:/tmp/hadoop-hadoop/mapred/local/localRunner/hadoop/job_local1376201216_0001/job_local1376201216_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval;  Ignoring.
2015-02-24 22:37:15,387 WARN  [main] conf.Configuration (  Configuration.java:loadProperty(2358)  ) - file:/tmp/hadoop-hadoop/mapred/local/localRunner/hadoop/job_local1376201216_0001/job_local1376201216_0001.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts;  Ignoring.
2015-02-24 22:37:15,401 INFO  [main] mapreduce.Job (  Job.java:submit(1289) ) - The url to track the job: http://localhost:8080/
2015-02-24 22:37:15,401 INFO  [main] mapreduce.Job (  Job.java:monitorAndPrintJob(1334)  ) - Running job: job_local1376201216_0001
2015-02-24 22:37:15,431 INFO  [Thread-14] mapred.LocalJobRunner ( LocalJobRunner.java:createOutputCommitter(471)  ) - OutputCommitter set in config null
2015-02-24 22:37:15,453 INFO  [Thread-14] mapred.LocalJobRunner ( LocalJobRunner.java:createOutputCommitter(489)  ) - OutputCommitter is org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
2015-02-24 22:37:15,583 INFO  [Thread-14] mapred.LocalJobRunner (  LocalJobRunner.java:runTasks(448)  ) - Waiting for map tasks
2015-02-24 22:37:15,588 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner ( LocalJobRunner.java:run(224)  ) - Starting task: attempt_local1376201216_0001_m_000000_0
2015-02-24 22:37:15,723 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task ( Task.java:initialize(581)  ) -  Using ResourceCalculatorProcessTree : [ ]
2015-02-24 22:37:15,726 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:runNewMapper(733)  ) - Processing split: hdfs://cloud01:9000/wc/wordcount.txt:0+84
2015-02-24 22:37:15,754 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:createSortingCollector(388)  ) - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2015-02-24 22:37:15,876 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:setEquator(1182)  ) - (EQUATOR) 0 kvi 26214396(104857584)
2015-02-24 22:37:15,877 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:init(975)  ) - mapreduce.task.io.sort.mb: 100
2015-02-24 22:37:15,877 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:init(976)  ) - soft limit at 83886080
2015-02-24 22:37:15,877 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:init(977)  ) - bufstart = 0; bufvoid = 104857600
2015-02-24 22:37:15,877 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:init(978)  ) - kvstart = 26214396; length = 6553600
2015-02-24 22:37:16,435 INFO  [main] mapreduce.Job (  Job.java:monitorAndPrintJob(1355)  ) - Job job_local1376201216_0001 running in uber mode : false
2015-02-24 22:37:16,552 INFO  [main] mapreduce.Job (  Job.java:monitorAndPrintJob(1362)  ) -  map 0% reduce 0%
2015-02-24 22:37:17,104 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner ( LocalJobRunner.java:statusUpdate(591)  ) -
2015-02-24 22:37:17,106 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:flush(1437)  ) - Starting flush of map output
2015-02-24 22:37:17,106 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:flush(1455)  ) - Spilling map output
2015-02-24 22:37:17,106 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:flush(1456)  ) - bufstart = 0; bufend = 195; bufvoid = 104857600
2015-02-24 22:37:17,106 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:flush(1458)  ) - kvstart = 26214396(104857584); kvend = 26214344(104857376); length = 53/6553600
2015-02-24 22:37:17,113 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:sortAndSpill(1641)  ) - Finished spill 0
2015-02-24 22:37:17,116 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task (  Task.java:done(995) ) - Task:attempt_local1376201216_0001_m_000000_0 is done. And is in the process of committing
2015-02-24 22:37:17,123 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner ( LocalJobRunner.java:statusUpdate(591)  ) - map
2015-02-24 22:37:17,123 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task ( Task.java:sendDone(1115)  ) - Task 'attempt_local1376201216_0001_m_000000_0' done.
2015-02-24 22:37:17,123 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner ( LocalJobRunner.java:run(249)  ) - Finishing task: attempt_local1376201216_0001_m_000000_0
2015-02-24 22:37:17,123 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner ( LocalJobRunner.java:run(224)  ) - Starting task: attempt_local1376201216_0001_m_000001_0
2015-02-24 22:37:17,124 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task ( Task.java:initialize(581)  ) -  Using ResourceCalculatorProcessTree : [ ]
2015-02-24 22:37:17,125 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:runNewMapper(733)  ) - Processing split: hdfs://cloud01:9000/wc/wordcount1.txt:0+11
2015-02-24 22:37:17,125 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:createSortingCollector(388)  ) - Map output collector class = org.apache.hadoop.mapred.MapTask$MapOutputBuffer
2015-02-24 22:37:17,172 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:setEquator(1182)  ) - (EQUATOR) 0 kvi 26214396(104857584)
2015-02-24 22:37:17,172 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:init(975)  ) - mapreduce.task.io.sort.mb: 100
2015-02-24 22:37:17,172 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:init(976)  ) - soft limit at 83886080
2015-02-24 22:37:17,173 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:init(977)  ) - bufstart = 0; bufvoid = 104857600
2015-02-24 22:37:17,173 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:init(978)  ) - kvstart = 26214396; length = 6553600
2015-02-24 22:37:17,180 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner ( LocalJobRunner.java:statusUpdate(591)  ) -
2015-02-24 22:37:17,180 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:flush(1437)  ) - Starting flush of map output
2015-02-24 22:37:17,180 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:flush(1455)  ) - Spilling map output
2015-02-24 22:37:17,180 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:flush(1456)  ) - bufstart = 0; bufend = 26; bufvoid = 104857600
2015-02-24 22:37:17,180 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:flush(1458)  ) - kvstart = 26214396(104857584); kvend = 26214392(104857568); length = 5/6553600
2015-02-24 22:37:17,182 INFO  [LocalJobRunner Map Task Executor #0] mapred.MapTask ( MapTask.java:sortAndSpill(1641)  ) - Finished spill 0
2015-02-24 22:37:17,183 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task (  Task.java:done(995) ) - Task:attempt_local1376201216_0001_m_000001_0 is done. And is in the process of committing
2015-02-24 22:37:17,187 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner ( LocalJobRunner.java:statusUpdate(591)  ) - map
2015-02-24 22:37:17,188 INFO  [LocalJobRunner Map Task Executor #0] mapred.Task ( Task.java:sendDone(1115)  ) - Task 'attempt_local1376201216_0001_m_000001_0' done.
2015-02-24 22:37:17,188 INFO  [LocalJobRunner Map Task Executor #0] mapred.LocalJobRunner ( LocalJobRunner.java:run(249)  ) - Finishing task: attempt_local1376201216_0001_m_000001_0
2015-02-24 22:37:17,188 INFO  [Thread-14] mapred.LocalJobRunner (  LocalJobRunner.java:runTasks(456)  ) - map task executor complete.
2015-02-24 22:37:17,201 INFO  [Thread-14] mapred.LocalJobRunner (  LocalJobRunner.java:runTasks(448)  ) - Waiting for reduce tasks
2015-02-24 22:37:17,201 INFO  [pool-6-thread-1] mapred.LocalJobRunner (  LocalJobRunner.java:run(302)  ) - Starting task: attempt_local1376201216_0001_r_000000_0
2015-02-24 22:37:17,212 INFO  [pool-6-thread-1] mapred.Task (  Task.java:initialize(581) ) -  Using ResourceCalculatorProcessTree : [ ]
2015-02-24 22:37:17,215 INFO  [pool-6-thread-1] mapred.ReduceTask (  ReduceTask.java:run(362) ) - Using ShuffleConsumerPlugin: org.apache.hadoop.mapreduce.task.reduce.Shuffle@b76539
2015-02-24 22:37:17,248 INFO  [pool-6-thread-1] reduce.MergeManagerImpl ( MergeManagerImpl.java:<init>(193)  ) - MergerManager: memoryLimit=178821520, maxSingleShuffleLimit=44705380, mergeThreshold=118022208, ioSortFactor=10, memToMemMergeOutputsThreshold=10
2015-02-24 22:37:17,256 INFO  [EventFetcher for fetching Map Completion Events] reduce.EventFetcher ( EventFetcher.java:run(61)  ) - attempt_local1376201216_0001_r_000000_0 Thread started: EventFetcher for fetching Map Completion Events
2015-02-24 22:37:17,374 INFO  [localfetcher#1] reduce.LocalFetcher ( LocalFetcher.java:copyMapOutput(140)  ) - localfetcher#1 about to shuffle output of map attempt_local1376201216_0001_m_000000_0 decomp: 225 len: 229 to MEMORY
2015-02-24 22:37:17,412 INFO  [localfetcher#1] reduce.InMemoryMapOutput ( InMemoryMapOutput.java:shuffle(100)  ) - Read 225 bytes from map-output for attempt_local1376201216_0001_m_000000_0
2015-02-24 22:37:17,421 INFO  [localfetcher#1] reduce.MergeManagerImpl ( MergeManagerImpl.java:closeInMemoryFile(307)  ) - closeInMemoryFile -> map-output of size: 225, inMemoryMapOutputs.size() -> 1, commitMemory -> 0, usedMemory ->225
2015-02-24 22:37:17,462 INFO  [localfetcher#1] reduce.LocalFetcher ( LocalFetcher.java:copyMapOutput(140)  ) - localfetcher#1 about to shuffle output of map attempt_local1376201216_0001_m_000001_0 decomp: 32 len: 36 to MEMORY
2015-02-24 22:37:17,468 INFO  [localfetcher#1] reduce.InMemoryMapOutput ( InMemoryMapOutput.java:shuffle(100)  ) - Read 32 bytes from map-output for attempt_local1376201216_0001_m_000001_0
2015-02-24 22:37:17,469 INFO  [localfetcher#1] reduce.MergeManagerImpl ( MergeManagerImpl.java:closeInMemoryFile(307)  ) - closeInMemoryFile -> map-output of size: 32, inMemoryMapOutputs.size() -> 2, commitMemory -> 225, usedMemory ->257
2015-02-24 22:37:17,477 INFO  [EventFetcher for fetching Map Completion Events] reduce.EventFetcher ( EventFetcher.java:run(76)  ) - EventFetcher is interrupted.. Returning
2015-02-24 22:37:17,480 INFO  [pool-6-thread-1] mapred.LocalJobRunner ( LocalJobRunner.java:statusUpdate(591)  ) - 2 / 2 copied.
2015-02-24 22:37:17,482 INFO  [pool-6-thread-1] reduce.MergeManagerImpl ( MergeManagerImpl.java:finalMerge(667)  ) - finalMerge called with 2 in-memory map-outputs and 0 on-disk map-outputs
2015-02-24 22:37:17,546 INFO  [pool-6-thread-1] mapred.Merger (  Merger.java:merge(591) ) - Merging 2 sorted segments
2015-02-24 22:37:17,547 INFO  [pool-6-thread-1] mapred.Merger (  Merger.java:merge(690) ) - Down to the last merge-pass, with 2 segments left of total size: 243 bytes
2015-02-24 22:37:17,555 INFO  [pool-6-thread-1] reduce.MergeManagerImpl ( MergeManagerImpl.java:finalMerge(742)  ) - Merged 2 segments, 257 bytes to disk to satisfy reduce memory limit
2015-02-24 22:37:17,556 INFO  [pool-6-thread-1] reduce.MergeManagerImpl ( MergeManagerImpl.java:finalMerge(772)  ) - Merging 1 files, 259 bytes from disk
2015-02-24 22:37:17,563 INFO  [pool-6-thread-1] reduce.MergeManagerImpl ( MergeManagerImpl.java:finalMerge(787)  ) - Merging 0 segments, 0 bytes from memory into reduce
2015-02-24 22:37:17,563 INFO  [pool-6-thread-1] mapred.Merger (  Merger.java:merge(591) ) - Merging 1 sorted segments
2015-02-24 22:37:17,566 INFO  [pool-6-thread-1] mapred.Merger (  Merger.java:merge(690) ) - Down to the last merge-pass, with 1 segments left of total size: 247 bytes
2015-02-24 22:37:17,568 INFO  [pool-6-thread-1] mapred.LocalJobRunner ( LocalJobRunner.java:statusUpdate(591)  ) - 2 / 2 copied.
2015-02-24 22:37:17,576 INFO  [main] mapreduce.Job (  Job.java:monitorAndPrintJob(1362)  ) -  map 100% reduce 0%
2015-02-24 22:37:17,820 INFO  [pool-6-thread-1] Configuration.deprecation ( Configuration.java:warnOnceIfDeprecated(1009)  ) - mapred.skip.on is deprecated. Instead, use mapreduce.job.skiprecords
2015-02-24 22:37:18,045 INFO  [pool-6-thread-1] mapred.Task (  Task.java:done(995) ) - Task:attempt_local1376201216_0001_r_000000_0 is done. And is in the process of committing
2015-02-24 22:37:18,049 INFO  [pool-6-thread-1] mapred.LocalJobRunner ( LocalJobRunner.java:statusUpdate(591)  ) - 2 / 2 copied.
2015-02-24 22:37:18,049 INFO  [pool-6-thread-1] mapred.Task (  Task.java:commit(1156) ) - Task attempt_local1376201216_0001_r_000000_0 is allowed to commit now
2015-02-24 22:37:18,074 INFO  [pool-6-thread-1] output.FileOutputCommitter ( FileOutputCommitter.java:commitTask(439)  ) - Saved output of task 'attempt_local1376201216_0001_r_000000_0' to hdfs://cloud01:9000/tmp/_temporary/0/task_local1376201216_0001_r_000000
2015-02-24 22:37:18,075 INFO  [pool-6-thread-1] mapred.LocalJobRunner ( LocalJobRunner.java:statusUpdate(591)  ) - reduce > reduce
2015-02-24 22:37:18,075 INFO  [pool-6-thread-1] mapred.Task (  Task.java:sendDone(1115) ) - Task 'attempt_local1376201216_0001_r_000000_0' done.
2015-02-24 22:37:18,076 INFO  [pool-6-thread-1] mapred.LocalJobRunner (  LocalJobRunner.java:run(325)  ) - Finishing task: attempt_local1376201216_0001_r_000000_0
2015-02-24 22:37:18,076 INFO  [Thread-14] mapred.LocalJobRunner (  LocalJobRunner.java:runTasks(456)  ) - reduce task executor complete.
2015-02-24 22:37:18,578 INFO  [main] mapreduce.Job (  Job.java:monitorAndPrintJob(1362)  ) -  map 100% reduce 100%
2015-02-24 22:37:18,578 INFO  [main] mapreduce.Job (  Job.java:monitorAndPrintJob(1373)  ) - Job job_local1376201216_0001 completed successfully
2015-02-24 22:37:18,590 INFO  [main] mapreduce.Job (  Job.java:monitorAndPrintJob(1380)  ) - Counters: 38
                File System Counters
                                FILE: Number of bytes read=429505752
                                FILE: Number of bytes written=433529504
                                FILE: Number of read operations=0
                                FILE: Number of large read operations=0
                                FILE: Number of write operations=0
                                HDFS: Number of bytes read=274
                                HDFS: Number of bytes written=47
                                HDFS: Number of read operations=28
                                HDFS: Number of large read operations=0
                                HDFS: Number of write operations=8
                Map-Reduce Framework
                                Map input records=9
                                Map output records=16
                                Map output bytes=221
                                Map output materialized bytes=265
                                Input split bytes=203
                                Combine input records=0
                                Combine output records=0
                                Reduce input groups=6
                                Reduce shuffle bytes=265
                                Reduce input records=16
                                Reduce output records=6
                                Spilled Records=32
                                Shuffled Maps =2
                                Failed Shuffles=0
                                Merged Map outputs=2
                                GC time elapsed (ms)=70
                                CPU time spent (ms)=0
                                Physical memory (bytes) snapshot=0
                                Virtual memory (bytes) snapshot=0
                                Total committed heap usage (bytes)=457125888
                Shuffle Errors
                                BAD_ID=0
                                CONNECTION=0
                                IO_ERROR=0
                                WRONG_LENGTH=0
                                WRONG_MAP=0
                                WRONG_REDUCE=0
                File Input Format Counters
                                Bytes Read=95
                File Output Format Counters
                                Bytes Written=47





这篇关于【hadoop】 3002-mapreduce程序统计单词个数示例的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!



http://www.chinasem.cn/article/431456

相关文章

使用Python创建一个功能完整的Windows风格计算器程序

《使用Python创建一个功能完整的Windows风格计算器程序》:本文主要介绍如何使用Python和Tkinter创建一个功能完整的Windows风格计算器程序,包括基本运算、高级科学计算(如三... 目录python实现Windows系统计算器程序(含高级功能)1. 使用Tkinter实现基础计算器2.

Pandas中统计汇总可视化函数plot()的使用

《Pandas中统计汇总可视化函数plot()的使用》Pandas提供了许多强大的数据处理和分析功能,其中plot()函数就是其可视化功能的一个重要组成部分,本文主要介绍了Pandas中统计汇总可视化... 目录一、plot()函数简介二、plot()函数的基本用法三、plot()函数的参数详解四、使用pl

基于Python打造一个智能单词管理神器

《基于Python打造一个智能单词管理神器》这篇文章主要为大家详细介绍了如何使用Python打造一个智能单词管理神器,从查询到导出的一站式解决,感兴趣的小伙伴可以跟随小编一起学习一下... 目录1. 项目概述:为什么需要这个工具2. 环境搭建与快速入门2.1 环境要求2.2 首次运行配置3. 核心功能使用指

使用Java将各种数据写入Excel表格的操作示例

《使用Java将各种数据写入Excel表格的操作示例》在数据处理与管理领域,Excel凭借其强大的功能和广泛的应用,成为了数据存储与展示的重要工具,在Java开发过程中,常常需要将不同类型的数据,本文... 目录前言安装免费Java库1. 写入文本、或数值到 Excel单元格2. 写入数组到 Excel表格

Python中的Walrus运算符分析示例详解

《Python中的Walrus运算符分析示例详解》Python中的Walrus运算符(:=)是Python3.8引入的一个新特性,允许在表达式中同时赋值和返回值,它的核心作用是减少重复计算,提升代码简... 目录1. 在循环中避免重复计算2. 在条件判断中同时赋值变量3. 在列表推导式或字典推导式中简化逻辑

Python位移操作和位运算的实现示例

《Python位移操作和位运算的实现示例》本文主要介绍了Python位移操作和位运算的实现示例,文中通过示例代码介绍的非常详细,对大家的学习或者工作具有一定的参考学习价值,需要的朋友们下面随着小编来一... 目录1. 位移操作1.1 左移操作 (<<)1.2 右移操作 (>>)注意事项:2. 位运算2.1

pandas中位数填充空值的实现示例

《pandas中位数填充空值的实现示例》中位数填充是一种简单而有效的方法,用于填充数据集中缺失的值,本文就来介绍一下pandas中位数填充空值的实现,具有一定的参考价值,感兴趣的可以了解一下... 目录什么是中位数填充?为什么选择中位数填充?示例数据结果分析完整代码总结在数据分析和机器学习过程中,处理缺失数

Pandas统计每行数据中的空值的方法示例

《Pandas统计每行数据中的空值的方法示例》处理缺失数据(NaN值)是一个非常常见的问题,本文主要介绍了Pandas统计每行数据中的空值的方法示例,具有一定的参考价值,感兴趣的可以了解一下... 目录什么是空值?为什么要统计空值?准备工作创建示例数据统计每行空值数量进一步分析www.chinasem.cn处

利用Python调试串口的示例代码

《利用Python调试串口的示例代码》在嵌入式开发、物联网设备调试过程中,串口通信是最基础的调试手段本文将带你用Python+ttkbootstrap打造一款高颜值、多功能的串口调试助手,需要的可以了... 目录概述:为什么需要专业的串口调试工具项目架构设计1.1 技术栈选型1.2 关键类说明1.3 线程模

Python使用getopt处理命令行参数示例解析(最佳实践)

《Python使用getopt处理命令行参数示例解析(最佳实践)》getopt模块是Python标准库中一个简单但强大的命令行参数处理工具,它特别适合那些需要快速实现基本命令行参数解析的场景,或者需要... 目录为什么需要处理命令行参数?getopt模块基础实际应用示例与其他参数处理方式的比较常见问http