You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Tri Doan <tr...@k-state.edu> on 2010/10/16 17:43:41 UTC

pls clarify on this

Saturday
Hi Harsh J

since i would use map function to emit  a pair of (file id , content) that will be used in reduce function to combine all text content with same file, then extract only content only between <title> </title> , <text> </text>
i thought if i can overwite Outputcollector <text,inwritable> by OutputCollector<text, text> to achieve this goal.

can i emit text object? or what should i do to emit object like text with file id then reduce will combine for further processing

best regard

Tri Doan
1429 Laramie Apt 3, Manhattan
KS 66502
USA

----- Original Message -----
From: "Harsh J" <qw...@gmail.com>
To: common-user@hadoop.apache.org
Sent: Saturday, October 16, 2010 8:23:08 AM
Subject: Re: how to fic this error error

Your mapper must emit an IntWritable as Value's type if you want to use that
in your reducer. Right now you are emitting a Text object instead.

On Oct 16, 2010 8:27 PM, "Tri Doan" <tr...@k-state.edu> wrote:

Saturday

i would liek to modify simple word count program so that i can produce text
file from given html files ( by extracting text content only beween <title>
and </title> and <text> and </text> . When i try to modify map and reduce
task. it seems that i could not overwrite inwritable. the error is
10/10/16 09:07:18 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
10/10/16 09:07:18 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
10/10/16 09:07:18 WARN mapred.JobClient: No job jar file set.  User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).
10/10/16 09:07:18 INFO mapred.FileInputFormat: Total input paths to process
: 20
10/10/16 09:07:19 INFO mapred.JobClient: Running job: job_local_0001
10/10/16 09:07:19 INFO mapred.FileInputFormat: Total input paths to process
: 20
10/10/16 09:07:19 INFO mapred.MapTask: numReduceTasks: 1
10/10/16 09:07:19 INFO mapred.MapTask: io.sort.mb = 100
10/10/16 09:07:19 INFO mapred.MapTask: data buffer = 79691776/99614720
10/10/16 09:07:19 INFO mapred.MapTask: record buffer = 262144/327680
10/10/16 09:07:19 INFO mapred.MapTask: Starting flush of map output
10/10/16 09:07:20 INFO mapred.JobClient:  map 0% reduce 0%
10/10/16 09:07:21 WARN mapred.LocalJobRunner: job_local_0001
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
org.apache.hadoop.io.IntWritable   <<<--------------------
       at WordProcess$Reduce.reduce(WordProcess.java:44)
       at WordProcess$Reduce.reduce(WordProcess.java:1)
       at
org.apache.hadoop.mapred.Task$OldCombinerRunner.combine(Task.java:1151)
       at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1265)
       at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1129)
       at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
       at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
10/10/16 09:07:22 INFO mapred.JobClient: Job complete: job_local_0001
10/10/16 09:07:22 INFO mapred.JobClient: Counters: 0
Exception in thread "main" java.io.IOException: Job failed!
<--------------------------------------
       at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
       at WordProcess.main(WordProcess.java:88)


my code is


import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;


public class WordProcess {

 public static class Map extends MapReduceBase implements
Mapper<LongWritable, Text, Text, Text> {
   private final static IntWritable one = new IntWritable(1);
   private Text id = new Text();

   private Text value = new Text();

   public void map(LongWritable key, Text value, OutputCollector<Text, Text>
output, Reporter reporter) throws IOException {
       String line = value.toString();

       FileSplit fileSplit = (FileSplit)reporter.getInputSplit();
       String fileName = fileSplit.getPath().getName();

       id.set(fileName);
       value.set(line);
       output.collect(id, value);

   }
 }

 public static class Reduce extends MapReduceBase implements Reducer<Text,
IntWritable, Text, Text> {

   public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
       int sum = 0;

       String str = "";
       String substr1,substr2;
       Text text = new Text();
       while (values.hasNext())
       {
           String s = values.next().toString();
           str = str.concat(s);

       }
       // locate tags and extract content
       int x1 = str.indexOf("<TITLE>");
       int y1 = str.indexOf("</TITLE>");
       substr1 = str.substring(x1+7,y1);

       int x2 = str.indexOf("<TEXT>");
       int y2 = str.indexOf("</TEXT>");
       substr2 = str.substring(x2+5,y2);

       str = substr1 +" "+ substr2;

       text.set(str);
       output.collect(key, text);
       System.out.println(key+","+text);

   }
 }

 public static void main(String[] args) throws Exception {
   JobConf conf = new JobConf(WordProcess.class);
   conf.setJobName("wordprocess");

   conf.setOutputKeyClass(Text.class);
  // conf.setOutputValueClass(IntWritable.class);
   conf.setOutputValueClass(Text.class);

   conf.setMapperClass(Map.class);

   conf.setCombinerClass(Reduce.class);
   conf.setReducerClass(Reduce.class);

   conf.setInputFormat(TextInputFormat.class);
   conf.setOutputFormat(TextOutputFormat.class);

   FileInputFormat.setInputPaths(conf, new Path(args[0]));
   FileOutputFormat.setOutputPath(conf, new Path(args[1]));

   // delete the output directory if it exists already
   FileSystem.get(conf).delete(new Path(args[1]), true);

   JobClient.runJob(conf);
 }

}




anyone have experience with this problem, pls tell me how to fix
thank in advances

best regard
Tri Doan
1429 Laramie Apt 3, Manhattan
KS 66502
USA

for job fail error

Posted by Tri Doan <tr...@k-state.edu>.
Hi Harsh J

thank, it work . I see that it scann through 20 datta text files ( exact number of files),except  at last step it gives error "job fails" i note by <<<---------------------

On the other hand, there is notice such as java.lang.StringIndexOutOfBoundsException: String index out of range: -7
before. how to fix ?
 I aTTACH again the code modified


10/10/16 11:25:53 INFO jvm.JvmMetrics: Initializing JVM Metrics with processName=JobTracker, sessionId=
10/10/16 11:25:53 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.
10/10/16 11:25:53 WARN mapred.JobClient: No job jar file set.  User classes may not be found. See JobConf(Class) or JobConf#setJar(String).
10/10/16 11:25:53 INFO mapred.FileInputFormat: Total input paths to process : 20
10/10/16 11:25:54 INFO mapred.JobClient: Running job: job_local_0001
10/10/16 11:25:54 INFO mapred.FileInputFormat: Total input paths to process : 20
10/10/16 11:25:54 INFO mapred.MapTask: numReduceTasks: 1
10/10/16 11:25:54 INFO mapred.MapTask: io.sort.mb = 100
10/10/16 11:25:55 INFO mapred.MapTask: data buffer = 79691776/99614720
10/10/16 11:25:55 INFO mapred.MapTask: record buffer = 262144/327680
10/10/16 11:25:55 INFO mapred.MapTask: Starting flush of map output
10/10/16 11:25:55 INFO mapred.JobClient:  map 0% reduce 0%
10/10/16 11:25:57 INFO mapred.MapTask: Finished spill 0
....
riations inboth the hypersonic similarity parameter k = m(d1) and the ratioof specific heats y .  reviewer believes this generalization should be of interest tothose engaged in development of hypersonic hardware as well astheory .
10/10/16 11:26:00 INFO mapred.MapTask: Finished spill 0
10/10/16 11:26:00 INFO mapred.TaskRunner: Task:attempt_local_0001_m_000019_0 is done. And is in the process of commiting
10/10/16 11:26:00 INFO mapred.LocalJobRunner: file:/F:/work-java/WordCount/input/cranfield0020:0+1245
10/10/16 11:26:00 INFO mapred.TaskRunner: Task 'attempt_local_0001_m_000019_0' done.
10/10/16 11:26:01 INFO mapred.LocalJobRunner: 
10/10/16 11:26:01 INFO mapred.Merger: Merging 20 sorted segments
10/10/16 11:26:01 INFO mapred.Merger: Merging 2 intermediate segments out of a total of 20
10/10/16 11:26:01 INFO mapred.Merger: Merging 10 intermediate segments out of a total of 19
10/10/16 11:26:01 INFO mapred.Merger: Down to the last merge-pass, with 10 segments left of total size: 18674 bytes
10/10/16 11:26:01 INFO mapred.LocalJobRunner: 
10/10/16 11:26:01 WARN mapred.LocalJobRunner: job_local_0001
java.lang.StringIndexOutOfBoundsException: String index out of range: -7   <<------------------ does it cause job fila error
	at java.lang.String.substring(Unknown Source)
	at WordProcess$Reduce.reduce(WordProcess.java:50)
	at WordProcess$Reduce.reduce(WordProcess.java:1)
	at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
	at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
10/10/16 11:26:01 INFO mapred.JobClient: Job complete: job_local_0001
10/10/16 11:26:01 INFO mapred.JobClient: Counters: 13
10/10/16 11:26:01 INFO mapred.JobClient:   FileSystemCounters
10/10/16 11:26:01 INFO mapred.JobClient:     FILE_BYTES_READ=484226
10/10/16 11:26:01 INFO mapred.JobClient:     FILE_BYTES_WRITTEN=801554
10/10/16 11:26:01 INFO mapred.JobClient:   Map-Reduce Framework
10/10/16 11:26:01 INFO mapred.JobClient:     Reduce input groups=0
10/10/16 11:26:01 INFO mapred.JobClient:     Combine output records=20
10/10/16 11:26:01 INFO mapred.JobClient:     Map input records=629
10/10/16 11:26:01 INFO mapred.JobClient:     Reduce shuffle bytes=0
10/10/16 11:26:01 INFO mapred.JobClient:     Reduce output records=0
10/10/16 11:26:01 INFO mapred.JobClient:     Spilled Records=33
10/10/16 11:26:01 INFO mapred.JobClient:     Map output bytes=30457
10/10/16 11:26:01 INFO mapred.JobClient:     Map input bytes=21651
10/10/16 11:26:01 INFO mapred.JobClient:     Combine input records=629
10/10/16 11:26:01 INFO mapred.JobClient:     Map output records=629
10/10/16 11:26:01 INFO mapred.JobClient:     Reduce input records=0
Exception in thread "main" java.io.IOException: Job failed!                      <<---------------------------- error
	at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
	at WordProcess.main(WordProcess.java:87)


------- newly modify code


import java.io.IOException;
import java.util.*;
        
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;
      
public class WordProcess {
        
 public static class Map extends MapReduceBase implements Mapper<LongWritable, Text, Text, Text> {
   private final static IntWritable one = new IntWritable(1);
    private Text id = new Text();

    private Text value = new Text();
        
    public void map(LongWritable key, Text value, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        String line = value.toString();

        FileSplit fileSplit = (FileSplit)reporter.getInputSplit();
        String fileName = fileSplit.getPath().getName();

        id.set(fileName);
        value.set(line);
        output.collect(id, value  );
                      
    }
 } 
        
 public static class Reduce extends MapReduceBase implements Reducer<Text, Text, Text, Text> {

    public void reduce(Text key, Iterator<Text> values, OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
        int sum = 0;
 
        String str = "";
        String substr1,substr2;
        Text text = new Text();
        while (values.hasNext()) 
        {
            String s = values.next().toString();
            str = str.concat(s); // convert into long string to process
            
        }
        // locate tags and extract content
        int x1 = str.indexOf("<TITLE>");
        int y1 = str.indexOf("</TITLE>");
        substr1 = str.substring(x1+7,y1);
        
        int x2 = str.indexOf("<TEXT>");
        int y2 = str.indexOf("</TEXT>");
        substr2 = str.substring(x2+6,y2);
        
        str = substr1 +" "+ substr2;
 
        text.set(str);
        output.collect(key, text);
        System.out.println(key+","+text);
     
    }
 }
        
 public static void main(String[] args) throws Exception {
    JobConf conf = new JobConf(WordProcess.class);
    conf.setJobName("wordprocess");
        
    conf.setOutputKeyClass(Text.class);
   // conf.setOutputValueClass(IntWritable.class);
    conf.setOutputValueClass(Text.class);    
   
    conf.setMapperClass(Map.class);
    
    conf.setCombinerClass(Reduce.class);
    conf.setReducerClass(Reduce.class);
        
    conf.setInputFormat(TextInputFormat.class);
    conf.setOutputFormat(TextOutputFormat.class);
        
    FileInputFormat.setInputPaths(conf, new Path(args[0]));
    FileOutputFormat.setOutputPath(conf, new Path(args[1]));
    
    // delete the output directory if it exists already
    FileSystem.get(conf).delete(new Path(args[1]), true);

    JobClient.runJob(conf);
 }
        
}



Tri Doan
1429 Laramie Apt 3, Manhattan
KS 66502
USA

----- Original Message -----
From: "Harsh J" <qw...@gmail.com>
To: common-user@hadoop.apache.org
Sent: Saturday, October 16, 2010 9:08:12 AM
Subject: Re: pls clarify on this

You can emit whichever Writable you like, but as per your given code your
Reducer class (the class definition line specifically) is looking for an
IntWritable in the value's iterable. Change that to Text and it should do
what you expect :)

For reference, look into JobConf.setMapOutputKeyClass(...) and
JobConf.setMapOutputValueClass(...) when it comes to configuring explicitly,
the intermediate key and value types (which transform into a reducer's
input).

On Oct 16, 2010 9:14 PM, "Tri Doan" <tr...@k-state.edu> wrote:

Saturday
Hi Harsh J

since i would use map function to emit  a pair of (file id , content) that
will be used in reduce function to combine all text content with same file,
then extract only content only between <title> </title> , <text> </text>
i thought if i can overwite Outputcollector <text,inwritable> by
OutputCollector<text, text> to achieve this goal.

can i emit text object? or what should i do to emit object like text with
file id then reduce will combine for further processing

best regard

Tri Doan
1429 Laramie Apt 3, Manhattan
KS 66502
USA

----- Original Message -----
From: "Harsh J" <qw...@gmail.com>
To: common-user@hadoop.apache.org
Sent: Saturday, October 16, 2010 8:23:08 AM
Subject: Re: how to fic this error error

Your mapper must emit an IntWritable as Value's type if you want to use that
in your reducer. Right now you are emitting a Text object instead.

On Oct 16, 2010 8:27 PM, "Tri Doan" <tr...@k-state.edu> wrote:

Saturday

i would liek to modify simple word count program so that i can produce text
file from given html files ( by extracting text content only beween <title>
and </title> and <text> and </text> . When i try to modify map and reduce
task. it seems that i could not overwrite inwritable. the error is
10/10/16 09:07:18 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
10/10/16 09:07:18 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
10/10/16 09:07:18 WARN mapred.JobClient: No job jar file set.  User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).
10/10/16 09:07:18 INFO mapred.FileInputFormat: Total input paths to process
: 20
10/10/16 09:07:19 INFO mapred.JobClient: Running job: job_local_0001
10/10/16 09:07:19 INFO mapred.FileInputFormat: Total input paths to process
: 20
10/10/16 09:07:19 INFO mapred.MapTask: numReduceTasks: 1
10/10/16 09:07:19 INFO mapred.MapTask: io.sort.mb = 100
10/10/16 09:07:19 INFO mapred.MapTask: data buffer = 79691776/99614720
10/10/16 09:07:19 INFO mapred.MapTask: record buffer = 262144/327680
10/10/16 09:07:19 INFO mapred.MapTask: Starting flush of map output
10/10/16 09:07:20 INFO mapred.JobClient:  map 0% reduce 0%
10/10/16 09:07:21 WARN mapred.LocalJobRunner: job_local_0001
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
org.apache.hadoop.io.IntWritable   <<<--------------------
      at WordProcess$Reduce.reduce(WordProcess.java:44)
      at WordProcess$Reduce.reduce(WordProcess.java:1)
      at
org.apache.hadoop.mapred.Task$OldCombinerRunner.combine(Task.java:1151)
      at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1265)
      at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1129)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
      at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
10/10/16 09:07:22 INFO mapred.JobClient: Job complete: job_local_0001
10/10/16 09:07:22 INFO mapred.JobClient: Counters: 0
Exception in thread "main" java.io.IOException: Job failed!
<--------------------------------------
      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
      at WordProcess.main(WordProcess.java:88)


my code is


import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;


public class WordProcess {

 public static class Map extends MapReduceBase implements
Mapper<LongWritable, Text, Text, Text> {
  private final static IntWritable one = new IntWritable(1);
  private Text id = new Text();

  private Text value = new Text();

  public void map(LongWritable key, Text value, OutputCollector<Text, Text>
output, Reporter reporter) throws IOException {
      String line = value.toString();

      FileSplit fileSplit = (FileSplit)reporter.getInputSplit();
      String fileName = fileSplit.getPath().getName();

      id.set(fileName);
      value.set(line);
      output.collect(id, value);

  }
 }

 public static class Reduce extends MapReduceBase implements Reducer<Text,
IntWritable, Text, Text> {

  public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
      int sum = 0;

      String str = "";
      String substr1,substr2;
      Text text = new Text();
      while (values.hasNext())
      {
          String s = values.next().toString();
          str = str.concat(s);

      }
      // locate tags and extract content
      int x1 = str.indexOf("<TITLE>");
      int y1 = str.indexOf("</TITLE>");
      substr1 = str.substring(x1+7,y1);

      int x2 = str.indexOf("<TEXT>");
      int y2 = str.indexOf("</TEXT>");
      substr2 = str.substring(x2+5,y2);

      str = substr1 +" "+ substr2;

      text.set(str);
      output.collect(key, text);
      System.out.println(key+","+text);

  }
 }

 public static void main(String[] args) throws Exception {
  JobConf conf = new JobConf(WordProcess.class);
  conf.setJobName("wordprocess");

  conf.setOutputKeyClass(Text.class);
 // conf.setOutputValueClass(IntWritable.class);
  conf.setOutputValueClass(Text.class);

  conf.setMapperClass(Map.class);

  conf.setCombinerClass(Reduce.class);
  conf.setReducerClass(Reduce.class);

  conf.setInputFormat(TextInputFormat.class);
  conf.setOutputFormat(TextOutputFormat.class);

  FileInputFormat.setInputPaths(conf, new Path(args[0]));
  FileOutputFormat.setOutputPath(conf, new Path(args[1]));

  // delete the output directory if it exists already
  FileSystem.get(conf).delete(new Path(args[1]), true);

  JobClient.runJob(conf);
 }

}




anyone have experience with this problem, pls tell me how to fix
thank in advances

best regard
Tri Doan
1429 Laramie Apt 3, Manhattan
KS 66502
USA

Re: pls clarify on this

Posted by Harsh J <qw...@gmail.com>.
You can emit whichever Writable you like, but as per your given code your
Reducer class (the class definition line specifically) is looking for an
IntWritable in the value's iterable. Change that to Text and it should do
what you expect :)

For reference, look into JobConf.setMapOutputKeyClass(...) and
JobConf.setMapOutputValueClass(...) when it comes to configuring explicitly,
the intermediate key and value types (which transform into a reducer's
input).

On Oct 16, 2010 9:14 PM, "Tri Doan" <tr...@k-state.edu> wrote:

Saturday
Hi Harsh J

since i would use map function to emit  a pair of (file id , content) that
will be used in reduce function to combine all text content with same file,
then extract only content only between <title> </title> , <text> </text>
i thought if i can overwite Outputcollector <text,inwritable> by
OutputCollector<text, text> to achieve this goal.

can i emit text object? or what should i do to emit object like text with
file id then reduce will combine for further processing

best regard

Tri Doan
1429 Laramie Apt 3, Manhattan
KS 66502
USA

----- Original Message -----
From: "Harsh J" <qw...@gmail.com>
To: common-user@hadoop.apache.org
Sent: Saturday, October 16, 2010 8:23:08 AM
Subject: Re: how to fic this error error

Your mapper must emit an IntWritable as Value's type if you want to use that
in your reducer. Right now you are emitting a Text object instead.

On Oct 16, 2010 8:27 PM, "Tri Doan" <tr...@k-state.edu> wrote:

Saturday

i would liek to modify simple word count program so that i can produce text
file from given html files ( by extracting text content only beween <title>
and </title> and <text> and </text> . When i try to modify map and reduce
task. it seems that i could not overwrite inwritable. the error is
10/10/16 09:07:18 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
10/10/16 09:07:18 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
10/10/16 09:07:18 WARN mapred.JobClient: No job jar file set.  User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).
10/10/16 09:07:18 INFO mapred.FileInputFormat: Total input paths to process
: 20
10/10/16 09:07:19 INFO mapred.JobClient: Running job: job_local_0001
10/10/16 09:07:19 INFO mapred.FileInputFormat: Total input paths to process
: 20
10/10/16 09:07:19 INFO mapred.MapTask: numReduceTasks: 1
10/10/16 09:07:19 INFO mapred.MapTask: io.sort.mb = 100
10/10/16 09:07:19 INFO mapred.MapTask: data buffer = 79691776/99614720
10/10/16 09:07:19 INFO mapred.MapTask: record buffer = 262144/327680
10/10/16 09:07:19 INFO mapred.MapTask: Starting flush of map output
10/10/16 09:07:20 INFO mapred.JobClient:  map 0% reduce 0%
10/10/16 09:07:21 WARN mapred.LocalJobRunner: job_local_0001
java.lang.ClassCastException: org.apache.hadoop.io.Text cannot be cast to
org.apache.hadoop.io.IntWritable   <<<--------------------
      at WordProcess$Reduce.reduce(WordProcess.java:44)
      at WordProcess$Reduce.reduce(WordProcess.java:1)
      at
org.apache.hadoop.mapred.Task$OldCombinerRunner.combine(Task.java:1151)
      at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1265)
      at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1129)
      at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:359)
      at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
      at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
10/10/16 09:07:22 INFO mapred.JobClient: Job complete: job_local_0001
10/10/16 09:07:22 INFO mapred.JobClient: Counters: 0
Exception in thread "main" java.io.IOException: Job failed!
<--------------------------------------
      at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1252)
      at WordProcess.main(WordProcess.java:88)


my code is


import java.io.IOException;
import java.util.*;

import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.*;
import org.apache.hadoop.util.*;


public class WordProcess {

 public static class Map extends MapReduceBase implements
Mapper<LongWritable, Text, Text, Text> {
  private final static IntWritable one = new IntWritable(1);
  private Text id = new Text();

  private Text value = new Text();

  public void map(LongWritable key, Text value, OutputCollector<Text, Text>
output, Reporter reporter) throws IOException {
      String line = value.toString();

      FileSplit fileSplit = (FileSplit)reporter.getInputSplit();
      String fileName = fileSplit.getPath().getName();

      id.set(fileName);
      value.set(line);
      output.collect(id, value);

  }
 }

 public static class Reduce extends MapReduceBase implements Reducer<Text,
IntWritable, Text, Text> {

  public void reduce(Text key, Iterator<IntWritable> values,
OutputCollector<Text, Text> output, Reporter reporter) throws IOException {
      int sum = 0;

      String str = "";
      String substr1,substr2;
      Text text = new Text();
      while (values.hasNext())
      {
          String s = values.next().toString();
          str = str.concat(s);

      }
      // locate tags and extract content
      int x1 = str.indexOf("<TITLE>");
      int y1 = str.indexOf("</TITLE>");
      substr1 = str.substring(x1+7,y1);

      int x2 = str.indexOf("<TEXT>");
      int y2 = str.indexOf("</TEXT>");
      substr2 = str.substring(x2+5,y2);

      str = substr1 +" "+ substr2;

      text.set(str);
      output.collect(key, text);
      System.out.println(key+","+text);

  }
 }

 public static void main(String[] args) throws Exception {
  JobConf conf = new JobConf(WordProcess.class);
  conf.setJobName("wordprocess");

  conf.setOutputKeyClass(Text.class);
 // conf.setOutputValueClass(IntWritable.class);
  conf.setOutputValueClass(Text.class);

  conf.setMapperClass(Map.class);

  conf.setCombinerClass(Reduce.class);
  conf.setReducerClass(Reduce.class);

  conf.setInputFormat(TextInputFormat.class);
  conf.setOutputFormat(TextOutputFormat.class);

  FileInputFormat.setInputPaths(conf, new Path(args[0]));
  FileOutputFormat.setOutputPath(conf, new Path(args[1]));

  // delete the output directory if it exists already
  FileSystem.get(conf).delete(new Path(args[1]), true);

  JobClient.runJob(conf);
 }

}




anyone have experience with this problem, pls tell me how to fix
thank in advances

best regard
Tri Doan
1429 Laramie Apt 3, Manhattan
KS 66502
USA