You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Fatih Haltas <fa...@nyu.edu> on 2013/02/27 08:33:16 UTC
java.lang.NumberFormatException and Thanks to Hemanth and Harsh
Hi all,
First, I would like to thank you all, espacially to Hemanth and Harsh.
I solved my problem, this was exactly the about java version and hadoop
version incompatibility, now, I can run my compiled and jarred MapReduce
program.
I have a different question now. I created a code, finding the each ip's
packet number in a given time interval for the netflow data.
However, I am getting the java.lang.NumberFormatException.
*************************************************************************
1. Here is my code in Java;
*************************************************************************
package org.myorg;
import java.io.IOException;
import org.apache.hadoop.io.*;
import java.util.NoSuchElementException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Mapper;
import java.util.StringTokenizer;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;
public class MapReduce extends Configured implements Tool
{
public int run (String[] args) throws Exception
{
System.out.println("Debug1");
if(args.length != 2)
{
System.err.println("Usage: MapReduce <input path>
<output path>");
ToolRunner.printGenericCommandUsage(System.err);
}
Job job = new Job();
job.setJarByClass(MapReduce.class);
System.out.println("Debug2");
job.setJobName("MaximumPacketFlowIP");
System.out.println("Debug3");
FileInputFormat.addInputPath(job, new Path(args[0]));
System.out.println("Debug8");
FileOutputFormat.setOutputPath(job, new Path(args[1]));
System.out.println("Debug9");
job.setMapperClass(FlowPortMapper.class);
System.out.println("Debug6");
job.setReducerClass(FlowPortReducer.class);
System.out.println("Debug7");
job.setOutputKeyClass(Text.class);
System.out.println("Debug4");
job.setOutputValueClass(IntWritable.class);
System.out.println("Debug5");
//System.exit(job.waitForCompletion(true) ? 0:1);
return job.waitForCompletion(true) ? 0:1 ;
}
/* ----------------------main---------------------*/
public static void main(String[] args) throws Exception
{
int exitCode = ToolRunner.run(new MapReduce(), args);
System.exit(exitCode);
}
/*---------------------------------Mapper-----------------------------------------*/
static class FlowPortMapper extends Mapper
<LongWritable,Text,Text,IntWritable>
{
public void map (LongWritable key, Text value, Context
context) throws IOException,
InterruptedException
{
String flow = value.toString();
long starttime=0;
long endtime=0;
long time1=1357289339;
long time2=1357289342;
StringTokenizer line = new StringTokenizer(flow);
String internalip="i";
//Getting the internalip from flow
if(line.hasMoreTokens())
internalip=line.nextToken();
//Getting the starttime and endtime from flow
for(int i=0;i<9;i++)
if(line.hasMoreTokens())
starttime=Long.parseLong(line.nextToken());
if(line.hasMoreTokens())
endtime=Long.parseLong(line.nextToken());
//If the time is in the given interval then emit 1
if(starttime>=time1 && endtime<=time2)
context.write(new Text(internalip), new
IntWritable(1));
}
}
/* --------------------Reducer-----------------------------------*/
static class FlowPortReducer extends
Reducer<Text,IntWritable,Text,IntWritable>
{
public void reduce(Text key,Iterable<IntWritable> values,
Context context) throws IOException,
InterruptedException
{
int numberOfOccurence = 0;
for(IntWritable value:values)
numberOfOccurence += value.get();
context.write(key, new
IntWritable(numberOfOccurence));
}
}
}
******************************************************************************
2. Here is the error I am getting:
******************************************************************************
[hadoop@ADUAE042-LAP-V flowtimeclasses2602]$ hadoop jar
flowtime2602_1841.jar org.myorg.MapReduce /user/hadoop/NetFlow2 test1106.out
Warning: $HADOOP_HOME is deprecated.
Debug1
Debug2
Debug3
Debug8
Debug9
Debug6
Debug7
Debug4
Debug5
13/02/27 10:56:00 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
13/02/27 10:56:00 INFO input.FileInputFormat: Total input paths to process
: 1
13/02/27 10:56:00 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
13/02/27 10:56:00 WARN snappy.LoadSnappy: Snappy native library not loaded
13/02/27 10:56:00 INFO mapred.JobClient: Running job: job_201302261146_0014
13/02/27 10:56:01 INFO mapred.JobClient: map 0% reduce 0%
13/02/27 10:56:13 INFO mapred.JobClient: Task Id :
attempt_201302261146_0014_m_000000_0, Status : FAILED
java.lang.NumberFormatException: For input string: "Data"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Long.parseLong(Long.java:410)
at java.lang.Long.parseLong(Long.java:468)
at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:106)
at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:86)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
13/02/27 10:56:20 INFO mapred.JobClient: Task Id :
attempt_201302261146_0014_m_000000_1, Status : FAILED
java.lang.NumberFormatException: For input string: "Data"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Long.parseLong(Long.java:410)
at java.lang.Long.parseLong(Long.java:468)
at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:106)
at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:86)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
13/02/27 10:56:28 INFO mapred.JobClient: Task Id :
attempt_201302261146_0014_m_000000_2, Status : FAILED
java.lang.NumberFormatException: For input string: "Data"
at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
at java.lang.Long.parseLong(Long.java:410)
at java.lang.Long.parseLong(Long.java:468)
at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:106)
at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:86)
at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
at org.apache.hadoop.mapred.Child.main(Child.java:249)
13/02/27 10:56:40 INFO mapred.JobClient: Job complete: job_201302261146_0014
13/02/27 10:56:40 INFO mapred.JobClient: Counters: 8
13/02/27 10:56:40 INFO mapred.JobClient: Job Counters
13/02/27 10:56:40 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=33174
13/02/27 10:56:40 INFO mapred.JobClient: Total time spent by all
reduces waiting after reserving slots (ms)=0
13/02/27 10:56:40 INFO mapred.JobClient: Total time spent by all maps
waiting after reserving slots (ms)=0
13/02/27 10:56:40 INFO mapred.JobClient: Rack-local map tasks=1
13/02/27 10:56:40 INFO mapred.JobClient: Launched map tasks=4
13/02/27 10:56:40 INFO mapred.JobClient: Data-local map tasks=3
13/02/27 10:56:40 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=0
13/02/27 10:56:40 INFO mapred.JobClient: Failed map tasks=1
**********************************************************************************
3. As you see from the code, I added some debug lines after some lines.
**********************************************************************************
4. My data is a NetFlow data; Normally, it was .txt file; I inserted it to
dfs; whose info is as follows:
********************************************************
[hadoop@ADUAE042-LAP-V flowtimeclasses2602]$ hadoop dfs -lsr
/user/hadoop/NetFlow2
Warning: $HADOOP_HOME is deprecated.
-rw-r--r-- 2 hadoop supergroup 14187484 2013-02-26 12:03
/user/hadoop/NetFlow2
*********************************************************
What may be the reason for it?
How can I fix it?