You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Fatih Haltas <fa...@nyu.edu> on 2013/02/27 08:33:16 UTC

java.lang.NumberFormatException and Thanks to Hemanth and Harsh

Hi all,

First, I would like to thank you all, espacially to Hemanth and Harsh.

I solved my problem, this was exactly the about java version and hadoop
version incompatibility, now, I can run my compiled and jarred MapReduce
program.

I have a different question now. I created a code, finding the each ip's
packet number in a given time interval for the netflow data.

However, I am getting the java.lang.NumberFormatException.
*************************************************************************
1. Here is my code in Java;
*************************************************************************
package org.myorg;
import java.io.IOException;
import org.apache.hadoop.io.*;
import java.util.NoSuchElementException;
import java.util.Iterator;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.*;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.conf.*;
import org.apache.hadoop.io.*;
import org.apache.hadoop.mapred.Reporter;
import org.apache.hadoop.mapreduce.lib.input.FileInputFormat;
import org.apache.hadoop.mapreduce.lib.input.TextInputFormat;
import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat;
import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat;
import org.apache.hadoop.io.IntWritable;
import org.apache.hadoop.io.LongWritable;
import org.apache.hadoop.io.Text;
import org.apache.hadoop.mapreduce.Mapper;
import org.apache.hadoop.mapreduce.Mapper;
import java.util.StringTokenizer;
import org.apache.hadoop.util.Tool;
import org.apache.hadoop.util.ToolRunner;



public class MapReduce extends Configured implements Tool
{


        public int run (String[] args) throws Exception
        {
                System.out.println("Debug1");

                if(args.length != 2)
                {
                        System.err.println("Usage: MapReduce <input path>
<output path>");
                        ToolRunner.printGenericCommandUsage(System.err);
                }

                Job job = new Job();

                job.setJarByClass(MapReduce.class);
                System.out.println("Debug2");
                job.setJobName("MaximumPacketFlowIP");
                System.out.println("Debug3");

                 FileInputFormat.addInputPath(job, new Path(args[0]));
                 System.out.println("Debug8");
                 FileOutputFormat.setOutputPath(job, new Path(args[1]));
                 System.out.println("Debug9");


            job.setMapperClass(FlowPortMapper.class);
                System.out.println("Debug6");
            job.setReducerClass(FlowPortReducer.class);
                System.out.println("Debug7");





            job.setOutputKeyClass(Text.class);
                System.out.println("Debug4");
            job.setOutputValueClass(IntWritable.class);
                System.out.println("Debug5");

            //System.exit(job.waitForCompletion(true) ? 0:1);
                return job.waitForCompletion(true) ? 0:1 ;
        }


        /* ----------------------main---------------------*/
        public static void main(String[] args) throws Exception
        {
                int exitCode = ToolRunner.run(new MapReduce(), args);
                System.exit(exitCode);
        }

/*---------------------------------Mapper-----------------------------------------*/
        static class FlowPortMapper extends Mapper
<LongWritable,Text,Text,IntWritable>
        {
                public void map (LongWritable key, Text value, Context
context) throws IOException,
                InterruptedException
                {
                        String flow = value.toString();
                        long starttime=0;
                        long endtime=0;
                        long time1=1357289339;
                        long time2=1357289342;
                        StringTokenizer line = new StringTokenizer(flow);
                        String internalip="i";

                        //Getting the internalip from flow
                        if(line.hasMoreTokens())
                        internalip=line.nextToken();

                        //Getting the starttime and endtime from flow
                        for(int i=0;i<9;i++)
                        if(line.hasMoreTokens())
                        starttime=Long.parseLong(line.nextToken());
                        if(line.hasMoreTokens())
                        endtime=Long.parseLong(line.nextToken());

                        //If the time is in the given interval then emit 1
                        if(starttime>=time1 && endtime<=time2)
                        context.write(new Text(internalip), new
IntWritable(1));


                        }

        }

        /* --------------------Reducer-----------------------------------*/

        static class FlowPortReducer extends
Reducer<Text,IntWritable,Text,IntWritable>
        {
                public  void reduce(Text key,Iterable<IntWritable> values,
Context context) throws IOException,
                InterruptedException
                {
                                int numberOfOccurence = 0;
                                for(IntWritable value:values)
                                        numberOfOccurence += value.get();
                                context.write(key, new
IntWritable(numberOfOccurence));
                }
        }
}

******************************************************************************
2. Here is the error I am getting:
******************************************************************************
[hadoop@ADUAE042-LAP-V flowtimeclasses2602]$ hadoop jar
flowtime2602_1841.jar org.myorg.MapReduce /user/hadoop/NetFlow2 test1106.out
Warning: $HADOOP_HOME is deprecated.

Debug1
Debug2
Debug3
Debug8
Debug9
Debug6
Debug7
Debug4
Debug5
13/02/27 10:56:00 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
13/02/27 10:56:00 INFO input.FileInputFormat: Total input paths to process
: 1
13/02/27 10:56:00 INFO util.NativeCodeLoader: Loaded the native-hadoop
library
13/02/27 10:56:00 WARN snappy.LoadSnappy: Snappy native library not loaded
13/02/27 10:56:00 INFO mapred.JobClient: Running job: job_201302261146_0014
13/02/27 10:56:01 INFO mapred.JobClient:  map 0% reduce 0%
13/02/27 10:56:13 INFO mapred.JobClient: Task Id :
attempt_201302261146_0014_m_000000_0, Status : FAILED
java.lang.NumberFormatException: For input string: "Data"
        at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
        at java.lang.Long.parseLong(Long.java:410)
        at java.lang.Long.parseLong(Long.java:468)
        at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:106)
        at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:86)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)

13/02/27 10:56:20 INFO mapred.JobClient: Task Id :
attempt_201302261146_0014_m_000000_1, Status : FAILED
java.lang.NumberFormatException: For input string: "Data"
        at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
        at java.lang.Long.parseLong(Long.java:410)
        at java.lang.Long.parseLong(Long.java:468)
        at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:106)
        at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:86)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)

13/02/27 10:56:28 INFO mapred.JobClient: Task Id :
attempt_201302261146_0014_m_000000_2, Status : FAILED
java.lang.NumberFormatException: For input string: "Data"
        at
java.lang.NumberFormatException.forInputString(NumberFormatException.java:48)
        at java.lang.Long.parseLong(Long.java:410)
        at java.lang.Long.parseLong(Long.java:468)
        at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:106)
        at org.myorg.MapReduce$FlowPortMapper.map(MapReduce.java:86)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:764)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:370)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1121)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)

13/02/27 10:56:40 INFO mapred.JobClient: Job complete: job_201302261146_0014
13/02/27 10:56:40 INFO mapred.JobClient: Counters: 8
13/02/27 10:56:40 INFO mapred.JobClient:   Job Counters
13/02/27 10:56:40 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=33174
13/02/27 10:56:40 INFO mapred.JobClient:     Total time spent by all
reduces waiting after reserving slots (ms)=0
13/02/27 10:56:40 INFO mapred.JobClient:     Total time spent by all maps
waiting after reserving slots (ms)=0
13/02/27 10:56:40 INFO mapred.JobClient:     Rack-local map tasks=1
13/02/27 10:56:40 INFO mapred.JobClient:     Launched map tasks=4
13/02/27 10:56:40 INFO mapred.JobClient:     Data-local map tasks=3
13/02/27 10:56:40 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=0
13/02/27 10:56:40 INFO mapred.JobClient:     Failed map tasks=1
**********************************************************************************
3. As you see from the code, I added some debug lines after some lines.
**********************************************************************************
4. My data is a NetFlow data; Normally, it was .txt file; I inserted it to
dfs; whose info is as follows:
********************************************************
[hadoop@ADUAE042-LAP-V flowtimeclasses2602]$ hadoop dfs -lsr
/user/hadoop/NetFlow2
Warning: $HADOOP_HOME is deprecated.

-rw-r--r--   2 hadoop supergroup   14187484 2013-02-26 12:03
/user/hadoop/NetFlow2
*********************************************************

What may be the reason for it?

How can I fix it?