You are viewing a plain text version of this content. The canonical link for it is here.

Posted to common-user@hadoop.apache.org by Pedro Vivancos <pe...@vocali.net> on 2009/01/16 19:41:43 UTC

How to debug a MapReduce application

Dear friends,

I am new at Hadoop and at MapReduce techniques. I've developed my first
map-reduce application using hadoop but I can't manage to make it work. I
get the following error at the very beginning of the execution:

java.lang.NullPointerException
    at
org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:504)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:295)
    at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
16-ene-2009 18:29:30 es.vocali.intro.tools.memo.MemoAnnotationMerging main
GRAVE: Se ha producido un error
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at
es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at
es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)

Sorry if I don't give you more information but I don't know where to start
to find the error. My app is quite simple. It just gets some rows from a
postgresql database and try to see which ones can be deleted.

Here you have the configuration I am using:

MemoAnnotationMerging memo = new MemoAnnotationMerging();

        Map<String, String> parametros = memo.checkParams(args);

        memo.initDataStore(parametros.get(DATASTORE_URL));

        JobConf conf = new JobConf(MemoAnnotationMerging.class);
        conf.setJobName("memo - annotation merging");

        conf.setMapperClass(MemoAnnotationMapper.class);
        conf.setCombinerClass(MemoAnnotationReducer.class);
        conf.setReducerClass(MemoAnnotationReducer.class);

        DBConfiguration.configureDB(conf, DRIVER_CLASS,
parametros.get(DATASTORE_URL));

        // ???
        //conf.setInputFormat(DBInputFormat.class);
        //conf.setOutputFormat(TextOutputFormat.class);

        conf.setMapOutputKeyClass(LongWritable.class);
        conf.setMapOutputValueClass(Annotation.class);


        //conf.setOutputKeyClass(Annotation.class);
        //conf.setOutputValueClass(BooleanWritable.class);

        DBInputFormat.setInput(conf, MemoAnnotationDBWritable.class,
GET_ANNOTATIONS_QUERY, COUNT_ANNOTATIONS_QUERY);

        FileOutputFormat.setOutputPath(conf, new Path("eliminar.txt"));

        // ejecutamos el algoritmo map-reduce para mezclar anotaciones
        try {
            JobClient.runJob(conf);

        } catch (IOException e) {
            e.printStackTrace();
            System.exit(-1);
        }

Thanks in advance.

 Pedro Vivancos Vicente
Vócali Sistemas Inteligentes S.L. <http://www.vocali.net>
Edificio CEEIM, Campus de Espinardo
30100, Espinardo, Murcia, Spain
Tel. +34 902 929 644  <http://www.vocali.net>

Re: How to debug a MapReduce application

Posted by Pedro Vivancos <pe...@vocali.net>.

I am terribly sorry. I made a mistake. This is the output I get:

09/01/19 07:59:45 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
09/01/19 07:59:45 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
09/01/19 07:59:45 INFO mapred.JobClient: Running job: job_local_0001
09/01/19 07:59:45 INFO mapred.MapTask: numReduceTasks: 1
09/01/19 07:59:45 INFO mapred.MapTask: io.sort.mb = 100
09/01/19 07:59:46 INFO mapred.MapTask: data buffer = 79691776/99614720
09/01/19 07:59:46 INFO mapred.MapTask: record buffer = 262144/327680
09/01/19 07:59:46 WARN mapred.LocalJobRunner: job_local_0001
java.lang.NullPointerException
    at
org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
    at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:504)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:295)
    at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
09/01/19 07:59:46 ERROR memo.MemoAnnotationMerging: Se ha producido un error
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at
es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
java.io.IOException: Job failed!
    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
    at
es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
    at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
    at java.lang.reflect.Method.invoke(Method.java:597)
    at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
    at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
    at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)



On Mon, Jan 19, 2009 at 8:47 AM, Pedro Vivancos
<pe...@vocali.net>wrote:

> Thank you very much, but actually I would like to run my application as a
> standalone one.
>
> Anyway I tried to execute it on a pseudo distributed mode with that setup
> and this what I got:
>
> 09/01/19 07:45:24 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 0 time(s).
> 09/01/19 07:45:25 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 1 time(s).
> 09/01/19 07:45:26 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 2 time(s).
> 09/01/19 07:45:27 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 3 time(s).
> 09/01/19 07:45:28 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 4 time(s).
> 09/01/19 07:45:29 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 5 time(s).
> 09/01/19 07:45:30 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 6 time(s).
> 09/01/19 07:45:31 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 7 time(s).
> 09/01/19 07:45:32 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 8 time(s).
> 09/01/19 07:45:33 INFO ipc.Client: Retrying connect to server: localhost/
> 127.0.0.1:9000. Already tried 9 time(s).
> java.lang.RuntimeException: java.io.IOException: Call to localhost/
> 127.0.0.1:9000 failed on local exception: Connection refused
>     at
> org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:323)
>     at
> org.apache.hadoop.mapred.FileOutputFormat.setOutputPath(FileOutputFormat.java:118)
>     at
> es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:156)
>     at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>     at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>     at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>     at java.lang.reflect.Method.invoke(Method.java:597)
>     at org.apache.hadoop.util.RunJar.main(RunJar.java:165)
>     at org.apache.hadoop.mapred.JobShell.run(JobShell.java:54)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>     at org.apache.hadoop.mapred.JobShell.main(JobShell.java:68)
> Caused by: java.io.IOException: Call to localhost/127.0.0.1:9000 failed on
> local exception: Connection refused
>     at org.apache.hadoop.ipc.Client.call(Client.java:699)
>     at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>     at $Proxy0.getProtocolVersion(Unknown Source)
>     at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:319)
>     at
> org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:104)
>     at org.apache.hadoop.hdfs.DFSClient.<init>(DFSClient.java:177)
>     at
> org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:74)
>     at
> org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1367)
>     at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:56)
>     at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1379)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:215)
>     at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:120)
>     at
> org.apache.hadoop.mapred.JobConf.getWorkingDirectory(JobConf.java:319)
>     ... 11 more
> Caused by: java.net.ConnectException: Connection refused
>     at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
>     at
> sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:574)
>     at sun.nio.ch.SocketAdaptor.connect(SocketAdaptor.java:100)
>     at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:299)
>     at org.apache.hadoop.ipc.Client$Connection.access$1700(Client.java:176)
>     at org.apache.hadoop.ipc.Client.getConnection(Client.java:772)
>     at org.apache.hadoop.ipc.Client.call(Client.java:685)
>     ... 23 more
>
> Thanks again.
>
> Best regards,
> Pedro Vivancos
>
>
>
> On Mon, Jan 19, 2009 at 5:05 AM, Amareshwari Sriramadasu <
> amarsri@yahoo-inc.com> wrote:
>
>> From the exception you pasted, it looks like your io.serializations did
>> not set the SerializationFactory properly. Do you see any logs on your
>> console for adding serialization class?
>> Can you try running your app on pseudo distributed mode, instead of
>> LocalJobRunner ?
>> You can find pseudo distributed setup  at
>> http://hadoop.apache.org/core/docs/r0.19.0/quickstart.html#PseudoDistributed
>>
>> Thanks
>> Amareshwari
>>
>> Pedro Vivancos wrote:
>>
>>> Dear friends,
>>>
>>> I am new at Hadoop and at MapReduce techniques. I've developed my first
>>> map-reduce application using hadoop but I can't manage to make it work. I
>>> get the following error at the very beginning of the execution:
>>>
>>> java.lang.NullPointerException
>>>    at
>>>
>>> org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
>>>    at
>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:504)
>>>    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:295)
>>>    at
>>> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
>>> 16-ene-2009 18:29:30 es.vocali.intro.tools.memo.MemoAnnotationMerging
>>> main
>>> GRAVE: Se ha producido un error
>>> java.io.IOException: Job failed!
>>>    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
>>>    at
>>>
>>> es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)
>>> java.io.IOException: Job failed!
>>>    at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
>>>    at
>>>
>>> es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)
>>>
>>> Sorry if I don't give you more information but I don't know where to
>>> start
>>> to find the error. My app is quite simple. It just gets some rows from a
>>> postgresql database and try to see which ones can be deleted.
>>>
>>> Here you have the configuration I am using:
>>>
>>> MemoAnnotationMerging memo = new MemoAnnotationMerging();
>>>
>>>        Map<String, String> parametros = memo.checkParams(args);
>>>
>>>        memo.initDataStore(parametros.get(DATASTORE_URL));
>>>
>>>        JobConf conf = new JobConf(MemoAnnotationMerging.class);
>>>        conf.setJobName("memo - annotation merging");
>>>
>>>        conf.setMapperClass(MemoAnnotationMapper.class);
>>>        conf.setCombinerClass(MemoAnnotationReducer.class);
>>>        conf.setReducerClass(MemoAnnotationReducer.class);
>>>
>>>        DBConfiguration.configureDB(conf, DRIVER_CLASS,
>>> parametros.get(DATASTORE_URL));
>>>
>>>        // ???
>>>        //conf.setInputFormat(DBInputFormat.class);
>>>        //conf.setOutputFormat(TextOutputFormat.class);
>>>
>>>        conf.setMapOutputKeyClass(LongWritable.class);
>>>        conf.setMapOutputValueClass(Annotation.class);
>>>
>>>
>>>        //conf.setOutputKeyClass(Annotation.class);
>>>        //conf.setOutputValueClass(BooleanWritable.class);
>>>
>>>        DBInputFormat.setInput(conf, MemoAnnotationDBWritable.class,
>>> GET_ANNOTATIONS_QUERY, COUNT_ANNOTATIONS_QUERY);
>>>
>>>        FileOutputFormat.setOutputPath(conf, new Path("eliminar.txt"));
>>>
>>>        // ejecutamos el algoritmo map-reduce para mezclar anotaciones
>>>        try {
>>>            JobClient.runJob(conf);
>>>
>>>        } catch (IOException e) {
>>>            e.printStackTrace();
>>>            System.exit(-1);
>>>        }
>>>
>>> Thanks in advance.
>>>
>>>  Pedro Vivancos Vicente
>>> Vócali Sistemas Inteligentes S.L. <http://www.vocali.net>
>>> Edificio CEEIM, Campus de Espinardo
>>> 30100, Espinardo, Murcia, Spain
>>> Tel. +34 902 929 644  <http://www.vocali.net>
>>>
>>>
>>>
>>
>>
>

Re: How to debug a MapReduce application

Posted by Amareshwari Sriramadasu <am...@yahoo-inc.com>.

 From the exception you pasted, it looks like your io.serializations did 
not set the SerializationFactory properly. Do you see any logs on your 
console for adding serialization class?
Can you try running your app on pseudo distributed mode, instead of 
LocalJobRunner ?
You can find pseudo distributed setup  at 
http://hadoop.apache.org/core/docs/r0.19.0/quickstart.html#PseudoDistributed

Thanks
Amareshwari

Pedro Vivancos wrote:
> Dear friends,
>
> I am new at Hadoop and at MapReduce techniques. I've developed my first
> map-reduce application using hadoop but I can't manage to make it work. I
> get the following error at the very beginning of the execution:
>
> java.lang.NullPointerException
>     at
> org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
>     at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:504)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:295)
>     at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
> 16-ene-2009 18:29:30 es.vocali.intro.tools.memo.MemoAnnotationMerging main
> GRAVE: Se ha producido un error
> java.io.IOException: Job failed!
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
>     at
> es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)
> java.io.IOException: Job failed!
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
>     at
> es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)
>
> Sorry if I don't give you more information but I don't know where to start
> to find the error. My app is quite simple. It just gets some rows from a
> postgresql database and try to see which ones can be deleted.
>
> Here you have the configuration I am using:
>
> MemoAnnotationMerging memo = new MemoAnnotationMerging();
>
>         Map<String, String> parametros = memo.checkParams(args);
>
>         memo.initDataStore(parametros.get(DATASTORE_URL));
>
>         JobConf conf = new JobConf(MemoAnnotationMerging.class);
>         conf.setJobName("memo - annotation merging");
>
>         conf.setMapperClass(MemoAnnotationMapper.class);
>         conf.setCombinerClass(MemoAnnotationReducer.class);
>         conf.setReducerClass(MemoAnnotationReducer.class);
>
>         DBConfiguration.configureDB(conf, DRIVER_CLASS,
> parametros.get(DATASTORE_URL));
>
>         // ???
>         //conf.setInputFormat(DBInputFormat.class);
>         //conf.setOutputFormat(TextOutputFormat.class);
>
>         conf.setMapOutputKeyClass(LongWritable.class);
>         conf.setMapOutputValueClass(Annotation.class);
>
>
>         //conf.setOutputKeyClass(Annotation.class);
>         //conf.setOutputValueClass(BooleanWritable.class);
>
>         DBInputFormat.setInput(conf, MemoAnnotationDBWritable.class,
> GET_ANNOTATIONS_QUERY, COUNT_ANNOTATIONS_QUERY);
>
>         FileOutputFormat.setOutputPath(conf, new Path("eliminar.txt"));
>
>         // ejecutamos el algoritmo map-reduce para mezclar anotaciones
>         try {
>             JobClient.runJob(conf);
>
>         } catch (IOException e) {
>             e.printStackTrace();
>             System.exit(-1);
>         }
>
> Thanks in advance.
>
>  Pedro Vivancos Vicente
> Vócali Sistemas Inteligentes S.L. <http://www.vocali.net>
> Edificio CEEIM, Campus de Espinardo
> 30100, Espinardo, Murcia, Spain
> Tel. +34 902 929 644  <http://www.vocali.net>
>
>

Re: How to debug a MapReduce application

Posted by Yi-Kai Tsai <yi...@yahoo-inc.com>.

hi

Maybe you can read this http://www.vimeo.com/2085477 first

> Dear friends,
>
> I am new at Hadoop and at MapReduce techniques. I've developed my first
> map-reduce application using hadoop but I can't manage to make it work. I
> get the following error at the very beginning of the execution:
>
> java.lang.NullPointerException
>     at
> org.apache.hadoop.io.serializer.SerializationFactory.getSerializer(SerializationFactory.java:73)
>     at
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.<init>(MapTask.java:504)
>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:295)
>     at
> org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:138)
> 16-ene-2009 18:29:30 es.vocali.intro.tools.memo.MemoAnnotationMerging main
> GRAVE: Se ha producido un error
> java.io.IOException: Job failed!
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
>     at
> es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)
> java.io.IOException: Job failed!
>     at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:1217)
>     at
> es.vocali.intro.tools.memo.MemoAnnotationMerging.main(MemoAnnotationMerging.java:160)
>
> Sorry if I don't give you more information but I don't know where to start
> to find the error. My app is quite simple. It just gets some rows from a
> postgresql database and try to see which ones can be deleted.
>   


-- 
Yi-Kai Tsai (cuma) <yi...@yahoo-inc.com>, Asia Regional Search Engineering.