You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Rajarshi Guha <rg...@indiana.edu> on 2009/05/04 23:59:13 UTC

specifying command line args, but getting an NPE

Hi, I have a Hadoop program in which main() reads in some command line  
args:

    public static void main(String[] args) throws Exception {
         Configuration conf = new Configuration();
         String[] otherArgs = new GenericOptionsParser(conf,  
args).getRemainingArgs();
         if (otherArgs.length != 3) {
             System.err.println("Usage: subsearch <in> <out>  
<pattern>");
             System.exit(2);
         }

         FileInputFormat.addInputPath(job, new Path(otherArgs[0]));
         FileOutputFormat.setOutputPath(job, new Path(otherArgs[1]));
         pattern = otherArgs[2];
    ....
    }

Here pattern is declared as a static String class variable.

When I run the program using the local tracker, it runs fine and uses  
the value of pattern. However, if I run the code in distributed mode,  
I get a NullPointerException - as far as I can tell, pattern is  
turning out to be null in this case.

If I hard code the value of pattern in to the code that uses it, the  
program runs fine.

So my question is: if I need to use an argument, specified on the  
command line, do I need to do anything special to the variable holding  
it? In other words, the simple assignment

	pattern = otherArgs[2];

seems to lead to an NPE when run in distributed mode.

Any pointers would be appreciated

Thanks,


-------------------------------------------------------------------
Rajarshi Guha  <rg...@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
-------------------------------------------------------------------
Q:  What's polite and works for the phone company?
A:  A deferential operator.



Re: specifying command line args, but getting an NPE

Posted by Rajarshi Guha <rg...@indiana.edu>.
On May 4, 2009, at 6:07 PM, Todd Lipcon wrote:

> The issue here is that your mapper and reducer classes are being
> instantiated in a different JVM from your main() function. In order  
> to pass
> data to them, you need to use the Configuration object.
>
> Since you have a simple String here, this should be pretty simple.  
> Something
> like:
>
> conf.set("com.example.tool.pattern", otherArgs[2]);
>
> then in the configure() function of your Mapper/Reducer, simply  
> retrieve it
> using conf.get("com.example.tool.pattern");


Thanks for the pointer. I'm using Hadoop 0.20.0 and my mapper which  
extends Mapper<Object, Text, Text, IntWritable> doesn't seem to have a  
configure() method.

Looking at the API I see the superclass has a setup method. Thus in my  
class I do:

     public static class MoleculeMapper extends Mapper<Object, Text,  
Text, IntWritable> {

         private Text matches = new Text();
         private String pattern;

         public void setup(Context context) {
             pattern =  
context.getConfiguration().get("net.rguha.dc.data.pattern");
             System.out.println("pattern = " + pattern);
         }
        ....
     }

In my main method I have

Configuration conf = new Configuration();
String[] otherArgs = new GenericOptionsParser(conf,  
args).getRemainingArgs();
conf.set("net.rguha.dc.data.pattern", otherArgs[2]);

However, even with this, pattern turns out to be null when printed in  
setup().

I just started on Hadoop a day or two ago, and my understanding is  
that 0.20.0 had some pretty major refactoring. As a result a lot of  
examples I come across on the Net don't seem to work. Could the lack  
of the configure() method be due to the refactoring?

-------------------------------------------------------------------
Rajarshi Guha  <rg...@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
-------------------------------------------------------------------
Q:  What's polite and works for the phone company?
A:  A deferential operator.



Re: specifying command line args, but getting an NPE

Posted by Rajarshi Guha <rg...@indiana.edu>.
On May 4, 2009, at 6:07 PM, Todd Lipcon wrote:

>
> Since you have a simple String here, this should be pretty simple.  
> Something
> like:
>
> conf.set("com.example.tool.pattern", otherArgs[2]);
>
> then in the configure() function of your Mapper/Reducer, simply  
> retrieve it
> using conf.get("com.example.tool.pattern");


Trial and error solved the problem. It turns out I need to set the  
value in the Configuration object before I create the Job object.  
Thus, the following works and makes the value of  
net.rguha.dc.data.pattern available to the mappers.

Configuration conf = new Configuration();
conf.set("net.rguha.dc.data.pattern", otherArgs[2]);
Job job = new Job(conf, "id 1");

But if conf.set(...) is called after instantiating job, it doesn't.

Is this intended?

-------------------------------------------------------------------
Rajarshi Guha  <rg...@indiana.edu>
GPG Fingerprint: D070 5427 CC5B 7938 929C  DD13 66A1 922C 51E7 9E84
-------------------------------------------------------------------
Q:  What's polite and works for the phone company?
A:  A deferential operator.



Re: specifying command line args, but getting an NPE

Posted by Todd Lipcon <to...@cloudera.com>.
On Mon, May 4, 2009 at 2:59 PM, Rajarshi Guha <rg...@indiana.edu> wrote:

> So my question is: if I need to use an argument, specified on the command
> line, do I need to do anything special to the variable holding it? In other
> words, the simple assignment
>
>        pattern = otherArgs[2];
>
> seems to lead to an NPE when run in distributed mode.
>

Hi Rajarshi,

The issue here is that your mapper and reducer classes are being
instantiated in a different JVM from your main() function. In order to pass
data to them, you need to use the Configuration object.

Since you have a simple String here, this should be pretty simple. Something
like:

conf.set("com.example.tool.pattern", otherArgs[2]);

then in the configure() function of your Mapper/Reducer, simply retrieve it
using conf.get("com.example.tool.pattern");

Hope that helps,
-Todd