You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Manikandan Saravanan <ma...@thesocialpeople.net> on 2014/01/06 14:08:12 UTC

Hadoop permissions issue

I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase runs fine. But in the next job, this error comes up

java.lang.NullPointerException
	at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
	at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
	at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
	at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
	at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
	at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
	at org.apache.hadoop.mapred.Child.main(Child.java:249)

I’m running three nodes namely nutch1,2,3. The first one’s in the masters file and all are listed in the slaves file. The /etc/hosts file lists all machines along with their IP addresses. Can someone help me?

-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople

Re: Hadoop permissions issue

Posted by Manikandan Saravanan <ma...@thesocialpeople.net>.
I’m running Nutch 2.2.1 on a Hadoop cluster. I’m running 5000 links from the DMOZ Open Directory Project. The reduce job stops exactly at 33% all the time and it throws this exception. From the nutch mailing list, it seems that my job is stumbling upon a repUrl value that’s null.
-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople

On 6 January 2014 at 7:14:41 pm, Devin Suiter RDX (dsuiter@rdx.com) wrote:

Based on the Exception type, it looks like something in your job is looking for a valid value, and not finding it.

You will probably need to share the job code for people to help with this - to my eyes, this doesn't appear to be a Hadoop configuration issue, or any kind of problem with how the system is working.

Are you using Avro inputs and outputs? If your reduce is trying to parse an Avro record, it may be that the field type is not correct, or maybe there is a reference to an outside schema object that is not available...

If you provide more information about the context of the error (use case, program goal, code block, something like that) then it is easier to help you.



Devin Suiter
Jr. Data Solutions Software Engineer

100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Mon, Jan 6, 2014 at 8:08 AM, Manikandan Saravanan <ma...@thesocialpeople.net> wrote:
I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase runs fine. But in the next job, this error comes up

java.lang.NullPointerException
at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

I’m running three nodes namely nutch1,2,3. The first one’s in the masters file and all are listed in the slaves file. The /etc/hosts file lists all machines along with their IP addresses. Can someone help me?

-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople


Re: Hadoop permissions issue

Posted by Manikandan Saravanan <ma...@thesocialpeople.net>.
I’m running Nutch 2.2.1 on a Hadoop cluster. I’m running 5000 links from the DMOZ Open Directory Project. The reduce job stops exactly at 33% all the time and it throws this exception. From the nutch mailing list, it seems that my job is stumbling upon a repUrl value that’s null.
-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople

On 6 January 2014 at 7:14:41 pm, Devin Suiter RDX (dsuiter@rdx.com) wrote:

Based on the Exception type, it looks like something in your job is looking for a valid value, and not finding it.

You will probably need to share the job code for people to help with this - to my eyes, this doesn't appear to be a Hadoop configuration issue, or any kind of problem with how the system is working.

Are you using Avro inputs and outputs? If your reduce is trying to parse an Avro record, it may be that the field type is not correct, or maybe there is a reference to an outside schema object that is not available...

If you provide more information about the context of the error (use case, program goal, code block, something like that) then it is easier to help you.



Devin Suiter
Jr. Data Solutions Software Engineer

100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Mon, Jan 6, 2014 at 8:08 AM, Manikandan Saravanan <ma...@thesocialpeople.net> wrote:
I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase runs fine. But in the next job, this error comes up

java.lang.NullPointerException
at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

I’m running three nodes namely nutch1,2,3. The first one’s in the masters file and all are listed in the slaves file. The /etc/hosts file lists all machines along with their IP addresses. Can someone help me?

-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople


Re: Hadoop permissions issue

Posted by Manikandan Saravanan <ma...@thesocialpeople.net>.
I’m running Nutch 2.2.1 on a Hadoop cluster. I’m running 5000 links from the DMOZ Open Directory Project. The reduce job stops exactly at 33% all the time and it throws this exception. From the nutch mailing list, it seems that my job is stumbling upon a repUrl value that’s null.
-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople

On 6 January 2014 at 7:14:41 pm, Devin Suiter RDX (dsuiter@rdx.com) wrote:

Based on the Exception type, it looks like something in your job is looking for a valid value, and not finding it.

You will probably need to share the job code for people to help with this - to my eyes, this doesn't appear to be a Hadoop configuration issue, or any kind of problem with how the system is working.

Are you using Avro inputs and outputs? If your reduce is trying to parse an Avro record, it may be that the field type is not correct, or maybe there is a reference to an outside schema object that is not available...

If you provide more information about the context of the error (use case, program goal, code block, something like that) then it is easier to help you.



Devin Suiter
Jr. Data Solutions Software Engineer

100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Mon, Jan 6, 2014 at 8:08 AM, Manikandan Saravanan <ma...@thesocialpeople.net> wrote:
I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase runs fine. But in the next job, this error comes up

java.lang.NullPointerException
at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

I’m running three nodes namely nutch1,2,3. The first one’s in the masters file and all are listed in the slaves file. The /etc/hosts file lists all machines along with their IP addresses. Can someone help me?

-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople


Re: Hadoop permissions issue

Posted by Manikandan Saravanan <ma...@thesocialpeople.net>.
I’m running Nutch 2.2.1 on a Hadoop cluster. I’m running 5000 links from the DMOZ Open Directory Project. The reduce job stops exactly at 33% all the time and it throws this exception. From the nutch mailing list, it seems that my job is stumbling upon a repUrl value that’s null.
-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople

On 6 January 2014 at 7:14:41 pm, Devin Suiter RDX (dsuiter@rdx.com) wrote:

Based on the Exception type, it looks like something in your job is looking for a valid value, and not finding it.

You will probably need to share the job code for people to help with this - to my eyes, this doesn't appear to be a Hadoop configuration issue, or any kind of problem with how the system is working.

Are you using Avro inputs and outputs? If your reduce is trying to parse an Avro record, it may be that the field type is not correct, or maybe there is a reference to an outside schema object that is not available...

If you provide more information about the context of the error (use case, program goal, code block, something like that) then it is easier to help you.



Devin Suiter
Jr. Data Solutions Software Engineer

100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Mon, Jan 6, 2014 at 8:08 AM, Manikandan Saravanan <ma...@thesocialpeople.net> wrote:
I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase runs fine. But in the next job, this error comes up

java.lang.NullPointerException
at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
at org.apache.hadoop.mapred.Child.main(Child.java:249)

I’m running three nodes namely nutch1,2,3. The first one’s in the masters file and all are listed in the slaves file. The /etc/hosts file lists all machines along with their IP addresses. Can someone help me?

-- 
Manikandan Saravanan
Architect - Technology
TheSocialPeople


Re: Hadoop permissions issue

Posted by Devin Suiter RDX <ds...@rdx.com>.
Based on the Exception type, it looks like something in your job is looking
for a valid value, and not finding it.

You will probably need to share the job code for people to help with this -
to my eyes, this doesn't appear to be a Hadoop configuration issue, or any
kind of problem with how the system is working.

Are you using Avro inputs and outputs? If your reduce is trying to parse an
Avro record, it may be that the field type is not correct, or maybe there
is a reference to an outside schema object that is not available...

If you provide more information about the context of the error (use case,
program goal, code block, something like that) then it is easier to help
you.



*Devin Suiter*
Jr. Data Solutions Software Engineer
100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Mon, Jan 6, 2014 at 8:08 AM, Manikandan Saravanan <
manikandan@thesocialpeople.net> wrote:

> I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase
> runs fine. But in the next job, this error comes up
>
> java.lang.NullPointerException
>
> at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
>
> at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
>
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
> I’m running three nodes namely nutch1,2,3. The first one’s in the masters
> file and all are listed in the slaves file. The /etc/hosts file lists all
> machines along with their IP addresses. Can someone help me?
>
> --
> Manikandan Saravanan
> Architect - Technology
> TheSocialPeople <http://thesocialpeople.net>
>

Re: Hadoop permissions issue

Posted by Devin Suiter RDX <ds...@rdx.com>.
Based on the Exception type, it looks like something in your job is looking
for a valid value, and not finding it.

You will probably need to share the job code for people to help with this -
to my eyes, this doesn't appear to be a Hadoop configuration issue, or any
kind of problem with how the system is working.

Are you using Avro inputs and outputs? If your reduce is trying to parse an
Avro record, it may be that the field type is not correct, or maybe there
is a reference to an outside schema object that is not available...

If you provide more information about the context of the error (use case,
program goal, code block, something like that) then it is easier to help
you.



*Devin Suiter*
Jr. Data Solutions Software Engineer
100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Mon, Jan 6, 2014 at 8:08 AM, Manikandan Saravanan <
manikandan@thesocialpeople.net> wrote:

> I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase
> runs fine. But in the next job, this error comes up
>
> java.lang.NullPointerException
>
> at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
>
> at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
>
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
> I’m running three nodes namely nutch1,2,3. The first one’s in the masters
> file and all are listed in the slaves file. The /etc/hosts file lists all
> machines along with their IP addresses. Can someone help me?
>
> --
> Manikandan Saravanan
> Architect - Technology
> TheSocialPeople <http://thesocialpeople.net>
>

Re: Hadoop permissions issue

Posted by Devin Suiter RDX <ds...@rdx.com>.
Based on the Exception type, it looks like something in your job is looking
for a valid value, and not finding it.

You will probably need to share the job code for people to help with this -
to my eyes, this doesn't appear to be a Hadoop configuration issue, or any
kind of problem with how the system is working.

Are you using Avro inputs and outputs? If your reduce is trying to parse an
Avro record, it may be that the field type is not correct, or maybe there
is a reference to an outside schema object that is not available...

If you provide more information about the context of the error (use case,
program goal, code block, something like that) then it is easier to help
you.



*Devin Suiter*
Jr. Data Solutions Software Engineer
100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Mon, Jan 6, 2014 at 8:08 AM, Manikandan Saravanan <
manikandan@thesocialpeople.net> wrote:

> I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase
> runs fine. But in the next job, this error comes up
>
> java.lang.NullPointerException
>
> at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
>
> at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
>
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
> I’m running three nodes namely nutch1,2,3. The first one’s in the masters
> file and all are listed in the slaves file. The /etc/hosts file lists all
> machines along with their IP addresses. Can someone help me?
>
> --
> Manikandan Saravanan
> Architect - Technology
> TheSocialPeople <http://thesocialpeople.net>
>

Re: Hadoop permissions issue

Posted by Devin Suiter RDX <ds...@rdx.com>.
Based on the Exception type, it looks like something in your job is looking
for a valid value, and not finding it.

You will probably need to share the job code for people to help with this -
to my eyes, this doesn't appear to be a Hadoop configuration issue, or any
kind of problem with how the system is working.

Are you using Avro inputs and outputs? If your reduce is trying to parse an
Avro record, it may be that the field type is not correct, or maybe there
is a reference to an outside schema object that is not available...

If you provide more information about the context of the error (use case,
program goal, code block, something like that) then it is easier to help
you.



*Devin Suiter*
Jr. Data Solutions Software Engineer
100 Sandusky Street | 2nd Floor | Pittsburgh, PA 15212
Google Voice: 412-256-8556 | www.rdx.com


On Mon, Jan 6, 2014 at 8:08 AM, Manikandan Saravanan <
manikandan@thesocialpeople.net> wrote:

> I’m trying to run Nutch 2.2.1 on a Hadoop 1.2.1 cluster. The fetch phase
> runs fine. But in the next job, this error comes up
>
> java.lang.NullPointerException
>
> at org.apache.avro.util.Utf8.<init>(Utf8.java:37)
>
> at org.apache.nutch.crawl.GeneratorReducer.setup(GeneratorReducer.java:100)
>
> at org.apache.hadoop.mapreduce.Reducer.run(Reducer.java:174)
>
> at org.apache.hadoop.mapred.ReduceTask.runNewReducer(ReduceTask.java:649)
>
> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:418)
>
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at javax.security.auth.Subject.doAs(Subject.java:415)
>
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
>
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
> I’m running three nodes namely nutch1,2,3. The first one’s in the masters
> file and all are listed in the slaves file. The /etc/hosts file lists all
> machines along with their IP addresses. Can someone help me?
>
> --
> Manikandan Saravanan
> Architect - Technology
> TheSocialPeople <http://thesocialpeople.net>
>