You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@giraph.apache.org by David Garcia <dg...@potomacfusion.com> on 2012/02/08 02:46:38 UTC

running job with giraph dependency anomaly

I am running into a weird error that I haven't seen yet (I suppose I've
been lucky).  I see the following in the logging:

org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
library for your platform... using builtin-java classes where applicable


In the job definition, the property "mapreduce.map.class" is not even
defined.  For Giraph, this is usually set to
"mapreduce.map.class=org.apache.giraph.graph.GraphMapper"

I'm building my project with hadoop 0.20.204.

When I build the GiraphProject myself (and run my own tests with the
projects dependencies), I have no problems.  The main difference is that
I'm using a Giraph dependency in my work project.  All input is welcome.
Thx!!

-David


Re: running job with giraph dependency anomaly

Posted by Jakob Homan <jg...@gmail.com>.
Nothing that Giraph does should be influenced by 32/64 (basically,
very rare caveats apply, etc, etc).  I'm still not clear on what error
you're encountering.  Your custom mapper sets everything GraphMapper
does, but then doesn't run?

On Tue, Feb 7, 2012 at 6:18 PM, David Garcia <dg...@potomacfusion.com> wrote:
> Yeah.  I haven't changed anything with the standard Giraph stuff.  I just
> made my own vertex and and VertexInputFormat.  We are in a 64bit
> environment. . .is it possible that building a jar with 32bit tools would
> be a problem?  I wouldn't think so, since that addressing
> native-dependency issues was sort of the *point* of java. . .but, this
> seems really odd to me.  Are there some dependency restrictions that I
> should know about?  We have to use Jackson 1.6 (because we use cloudera
> distribution of hadoop), and there are other libraries we use.  Thx again
> for the feedback.
>
> -David
>
> On 2/7/12 8:08 PM, "Avery Ching" <ac...@apache.org> wrote:
>
>>If you're using GiraphJob, the mapper class should be set for you.
>>That's weird.
>>
>>Avery
>>
>>On 2/7/12 5:58 PM, David Garcia wrote:
>>> That's interesting.  Yes, I don't need native libraries.  The problem
>>>I'm
>>> having is that after I run job.waitForCompletion(..),
>>> The job runs a mapper that is something other than GraphMapper.  It
>>> doesn't complain that a Mapper isn't defined or anything.  It runs
>>> something else.  As I mentioned below, the map-class doesn't appear to
>>>be
>>> defined.
>>>
>>>
>>> On 2/7/12 7:50 PM, "Jakob Homan"<jg...@gmail.com>  wrote:
>>>
>>>> That's not necessarily a bad thing.  Hadoop (not Giraph) has native
>>>> code library it can use for improved performance.  You'll see this
>>>> message when running on a cluster that's not been deployed to use the
>>>> native libraries.  If I follow what you wrote, most likely your work
>>>> project cluster is so configured.  Unless you actively expect to have
>>>> the native libraries loaded, I wouldn't be concerned.
>>>>
>>>>
>>>> On Tue, Feb 7, 2012 at 5:46 PM, David Garcia<dg...@potomacfusion.com>
>>>> wrote:
>>>>> I am running into a weird error that I haven't seen yet (I suppose
>>>>>I've
>>>>> been lucky).  I see the following in the logging:
>>>>>
>>>>> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>>>>> library for your platform... using builtin-java classes where
>>>>>applicable
>>>>>
>>>>>
>>>>> In the job definition, the property "mapreduce.map.class" is not even
>>>>> defined.  For Giraph, this is usually set to
>>>>> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>>>>>
>>>>> I'm building my project with hadoop 0.20.204.
>>>>>
>>>>> When I build the GiraphProject myself (and run my own tests with the
>>>>> projects dependencies), I have no problems.  The main difference is
>>>>>that
>>>>> I'm using a Giraph dependency in my work project.  All input is
>>>>>welcome.
>>>>> Thx!!
>>>>>
>>>>> -David
>>>>>
>>
>

Re: running job with giraph dependency anomaly

Posted by David Garcia <dg...@potomacfusion.com>.
Yeah.  I haven't changed anything with the standard Giraph stuff.  I just
made my own vertex and and VertexInputFormat.  We are in a 64bit
environment. . .is it possible that building a jar with 32bit tools would
be a problem?  I wouldn't think so, since that addressing
native-dependency issues was sort of the *point* of java. . .but, this
seems really odd to me.  Are there some dependency restrictions that I
should know about?  We have to use Jackson 1.6 (because we use cloudera
distribution of hadoop), and there are other libraries we use.  Thx again
for the feedback.

-David

On 2/7/12 8:08 PM, "Avery Ching" <ac...@apache.org> wrote:

>If you're using GiraphJob, the mapper class should be set for you.
>That's weird.
>
>Avery
>
>On 2/7/12 5:58 PM, David Garcia wrote:
>> That's interesting.  Yes, I don't need native libraries.  The problem
>>I'm
>> having is that after I run job.waitForCompletion(..),
>> The job runs a mapper that is something other than GraphMapper.  It
>> doesn't complain that a Mapper isn't defined or anything.  It runs
>> something else.  As I mentioned below, the map-class doesn't appear to
>>be
>> defined.
>>
>>
>> On 2/7/12 7:50 PM, "Jakob Homan"<jg...@gmail.com>  wrote:
>>
>>> That's not necessarily a bad thing.  Hadoop (not Giraph) has native
>>> code library it can use for improved performance.  You'll see this
>>> message when running on a cluster that's not been deployed to use the
>>> native libraries.  If I follow what you wrote, most likely your work
>>> project cluster is so configured.  Unless you actively expect to have
>>> the native libraries loaded, I wouldn't be concerned.
>>>
>>>
>>> On Tue, Feb 7, 2012 at 5:46 PM, David Garcia<dg...@potomacfusion.com>
>>> wrote:
>>>> I am running into a weird error that I haven't seen yet (I suppose
>>>>I've
>>>> been lucky).  I see the following in the logging:
>>>>
>>>> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>>>> library for your platform... using builtin-java classes where
>>>>applicable
>>>>
>>>>
>>>> In the job definition, the property "mapreduce.map.class" is not even
>>>> defined.  For Giraph, this is usually set to
>>>> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>>>>
>>>> I'm building my project with hadoop 0.20.204.
>>>>
>>>> When I build the GiraphProject myself (and run my own tests with the
>>>> projects dependencies), I have no problems.  The main difference is
>>>>that
>>>> I'm using a Giraph dependency in my work project.  All input is
>>>>welcome.
>>>> Thx!!
>>>>
>>>> -David
>>>>
>


Re: running job with giraph dependency anomaly

Posted by Avery Ching <ac...@apache.org>.
If you're using GiraphJob, the mapper class should be set for you.  
That's weird.

Avery

On 2/7/12 5:58 PM, David Garcia wrote:
> That's interesting.  Yes, I don't need native libraries.  The problem I'm
> having is that after I run job.waitForCompletion(..),
> The job runs a mapper that is something other than GraphMapper.  It
> doesn't complain that a Mapper isn't defined or anything.  It runs
> something else.  As I mentioned below, the map-class doesn't appear to be
> defined.
>
>
> On 2/7/12 7:50 PM, "Jakob Homan"<jg...@gmail.com>  wrote:
>
>> That's not necessarily a bad thing.  Hadoop (not Giraph) has native
>> code library it can use for improved performance.  You'll see this
>> message when running on a cluster that's not been deployed to use the
>> native libraries.  If I follow what you wrote, most likely your work
>> project cluster is so configured.  Unless you actively expect to have
>> the native libraries loaded, I wouldn't be concerned.
>>
>>
>> On Tue, Feb 7, 2012 at 5:46 PM, David Garcia<dg...@potomacfusion.com>
>> wrote:
>>> I am running into a weird error that I haven't seen yet (I suppose I've
>>> been lucky).  I see the following in the logging:
>>>
>>> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where applicable
>>>
>>>
>>> In the job definition, the property "mapreduce.map.class" is not even
>>> defined.  For Giraph, this is usually set to
>>> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>>>
>>> I'm building my project with hadoop 0.20.204.
>>>
>>> When I build the GiraphProject myself (and run my own tests with the
>>> projects dependencies), I have no problems.  The main difference is that
>>> I'm using a Giraph dependency in my work project.  All input is welcome.
>>> Thx!!
>>>
>>> -David
>>>


Re: running job with giraph dependency anomaly

Posted by David Garcia <dg...@potomacfusion.com>.
That's interesting.  Yes, I don't need native libraries.  The problem I'm
having is that after I run job.waitForCompletion(..),
The job runs a mapper that is something other than GraphMapper.  It
doesn't complain that a Mapper isn't defined or anything.  It runs
something else.  As I mentioned below, the map-class doesn't appear to be
defined.


On 2/7/12 7:50 PM, "Jakob Homan" <jg...@gmail.com> wrote:

>That's not necessarily a bad thing.  Hadoop (not Giraph) has native
>code library it can use for improved performance.  You'll see this
>message when running on a cluster that's not been deployed to use the
>native libraries.  If I follow what you wrote, most likely your work
>project cluster is so configured.  Unless you actively expect to have
>the native libraries loaded, I wouldn't be concerned.
>
>
>On Tue, Feb 7, 2012 at 5:46 PM, David Garcia <dg...@potomacfusion.com>
>wrote:
>> I am running into a weird error that I haven't seen yet (I suppose I've
>> been lucky).  I see the following in the logging:
>>
>> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>>
>>
>> In the job definition, the property "mapreduce.map.class" is not even
>> defined.  For Giraph, this is usually set to
>> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>>
>> I'm building my project with hadoop 0.20.204.
>>
>> When I build the GiraphProject myself (and run my own tests with the
>> projects dependencies), I have no problems.  The main difference is that
>> I'm using a Giraph dependency in my work project.  All input is welcome.
>> Thx!!
>>
>> -David
>>


Re: running job with giraph dependency anomaly

Posted by Jakob Homan <jg...@gmail.com>.
That's not necessarily a bad thing.  Hadoop (not Giraph) has native
code library it can use for improved performance.  You'll see this
message when running on a cluster that's not been deployed to use the
native libraries.  If I follow what you wrote, most likely your work
project cluster is so configured.  Unless you actively expect to have
the native libraries loaded, I wouldn't be concerned.


On Tue, Feb 7, 2012 at 5:46 PM, David Garcia <dg...@potomacfusion.com> wrote:
> I am running into a weird error that I haven't seen yet (I suppose I've
> been lucky).  I see the following in the logging:
>
> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
> library for your platform... using builtin-java classes where applicable
>
>
> In the job definition, the property "mapreduce.map.class" is not even
> defined.  For Giraph, this is usually set to
> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>
> I'm building my project with hadoop 0.20.204.
>
> When I build the GiraphProject myself (and run my own tests with the
> projects dependencies), I have no problems.  The main difference is that
> I'm using a Giraph dependency in my work project.  All input is welcome.
> Thx!!
>
> -David
>

Re: Giraph Architecture bug in

Posted by Avery Ching <ac...@apache.org>.
AFAIK we don't have any SOP for opening issues.  Maybe I'll take a crack 
at this one tonight if I find some time, unless you were planning to 
work on it David.

Avery

On 2/8/12 5:46 PM, David Garcia wrote:
> I opened up
>
> * GIRAPH-144<https://issues.apache.org/jira/browse/GIRAPH-144>
>
>
> I apologize if I didn't do it up according to project SOP's.  I haven't
> had time to read it thoroughly.
>
> -David
>
>
> On 2/8/12 7:29 PM, "David Garcia"<dg...@potomacfusion.com>  wrote:
>
>> Yeah, I'll write something up.
>>
>>
>> On 2/8/12 7:26 PM, "Avery Ching"<ac...@apache.org>  wrote:
>>
>>> Since we call waitForCompletion() (which calls submit() internally) in
>>> GiraphJob#run(), we cannot override those methods.  A better fix would
>>> probably be to use composition rather than inheritance (i.e.
>>>
>>> public class GiraphJob {
>>>      Job internalJob;
>>> }
>>>
>>> and expose the methods we would like as necessary.  There are other
>>> methods we don't want the user to call, (i.e. setMapperClass(), etc.).
>>> David, can you please open an issue for this?
>>>
>>> Avery
>>>
>>> On 2/8/12 5:17 PM, David Garcia wrote:
>>>> This is a very subtle bug.  GiraphJob inherits from
>>>> org.apache.mapreduce.Job.  However, the methods submit() and
>>>> waitForCompletion() are not overridden.  I assumed that they were
>>>> implemented, so when I called either one of these methods, the
>>>> framework
>>>> started up identity mappers/reducers.  A simple fix is to throw
>>>> unsupported operation exceptions or to implement these methods.
>>>> Perhaps
>>>> this has been done already?
>>>>
>>>> -David
>>>>
>>>> On 2/7/12 7:46 PM, "David Garcia"<dg...@potomacfusion.com>   wrote:
>>>>
>>>>> I am running into a weird error that I haven't seen yet (I suppose
>>>>> I've
>>>>> been lucky).  I see the following in the logging:
>>>>>
>>>>> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>>>>> library for your platform... using builtin-java classes where
>>>>> applicable
>>>>>
>>>>>
>>>>> In the job definition, the property "mapreduce.map.class" is not even
>>>>> defined.  For Giraph, this is usually set to
>>>>> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>>>>>
>>>>> I'm building my project with hadoop 0.20.204.
>>>>>
>>>>> When I build the GiraphProject myself (and run my own tests with the
>>>>> projects dependencies), I have no problems.  The main difference is
>>>>> that
>>>>> I'm using a Giraph dependency in my work project.  All input is
>>>>> welcome.
>>>>> Thx!!
>>>>>
>>>>> -David
>>>>>


Re: Giraph Architecture bug in

Posted by David Garcia <dg...@potomacfusion.com>.
I opened up 

* GIRAPH-144 <https://issues.apache.org/jira/browse/GIRAPH-144>


I apologize if I didn't do it up according to project SOP's.  I haven't
had time to read it thoroughly.

-David


On 2/8/12 7:29 PM, "David Garcia" <dg...@potomacfusion.com> wrote:

>Yeah, I'll write something up.
>
>
>On 2/8/12 7:26 PM, "Avery Ching" <ac...@apache.org> wrote:
>
>>Since we call waitForCompletion() (which calls submit() internally) in
>>GiraphJob#run(), we cannot override those methods.  A better fix would
>>probably be to use composition rather than inheritance (i.e.
>>
>>public class GiraphJob {
>>     Job internalJob;
>>}
>>
>>and expose the methods we would like as necessary.  There are other
>>methods we don't want the user to call, (i.e. setMapperClass(), etc.).
>>David, can you please open an issue for this?
>>
>>Avery
>>
>>On 2/8/12 5:17 PM, David Garcia wrote:
>>> This is a very subtle bug.  GiraphJob inherits from
>>> org.apache.mapreduce.Job.  However, the methods submit() and
>>> waitForCompletion() are not overridden.  I assumed that they were
>>> implemented, so when I called either one of these methods, the
>>>framework
>>> started up identity mappers/reducers.  A simple fix is to throw
>>> unsupported operation exceptions or to implement these methods.
>>>Perhaps
>>> this has been done already?
>>>
>>> -David
>>>
>>> On 2/7/12 7:46 PM, "David Garcia"<dg...@potomacfusion.com>  wrote:
>>>
>>>> I am running into a weird error that I haven't seen yet (I suppose
>>>>I've
>>>> been lucky).  I see the following in the logging:
>>>>
>>>> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>>>> library for your platform... using builtin-java classes where
>>>>applicable
>>>>
>>>>
>>>> In the job definition, the property "mapreduce.map.class" is not even
>>>> defined.  For Giraph, this is usually set to
>>>> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>>>>
>>>> I'm building my project with hadoop 0.20.204.
>>>>
>>>> When I build the GiraphProject myself (and run my own tests with the
>>>> projects dependencies), I have no problems.  The main difference is
>>>>that
>>>> I'm using a Giraph dependency in my work project.  All input is
>>>>welcome.
>>>> Thx!!
>>>>
>>>> -David
>>>>
>>
>


Re: Giraph Architecture bug in

Posted by David Garcia <dg...@potomacfusion.com>.
Yeah, I'll write something up.


On 2/8/12 7:26 PM, "Avery Ching" <ac...@apache.org> wrote:

>Since we call waitForCompletion() (which calls submit() internally) in
>GiraphJob#run(), we cannot override those methods.  A better fix would
>probably be to use composition rather than inheritance (i.e.
>
>public class GiraphJob {
>     Job internalJob;
>}
>
>and expose the methods we would like as necessary.  There are other
>methods we don't want the user to call, (i.e. setMapperClass(), etc.).
>David, can you please open an issue for this?
>
>Avery
>
>On 2/8/12 5:17 PM, David Garcia wrote:
>> This is a very subtle bug.  GiraphJob inherits from
>> org.apache.mapreduce.Job.  However, the methods submit() and
>> waitForCompletion() are not overridden.  I assumed that they were
>> implemented, so when I called either one of these methods, the framework
>> started up identity mappers/reducers.  A simple fix is to throw
>> unsupported operation exceptions or to implement these methods.  Perhaps
>> this has been done already?
>>
>> -David
>>
>> On 2/7/12 7:46 PM, "David Garcia"<dg...@potomacfusion.com>  wrote:
>>
>>> I am running into a weird error that I haven't seen yet (I suppose I've
>>> been lucky).  I see the following in the logging:
>>>
>>> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>>> library for your platform... using builtin-java classes where
>>>applicable
>>>
>>>
>>> In the job definition, the property "mapreduce.map.class" is not even
>>> defined.  For Giraph, this is usually set to
>>> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>>>
>>> I'm building my project with hadoop 0.20.204.
>>>
>>> When I build the GiraphProject myself (and run my own tests with the
>>> projects dependencies), I have no problems.  The main difference is
>>>that
>>> I'm using a Giraph dependency in my work project.  All input is
>>>welcome.
>>> Thx!!
>>>
>>> -David
>>>
>


Re: Giraph Architecture bug in

Posted by Avery Ching <ac...@apache.org>.
Since we call waitForCompletion() (which calls submit() internally) in 
GiraphJob#run(), we cannot override those methods.  A better fix would 
probably be to use composition rather than inheritance (i.e.

public class GiraphJob {
     Job internalJob;
}

and expose the methods we would like as necessary.  There are other 
methods we don't want the user to call, (i.e. setMapperClass(), etc.).  
David, can you please open an issue for this?

Avery

On 2/8/12 5:17 PM, David Garcia wrote:
> This is a very subtle bug.  GiraphJob inherits from
> org.apache.mapreduce.Job.  However, the methods submit() and
> waitForCompletion() are not overridden.  I assumed that they were
> implemented, so when I called either one of these methods, the framework
> started up identity mappers/reducers.  A simple fix is to throw
> unsupported operation exceptions or to implement these methods.  Perhaps
> this has been done already?
>
> -David
>
> On 2/7/12 7:46 PM, "David Garcia"<dg...@potomacfusion.com>  wrote:
>
>> I am running into a weird error that I haven't seen yet (I suppose I've
>> been lucky).  I see the following in the logging:
>>
>> org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>> library for your platform... using builtin-java classes where applicable
>>
>>
>> In the job definition, the property "mapreduce.map.class" is not even
>> defined.  For Giraph, this is usually set to
>> "mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>>
>> I'm building my project with hadoop 0.20.204.
>>
>> When I build the GiraphProject myself (and run my own tests with the
>> projects dependencies), I have no problems.  The main difference is that
>> I'm using a Giraph dependency in my work project.  All input is welcome.
>> Thx!!
>>
>> -David
>>


Giraph Architecture bug in

Posted by David Garcia <dg...@potomacfusion.com>.
This is a very subtle bug.  GiraphJob inherits from
org.apache.mapreduce.Job.  However, the methods submit() and
waitForCompletion() are not overridden.  I assumed that they were
implemented, so when I called either one of these methods, the framework
started up identity mappers/reducers.  A simple fix is to throw
unsupported operation exceptions or to implement these methods.  Perhaps
this has been done already?

-David

On 2/7/12 7:46 PM, "David Garcia" <dg...@potomacfusion.com> wrote:

>I am running into a weird error that I haven't seen yet (I suppose I've
>been lucky).  I see the following in the logging:
>
>org.apache.hadoop.util.NativeCodeLoader: Unable to load native-hadoop
>library for your platform... using builtin-java classes where applicable
>
>
>In the job definition, the property "mapreduce.map.class" is not even
>defined.  For Giraph, this is usually set to
>"mapreduce.map.class=org.apache.giraph.graph.GraphMapper"
>
>I'm building my project with hadoop 0.20.204.
>
>When I build the GiraphProject myself (and run my own tests with the
>projects dependencies), I have no problems.  The main difference is that
>I'm using a Giraph dependency in my work project.  All input is welcome.
>Thx!!
>
>-David
>