You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Ian Truslove <ia...@nsidc.org> on 2012/09/26 21:26:54 UTC

"gora.properties not found" when running in Hadoop

I've begun trying to get Nutch 2.1 running in Hadoop.  I have a single node
"cluster" running, and I have build Nutch 2.1 from source
(http://svn.apache.org/repos/asf/nutch/branches/2.1, r1388524).  When I
issue the command to run Nutch thus:

    ubuntu:~/apache-nutch-svn-2.1$ ~/hadoop-1.0.3/bin/hadoop jar
build/apache-nutch-2.1.job org.apache.nutch.crawl.Crawler urls -dir urls
-depth 3 -topN 5

then I get the following stack trace:

12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
properties will be empty.
12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
properties will be empty.
Exception in thread "main" org.apache.gora.util.GoraException:
java.io.IOException: java.lang.NullPointerException
at 
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
:167)
at 
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
:135)
at 
org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
Caused by: java.io.IOException: java.lang.NullPointerException
at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:747)
at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:160)
at 
org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.
java:102)
at 
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
:161)
... 13 more
Caused by: java.lang.NullPointerException
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:186)
at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:735)
... 16 more

Unzipping the .job file shows gora.properties exists in the root directory
of the archive.  

I'm sure I'm just making a noob error ­ can anyone point me in the right
direction?

Thanks,

-Ian.



Re: "gora.properties not found" when running in Hadoop

Posted by Lewis John Mcgibbney <le...@gmail.com>.
@ Ian,

Apologies, this one slipped through the net.

On Wed, Sep 26, 2012 at 8:26 PM, Ian Truslove <ia...@nsidc.org> wrote:

>     ubuntu:~/apache-nutch-svn-2.1$ ~/hadoop-1.0.3/bin/hadoop jar
> build/apache-nutch-2.1.job org.apache.nutch.crawl.Crawler urls -dir urls
> -depth 3 -topN 5

The above is not required. Once you've ssh'd into your Job Tracker all
you need to do is head to your deploy directory and use the ./nutch
script from there. It will save you a power of time rather than hard
coding all of the classes to console. You should not experience any
problems running on locally deploy hadoop cluster however we are aware
of problems with CDH4.

>
> Unzipping the .job file shows gora.properties exists in the root directory
> of the archive.

Is it the SqlStore that you wish to use to store your webdb? If so
then you're gora.properties config seems to be OK, however if not then
you'll need to edit and recompile the .job using 'ant job'

Please let us know how you get on.

Lewis

-- 
Lewis

Re: "gora.properties not found" when running in Hadoop

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Chris,

Thank you for the suggestion, hopefully that works for Ian too!

Cheers,
Chris

On Oct 2, 2012, at 3:16 PM, Chris Gerken wrote:

> Howdy Ian, Chris,
> 
> I found that I had to move the gora.properties file and the mappings file into the java root directory (effectively the default package) when building a Gora/Hadoop application.
> 
> Chris Gerken
> 
> 
> 
> On Oct 2, 2012, at 8:16 AM, Mattmann, Chris A (388J) wrote:
> 
>> Hi Ian,
>> 
>> I'm not sure if anyone got back to you on this, so I'm CC'ing the Gora user list as well.
>> Did you build the Nutch job jar from source, or did you download it from somewhere?
>> 
>> Thanks!
>> 
>> Cheers,
>> Chris
>> 
>> On Sep 26, 2012, at 8:26 PM, Ian Truslove wrote:
>> 
>>> I've begun trying to get Nutch 2.1 running in Hadoop.  I have a single node
>>> "cluster" running, and I have build Nutch 2.1 from source
>>> (http://svn.apache.org/repos/asf/nutch/branches/2.1, r1388524).  When I
>>> issue the command to run Nutch thus:
>>> 
>>>  ubuntu:~/apache-nutch-svn-2.1$ ~/hadoop-1.0.3/bin/hadoop jar
>>> build/apache-nutch-2.1.job org.apache.nutch.crawl.Crawler urls -dir urls
>>> -depth 3 -topN 5
>>> 
>>> then I get the following stack trace:
>>> 
>>> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
>>> properties will be empty.
>>> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
>>> properties will be empty.
>>> Exception in thread "main" org.apache.gora.util.GoraException:
>>> java.io.IOException: java.lang.NullPointerException
>>> at 
>>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
>>> :167)
>>> at 
>>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
>>> :135)
>>> at 
>>> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
>>> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
>>> at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
>>> at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
>>> at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at 
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
>>> )
>>> at 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
>>> .java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:601)
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>> Caused by: java.io.IOException: java.lang.NullPointerException
>>> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:747)
>>> at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:160)
>>> at 
>>> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.
>>> java:102)
>>> at 
>>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
>>> :161)
>>> ... 13 more
>>> Caused by: java.lang.NullPointerException
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:186)
>>> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:735)
>>> ... 16 more
>>> 
>>> Unzipping the .job file shows gora.properties exists in the root directory
>>> of the archive.  
>>> 
>>> I'm sure I'm just making a noob error ­ can anyone point me in the right
>>> direction?
>>> 
>>> Thanks,
>>> 
>>> -Ian.
>>> 
>>> 
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: "gora.properties not found" when running in Hadoop

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Chris,

Thank you for the suggestion, hopefully that works for Ian too!

Cheers,
Chris

On Oct 2, 2012, at 3:16 PM, Chris Gerken wrote:

> Howdy Ian, Chris,
> 
> I found that I had to move the gora.properties file and the mappings file into the java root directory (effectively the default package) when building a Gora/Hadoop application.
> 
> Chris Gerken
> 
> 
> 
> On Oct 2, 2012, at 8:16 AM, Mattmann, Chris A (388J) wrote:
> 
>> Hi Ian,
>> 
>> I'm not sure if anyone got back to you on this, so I'm CC'ing the Gora user list as well.
>> Did you build the Nutch job jar from source, or did you download it from somewhere?
>> 
>> Thanks!
>> 
>> Cheers,
>> Chris
>> 
>> On Sep 26, 2012, at 8:26 PM, Ian Truslove wrote:
>> 
>>> I've begun trying to get Nutch 2.1 running in Hadoop.  I have a single node
>>> "cluster" running, and I have build Nutch 2.1 from source
>>> (http://svn.apache.org/repos/asf/nutch/branches/2.1, r1388524).  When I
>>> issue the command to run Nutch thus:
>>> 
>>>  ubuntu:~/apache-nutch-svn-2.1$ ~/hadoop-1.0.3/bin/hadoop jar
>>> build/apache-nutch-2.1.job org.apache.nutch.crawl.Crawler urls -dir urls
>>> -depth 3 -topN 5
>>> 
>>> then I get the following stack trace:
>>> 
>>> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
>>> properties will be empty.
>>> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
>>> properties will be empty.
>>> Exception in thread "main" org.apache.gora.util.GoraException:
>>> java.io.IOException: java.lang.NullPointerException
>>> at 
>>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
>>> :167)
>>> at 
>>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
>>> :135)
>>> at 
>>> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
>>> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
>>> at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
>>> at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
>>> at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
>>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>>> at 
>>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
>>> )
>>> at 
>>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
>>> .java:43)
>>> at java.lang.reflect.Method.invoke(Method.java:601)
>>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>>> Caused by: java.io.IOException: java.lang.NullPointerException
>>> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:747)
>>> at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:160)
>>> at 
>>> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.
>>> java:102)
>>> at 
>>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
>>> :161)
>>> ... 13 more
>>> Caused by: java.lang.NullPointerException
>>> at java.lang.Class.forName0(Native Method)
>>> at java.lang.Class.forName(Class.java:186)
>>> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:735)
>>> ... 16 more
>>> 
>>> Unzipping the .job file shows gora.properties exists in the root directory
>>> of the archive.  
>>> 
>>> I'm sure I'm just making a noob error ­ can anyone point me in the right
>>> direction?
>>> 
>>> Thanks,
>>> 
>>> -Ian.
>>> 
>>> 
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Chris Mattmann, Ph.D.
>> Senior Computer Scientist
>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 171-266B, Mailstop: 171-246
>> Email: chris.a.mattmann@nasa.gov
>> WWW:   http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> Adjunct Assistant Professor, Computer Science Department
>> University of Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: "gora.properties not found" when running in Hadoop

Posted by Chris Gerken <ch...@mindspring.com>.
Howdy Ian, Chris,

I found that I had to move the gora.properties file and the mappings file into the java root directory (effectively the default package) when building a Gora/Hadoop application.

Chris Gerken



On Oct 2, 2012, at 8:16 AM, Mattmann, Chris A (388J) wrote:

> Hi Ian,
> 
> I'm not sure if anyone got back to you on this, so I'm CC'ing the Gora user list as well.
> Did you build the Nutch job jar from source, or did you download it from somewhere?
> 
> Thanks!
> 
> Cheers,
> Chris
> 
> On Sep 26, 2012, at 8:26 PM, Ian Truslove wrote:
> 
>> I've begun trying to get Nutch 2.1 running in Hadoop.  I have a single node
>> "cluster" running, and I have build Nutch 2.1 from source
>> (http://svn.apache.org/repos/asf/nutch/branches/2.1, r1388524).  When I
>> issue the command to run Nutch thus:
>> 
>>   ubuntu:~/apache-nutch-svn-2.1$ ~/hadoop-1.0.3/bin/hadoop jar
>> build/apache-nutch-2.1.job org.apache.nutch.crawl.Crawler urls -dir urls
>> -depth 3 -topN 5
>> 
>> then I get the following stack trace:
>> 
>> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
>> properties will be empty.
>> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
>> properties will be empty.
>> Exception in thread "main" org.apache.gora.util.GoraException:
>> java.io.IOException: java.lang.NullPointerException
>> at 
>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
>> :167)
>> at 
>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
>> :135)
>> at 
>> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
>> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
>> at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
>> at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
>> at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
>> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> at 
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
>> )
>> at 
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
>> .java:43)
>> at java.lang.reflect.Method.invoke(Method.java:601)
>> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> Caused by: java.io.IOException: java.lang.NullPointerException
>> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:747)
>> at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:160)
>> at 
>> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.
>> java:102)
>> at 
>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
>> :161)
>> ... 13 more
>> Caused by: java.lang.NullPointerException
>> at java.lang.Class.forName0(Native Method)
>> at java.lang.Class.forName(Class.java:186)
>> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:735)
>> ... 16 more
>> 
>> Unzipping the .job file shows gora.properties exists in the root directory
>> of the archive.  
>> 
>> I'm sure I'm just making a noob error ­ can anyone point me in the right
>> direction?
>> 
>> Thanks,
>> 
>> -Ian.
>> 
>> 
> 
> 
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Chris Mattmann, Ph.D.
> Senior Computer Scientist
> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
> Office: 171-266B, Mailstop: 171-246
> Email: chris.a.mattmann@nasa.gov
> WWW:   http://sunset.usc.edu/~mattmann/
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> Adjunct Assistant Professor, Computer Science Department
> University of Southern California, Los Angeles, CA 90089 USA
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
> 


Re: "gora.properties not found" when running in Hadoop

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Ian,

I'm not sure if anyone got back to you on this, so I'm CC'ing the Gora user list as well.
Did you build the Nutch job jar from source, or did you download it from somewhere?

Thanks!

Cheers,
Chris

On Sep 26, 2012, at 8:26 PM, Ian Truslove wrote:

> I've begun trying to get Nutch 2.1 running in Hadoop.  I have a single node
> "cluster" running, and I have build Nutch 2.1 from source
> (http://svn.apache.org/repos/asf/nutch/branches/2.1, r1388524).  When I
> issue the command to run Nutch thus:
> 
>    ubuntu:~/apache-nutch-svn-2.1$ ~/hadoop-1.0.3/bin/hadoop jar
> build/apache-nutch-2.1.job org.apache.nutch.crawl.Crawler urls -dir urls
> -depth 3 -topN 5
> 
> then I get the following stack trace:
> 
> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
> properties will be empty.
> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
> properties will be empty.
> Exception in thread "main" org.apache.gora.util.GoraException:
> java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
> :167)
> at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
> :135)
> at 
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
> at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
> at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
> at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
> )
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: java.io.IOException: java.lang.NullPointerException
> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:747)
> at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:160)
> at 
> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.
> java:102)
> at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
> :161)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:186)
> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:735)
> ... 16 more
> 
> Unzipping the .job file shows gora.properties exists in the root directory
> of the archive.  
> 
> I'm sure I'm just making a noob error ­ can anyone point me in the right
> direction?
> 
> Thanks,
> 
> -Ian.
> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++


Re: "gora.properties not found" when running in Hadoop

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Hi Ian,

I'm not sure if anyone got back to you on this, so I'm CC'ing the Gora user list as well.
Did you build the Nutch job jar from source, or did you download it from somewhere?

Thanks!

Cheers,
Chris

On Sep 26, 2012, at 8:26 PM, Ian Truslove wrote:

> I've begun trying to get Nutch 2.1 running in Hadoop.  I have a single node
> "cluster" running, and I have build Nutch 2.1 from source
> (http://svn.apache.org/repos/asf/nutch/branches/2.1, r1388524).  When I
> issue the command to run Nutch thus:
> 
>    ubuntu:~/apache-nutch-svn-2.1$ ~/hadoop-1.0.3/bin/hadoop jar
> build/apache-nutch-2.1.job org.apache.nutch.crawl.Crawler urls -dir urls
> -depth 3 -topN 5
> 
> then I get the following stack trace:
> 
> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
> properties will be empty.
> 12/09/26 13:18:03 WARN store.DataStoreFactory: gora.properties not found,
> properties will be empty.
> Exception in thread "main" org.apache.gora.util.GoraException:
> java.io.IOException: java.lang.NullPointerException
> at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
> :167)
> at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
> :135)
> at 
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
> at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
> at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
> at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57
> )
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl
> .java:43)
> at java.lang.reflect.Method.invoke(Method.java:601)
> at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Caused by: java.io.IOException: java.lang.NullPointerException
> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:747)
> at org.apache.gora.sql.store.SqlStore.initialize(SqlStore.java:160)
> at 
> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.
> java:102)
> at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java
> :161)
> ... 13 more
> Caused by: java.lang.NullPointerException
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:186)
> at org.apache.gora.sql.store.SqlStore.getConnection(SqlStore.java:735)
> ... 16 more
> 
> Unzipping the .job file shows gora.properties exists in the root directory
> of the archive.  
> 
> I'm sure I'm just making a noob error ­ can anyone point me in the right
> direction?
> 
> Thanks,
> 
> -Ian.
> 
> 


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Senior Computer Scientist
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 171-266B, Mailstop: 171-246
Email: chris.a.mattmann@nasa.gov
WWW:   http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Assistant Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++