You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Adamantios Corais <ad...@gmail.com> on 2014/04/03 00:06:14 UTC

InjectorJob: org.apache.gora.util.GoraException: java.lang.RuntimeException: java.lang.IllegalArgumentException...

Hi all,

I have followed all steps to set-up Nutch (2.2.1) with HBase (0.90.4) 
and Solr (4.7.1) as described in the book "Web Crawling and Data Mining 
with Apache Nutch", however, I am getting the following error:

> InjectorJob: org.apache.gora.util.GoraException: 
> java.lang.RuntimeException: java.lang.IllegalArgumentException: Not a 
> host:port pair: �27204@eualin-T430eualin-T430,37745,1396453102781
> at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
> at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
> at 
> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)
> at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
> Caused by: java.lang.RuntimeException: 
> java.lang.IllegalArgumentException: Not a host:port pair: 
> �27204@eualin-T430eualin-T430,37745,1396453102781
> at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:127)
> at 
> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
> at 
> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
> ... 7 more
> Caused by: java.lang.IllegalArgumentException: Not a host:port pair: 
> �27204@eualin-T430eualin-T430,37745,1396453102781
> at org.apache.hadoop.hbase.HServerAddress.<init>(HServerAddress.java:60)
> at 
> org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:63)
> at 
> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:354)
> at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94)
> at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:109)
> ... 9 more

As much as I searched, I could not find any solution. Any ideas?

Best,
Adam.

Re: InjectorJob: org.apache.gora.util.GoraException: java.lang.RuntimeException: java.lang.IllegalArgumentException...

Posted by d_k <ma...@gmail.com>.
Is there a chance you have another version of hbase ( > 0.90.x) running
from previous attempts?


On Thu, Apr 3, 2014 at 11:09 AM, Adamantios Corais <
adamantios.corais@gmail.com> wrote:

> Hi Talat,
>
> Here are my installation steps. Let me know if there is something not
> clear!
>
> Best,
> Adam
>
>  cd ~/Downloads
>> wget http://mirror.softaculous.com/apache/nutch/2.2.1/apache-
>> nutch-2.2.1-src.tar.gz
>> tar -zxvf apache-nutch-2.2.1-src.tar.gz
>>
>>
>>
>>
>> cd ~/Downloads
>> wget http://archive.apache.org/dist/hbase/hbase-0.90.4/hbase-
>> 0.90.4.tar.gz
>> tar -zxvf hbase-0.90.4.tar.gz
>>
>>
>>
>>
>> cd ~/Downloads
>> wget http://archive.apache.org/dist/lucene/solr/4.7.1/solr-4.7.1.zip
>> unzip solr-4.7.1.zip -d ~/Downloads
>>
>>
>>
>>
>> mkdir ~/Downloads/hbase_rootdir
>> mkdir ~/Downloads/hbase_zookeeper
>>
>>
>>
>>
>> gedit ~/Downloads/hbase-0.90.4/conf/hbase-site.xml
>>
>> <?xml version="1.0"?>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> <configuration>
>> <property>
>> <name>hbase.rootdir</name>
>> <value>~/Downloads/hbase_rootdir</value>
>> </property>
>> <property>
>> <name>hbase.zookeeper.property.dataDir</name>
>> <value>~/Downloads/hbase_zookeeper</value>
>> </property>
>> </configuration>
>>
>>
>>
>>
>> gedit ~/Downloads/apache-nutch-2.2.1/conf/nutch-site.xml
>>
>> <?xml version="1.0"?>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> <configuration>
>> <property>
>> <name>storage.data.store.class</name>
>> <value>org.apache.gora.hbase.store.HBaseStore</value>
>> <description>Default class for storing data</description>
>> </property>
>> </configuration>
>>
>>
>>
>>
>> gedit ~/Downloads/apache-nutch-2.2.1/ivy/ivy.xml
>>
>> <!-- Uncomment this to use HBase as Gora backend -->
>> <dependency org="org.apache.gora" name="gora-hbase" rev="0.3"
>> conf="*->default" />
>>
>>
>>
>>
>> gedit ~/Downloads/apache-nutch-2.2.1/conf/gora.properties
>>
>> # Add this to use HBase as Gora backend
>> gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
>>
>>
>>
>>
>> cd ~/Downloads/apache-nutch-2.2.1/
>> ant runtime
>>
>>
>>
>>
>> cd ~/Downloads/hbase-0.90.4/
>> export JAVA_HOME=/usr/lib/jvm/java-7-oracle/
>> ./bin/hbase shell
>> exit
>>
>>
>>
>>
>> cd ~/Downloads/apache-nutch-2.2.1/runtime/local
>> bin/nutch
>>
>>
>>
>>
>> gedit ~/Downloads/apache-nutch-2.2.1/runtime/local/conf/nutch-site.xml
>>
>> <?xml version="1.0"?>
>> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
>> <configuration>
>> <property>
>> <name>storage.data.store.class</name>
>> <value>org.apache.gora.hbase.store.HBaseStore</value>
>> <description>Default class for storing data</description>
>> </property>
>> <property>
>> <name>http.agent.name</name>
>> <value>My Nutch Spider</value>
>> </property>
>> </configuration>
>>
>>
>>
>>
>> cd ~/Downloads/apache-nutch-2.2.1/runtime/local
>> mkdir -p urls
>> cd urls
>> gedit seed.txt
>>
>> http://nutch.apache.org/
>>
>>
>>
>>
>> gedit ~/Downloads/apache-nutch-2.2.1/conf/regex-urlfilter.txt
>>
>> # accept anything else
>> +^http://([a-z0-9]*\.)*nutch.apache.org/
>>
>>
>>
>>
>> #Set SOLR home
>> export SOLR_HOME=~/Downloads/solr-4.7.1/solr/example/solr
>>
>>
>>
>>
>> cd ~/Downloads/solr-4.7.1/example
>> java -jar start.jar
>>
>> http://localhost:8983/solr/admin/
>>
>> CTRL + C
>>
>>
>>
>>
>> mv ~/Downloads/solr-4.7.1/solr/example/solr/collection1/conf/schema.xml
>> ~/Downloads/solr-4.7.1/solr/example/solr/collection1/conf/schema.xml.bnk
>> cp ~/Downloads/apache-nutch-2.2.1/conf/schema.xml
>> ~/Downloads/solr-4.7.1/solr/example/solr/collection1/conf/schema.xml
>>
>>
>>
>>
>> cd ~/Downloads/solr-4.7.1/example
>> java -jar start.jar
>>
>> http://localhost:8983/solr/admin/
>>
>> CTRL + SHIFT + T
>>
>>
>>
>>
>> cd ~/Downloads/hbase-0.90.4/
>> export JAVA_HOME=/usr/lib/jvm/java-7-oracle/
>> ./bin/start-hbase.sh
>>
>> CTRL + SHIFT + T
>>
>>
>>
>>
>> cd ~/Downloads/apache-nutch-2.2.1/runtime/local
>> export JAVA_HOME=/usr/lib/jvm/java-7-oracle/
>> ./bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr/ 2
>>
>>
>>
>>
>
>
>
> On 04/03/2014 08:18 AM, Talat Uyarer wrote:
>
>> Hi Adamantios,
>>
>> I dont know steps of the book. Can you share us what did you do ? Two
>> different situation can be caused this error. Either your hbase client
>> version different hbase server which used by gora (Gora use 0.90.4
>> hbase client) or your zookeeper  has a misconfiguration.
>>
>> I wait your installation steps :)
>>
>> Talat
>>
>>
>> 2014-04-03 1:06 GMT+03:00 Adamantios Corais <adamantios.corais@gmail.com
>> >:
>>
>>> Hi all,
>>>
>>> I have followed all steps to set-up Nutch (2.2.1) with HBase (0.90.4) and
>>> Solr (4.7.1) as described in the book "Web Crawling and Data Mining with
>>> Apache Nutch", however, I am getting the following error:
>>>
>>>  InjectorJob: org.apache.gora.util.GoraException:
>>>> java.lang.RuntimeException: java.lang.IllegalArgumentException: Not a
>>>> host:port pair: � 27204@eualin-T430eualin-T430,37745,1396453102781
>>>> at
>>>> org.apache.gora.store.DataStoreFactory.createDataStore(
>>>> DataStoreFactory.java:167)
>>>> at
>>>> org.apache.gora.store.DataStoreFactory.createDataStore(
>>>> DataStoreFactory.java:135)
>>>> at
>>>> org.apache.nutch.storage.StorageUtils.createWebStore(
>>>> StorageUtils.java:75)
>>>> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)
>>>> at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
>>>> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
>>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>> at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
>>>> Caused by: java.lang.RuntimeException: java.lang.
>>>> IllegalArgumentException:
>>>> Not a host:port pair: � 27204@eualin-T430eualin-T430,
>>>> 37745,1396453102781
>>>> at org.apache.gora.hbase.store.HBaseStore.initialize(
>>>> HBaseStore.java:127)
>>>> at
>>>> org.apache.gora.store.DataStoreFactory.initializeDataStore(
>>>> DataStoreFactory.java:102)
>>>> at
>>>> org.apache.gora.store.DataStoreFactory.createDataStore(
>>>> DataStoreFactory.java:161)
>>>> ... 7 more
>>>> Caused by: java.lang.IllegalArgumentException: Not a host:port pair: �
>>>> 27204@eualin-T430eualin-T430,37745,1396453102781
>>>> at org.apache.hadoop.hbase.HServerAddress.<init>(
>>>> HServerAddress.java:60)
>>>> at
>>>> org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(
>>>> MasterAddressTracker.java:63)
>>>> at
>>>> org.apache.hadoop.hbase.client.HConnectionManager$
>>>> HConnectionImplementation.getMaster(HConnectionManager.java:354)
>>>> at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94)
>>>> at org.apache.gora.hbase.store.HBaseStore.initialize(
>>>> HBaseStore.java:109)
>>>> ... 9 more
>>>>
>>>
>>> As much as I searched, I could not find any solution. Any ideas?
>>>
>>> Best,
>>> Adam.
>>>
>>
>>
>>
>

Re: InjectorJob: org.apache.gora.util.GoraException: java.lang.RuntimeException: java.lang.IllegalArgumentException...

Posted by Adamantios Corais <ad...@gmail.com>.
Hi Talat,

Here are my installation steps. Let me know if there is something not clear!

Best,
Adam

> cd ~/Downloads
> wget 
> http://mirror.softaculous.com/apache/nutch/2.2.1/apache-nutch-2.2.1-src.tar.gz
> tar -zxvf apache-nutch-2.2.1-src.tar.gz
>
>
>
>
> cd ~/Downloads
> wget http://archive.apache.org/dist/hbase/hbase-0.90.4/hbase-0.90.4.tar.gz
> tar -zxvf hbase-0.90.4.tar.gz
>
>
>
>
> cd ~/Downloads
> wget http://archive.apache.org/dist/lucene/solr/4.7.1/solr-4.7.1.zip
> unzip solr-4.7.1.zip -d ~/Downloads
>
>
>
>
> mkdir ~/Downloads/hbase_rootdir
> mkdir ~/Downloads/hbase_zookeeper
>
>
>
>
> gedit ~/Downloads/hbase-0.90.4/conf/hbase-site.xml
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <configuration>
> <property>
> <name>hbase.rootdir</name>
> <value>~/Downloads/hbase_rootdir</value>
> </property>
> <property>
> <name>hbase.zookeeper.property.dataDir</name>
> <value>~/Downloads/hbase_zookeeper</value>
> </property>
> </configuration>
>
>
>
>
> gedit ~/Downloads/apache-nutch-2.2.1/conf/nutch-site.xml
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <configuration>
> <property>
> <name>storage.data.store.class</name>
> <value>org.apache.gora.hbase.store.HBaseStore</value>
> <description>Default class for storing data</description>
> </property>
> </configuration>
>
>
>
>
> gedit ~/Downloads/apache-nutch-2.2.1/ivy/ivy.xml
>
> <!-- Uncomment this to use HBase as Gora backend -->
> <dependency org="org.apache.gora" name="gora-hbase" rev="0.3" 
> conf="*->default" />
>
>
>
>
> gedit ~/Downloads/apache-nutch-2.2.1/conf/gora.properties
>
> # Add this to use HBase as Gora backend
> gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
>
>
>
>
> cd ~/Downloads/apache-nutch-2.2.1/
> ant runtime
>
>
>
>
> cd ~/Downloads/hbase-0.90.4/
> export JAVA_HOME=/usr/lib/jvm/java-7-oracle/
> ./bin/hbase shell
> exit
>
>
>
>
> cd ~/Downloads/apache-nutch-2.2.1/runtime/local
> bin/nutch
>
>
>
>
> gedit ~/Downloads/apache-nutch-2.2.1/runtime/local/conf/nutch-site.xml
>
> <?xml version="1.0"?>
> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
> <configuration>
> <property>
> <name>storage.data.store.class</name>
> <value>org.apache.gora.hbase.store.HBaseStore</value>
> <description>Default class for storing data</description>
> </property>
> <property>
> <name>http.agent.name</name>
> <value>My Nutch Spider</value>
> </property>
> </configuration>
>
>
>
>
> cd ~/Downloads/apache-nutch-2.2.1/runtime/local
> mkdir -p urls
> cd urls
> gedit seed.txt
>
> http://nutch.apache.org/
>
>
>
>
> gedit ~/Downloads/apache-nutch-2.2.1/conf/regex-urlfilter.txt
>
> # accept anything else
> +^http://([a-z0-9]*\.)*nutch.apache.org/
>
>
>
>
> #Set SOLR home
> export SOLR_HOME=~/Downloads/solr-4.7.1/solr/example/solr
>
>
>
>
> cd ~/Downloads/solr-4.7.1/example
> java -jar start.jar
>
> http://localhost:8983/solr/admin/
>
> CTRL + C
>
>
>
>
> mv 
> ~/Downloads/solr-4.7.1/solr/example/solr/collection1/conf/schema.xml 
> ~/Downloads/solr-4.7.1/solr/example/solr/collection1/conf/schema.xml.bnk
> cp ~/Downloads/apache-nutch-2.2.1/conf/schema.xml 
> ~/Downloads/solr-4.7.1/solr/example/solr/collection1/conf/schema.xml
>
>
>
>
> cd ~/Downloads/solr-4.7.1/example
> java -jar start.jar
>
> http://localhost:8983/solr/admin/
>
> CTRL + SHIFT + T
>
>
>
>
> cd ~/Downloads/hbase-0.90.4/
> export JAVA_HOME=/usr/lib/jvm/java-7-oracle/
> ./bin/start-hbase.sh
>
> CTRL + SHIFT + T
>
>
>
>
> cd ~/Downloads/apache-nutch-2.2.1/runtime/local
> export JAVA_HOME=/usr/lib/jvm/java-7-oracle/
> ./bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr/ 2
>
>
>




On 04/03/2014 08:18 AM, Talat Uyarer wrote:
> Hi Adamantios,
>
> I dont know steps of the book. Can you share us what did you do ? Two
> different situation can be caused this error. Either your hbase client
> version different hbase server which used by gora (Gora use 0.90.4
> hbase client) or your zookeeper  has a misconfiguration.
>
> I wait your installation steps :)
>
> Talat
>
>
> 2014-04-03 1:06 GMT+03:00 Adamantios Corais <ad...@gmail.com>:
>> Hi all,
>>
>> I have followed all steps to set-up Nutch (2.2.1) with HBase (0.90.4) and
>> Solr (4.7.1) as described in the book "Web Crawling and Data Mining with
>> Apache Nutch", however, I am getting the following error:
>>
>>> InjectorJob: org.apache.gora.util.GoraException:
>>> java.lang.RuntimeException: java.lang.IllegalArgumentException: Not a
>>> host:port pair: � 27204@eualin-T430eualin-T430,37745,1396453102781
>>> at
>>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
>>> at
>>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
>>> at
>>> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
>>> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)
>>> at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
>>> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
>>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>> at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
>>> Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException:
>>> Not a host:port pair: � 27204@eualin-T430eualin-T430,37745,1396453102781
>>> at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:127)
>>> at
>>> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
>>> at
>>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
>>> ... 7 more
>>> Caused by: java.lang.IllegalArgumentException: Not a host:port pair: �
>>> 27204@eualin-T430eualin-T430,37745,1396453102781
>>> at org.apache.hadoop.hbase.HServerAddress.<init>(HServerAddress.java:60)
>>> at
>>> org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:63)
>>> at
>>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:354)
>>> at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94)
>>> at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:109)
>>> ... 9 more
>>
>> As much as I searched, I could not find any solution. Any ideas?
>>
>> Best,
>> Adam.
>
>


Re: InjectorJob: org.apache.gora.util.GoraException: java.lang.RuntimeException: java.lang.IllegalArgumentException...

Posted by Talat Uyarer <ta...@uyarer.com>.
Hi Adamantios,

I dont know steps of the book. Can you share us what did you do ? Two
different situation can be caused this error. Either your hbase client
version different hbase server which used by gora (Gora use 0.90.4
hbase client) or your zookeeper  has a misconfiguration.

I wait your installation steps :)

Talat


2014-04-03 1:06 GMT+03:00 Adamantios Corais <ad...@gmail.com>:
> Hi all,
>
> I have followed all steps to set-up Nutch (2.2.1) with HBase (0.90.4) and
> Solr (4.7.1) as described in the book "Web Crawling and Data Mining with
> Apache Nutch", however, I am getting the following error:
>
>> InjectorJob: org.apache.gora.util.GoraException:
>> java.lang.RuntimeException: java.lang.IllegalArgumentException: Not a
>> host:port pair: � 27204@eualin-T430eualin-T430,37745,1396453102781
>> at
>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
>> at
>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
>> at
>> org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
>> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:221)
>> at org.apache.nutch.crawl.InjectorJob.inject(InjectorJob.java:251)
>> at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:273)
>> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>> at org.apache.nutch.crawl.InjectorJob.main(InjectorJob.java:282)
>> Caused by: java.lang.RuntimeException: java.lang.IllegalArgumentException:
>> Not a host:port pair: � 27204@eualin-T430eualin-T430,37745,1396453102781
>> at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:127)
>> at
>> org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
>> at
>> org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
>> ... 7 more
>> Caused by: java.lang.IllegalArgumentException: Not a host:port pair: �
>> 27204@eualin-T430eualin-T430,37745,1396453102781
>> at org.apache.hadoop.hbase.HServerAddress.<init>(HServerAddress.java:60)
>> at
>> org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:63)
>> at
>> org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:354)
>> at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94)
>> at org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:109)
>> ... 9 more
>
>
> As much as I searched, I could not find any solution. Any ideas?
>
> Best,
> Adam.



-- 
Talat UYARER
Websitesi: http://talat.uyarer.com
Twitter: http://twitter.com/talatuyarer
Linkedin: http://tr.linkedin.com/pub/talat-uyarer/10/142/304