You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@nutch.apache.org by Martha Perez Arriaga <ma...@gmail.com> on 2014/03/13 04:32:16 UTC

error java[12580:1003] Unable to load realm info from SCDynamicStore

Hello,

I installed the following with no issues:

apache-nutch-2.2.1-src.tar

apache-solr-3.6.2.tar

hbase-0.96.1.1-hadoop2-bin.tar



However, when I try to crawl, this error shows:java[12580:1003] Unable to
load realm info from SCDynamicStore



Please see below for more details. I am new to the web crawling, any help
is appreciated.

Martha



$ bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr/2 -depth 3
-topN 5

InjectorJob: starting at 2014-03-12 21:12:25

InjectorJob: Injecting urlDir: urls/seed.txt

2014-03-12 21:12:25.824 java[12580:1003] Unable to load realm info from
SCDynamicStore

InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
storage class.

InjectorJob: total number of urls rejected by filters: 0

InjectorJob: total number of urls injected after normalization and
filtering: 1

Injector: finished at 2014-03-12 21:12:28, elapsed: 00:00:02



$ bin/nutch crawl urls -solr http://localhost:8983/ -depth 4 -topN 5
-threads 4

2014-03-12 21:12:56.972 java[12587:1003] Unable to load realm info from
SCDynamicStore

InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
storage class.

InjectorJob: total number of urls rejected by filters: 0

InjectorJob: total number of urls injected after normalization and
filtering: 1

Exception in thread "main" java.lang.RuntimeException: job failed:
name=generate: null, jobid=job_local338944173_0002

at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)

at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:199)

at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)

at org.apache.nutch.crawl.Crawler.run(Crawler.java:152)

at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)

at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)

at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)



$ bin/nutch inject urls/seed.txt

InjectorJob: starting at 2014-03-12 21:13:29

InjectorJob: Injecting urlDir: urls/seed.txt

2014-03-12 21:13:29.658 java[12599:1003] Unable to load realm info from
SCDynamicStore

InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
storage class.

InjectorJob: total number of urls rejected by filters: 0

InjectorJob: total number of urls injected after normalization and
filtering: 1

Injector: finished at 2014-03-12 21:13:31, elapsed: 00:00:01



$ bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb
crawl/linkdb crawl/segments/*

SolrIndexerJob: starting

2014-03-12 21:14:12.474 java[12606:1003] Unable to load realm info from
SCDynamicStore

SolrIndexerJob: done.

Re: error java[12580:1003] Unable to load realm info from SCDynamicStore

Posted by feng lu <am...@gmail.com>.

do you HBaseStore as the default store in gora.preperties config file. You
can check this tutorial for nutch 2.x

https://wiki.apache.org/nutch/Nutch2Tutorial


On Thu, Mar 13, 2014 at 11:32 AM, Martha Perez Arriaga <marthao.pa@gmail.com
> wrote:

> Hello,
>
> I installed the following with no issues:
>
> apache-nutch-2.2.1-src.tar
>
> apache-solr-3.6.2.tar
>
> hbase-0.96.1.1-hadoop2-bin.tar
>
>
>
> However, when I try to crawl, this error shows:java[12580:1003] Unable to
> load realm info from SCDynamicStore
>
>
>
> Please see below for more details. I am new to the web crawling, any help
> is appreciated.
>
> Martha
>
>
>
> $ bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr/2 -depth 3
> -topN 5
>
> InjectorJob: starting at 2014-03-12 21:12:25
>
> InjectorJob: Injecting urlDir: urls/seed.txt
>
> 2014-03-12 21:12:25.824 java[12580:1003] Unable to load realm info from
> SCDynamicStore
>
> InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
> storage class.
>
> InjectorJob: total number of urls rejected by filters: 0
>
> InjectorJob: total number of urls injected after normalization and
> filtering: 1
>
> Injector: finished at 2014-03-12 21:12:28, elapsed: 00:00:02
>
>
>
> $ bin/nutch crawl urls -solr http://localhost:8983/ -depth 4 -topN 5
> -threads 4
>
> 2014-03-12 21:12:56.972 java[12587:1003] Unable to load realm info from
> SCDynamicStore
>
> InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
> storage class.
>
> InjectorJob: total number of urls rejected by filters: 0
>
> InjectorJob: total number of urls injected after normalization and
> filtering: 1
>
> Exception in thread "main" java.lang.RuntimeException: job failed:
> name=generate: null, jobid=job_local338944173_0002
>
> at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
>
> at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:199)
>
> at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
>
> at org.apache.nutch.crawl.Crawler.run(Crawler.java:152)
>
> at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
>
>
>
> $ bin/nutch inject urls/seed.txt
>
> InjectorJob: starting at 2014-03-12 21:13:29
>
> InjectorJob: Injecting urlDir: urls/seed.txt
>
> 2014-03-12 21:13:29.658 java[12599:1003] Unable to load realm info from
> SCDynamicStore
>
> InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
> storage class.
>
> InjectorJob: total number of urls rejected by filters: 0
>
> InjectorJob: total number of urls injected after normalization and
> filtering: 1
>
> Injector: finished at 2014-03-12 21:13:31, elapsed: 00:00:01
>
>
>
> $ bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb
> crawl/linkdb crawl/segments/*
>
> SolrIndexerJob: starting
>
> 2014-03-12 21:14:12.474 java[12606:1003] Unable to load realm info from
> SCDynamicStore
>
> SolrIndexerJob: done.
>



-- 
Don't Grow Old, Grow Up... :-)

Re: error java[12580:1003] Unable to load realm info from SCDynamicStore

Posted by Talat Uyarer <ta...@uyarer.com>.

Hi Marthao,

In Nutch 2.2.1 we does not support greater than 0.90.x. You should change
your store with Hbase 0.90.x

Talat
13 Mar 2014 05:32 tarihinde "Martha Perez Arriaga" <ma...@gmail.com>
yazdı:

> Hello,
>
> I installed the following with no issues:
>
> apache-nutch-2.2.1-src.tar
>
> apache-solr-3.6.2.tar
>
> hbase-0.96.1.1-hadoop2-bin.tar
>
>
>
> However, when I try to crawl, this error shows:java[12580:1003] Unable to
> load realm info from SCDynamicStore
>
>
>
> Please see below for more details. I am new to the web crawling, any help
> is appreciated.
>
> Martha
>
>
>
> $ bin/crawl urls/seed.txt TestCrawl http://localhost:8983/solr/2 -depth 3
> -topN 5
>
> InjectorJob: starting at 2014-03-12 21:12:25
>
> InjectorJob: Injecting urlDir: urls/seed.txt
>
> 2014-03-12 21:12:25.824 java[12580:1003] Unable to load realm info from
> SCDynamicStore
>
> InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
> storage class.
>
> InjectorJob: total number of urls rejected by filters: 0
>
> InjectorJob: total number of urls injected after normalization and
> filtering: 1
>
> Injector: finished at 2014-03-12 21:12:28, elapsed: 00:00:02
>
>
>
> $ bin/nutch crawl urls -solr http://localhost:8983/ -depth 4 -topN 5
> -threads 4
>
> 2014-03-12 21:12:56.972 java[12587:1003] Unable to load realm info from
> SCDynamicStore
>
> InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
> storage class.
>
> InjectorJob: total number of urls rejected by filters: 0
>
> InjectorJob: total number of urls injected after normalization and
> filtering: 1
>
> Exception in thread "main" java.lang.RuntimeException: job failed:
> name=generate: null, jobid=job_local338944173_0002
>
> at org.apache.nutch.util.NutchJob.waitForCompletion(NutchJob.java:54)
>
> at org.apache.nutch.crawl.GeneratorJob.run(GeneratorJob.java:199)
>
> at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
>
> at org.apache.nutch.crawl.Crawler.run(Crawler.java:152)
>
> at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
>
> at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>
> at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
>
>
>
> $ bin/nutch inject urls/seed.txt
>
> InjectorJob: starting at 2014-03-12 21:13:29
>
> InjectorJob: Injecting urlDir: urls/seed.txt
>
> 2014-03-12 21:13:29.658 java[12599:1003] Unable to load realm info from
> SCDynamicStore
>
> InjectorJob: Using class org.apache.gora.memory.store.MemStore as the Gora
> storage class.
>
> InjectorJob: total number of urls rejected by filters: 0
>
> InjectorJob: total number of urls injected after normalization and
> filtering: 1
>
> Injector: finished at 2014-03-12 21:13:31, elapsed: 00:00:01
>
>
>
> $ bin/nutch solrindex http://127.0.0.1:8983/solr/ crawl/crawldb -linkdb
> crawl/linkdb crawl/segments/*
>
> SolrIndexerJob: starting
>
> 2014-03-12 21:14:12.474 java[12606:1003] Unable to load realm info from
> SCDynamicStore
>
> SolrIndexerJob: done.
>