You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by cervenkovab <ce...@gmail.com> on 2014/01/23 14:19:51 UTC

WrongRegionException after updatedb

I run generate-fetch-parse and after update I got this exception.

2014-01-23 13:40:56,905 ERROR store.HBaseStore - Failed 747 actions:
WrongRegionException: 747 times, servers with issues: server.eu:43556, 
    2014-01-23 13:40:56,905 ERROR store.HBaseStore -
[Ljava.lang.StackTraceElement;@12101d00

Can you please help me to understand where can be problem?

using versions: Nutch 2.2.1, Hbase 0.90.6




--
View this message in context: http://lucene.472066.n3.nabble.com/WrongRegionException-after-updatedb-tp4112982.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: WrongRegionException after updatedb

Posted by cervenkovab <ce...@gmail.com>.
Hi,
well the issue apears again after running update and possibly causes that
pages are fetched no more.

Hadoop log:
/2014-03-10 23:24:13,240 ERROR store.HBaseStore - Failed 1418 actions:
WrongRegionException: 1418 times, servers with issues: localhost:33227, 
2014-03-10 23:24:13,241 ERROR store.HBaseStore -
[Ljava.lang.StackTraceElement;@6ce27643/


Hbase log:
/2014-03-10 23:26:33,793 INFO org.apache.zookeeper.server.NIOServerCnxn:
Accepted socket connection from /127.0.0.1:42526
2014-03-10 23:26:33,794 INFO org.apache.zookeeper.server.NIOServerCnxn:
Refusing session request for client /127.0.0.1:42526 as it has seen zxid
0x1602 our last zxid is 0xff1 client must try another server/


My stats:
/2014-03-10 23:05:11,393 INFO  crawl.WebTableReader - TOTAL urls:	510594
2014-03-10 23:05:11,393 INFO  crawl.WebTableReader - status 4
(status_redir_temp):	13947
2014-03-10 23:05:11,393 INFO  crawl.WebTableReader - status 5
(status_redir_perm):	6303
2014-03-10 23:05:11,393 INFO  crawl.WebTableReader - max score:	863.04
2014-03-10 23:05:11,393 INFO  crawl.WebTableReader - retry 64:	1
2014-03-10 23:05:11,393 INFO  crawl.WebTableReader - status 34
(status_retry):	692
2014-03-10 23:05:11,394 INFO  crawl.WebTableReader - status 3 (status_gone):
22738
2014-03-10 23:05:11,394 INFO  crawl.WebTableReader - *status 1
(status_unfetched):	255963*
2014-03-10 23:05:11,394 INFO  crawl.WebTableReader - status 0 (null):	1753
2014-03-10 23:05:11,394 INFO  crawl.WebTableReader - retry 156:	1
2014-03-10 23:05:11,394 INFO  crawl.WebTableReader - avg score:	0.40301135
2014-03-10 23:05:11,394 INFO  crawl.WebTableReader - WebTable statistics:
done
/


Here I found some thread where they had same exception but they resolbe it
by repairing client:
/http://zookeeper-user.578899.n2.nabble.com/Session-refused-zxid-too-high-td7577866.html/

Any idea how to make it fetching and updating properly again?

Thanks in advance




--
View this message in context: http://lucene.472066.n3.nabble.com/WrongRegionException-after-updatedb-tp4112982p4122709.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: WrongRegionException after updatedb

Posted by cervenkovab <ce...@gmail.com>.
Hi, thanks for hint.
We repaired the HBase regions with /hbase hbck -fixMeta -fixAssignments./
Don't know, what caused this problem, but it happend when we want to stop
this to be still logged:

/INFO  store.HBaseStore - Keyclass and nameclass match but mismatching table
names  mappingfile schema is 'webpage' vs actual schema 'webpage_webpage' ,
assuming they are the same./

This is in class HBaseStore

if (!tableName.equals(tableNameFromMapping)) {
            LOG.info("Keyclass and nameclass match but mismatching table
names " 
                + " mappingfile schema is '" + tableNameFromMapping 
                + "' vs actual schema '" + tableName + "' , assuming they
are the same.");
            if (tableNameFromMapping != null) {
              mappingBuilder.renameTable(tableNameFromMapping, tableName);
            }
          }

We tried all possible configurations (-crawlID on command line, nutch-site,
gora-hbase-mapping.xml), but still getting this strange INFO. When we change
in gora-hbase-mapping.xml table name to /webpage_webpage/ it creates a table
named /webpage_webpage_webpage/. 

 Is there any guide how to configure this? 

For example I want to run crawl in table named "first_webpage" and then (for
testing purposes) "second_webpage". Is there any possibility to do that? I
was assuming it can be done by this schema.prefix property, but it not works
for me (maybe I am wrong).
Thanks in advance



--
View this message in context: http://lucene.472066.n3.nabble.com/WrongRegionException-after-updatedb-tp4112982p4118418.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Re: WrongRegionException after updatedb

Posted by Tejas Patil <te...@gmail.com>.
This is tied with HBase and not Nutch. It would be beneficial if you get a
complete stack trace and post it over the HBase user group too.

~tejas


On Thu, Jan 23, 2014 at 6:49 PM, cervenkovab <ce...@gmail.com> wrote:

> I run generate-fetch-parse and after update I got this exception.
>
> 2014-01-23 13:40:56,905 ERROR store.HBaseStore - Failed 747 actions:
> WrongRegionException: 747 times, servers with issues: server.eu:43556,
>     2014-01-23 13:40:56,905 ERROR store.HBaseStore -
> [Ljava.lang.StackTraceElement;@12101d00
>
> Can you please help me to understand where can be problem?
>
> using versions: Nutch 2.2.1, Hbase 0.90.6
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/WrongRegionException-after-updatedb-tp4112982.html
> Sent from the Nutch - User mailing list archive at Nabble.com.
>