You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by il...@rediff.co.in on 2009/08/23 09:09:20 UTC

How to use Hbase with Nutch

Hello,

I am trying to run NutchBase code on Hadoop/Hbase in local mode.
I have setup the environment and everything, its working fine.

I could able to create the table using Hbase shell as well. But, am not
clear how to use the InjectorHbase program for injecting set of seed urls
into my webtable. Please tell me the steps I should be following:
I created an Hbase table : create 'webtable', '__tmp_inject_key__'
Then ran the command: bin/nutch org.apache.nutchbase.crawl.InjectorHbase
webtable urls/
It throws the following exception:
org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException:
org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column
family on metadata:__tmp_inject_key__ does not exist in region
webtable2,,1250941483088 in table {NAME => 'webtable2', FAMILIES => [{NAME
=> '__tmp_inject_key__', COMPRESSION => 'NONE', VERSIONS => '3', LENGTH =>
'2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}


Please tell me the way I should create the table, is there something wrong
being done?




Re: How to use Hbase with Nutch

Posted by Doğacan Güney <do...@gmail.com>.
Hey,

On Sun, Aug 23, 2009 at 10:09, <il...@rediff.co.in> wrote:

> Hello,
>
> I am trying to run NutchBase code on Hadoop/Hbase in local mode.
> I have setup the environment and everything, its working fine.
>
> I could able to create the table using Hbase shell as well. But, am not
> clear how to use the InjectorHbase program for injecting set of seed urls
> into my webtable. Please tell me the steps I should be following:
> I created an Hbase table : create 'webtable', '__tmp_inject_key__'
> Then ran the command: bin/nutch org.apache.nutchbase.crawl.InjectorHbase
> webtable urls/
> It throws the following exception:
> org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException:
> org.apache.hadoop.hbase.regionserver.NoSuchColumnFamilyException: Column
> family on metadata:__tmp_inject_key__ does not exist in region
> webtable2,,1250941483088 in table {NAME => 'webtable2', FAMILIES => [{NAME
> => '__tmp_inject_key__', COMPRESSION => 'NONE', VERSIONS => '3', LENGTH =>
> '2147483647', TTL => '-1', IN_MEMORY => 'false', BLOCKCACHE => 'false'}]}
>
>
> Please tell me the way I should create the table, is there something wrong
> being done?
>
>
>
>
I have already committed nutchbase code into nutch's svn repository (branch:
nutchbase).

Try the  tutorial here:

http://issues.apache.org/jira/browse/NUTCH-650?focusedCommentId=12743919&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#action_12743919

PS: Please do not crosspost questions. Also, please do not post the same
question with a different title as well (if you do not receive an answer for
a while, it is fine to send a new mail though).


-- 
Doğacan Güney