You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by "Yves S. Garret" <yo...@gmail.com> on 2013/05/29 22:42:26 UTC

How to setup HBase as backend

Hi, I'm trying to run Nutch this time around with HBase in the background,
as
opposed to having MySQL instead.

In the past, I followed this tutorial:
http://nlp.solutions.asia/?p=180

This was all in good, but now that I have my HBase, I'd like to use that.
I left the configuration of Nutch as it was and proceeded to crawl
nutch.apache.org.  I got this error:
http://bin.cakephp.org/view/1301117746

What am I doing wrong?

At the moment, I'm reading through this, trying to get my stack to work,
will write back if I make any progress:
http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html

Re: How to setup HBase as backend

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Glad to hear you got it working.
I suspected it would be something like this ... we've all been there,


On Tue, Jun 4, 2013 at 2:14 PM, Yves S. Garret
<yo...@gmail.com>wrote:

> Ok, this is where I come back and feel like a dunce.  The issue was with
> where HBASE_HOME was pointing, which was to a 0.94.X install, not 0.90.6.
>
> Now all I need to do is figure out how to "drop" a table in HBase :) .
>
>
> On Tue, Jun 4, 2013 at 4:04 AM, Ferdy Galema <ferdy.galema@kalooga.com
> >wrote:
>
> > I would say no it doesn't matter what version of Hadoop you have, at
> least
> > not for the error you are having now. I have only seen the error you are
> > having when a client of HBase tries to connect to a different version of
> an
> > HBase server. (Incompatible versions that is). 0.90.X should work though.
> >
> > You don't need any special configuration to get Nutch2/HBase working,
> > besides setting the HBaseStore in gora-properties and pointing to the
> right
> > zookeeper in hbase.zookeeper.quorum. (Although that is not really
> necessary
> > when you just have a local zookeeper).
> >
> > Perhaps you are confusing zookeeper and HBase master in your config. Both
> > are needed with HBase (and then there is the regionserver). HBase master
> > usually is at 60010, the zookeeper at 2181. When you start HBase in
> > stand-alone mode then you get all 3 services in one go (HBase, zookeeper
> > and a regionserver). Finally, like Lewis said, I wouldn't bother with a
> > stand-alone Hadoop cluster until you are able to run Nutch jobs locally.
> >
> >
> > On Tue, Jun 4, 2013 at 3:25 AM, Yves S. Garret
> > <yo...@gmail.com>wrote:
> >
> > > One more question, would it matter what version of Hadoop that I have?
> > >
> > >
> > > On Thu, May 30, 2013 at 6:57 PM, Lewis John Mcgibbney <
> > > lewis.mcgibbney@gmail.com> wrote:
> > >
> > > > In all honesty I would make sure that you have a local and up-to-date
> > > > nutch-$version.job file generated and try it out in runtime/local
> > before
> > > > using the job in /runtime/deploy on your cluster.
> > > > You will know if it is good to go or not.
> > > > When you are ready to deploy it to your cluster (e.g. once your
> > satisfied
> > > > that it works on a test/sub set crawl) setup then just make it
> > available
> > > to
> > > > your Hadoop Job tracker classpath.
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
> > > > <yo...@gmail.com>wrote:
> > > >
> > > > > I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> > > > > would I need to copy around some jar files?
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > >
> > > > > > Make sure that everything is compiled and you are running from
> > > runtime
> > > > or
> > > > > > with the Jar in hadoop
> > > > > >
> > > > > >
> > > > > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > > > > > <yo...@gmail.com>wrote:
> > > > > >
> > > > > > > Here is my hbase-site.xml:
> > > > > > > http://bin.cakephp.org/view/2054577438
> > > > > > >
> > > > > > > I've set this property as well.
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <
> > nishans@amazon.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > What about your storage.data.store.class property in
> > > nutch-site.xml
> > > > > ? I
> > > > > > > > think you have to change the value to use hbase. For me it is
> > > > > > > > org.apache.gora.hbase.store.HBasetore.
> > > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > > > > > To: user@nutch.apache.org
> > > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > > >
> > > > > > > > Yes.  For the moment, for simplicity sake, I have everything
> > > going
> > > > to
> > > > > > > /tmp.
> > > > > > > >
> > > > > > > > hbase(main):004:0> scan 'test'
> > > > > > > > ROW
> > > > > > > > COLUMN+CELL
> > > > > > > >
> > > > > > > > 0 row(s) in 0.2370 seconds
> > > > > > > >
> > > > > > > > I _should_ have a table "webpage being created when I run
> > Nutch.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <
> > > nishans@amazon.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Is your hbase running ?
> > > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > > > >
> > > > > > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > > > > > http://bin.cakephp.org/view/1815127825
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > > > > > <yo...@gmail.com>wrote:
> > > > > > > > >
> > > > > > > > > > Ok, similar issue:
> > > > > > > > > > http://bin.cakephp.org/view/180499048
> > > > > > > > > >
> > > > > > > > > > I've left the defaults for config as they were, except
> this
> > > is
> > > > in
> > > > > > > > > > gora.properties in apache nutch.
> > > > > > > > > >
> > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API
> > > changes
> > > > > make
> > > > > > > > > >> more recent HBase versions incompatible.
> > > > > > > > > >> We will be upgrading HBase API usage in Gora within the
> > > > current
> > > > > > > > > >> development drive.
> > > > > > > > > >> Lewis
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > > > > > >> <yo...@gmail.com>wrote:
> > > > > > > > > >>
> > > > > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney
> <
> > > > > > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > > > >> >
> > > > > > > > > >> > > This is incompatible.
> > > > > > > > > >> > >
> > > > > > > > > >> > >
> > > > > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > > > > > >> > > <yo...@gmail.com>wrote:
> > > > > > > > > >> > >
> > > > > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > > > > > >> > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > > Hi Yves,
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > as Tejas said, your issue is almost certainly
> due
> > > to a
> > > > > > > > > >> compatibility
> > > > > > > > > >> > > > > problem between the version of Nutch and the one
> > of
> > > > > HBase.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > I had the same problem and in my case it was due
> > to
> > > > the
> > > > > > > > > >> > > > > HBase
> > > > > > > > > >> > version.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works
> > fine.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > 2013/5/29 Yves S. Garret <
> > > yoursurrogategod@gmail.com>
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around
> > with
> > > > > HBase
> > > > > > > > > >> > > > > > in the
> > > > > > > > > >> > > > > background,
> > > > > > > > > >> > > > > > as
> > > > > > > > > >> > > > > > opposed to having MySQL instead.
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > This was all in good, but now that I have my
> > > HBase,
> > > > > I'd
> > > > > > > > > >> > > > > > like to
> > > > > > > > > >> use
> > > > > > > > > >> > > > that.
> > > > > > > > > >> > > > > > I left the configuration of Nutch as it was
> and
> > > > > > proceeded
> > > > > > > > > >> > > > > > to
> > > > > > > > > >> crawl
> > > > > > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > What am I doing wrong?
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > At the moment, I'm reading through this,
> trying
> > to
> > > > get
> > > > > > my
> > > > > > > > > >> > > > > > stack
> > > > > > > > > >> to
> > > > > > > > > >> > > > work,
> > > > > > > > > >> > > > > > will write back if I make any progress:
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > >
> > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > > > > > >> ge
> > > > > > > > > >> .html
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > --
> > > > > > > > > >> > > > > Adriana Farina
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> > >
> > > > > > > > > >> > >
> > > > > > > > > >> > > --
> > > > > > > > > >> > > *Lewis*
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> --
> > > > > > > > > >> *Lewis*
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Lewis*
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Lewis*
> > > >
> > >
> >
> >
> >
> > --
> > *Ferdy Galema*
> > Kalooga Development
> >
> > --
> >
> > *Kalooga* | Visual RelevanceCheck out our Visual Gallery Layer now!<
> >
> http://www.independent.co.uk/arts-entertainment/music/news/david-cameron-gets-teenage-kicks-starring-in-one-direction-music-video-8499282.html#!kalooga-10369/%22One%20Direction%22
> > >
> > Kalooga
> >
> > Helperpark 288
> > 9723 ZA Groningen
> > The Netherlands
> > +31 50 2103400
> >
> > www.kalooga.com
> > info@kalooga.comKalooga EMEA
> >
> > 53 Davies Street
> > W1K 5JH London
> > United Kingdom
> > +44 20 7129 1430Kalooga Spain and LatAM
> >
> > Maria de Sevilla Diago No 3
> > 28022 Madrid - Madrid
> > Spain
> > +34 670 580 872
> >
>



-- 
*Lewis*

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
Ok, this is where I come back and feel like a dunce.  The issue was with
where HBASE_HOME was pointing, which was to a 0.94.X install, not 0.90.6.

Now all I need to do is figure out how to "drop" a table in HBase :) .


On Tue, Jun 4, 2013 at 4:04 AM, Ferdy Galema <fe...@kalooga.com>wrote:

> I would say no it doesn't matter what version of Hadoop you have, at least
> not for the error you are having now. I have only seen the error you are
> having when a client of HBase tries to connect to a different version of an
> HBase server. (Incompatible versions that is). 0.90.X should work though.
>
> You don't need any special configuration to get Nutch2/HBase working,
> besides setting the HBaseStore in gora-properties and pointing to the right
> zookeeper in hbase.zookeeper.quorum. (Although that is not really necessary
> when you just have a local zookeeper).
>
> Perhaps you are confusing zookeeper and HBase master in your config. Both
> are needed with HBase (and then there is the regionserver). HBase master
> usually is at 60010, the zookeeper at 2181. When you start HBase in
> stand-alone mode then you get all 3 services in one go (HBase, zookeeper
> and a regionserver). Finally, like Lewis said, I wouldn't bother with a
> stand-alone Hadoop cluster until you are able to run Nutch jobs locally.
>
>
> On Tue, Jun 4, 2013 at 3:25 AM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > One more question, would it matter what version of Hadoop that I have?
> >
> >
> > On Thu, May 30, 2013 at 6:57 PM, Lewis John Mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> > > In all honesty I would make sure that you have a local and up-to-date
> > > nutch-$version.job file generated and try it out in runtime/local
> before
> > > using the job in /runtime/deploy on your cluster.
> > > You will know if it is good to go or not.
> > > When you are ready to deploy it to your cluster (e.g. once your
> satisfied
> > > that it works on a test/sub set crawl) setup then just make it
> available
> > to
> > > your Hadoop Job tracker classpath.
> > >
> > >
> > > On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
> > > <yo...@gmail.com>wrote:
> > >
> > > > I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> > > > would I need to copy around some jar files?
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> > > > lewis.mcgibbney@gmail.com> wrote:
> > > >
> > > > > Make sure that everything is compiled and you are running from
> > runtime
> > > or
> > > > > with the Jar in hadoop
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > > > > <yo...@gmail.com>wrote:
> > > > >
> > > > > > Here is my hbase-site.xml:
> > > > > > http://bin.cakephp.org/view/2054577438
> > > > > >
> > > > > > I've set this property as well.
> > > > > >
> > > > > >
> > > > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <
> nishans@amazon.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > What about your storage.data.store.class property in
> > nutch-site.xml
> > > > ? I
> > > > > > > think you have to change the value to use hbase. For me it is
> > > > > > > org.apache.gora.hbase.store.HBasetore.
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > > > > To: user@nutch.apache.org
> > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > >
> > > > > > > Yes.  For the moment, for simplicity sake, I have everything
> > going
> > > to
> > > > > > /tmp.
> > > > > > >
> > > > > > > hbase(main):004:0> scan 'test'
> > > > > > > ROW
> > > > > > > COLUMN+CELL
> > > > > > >
> > > > > > > 0 row(s) in 0.2370 seconds
> > > > > > >
> > > > > > > I _should_ have a table "webpage being created when I run
> Nutch.
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <
> > nishans@amazon.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Is your hbase running ?
> > > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > > > > To: user@nutch.apache.org
> > > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > > >
> > > > > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > > > > http://bin.cakephp.org/view/1815127825
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > > > > <yo...@gmail.com>wrote:
> > > > > > > >
> > > > > > > > > Ok, similar issue:
> > > > > > > > > http://bin.cakephp.org/view/180499048
> > > > > > > > >
> > > > > > > > > I've left the defaults for config as they were, except this
> > is
> > > in
> > > > > > > > > gora.properties in apache nutch.
> > > > > > > > >
> gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API
> > changes
> > > > make
> > > > > > > > >> more recent HBase versions incompatible.
> > > > > > > > >> We will be upgrading HBase API usage in Gora within the
> > > current
> > > > > > > > >> development drive.
> > > > > > > > >> Lewis
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > > > > >> <yo...@gmail.com>wrote:
> > > > > > > > >>
> > > > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > > > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > > >> >
> > > > > > > > >> > > This is incompatible.
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > > > > >> > > <yo...@gmail.com>wrote:
> > > > > > > > >> > >
> > > > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > > > Hi Yves,
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > as Tejas said, your issue is almost certainly due
> > to a
> > > > > > > > >> compatibility
> > > > > > > > >> > > > > problem between the version of Nutch and the one
> of
> > > > HBase.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > I had the same problem and in my case it was due
> to
> > > the
> > > > > > > > >> > > > > HBase
> > > > > > > > >> > version.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works
> fine.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > 2013/5/29 Yves S. Garret <
> > yoursurrogategod@gmail.com>
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around
> with
> > > > HBase
> > > > > > > > >> > > > > > in the
> > > > > > > > >> > > > > background,
> > > > > > > > >> > > > > > as
> > > > > > > > >> > > > > > opposed to having MySQL instead.
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > This was all in good, but now that I have my
> > HBase,
> > > > I'd
> > > > > > > > >> > > > > > like to
> > > > > > > > >> use
> > > > > > > > >> > > > that.
> > > > > > > > >> > > > > > I left the configuration of Nutch as it was and
> > > > > proceeded
> > > > > > > > >> > > > > > to
> > > > > > > > >> crawl
> > > > > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > What am I doing wrong?
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > At the moment, I'm reading through this, trying
> to
> > > get
> > > > > my
> > > > > > > > >> > > > > > stack
> > > > > > > > >> to
> > > > > > > > >> > > > work,
> > > > > > > > >> > > > > > will write back if I make any progress:
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > >
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > > > > >> ge
> > > > > > > > >> .html
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > --
> > > > > > > > >> > > > > Adriana Farina
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > --
> > > > > > > > >> > > *Lewis*
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> --
> > > > > > > > >> *Lewis*
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Lewis*
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Lewis*
> > >
> >
>
>
>
> --
> *Ferdy Galema*
> Kalooga Development
>
> --
>
> *Kalooga* | Visual RelevanceCheck out our Visual Gallery Layer now!<
> http://www.independent.co.uk/arts-entertainment/music/news/david-cameron-gets-teenage-kicks-starring-in-one-direction-music-video-8499282.html#!kalooga-10369/%22One%20Direction%22
> >
> Kalooga
>
> Helperpark 288
> 9723 ZA Groningen
> The Netherlands
> +31 50 2103400
>
> www.kalooga.com
> info@kalooga.comKalooga EMEA
>
> 53 Davies Street
> W1K 5JH London
> United Kingdom
> +44 20 7129 1430Kalooga Spain and LatAM
>
> Maria de Sevilla Diago No 3
> 28022 Madrid - Madrid
> Spain
> +34 670 580 872
>

Re: How to setup HBase as backend

Posted by Ferdy Galema <fe...@kalooga.com>.
I would say no it doesn't matter what version of Hadoop you have, at least
not for the error you are having now. I have only seen the error you are
having when a client of HBase tries to connect to a different version of an
HBase server. (Incompatible versions that is). 0.90.X should work though.

You don't need any special configuration to get Nutch2/HBase working,
besides setting the HBaseStore in gora-properties and pointing to the right
zookeeper in hbase.zookeeper.quorum. (Although that is not really necessary
when you just have a local zookeeper).

Perhaps you are confusing zookeeper and HBase master in your config. Both
are needed with HBase (and then there is the regionserver). HBase master
usually is at 60010, the zookeeper at 2181. When you start HBase in
stand-alone mode then you get all 3 services in one go (HBase, zookeeper
and a regionserver). Finally, like Lewis said, I wouldn't bother with a
stand-alone Hadoop cluster until you are able to run Nutch jobs locally.


On Tue, Jun 4, 2013 at 3:25 AM, Yves S. Garret
<yo...@gmail.com>wrote:

> One more question, would it matter what version of Hadoop that I have?
>
>
> On Thu, May 30, 2013 at 6:57 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
> > In all honesty I would make sure that you have a local and up-to-date
> > nutch-$version.job file generated and try it out in runtime/local before
> > using the job in /runtime/deploy on your cluster.
> > You will know if it is good to go or not.
> > When you are ready to deploy it to your cluster (e.g. once your satisfied
> > that it works on a test/sub set crawl) setup then just make it available
> to
> > your Hadoop Job tracker classpath.
> >
> >
> > On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
> > <yo...@gmail.com>wrote:
> >
> > > I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> > > would I need to copy around some jar files?
> > >
> > >
> > > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> > > lewis.mcgibbney@gmail.com> wrote:
> > >
> > > > Make sure that everything is compiled and you are running from
> runtime
> > or
> > > > with the Jar in hadoop
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > > > <yo...@gmail.com>wrote:
> > > >
> > > > > Here is my hbase-site.xml:
> > > > > http://bin.cakephp.org/view/2054577438
> > > > >
> > > > > I've set this property as well.
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <nishans@amazon.com
> >
> > > > wrote:
> > > > >
> > > > > > What about your storage.data.store.class property in
> nutch-site.xml
> > > ? I
> > > > > > think you have to change the value to use hbase. For me it is
> > > > > > org.apache.gora.hbase.store.HBasetore.
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: Re: How to setup HBase as backend
> > > > > >
> > > > > > Yes.  For the moment, for simplicity sake, I have everything
> going
> > to
> > > > > /tmp.
> > > > > >
> > > > > > hbase(main):004:0> scan 'test'
> > > > > > ROW
> > > > > > COLUMN+CELL
> > > > > >
> > > > > > 0 row(s) in 0.2370 seconds
> > > > > >
> > > > > > I _should_ have a table "webpage being created when I run Nutch.
> > > > > >
> > > > > >
> > > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <
> nishans@amazon.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Is your hbase running ?
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > > > To: user@nutch.apache.org
> > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > >
> > > > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > > > http://bin.cakephp.org/view/1815127825
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > > > <yo...@gmail.com>wrote:
> > > > > > >
> > > > > > > > Ok, similar issue:
> > > > > > > > http://bin.cakephp.org/view/180499048
> > > > > > > >
> > > > > > > > I've left the defaults for config as they were, except this
> is
> > in
> > > > > > > > gora.properties in apache nutch.
> > > > > > > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > >
> > > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API
> changes
> > > make
> > > > > > > >> more recent HBase versions incompatible.
> > > > > > > >> We will be upgrading HBase API usage in Gora within the
> > current
> > > > > > > >> development drive.
> > > > > > > >> Lewis
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > > > >> <yo...@gmail.com>wrote:
> > > > > > > >>
> > > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > >> >
> > > > > > > >> > > This is incompatible.
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > > > >> > > <yo...@gmail.com>wrote:
> > > > > > > >> > >
> > > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > > Hi Yves,
> > > > > > > >> > > > >
> > > > > > > >> > > > > as Tejas said, your issue is almost certainly due
> to a
> > > > > > > >> compatibility
> > > > > > > >> > > > > problem between the version of Nutch and the one of
> > > HBase.
> > > > > > > >> > > > >
> > > > > > > >> > > > > I had the same problem and in my case it was due to
> > the
> > > > > > > >> > > > > HBase
> > > > > > > >> > version.
> > > > > > > >> > > > >
> > > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > > 2013/5/29 Yves S. Garret <
> yoursurrogategod@gmail.com>
> > > > > > > >> > > > >
> > > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around with
> > > HBase
> > > > > > > >> > > > > > in the
> > > > > > > >> > > > > background,
> > > > > > > >> > > > > > as
> > > > > > > >> > > > > > opposed to having MySQL instead.
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > This was all in good, but now that I have my
> HBase,
> > > I'd
> > > > > > > >> > > > > > like to
> > > > > > > >> use
> > > > > > > >> > > > that.
> > > > > > > >> > > > > > I left the configuration of Nutch as it was and
> > > > proceeded
> > > > > > > >> > > > > > to
> > > > > > > >> crawl
> > > > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > What am I doing wrong?
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > At the moment, I'm reading through this, trying to
> > get
> > > > my
> > > > > > > >> > > > > > stack
> > > > > > > >> to
> > > > > > > >> > > > work,
> > > > > > > >> > > > > > will write back if I make any progress:
> > > > > > > >> > > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > > > >> ge
> > > > > > > >> .html
> > > > > > > >> > > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > > --
> > > > > > > >> > > > > Adriana Farina
> > > > > > > >> > > > >
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > --
> > > > > > > >> > > *Lewis*
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> --
> > > > > > > >> *Lewis*
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Lewis*
> > > >
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
>



-- 
*Ferdy Galema*
Kalooga Development

-- 

*Kalooga* | Visual RelevanceCheck out our Visual Gallery Layer now!<http://www.independent.co.uk/arts-entertainment/music/news/david-cameron-gets-teenage-kicks-starring-in-one-direction-music-video-8499282.html#!kalooga-10369/%22One%20Direction%22>
Kalooga

Helperpark 288
9723 ZA Groningen
The Netherlands
+31 50 2103400

www.kalooga.com
info@kalooga.comKalooga EMEA

53 Davies Street
W1K 5JH London
United Kingdom
+44 20 7129 1430Kalooga Spain and LatAM

Maria de Sevilla Diago No 3
28022 Madrid - Madrid
Spain
+34 670 580 872

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
One more question, would it matter what version of Hadoop that I have?


On Thu, May 30, 2013 at 6:57 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> In all honesty I would make sure that you have a local and up-to-date
> nutch-$version.job file generated and try it out in runtime/local before
> using the job in /runtime/deploy on your cluster.
> You will know if it is good to go or not.
> When you are ready to deploy it to your cluster (e.g. once your satisfied
> that it works on a test/sub set crawl) setup then just make it available to
> your Hadoop Job tracker classpath.
>
>
> On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> > would I need to copy around some jar files?
> >
> >
> > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> > > Make sure that everything is compiled and you are running from runtime
> or
> > > with the Jar in hadoop
> > >
> > >
> > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > > <yo...@gmail.com>wrote:
> > >
> > > > Here is my hbase-site.xml:
> > > > http://bin.cakephp.org/view/2054577438
> > > >
> > > > I've set this property as well.
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <ni...@amazon.com>
> > > wrote:
> > > >
> > > > > What about your storage.data.store.class property in nutch-site.xml
> > ? I
> > > > > think you have to change the value to use hbase. For me it is
> > > > > org.apache.gora.hbase.store.HBasetore.
> > > > >
> > > > > -----Original Message-----
> > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > > To: user@nutch.apache.org
> > > > > Subject: Re: How to setup HBase as backend
> > > > >
> > > > > Yes.  For the moment, for simplicity sake, I have everything going
> to
> > > > /tmp.
> > > > >
> > > > > hbase(main):004:0> scan 'test'
> > > > > ROW
> > > > > COLUMN+CELL
> > > > >
> > > > > 0 row(s) in 0.2370 seconds
> > > > >
> > > > > I _should_ have a table "webpage being created when I run Nutch.
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <nishans@amazon.com
> >
> > > > wrote:
> > > > >
> > > > > > Is your hbase running ?
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: Re: How to setup HBase as backend
> > > > > >
> > > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > > http://bin.cakephp.org/view/1815127825
> > > > > >
> > > > > >
> > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > > <yo...@gmail.com>wrote:
> > > > > >
> > > > > > > Ok, similar issue:
> > > > > > > http://bin.cakephp.org/view/180499048
> > > > > > >
> > > > > > > I've left the defaults for config as they were, except this is
> in
> > > > > > > gora.properties in apache nutch.
> > > > > > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > > >
> > > > > > >
> > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > > >
> > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes
> > make
> > > > > > >> more recent HBase versions incompatible.
> > > > > > >> We will be upgrading HBase API usage in Gora within the
> current
> > > > > > >> development drive.
> > > > > > >> Lewis
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > > >> <yo...@gmail.com>wrote:
> > > > > > >>
> > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > > >> >
> > > > > > >> > > This is incompatible.
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > > >> > > <yo...@gmail.com>wrote:
> > > > > > >> > >
> > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > > >> > > >
> > > > > > >> > > > > Hi Yves,
> > > > > > >> > > > >
> > > > > > >> > > > > as Tejas said, your issue is almost certainly due to a
> > > > > > >> compatibility
> > > > > > >> > > > > problem between the version of Nutch and the one of
> > HBase.
> > > > > > >> > > > >
> > > > > > >> > > > > I had the same problem and in my case it was due to
> the
> > > > > > >> > > > > HBase
> > > > > > >> > version.
> > > > > > >> > > > >
> > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > > > > > >> > > > >
> > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around with
> > HBase
> > > > > > >> > > > > > in the
> > > > > > >> > > > > background,
> > > > > > >> > > > > > as
> > > > > > >> > > > > > opposed to having MySQL instead.
> > > > > > >> > > > > >
> > > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > > >> > > > > >
> > > > > > >> > > > > > This was all in good, but now that I have my HBase,
> > I'd
> > > > > > >> > > > > > like to
> > > > > > >> use
> > > > > > >> > > > that.
> > > > > > >> > > > > > I left the configuration of Nutch as it was and
> > > proceeded
> > > > > > >> > > > > > to
> > > > > > >> crawl
> > > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > > >> > > > > >
> > > > > > >> > > > > > What am I doing wrong?
> > > > > > >> > > > > >
> > > > > > >> > > > > > At the moment, I'm reading through this, trying to
> get
> > > my
> > > > > > >> > > > > > stack
> > > > > > >> to
> > > > > > >> > > > work,
> > > > > > >> > > > > > will write back if I make any progress:
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > > >> ge
> > > > > > >> .html
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > --
> > > > > > >> > > > > Adriana Farina
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > --
> > > > > > >> > > *Lewis*
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> --
> > > > > > >> *Lewis*
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Lewis*
> > >
> >
>
>
>
> --
> *Lewis*
>

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
One more question, if I get an error like this, what would be the first
thing to look at in terms of configuration files?

$ bin/nutch crawl urls -depth 3 -topN 5
Exception in thread "main" org.apache.gora.util.GoraException:
java.lang.IllegalArgumentException: Not a host:port pair:
�14856@ysg.connectlocalhost,34547,1370285799327
    at
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
    at
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
    at
org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
    at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
Caused by: java.lang.IllegalArgumentException: Not a host:port pair:
�14856@ysg.connectlocalhost,34547,1370285799327
    at org.apache.hadoop.hbase.HServerAddress.<init>(HServerAddress.java:60)
    at
org.apache.hadoop.hbase.MasterAddressTracker.getMasterAddress(MasterAddressTracker.java:63)
    at
org.apache.hadoop.hbase.client.HConnectionManager$HConnectionImplementation.getMaster(HConnectionManager.java:354)
    at org.apache.hadoop.hbase.client.HBaseAdmin.<init>(HBaseAdmin.java:94)
    at
org.apache.gora.hbase.store.HBaseStore.initialize(HBaseStore.java:108)
    at
org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
    at
org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
    ... 8 more



On Mon, Jun 3, 2013 at 3:31 PM, Tejas Patil <te...@gmail.com>wrote:

> HBase 0.90.6 is fine. I use that and didn't face any problems.
>
>
> On Mon, Jun 3, 2013 at 11:57 AM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > Positive, I have HBase 0.90.6 running at the moment.  Or would I need
> > to revert to an earlier build?
> >
> >
> > On Mon, Jun 3, 2013 at 3:47 AM, Ferdy Galema <ferdy.galema@kalooga.com
> > >wrote:
> >
> > > Hi,
> > >
> > > The following line still looks like your trying to connect to a newer
> > > version of HBase, instead of the supported 0.90.X. Are you absolutely
> > sure
> > > you are running on 0.90? And not 0.92, 0.94, 0.95?
> > >
> > > GeneratorJob: org.apache.gora.util.GoraException:
> > > java.lang.IllegalArgumentException: Not a host:port pair:
> > > � 5279@ysg.connectlocalhost,51982,1369874616660
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Fri, May 31, 2013 at 12:57 AM, Lewis John Mcgibbney <
> > > lewis.mcgibbney@gmail.com> wrote:
> > >
> > > > In all honesty I would make sure that you have a local and up-to-date
> > > > nutch-$version.job file generated and try it out in runtime/local
> > before
> > > > using the job in /runtime/deploy on your cluster.
> > > > You will know if it is good to go or not.
> > > > When you are ready to deploy it to your cluster (e.g. once your
> > satisfied
> > > > that it works on a test/sub set crawl) setup then just make it
> > available
> > > to
> > > > your Hadoop Job tracker classpath.
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
> > > > <yo...@gmail.com>wrote:
> > > >
> > > > > I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> > > > > would I need to copy around some jar files?
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > >
> > > > > > Make sure that everything is compiled and you are running from
> > > runtime
> > > > or
> > > > > > with the Jar in hadoop
> > > > > >
> > > > > >
> > > > > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > > > > > <yo...@gmail.com>wrote:
> > > > > >
> > > > > > > Here is my hbase-site.xml:
> > > > > > > http://bin.cakephp.org/view/2054577438
> > > > > > >
> > > > > > > I've set this property as well.
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <
> > nishans@amazon.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > What about your storage.data.store.class property in
> > > nutch-site.xml
> > > > > ? I
> > > > > > > > think you have to change the value to use hbase. For me it is
> > > > > > > > org.apache.gora.hbase.store.HBasetore.
> > > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > > > > > To: user@nutch.apache.org
> > > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > > >
> > > > > > > > Yes.  For the moment, for simplicity sake, I have everything
> > > going
> > > > to
> > > > > > > /tmp.
> > > > > > > >
> > > > > > > > hbase(main):004:0> scan 'test'
> > > > > > > > ROW
> > > > > > > > COLUMN+CELL
> > > > > > > >
> > > > > > > > 0 row(s) in 0.2370 seconds
> > > > > > > >
> > > > > > > > I _should_ have a table "webpage being created when I run
> > Nutch.
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <
> > > nishans@amazon.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Is your hbase running ?
> > > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > > > > > To: user@nutch.apache.org
> > > > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > > > >
> > > > > > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > > > > > http://bin.cakephp.org/view/1815127825
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > > > > > <yo...@gmail.com>wrote:
> > > > > > > > >
> > > > > > > > > > Ok, similar issue:
> > > > > > > > > > http://bin.cakephp.org/view/180499048
> > > > > > > > > >
> > > > > > > > > > I've left the defaults for config as they were, except
> this
> > > is
> > > > in
> > > > > > > > > > gora.properties in apache nutch.
> > > > > > > > > >
> > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API
> > > changes
> > > > > make
> > > > > > > > > >> more recent HBase versions incompatible.
> > > > > > > > > >> We will be upgrading HBase API usage in Gora within the
> > > > current
> > > > > > > > > >> development drive.
> > > > > > > > > >> Lewis
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > > > > > >> <yo...@gmail.com>wrote:
> > > > > > > > > >>
> > > > > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney
> <
> > > > > > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > > > >> >
> > > > > > > > > >> > > This is incompatible.
> > > > > > > > > >> > >
> > > > > > > > > >> > >
> > > > > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > > > > > >> > > <yo...@gmail.com>wrote:
> > > > > > > > > >> > >
> > > > > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > > > > > >> > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > > > > > >> > > >
> > > > > > > > > >> > > > > Hi Yves,
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > as Tejas said, your issue is almost certainly
> due
> > > to a
> > > > > > > > > >> compatibility
> > > > > > > > > >> > > > > problem between the version of Nutch and the one
> > of
> > > > > HBase.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > I had the same problem and in my case it was due
> > to
> > > > the
> > > > > > > > > >> > > > > HBase
> > > > > > > > > >> > version.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works
> > fine.
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > 2013/5/29 Yves S. Garret <
> > > yoursurrogategod@gmail.com>
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around
> > with
> > > > > HBase
> > > > > > > > > >> > > > > > in the
> > > > > > > > > >> > > > > background,
> > > > > > > > > >> > > > > > as
> > > > > > > > > >> > > > > > opposed to having MySQL instead.
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > This was all in good, but now that I have my
> > > HBase,
> > > > > I'd
> > > > > > > > > >> > > > > > like to
> > > > > > > > > >> use
> > > > > > > > > >> > > > that.
> > > > > > > > > >> > > > > > I left the configuration of Nutch as it was
> and
> > > > > > proceeded
> > > > > > > > > >> > > > > > to
> > > > > > > > > >> crawl
> > > > > > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > What am I doing wrong?
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > > > At the moment, I'm reading through this,
> trying
> > to
> > > > get
> > > > > > my
> > > > > > > > > >> > > > > > stack
> > > > > > > > > >> to
> > > > > > > > > >> > > > work,
> > > > > > > > > >> > > > > > will write back if I make any progress:
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > >
> > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > > > > > >> ge
> > > > > > > > > >> .html
> > > > > > > > > >> > > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > > > --
> > > > > > > > > >> > > > > Adriana Farina
> > > > > > > > > >> > > > >
> > > > > > > > > >> > > >
> > > > > > > > > >> > >
> > > > > > > > > >> > >
> > > > > > > > > >> > >
> > > > > > > > > >> > > --
> > > > > > > > > >> > > *Lewis*
> > > > > > > > > >> > >
> > > > > > > > > >> >
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >>
> > > > > > > > > >> --
> > > > > > > > > >> *Lewis*
> > > > > > > > > >>
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Lewis*
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Lewis*
> > > >
> > >
> > >
> > >
> > > --
> > > *Ferdy Galema*
> > > Kalooga Development
> > >
> > > --
> > >
> > > *Kalooga* | Visual RelevanceCheck out our Visual Gallery Layer now!<
> > >
> >
> http://www.independent.co.uk/arts-entertainment/music/news/david-cameron-gets-teenage-kicks-starring-in-one-direction-music-video-8499282.html#!kalooga-10369/%22One%20Direction%22
> > > >
> > > Kalooga
> > >
> > > Helperpark 288
> > > 9723 ZA Groningen
> > > The Netherlands
> > > +31 50 2103400
> > >
> > > www.kalooga.com
> > > info@kalooga.comKalooga EMEA
> > >
> > > 53 Davies Street
> > > W1K 5JH London
> > > United Kingdom
> > > +44 20 7129 1430Kalooga Spain and LatAM
> > >
> > > Maria de Sevilla Diago No 3
> > > 28022 Madrid - Madrid
> > > Spain
> > > +34 670 580 872
> > >
> >
>

Re: How to setup HBase as backend

Posted by Tejas Patil <te...@gmail.com>.
HBase 0.90.6 is fine. I use that and didn't face any problems.


On Mon, Jun 3, 2013 at 11:57 AM, Yves S. Garret
<yo...@gmail.com>wrote:

> Positive, I have HBase 0.90.6 running at the moment.  Or would I need
> to revert to an earlier build?
>
>
> On Mon, Jun 3, 2013 at 3:47 AM, Ferdy Galema <ferdy.galema@kalooga.com
> >wrote:
>
> > Hi,
> >
> > The following line still looks like your trying to connect to a newer
> > version of HBase, instead of the supported 0.90.X. Are you absolutely
> sure
> > you are running on 0.90? And not 0.92, 0.94, 0.95?
> >
> > GeneratorJob: org.apache.gora.util.GoraException:
> > java.lang.IllegalArgumentException: Not a host:port pair:
> > � 5279@ysg.connectlocalhost,51982,1369874616660
> >
> >
> >
> >
> >
> >
> > On Fri, May 31, 2013 at 12:57 AM, Lewis John Mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> > > In all honesty I would make sure that you have a local and up-to-date
> > > nutch-$version.job file generated and try it out in runtime/local
> before
> > > using the job in /runtime/deploy on your cluster.
> > > You will know if it is good to go or not.
> > > When you are ready to deploy it to your cluster (e.g. once your
> satisfied
> > > that it works on a test/sub set crawl) setup then just make it
> available
> > to
> > > your Hadoop Job tracker classpath.
> > >
> > >
> > > On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
> > > <yo...@gmail.com>wrote:
> > >
> > > > I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> > > > would I need to copy around some jar files?
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> > > > lewis.mcgibbney@gmail.com> wrote:
> > > >
> > > > > Make sure that everything is compiled and you are running from
> > runtime
> > > or
> > > > > with the Jar in hadoop
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > > > > <yo...@gmail.com>wrote:
> > > > >
> > > > > > Here is my hbase-site.xml:
> > > > > > http://bin.cakephp.org/view/2054577438
> > > > > >
> > > > > > I've set this property as well.
> > > > > >
> > > > > >
> > > > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <
> nishans@amazon.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > What about your storage.data.store.class property in
> > nutch-site.xml
> > > > ? I
> > > > > > > think you have to change the value to use hbase. For me it is
> > > > > > > org.apache.gora.hbase.store.HBasetore.
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > > > > To: user@nutch.apache.org
> > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > >
> > > > > > > Yes.  For the moment, for simplicity sake, I have everything
> > going
> > > to
> > > > > > /tmp.
> > > > > > >
> > > > > > > hbase(main):004:0> scan 'test'
> > > > > > > ROW
> > > > > > > COLUMN+CELL
> > > > > > >
> > > > > > > 0 row(s) in 0.2370 seconds
> > > > > > >
> > > > > > > I _should_ have a table "webpage being created when I run
> Nutch.
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <
> > nishans@amazon.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Is your hbase running ?
> > > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > > > > To: user@nutch.apache.org
> > > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > > >
> > > > > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > > > > http://bin.cakephp.org/view/1815127825
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > > > > <yo...@gmail.com>wrote:
> > > > > > > >
> > > > > > > > > Ok, similar issue:
> > > > > > > > > http://bin.cakephp.org/view/180499048
> > > > > > > > >
> > > > > > > > > I've left the defaults for config as they were, except this
> > is
> > > in
> > > > > > > > > gora.properties in apache nutch.
> > > > > > > > >
> gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API
> > changes
> > > > make
> > > > > > > > >> more recent HBase versions incompatible.
> > > > > > > > >> We will be upgrading HBase API usage in Gora within the
> > > current
> > > > > > > > >> development drive.
> > > > > > > > >> Lewis
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > > > > >> <yo...@gmail.com>wrote:
> > > > > > > > >>
> > > > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > > > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > > >> >
> > > > > > > > >> > > This is incompatible.
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > > > > >> > > <yo...@gmail.com>wrote:
> > > > > > > > >> > >
> > > > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > > > > >> > > >
> > > > > > > > >> > > >
> > > > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > > > > >> > > >
> > > > > > > > >> > > > > Hi Yves,
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > as Tejas said, your issue is almost certainly due
> > to a
> > > > > > > > >> compatibility
> > > > > > > > >> > > > > problem between the version of Nutch and the one
> of
> > > > HBase.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > I had the same problem and in my case it was due
> to
> > > the
> > > > > > > > >> > > > > HBase
> > > > > > > > >> > version.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works
> fine.
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > 2013/5/29 Yves S. Garret <
> > yoursurrogategod@gmail.com>
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around
> with
> > > > HBase
> > > > > > > > >> > > > > > in the
> > > > > > > > >> > > > > background,
> > > > > > > > >> > > > > > as
> > > > > > > > >> > > > > > opposed to having MySQL instead.
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > This was all in good, but now that I have my
> > HBase,
> > > > I'd
> > > > > > > > >> > > > > > like to
> > > > > > > > >> use
> > > > > > > > >> > > > that.
> > > > > > > > >> > > > > > I left the configuration of Nutch as it was and
> > > > > proceeded
> > > > > > > > >> > > > > > to
> > > > > > > > >> crawl
> > > > > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > What am I doing wrong?
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > > > At the moment, I'm reading through this, trying
> to
> > > get
> > > > > my
> > > > > > > > >> > > > > > stack
> > > > > > > > >> to
> > > > > > > > >> > > > work,
> > > > > > > > >> > > > > > will write back if I make any progress:
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > >
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > > > > >> ge
> > > > > > > > >> .html
> > > > > > > > >> > > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > >
> > > > > > > > >> > > > > --
> > > > > > > > >> > > > > Adriana Farina
> > > > > > > > >> > > > >
> > > > > > > > >> > > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > --
> > > > > > > > >> > > *Lewis*
> > > > > > > > >> > >
> > > > > > > > >> >
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >>
> > > > > > > > >> --
> > > > > > > > >> *Lewis*
> > > > > > > > >>
> > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Lewis*
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Lewis*
> > >
> >
> >
> >
> > --
> > *Ferdy Galema*
> > Kalooga Development
> >
> > --
> >
> > *Kalooga* | Visual RelevanceCheck out our Visual Gallery Layer now!<
> >
> http://www.independent.co.uk/arts-entertainment/music/news/david-cameron-gets-teenage-kicks-starring-in-one-direction-music-video-8499282.html#!kalooga-10369/%22One%20Direction%22
> > >
> > Kalooga
> >
> > Helperpark 288
> > 9723 ZA Groningen
> > The Netherlands
> > +31 50 2103400
> >
> > www.kalooga.com
> > info@kalooga.comKalooga EMEA
> >
> > 53 Davies Street
> > W1K 5JH London
> > United Kingdom
> > +44 20 7129 1430Kalooga Spain and LatAM
> >
> > Maria de Sevilla Diago No 3
> > 28022 Madrid - Madrid
> > Spain
> > +34 670 580 872
> >
>

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
Positive, I have HBase 0.90.6 running at the moment.  Or would I need
to revert to an earlier build?


On Mon, Jun 3, 2013 at 3:47 AM, Ferdy Galema <fe...@kalooga.com>wrote:

> Hi,
>
> The following line still looks like your trying to connect to a newer
> version of HBase, instead of the supported 0.90.X. Are you absolutely sure
> you are running on 0.90? And not 0.92, 0.94, 0.95?
>
> GeneratorJob: org.apache.gora.util.GoraException:
> java.lang.IllegalArgumentException: Not a host:port pair:
> � 5279@ysg.connectlocalhost,51982,1369874616660
>
>
>
>
>
>
> On Fri, May 31, 2013 at 12:57 AM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
> > In all honesty I would make sure that you have a local and up-to-date
> > nutch-$version.job file generated and try it out in runtime/local before
> > using the job in /runtime/deploy on your cluster.
> > You will know if it is good to go or not.
> > When you are ready to deploy it to your cluster (e.g. once your satisfied
> > that it works on a test/sub set crawl) setup then just make it available
> to
> > your Hadoop Job tracker classpath.
> >
> >
> > On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
> > <yo...@gmail.com>wrote:
> >
> > > I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> > > would I need to copy around some jar files?
> > >
> > >
> > > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> > > lewis.mcgibbney@gmail.com> wrote:
> > >
> > > > Make sure that everything is compiled and you are running from
> runtime
> > or
> > > > with the Jar in hadoop
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > > > <yo...@gmail.com>wrote:
> > > >
> > > > > Here is my hbase-site.xml:
> > > > > http://bin.cakephp.org/view/2054577438
> > > > >
> > > > > I've set this property as well.
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <nishans@amazon.com
> >
> > > > wrote:
> > > > >
> > > > > > What about your storage.data.store.class property in
> nutch-site.xml
> > > ? I
> > > > > > think you have to change the value to use hbase. For me it is
> > > > > > org.apache.gora.hbase.store.HBasetore.
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: Re: How to setup HBase as backend
> > > > > >
> > > > > > Yes.  For the moment, for simplicity sake, I have everything
> going
> > to
> > > > > /tmp.
> > > > > >
> > > > > > hbase(main):004:0> scan 'test'
> > > > > > ROW
> > > > > > COLUMN+CELL
> > > > > >
> > > > > > 0 row(s) in 0.2370 seconds
> > > > > >
> > > > > > I _should_ have a table "webpage being created when I run Nutch.
> > > > > >
> > > > > >
> > > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <
> nishans@amazon.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > Is your hbase running ?
> > > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > > > To: user@nutch.apache.org
> > > > > > > Subject: Re: How to setup HBase as backend
> > > > > > >
> > > > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > > > http://bin.cakephp.org/view/1815127825
> > > > > > >
> > > > > > >
> > > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > > > <yo...@gmail.com>wrote:
> > > > > > >
> > > > > > > > Ok, similar issue:
> > > > > > > > http://bin.cakephp.org/view/180499048
> > > > > > > >
> > > > > > > > I've left the defaults for config as they were, except this
> is
> > in
> > > > > > > > gora.properties in apache nutch.
> > > > > > > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > >
> > > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API
> changes
> > > make
> > > > > > > >> more recent HBase versions incompatible.
> > > > > > > >> We will be upgrading HBase API usage in Gora within the
> > current
> > > > > > > >> development drive.
> > > > > > > >> Lewis
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > > > >> <yo...@gmail.com>wrote:
> > > > > > > >>
> > > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > > > >> >
> > > > > > > >> > > This is incompatible.
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > > > >> > > <yo...@gmail.com>wrote:
> > > > > > > >> > >
> > > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > > > >> > > >
> > > > > > > >> > > >
> > > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > > > >> > > >
> > > > > > > >> > > > > Hi Yves,
> > > > > > > >> > > > >
> > > > > > > >> > > > > as Tejas said, your issue is almost certainly due
> to a
> > > > > > > >> compatibility
> > > > > > > >> > > > > problem between the version of Nutch and the one of
> > > HBase.
> > > > > > > >> > > > >
> > > > > > > >> > > > > I had the same problem and in my case it was due to
> > the
> > > > > > > >> > > > > HBase
> > > > > > > >> > version.
> > > > > > > >> > > > >
> > > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > > 2013/5/29 Yves S. Garret <
> yoursurrogategod@gmail.com>
> > > > > > > >> > > > >
> > > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around with
> > > HBase
> > > > > > > >> > > > > > in the
> > > > > > > >> > > > > background,
> > > > > > > >> > > > > > as
> > > > > > > >> > > > > > opposed to having MySQL instead.
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > This was all in good, but now that I have my
> HBase,
> > > I'd
> > > > > > > >> > > > > > like to
> > > > > > > >> use
> > > > > > > >> > > > that.
> > > > > > > >> > > > > > I left the configuration of Nutch as it was and
> > > > proceeded
> > > > > > > >> > > > > > to
> > > > > > > >> crawl
> > > > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > What am I doing wrong?
> > > > > > > >> > > > > >
> > > > > > > >> > > > > > At the moment, I'm reading through this, trying to
> > get
> > > > my
> > > > > > > >> > > > > > stack
> > > > > > > >> to
> > > > > > > >> > > > work,
> > > > > > > >> > > > > > will write back if I make any progress:
> > > > > > > >> > > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > > > >> ge
> > > > > > > >> .html
> > > > > > > >> > > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > >
> > > > > > > >> > > > > --
> > > > > > > >> > > > > Adriana Farina
> > > > > > > >> > > > >
> > > > > > > >> > > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > --
> > > > > > > >> > > *Lewis*
> > > > > > > >> > >
> > > > > > > >> >
> > > > > > > >>
> > > > > > > >>
> > > > > > > >>
> > > > > > > >> --
> > > > > > > >> *Lewis*
> > > > > > > >>
> > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Lewis*
> > > >
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
>
>
>
> --
> *Ferdy Galema*
> Kalooga Development
>
> --
>
> *Kalooga* | Visual RelevanceCheck out our Visual Gallery Layer now!<
> http://www.independent.co.uk/arts-entertainment/music/news/david-cameron-gets-teenage-kicks-starring-in-one-direction-music-video-8499282.html#!kalooga-10369/%22One%20Direction%22
> >
> Kalooga
>
> Helperpark 288
> 9723 ZA Groningen
> The Netherlands
> +31 50 2103400
>
> www.kalooga.com
> info@kalooga.comKalooga EMEA
>
> 53 Davies Street
> W1K 5JH London
> United Kingdom
> +44 20 7129 1430Kalooga Spain and LatAM
>
> Maria de Sevilla Diago No 3
> 28022 Madrid - Madrid
> Spain
> +34 670 580 872
>

Re: How to setup HBase as backend

Posted by Ferdy Galema <fe...@kalooga.com>.
Hi,

The following line still looks like your trying to connect to a newer
version of HBase, instead of the supported 0.90.X. Are you absolutely sure
you are running on 0.90? And not 0.92, 0.94, 0.95?

GeneratorJob: org.apache.gora.util.GoraException:
java.lang.IllegalArgumentException: Not a host:port pair:
� 5279@ysg.connectlocalhost,51982,1369874616660






On Fri, May 31, 2013 at 12:57 AM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> In all honesty I would make sure that you have a local and up-to-date
> nutch-$version.job file generated and try it out in runtime/local before
> using the job in /runtime/deploy on your cluster.
> You will know if it is good to go or not.
> When you are ready to deploy it to your cluster (e.g. once your satisfied
> that it works on a test/sub set crawl) setup then just make it available to
> your Hadoop Job tracker classpath.
>
>
> On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> > would I need to copy around some jar files?
> >
> >
> > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> > > Make sure that everything is compiled and you are running from runtime
> or
> > > with the Jar in hadoop
> > >
> > >
> > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > > <yo...@gmail.com>wrote:
> > >
> > > > Here is my hbase-site.xml:
> > > > http://bin.cakephp.org/view/2054577438
> > > >
> > > > I've set this property as well.
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <ni...@amazon.com>
> > > wrote:
> > > >
> > > > > What about your storage.data.store.class property in nutch-site.xml
> > ? I
> > > > > think you have to change the value to use hbase. For me it is
> > > > > org.apache.gora.hbase.store.HBasetore.
> > > > >
> > > > > -----Original Message-----
> > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > > To: user@nutch.apache.org
> > > > > Subject: Re: How to setup HBase as backend
> > > > >
> > > > > Yes.  For the moment, for simplicity sake, I have everything going
> to
> > > > /tmp.
> > > > >
> > > > > hbase(main):004:0> scan 'test'
> > > > > ROW
> > > > > COLUMN+CELL
> > > > >
> > > > > 0 row(s) in 0.2370 seconds
> > > > >
> > > > > I _should_ have a table "webpage being created when I run Nutch.
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <nishans@amazon.com
> >
> > > > wrote:
> > > > >
> > > > > > Is your hbase running ?
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: Re: How to setup HBase as backend
> > > > > >
> > > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > > http://bin.cakephp.org/view/1815127825
> > > > > >
> > > > > >
> > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > > <yo...@gmail.com>wrote:
> > > > > >
> > > > > > > Ok, similar issue:
> > > > > > > http://bin.cakephp.org/view/180499048
> > > > > > >
> > > > > > > I've left the defaults for config as they were, except this is
> in
> > > > > > > gora.properties in apache nutch.
> > > > > > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > > >
> > > > > > >
> > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > > >
> > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes
> > make
> > > > > > >> more recent HBase versions incompatible.
> > > > > > >> We will be upgrading HBase API usage in Gora within the
> current
> > > > > > >> development drive.
> > > > > > >> Lewis
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > > >> <yo...@gmail.com>wrote:
> > > > > > >>
> > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > > >> >
> > > > > > >> > > This is incompatible.
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > > >> > > <yo...@gmail.com>wrote:
> > > > > > >> > >
> > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > > >> > > >
> > > > > > >> > > > > Hi Yves,
> > > > > > >> > > > >
> > > > > > >> > > > > as Tejas said, your issue is almost certainly due to a
> > > > > > >> compatibility
> > > > > > >> > > > > problem between the version of Nutch and the one of
> > HBase.
> > > > > > >> > > > >
> > > > > > >> > > > > I had the same problem and in my case it was due to
> the
> > > > > > >> > > > > HBase
> > > > > > >> > version.
> > > > > > >> > > > >
> > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > > > > > >> > > > >
> > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around with
> > HBase
> > > > > > >> > > > > > in the
> > > > > > >> > > > > background,
> > > > > > >> > > > > > as
> > > > > > >> > > > > > opposed to having MySQL instead.
> > > > > > >> > > > > >
> > > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > > >> > > > > >
> > > > > > >> > > > > > This was all in good, but now that I have my HBase,
> > I'd
> > > > > > >> > > > > > like to
> > > > > > >> use
> > > > > > >> > > > that.
> > > > > > >> > > > > > I left the configuration of Nutch as it was and
> > > proceeded
> > > > > > >> > > > > > to
> > > > > > >> crawl
> > > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > > >> > > > > >
> > > > > > >> > > > > > What am I doing wrong?
> > > > > > >> > > > > >
> > > > > > >> > > > > > At the moment, I'm reading through this, trying to
> get
> > > my
> > > > > > >> > > > > > stack
> > > > > > >> to
> > > > > > >> > > > work,
> > > > > > >> > > > > > will write back if I make any progress:
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > > >> ge
> > > > > > >> .html
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > --
> > > > > > >> > > > > Adriana Farina
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > --
> > > > > > >> > > *Lewis*
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> --
> > > > > > >> *Lewis*
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Lewis*
> > >
> >
>
>
>
> --
> *Lewis*
>



-- 
*Ferdy Galema*
Kalooga Development

-- 

*Kalooga* | Visual RelevanceCheck out our Visual Gallery Layer now!<http://www.independent.co.uk/arts-entertainment/music/news/david-cameron-gets-teenage-kicks-starring-in-one-direction-music-video-8499282.html#!kalooga-10369/%22One%20Direction%22>
Kalooga

Helperpark 288
9723 ZA Groningen
The Netherlands
+31 50 2103400

www.kalooga.com
info@kalooga.comKalooga EMEA

53 Davies Street
W1K 5JH London
United Kingdom
+44 20 7129 1430Kalooga Spain and LatAM

Maria de Sevilla Diago No 3
28022 Madrid - Madrid
Spain
+34 670 580 872

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
Hi Lewis, at the moment, I'm running everything (HBase and Nutch)
locally.  I just want it to work on my laptop before I go further.


On Thu, May 30, 2013 at 6:57 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> In all honesty I would make sure that you have a local and up-to-date
> nutch-$version.job file generated and try it out in runtime/local before
> using the job in /runtime/deploy on your cluster.
> You will know if it is good to go or not.
> When you are ready to deploy it to your cluster (e.g. once your satisfied
> that it works on a test/sub set crawl) setup then just make it available to
> your Hadoop Job tracker classpath.
>
>
> On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> > would I need to copy around some jar files?
> >
> >
> > On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> > > Make sure that everything is compiled and you are running from runtime
> or
> > > with the Jar in hadoop
> > >
> > >
> > > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > > <yo...@gmail.com>wrote:
> > >
> > > > Here is my hbase-site.xml:
> > > > http://bin.cakephp.org/view/2054577438
> > > >
> > > > I've set this property as well.
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <ni...@amazon.com>
> > > wrote:
> > > >
> > > > > What about your storage.data.store.class property in nutch-site.xml
> > ? I
> > > > > think you have to change the value to use hbase. For me it is
> > > > > org.apache.gora.hbase.store.HBasetore.
> > > > >
> > > > > -----Original Message-----
> > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > > To: user@nutch.apache.org
> > > > > Subject: Re: How to setup HBase as backend
> > > > >
> > > > > Yes.  For the moment, for simplicity sake, I have everything going
> to
> > > > /tmp.
> > > > >
> > > > > hbase(main):004:0> scan 'test'
> > > > > ROW
> > > > > COLUMN+CELL
> > > > >
> > > > > 0 row(s) in 0.2370 seconds
> > > > >
> > > > > I _should_ have a table "webpage being created when I run Nutch.
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <nishans@amazon.com
> >
> > > > wrote:
> > > > >
> > > > > > Is your hbase running ?
> > > > > >
> > > > > > -----Original Message-----
> > > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > > To: user@nutch.apache.org
> > > > > > Subject: Re: How to setup HBase as backend
> > > > > >
> > > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > > http://bin.cakephp.org/view/1815127825
> > > > > >
> > > > > >
> > > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > > <yo...@gmail.com>wrote:
> > > > > >
> > > > > > > Ok, similar issue:
> > > > > > > http://bin.cakephp.org/view/180499048
> > > > > > >
> > > > > > > I've left the defaults for config as they were, except this is
> in
> > > > > > > gora.properties in apache nutch.
> > > > > > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > > >
> > > > > > >
> > > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > > >
> > > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes
> > make
> > > > > > >> more recent HBase versions incompatible.
> > > > > > >> We will be upgrading HBase API usage in Gora within the
> current
> > > > > > >> development drive.
> > > > > > >> Lewis
> > > > > > >>
> > > > > > >>
> > > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > > >> <yo...@gmail.com>wrote:
> > > > > > >>
> > > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > > >> >
> > > > > > >> > > This is incompatible.
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > > >> > > <yo...@gmail.com>wrote:
> > > > > > >> > >
> > > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > > >> > > >
> > > > > > >> > > >
> > > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > > >> > > >
> > > > > > >> > > > > Hi Yves,
> > > > > > >> > > > >
> > > > > > >> > > > > as Tejas said, your issue is almost certainly due to a
> > > > > > >> compatibility
> > > > > > >> > > > > problem between the version of Nutch and the one of
> > HBase.
> > > > > > >> > > > >
> > > > > > >> > > > > I had the same problem and in my case it was due to
> the
> > > > > > >> > > > > HBase
> > > > > > >> > version.
> > > > > > >> > > > >
> > > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > > > > > >> > > > >
> > > > > > >> > > > > > Hi, I'm trying to run Nutch this time around with
> > HBase
> > > > > > >> > > > > > in the
> > > > > > >> > > > > background,
> > > > > > >> > > > > > as
> > > > > > >> > > > > > opposed to having MySQL instead.
> > > > > > >> > > > > >
> > > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > > >> > > > > >
> > > > > > >> > > > > > This was all in good, but now that I have my HBase,
> > I'd
> > > > > > >> > > > > > like to
> > > > > > >> use
> > > > > > >> > > > that.
> > > > > > >> > > > > > I left the configuration of Nutch as it was and
> > > proceeded
> > > > > > >> > > > > > to
> > > > > > >> crawl
> > > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > > >> > > > > >
> > > > > > >> > > > > > What am I doing wrong?
> > > > > > >> > > > > >
> > > > > > >> > > > > > At the moment, I'm reading through this, trying to
> get
> > > my
> > > > > > >> > > > > > stack
> > > > > > >> to
> > > > > > >> > > > work,
> > > > > > >> > > > > > will write back if I make any progress:
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > > >> ge
> > > > > > >> .html
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> > > > > --
> > > > > > >> > > > > Adriana Farina
> > > > > > >> > > > >
> > > > > > >> > > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > --
> > > > > > >> > > *Lewis*
> > > > > > >> > >
> > > > > > >> >
> > > > > > >>
> > > > > > >>
> > > > > > >>
> > > > > > >> --
> > > > > > >> *Lewis*
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Lewis*
> > >
> >
>
>
>
> --
> *Lewis*
>

Re: How to setup HBase as backend

Posted by Lewis John Mcgibbney <le...@gmail.com>.
In all honesty I would make sure that you have a local and up-to-date
nutch-$version.job file generated and try it out in runtime/local before
using the job in /runtime/deploy on your cluster.
You will know if it is good to go or not.
When you are ready to deploy it to your cluster (e.g. once your satisfied
that it works on a test/sub set crawl) setup then just make it available to
your Hadoop Job tracker classpath.


On Thu, May 30, 2013 at 3:48 PM, Yves S. Garret
<yo...@gmail.com>wrote:

> I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
> would I need to copy around some jar files?
>
>
> On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
> > Make sure that everything is compiled and you are running from runtime or
> > with the Jar in hadoop
> >
> >
> > On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> > <yo...@gmail.com>wrote:
> >
> > > Here is my hbase-site.xml:
> > > http://bin.cakephp.org/view/2054577438
> > >
> > > I've set this property as well.
> > >
> > >
> > > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <ni...@amazon.com>
> > wrote:
> > >
> > > > What about your storage.data.store.class property in nutch-site.xml
> ? I
> > > > think you have to change the value to use hbase. For me it is
> > > > org.apache.gora.hbase.store.HBasetore.
> > > >
> > > > -----Original Message-----
> > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > Sent: Thursday, May 30, 2013 2:52 PM
> > > > To: user@nutch.apache.org
> > > > Subject: Re: How to setup HBase as backend
> > > >
> > > > Yes.  For the moment, for simplicity sake, I have everything going to
> > > /tmp.
> > > >
> > > > hbase(main):004:0> scan 'test'
> > > > ROW
> > > > COLUMN+CELL
> > > >
> > > > 0 row(s) in 0.2370 seconds
> > > >
> > > > I _should_ have a table "webpage being created when I run Nutch.
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <ni...@amazon.com>
> > > wrote:
> > > >
> > > > > Is your hbase running ?
> > > > >
> > > > > -----Original Message-----
> > > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > > To: user@nutch.apache.org
> > > > > Subject: Re: How to setup HBase as backend
> > > > >
> > > > > Even when I do bin/nutch generate, this is what I get:
> > > > > http://bin.cakephp.org/view/1815127825
> > > > >
> > > > >
> > > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > > <yo...@gmail.com>wrote:
> > > > >
> > > > > > Ok, similar issue:
> > > > > > http://bin.cakephp.org/view/180499048
> > > > > >
> > > > > > I've left the defaults for config as they were, except this is in
> > > > > > gora.properties in apache nutch.
> > > > > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > > >
> > > > > >
> > > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > > >
> > > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes
> make
> > > > > >> more recent HBase versions incompatible.
> > > > > >> We will be upgrading HBase API usage in Gora within the current
> > > > > >> development drive.
> > > > > >> Lewis
> > > > > >>
> > > > > >>
> > > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > > >> <yo...@gmail.com>wrote:
> > > > > >>
> > > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > > >> >
> > > > > >> >
> > > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > > >> >
> > > > > >> > > This is incompatible.
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > > >> > > <yo...@gmail.com>wrote:
> > > > > >> > >
> > > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > > >> > > >
> > > > > >> > > >
> > > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > > >> > > > <ad...@gmail.com>wrote:
> > > > > >> > > >
> > > > > >> > > > > Hi Yves,
> > > > > >> > > > >
> > > > > >> > > > > as Tejas said, your issue is almost certainly due to a
> > > > > >> compatibility
> > > > > >> > > > > problem between the version of Nutch and the one of
> HBase.
> > > > > >> > > > >
> > > > > >> > > > > I had the same problem and in my case it was due to the
> > > > > >> > > > > HBase
> > > > > >> > version.
> > > > > >> > > > >
> > > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > > > > >> > > > >
> > > > > >> > > > > > Hi, I'm trying to run Nutch this time around with
> HBase
> > > > > >> > > > > > in the
> > > > > >> > > > > background,
> > > > > >> > > > > > as
> > > > > >> > > > > > opposed to having MySQL instead.
> > > > > >> > > > > >
> > > > > >> > > > > > In the past, I followed this tutorial:
> > > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > > >> > > > > >
> > > > > >> > > > > > This was all in good, but now that I have my HBase,
> I'd
> > > > > >> > > > > > like to
> > > > > >> use
> > > > > >> > > > that.
> > > > > >> > > > > > I left the configuration of Nutch as it was and
> > proceeded
> > > > > >> > > > > > to
> > > > > >> crawl
> > > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > >> > > > > >
> > > > > >> > > > > > What am I doing wrong?
> > > > > >> > > > > >
> > > > > >> > > > > > At the moment, I'm reading through this, trying to get
> > my
> > > > > >> > > > > > stack
> > > > > >> to
> > > > > >> > > > work,
> > > > > >> > > > > > will write back if I make any progress:
> > > > > >> > > > > >
> > > > > >> > > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > > >> ge
> > > > > >> .html
> > > > > >> > > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > >
> > > > > >> > > > > --
> > > > > >> > > > > Adriana Farina
> > > > > >> > > > >
> > > > > >> > > >
> > > > > >> > >
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > --
> > > > > >> > > *Lewis*
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >>
> > > > > >>
> > > > > >> --
> > > > > >> *Lewis*
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
>



-- 
*Lewis*

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
I have $HADOOP_INSTALL in my path, would this be enough Lewis?  Or
would I need to copy around some jar files?


On Thu, May 30, 2013 at 6:35 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Make sure that everything is compiled and you are running from runtime or
> with the Jar in hadoop
>
>
> On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > Here is my hbase-site.xml:
> > http://bin.cakephp.org/view/2054577438
> >
> > I've set this property as well.
> >
> >
> > On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <ni...@amazon.com>
> wrote:
> >
> > > What about your storage.data.store.class property in nutch-site.xml ? I
> > > think you have to change the value to use hbase. For me it is
> > > org.apache.gora.hbase.store.HBasetore.
> > >
> > > -----Original Message-----
> > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > Sent: Thursday, May 30, 2013 2:52 PM
> > > To: user@nutch.apache.org
> > > Subject: Re: How to setup HBase as backend
> > >
> > > Yes.  For the moment, for simplicity sake, I have everything going to
> > /tmp.
> > >
> > > hbase(main):004:0> scan 'test'
> > > ROW
> > > COLUMN+CELL
> > >
> > > 0 row(s) in 0.2370 seconds
> > >
> > > I _should_ have a table "webpage being created when I run Nutch.
> > >
> > >
> > > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <ni...@amazon.com>
> > wrote:
> > >
> > > > Is your hbase running ?
> > > >
> > > > -----Original Message-----
> > > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > > Sent: Thursday, May 30, 2013 2:18 PM
> > > > To: user@nutch.apache.org
> > > > Subject: Re: How to setup HBase as backend
> > > >
> > > > Even when I do bin/nutch generate, this is what I get:
> > > > http://bin.cakephp.org/view/1815127825
> > > >
> > > >
> > > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > > <yo...@gmail.com>wrote:
> > > >
> > > > > Ok, similar issue:
> > > > > http://bin.cakephp.org/view/180499048
> > > > >
> > > > > I've left the defaults for config as they were, except this is in
> > > > > gora.properties in apache nutch.
> > > > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > > >
> > > > >
> > > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > > lewis.mcgibbney@gmail.com> wrote:
> > > > >
> > > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes make
> > > > >> more recent HBase versions incompatible.
> > > > >> We will be upgrading HBase API usage in Gora within the current
> > > > >> development drive.
> > > > >> Lewis
> > > > >>
> > > > >>
> > > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > > >> <yo...@gmail.com>wrote:
> > > > >>
> > > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > > >> >
> > > > >> >
> > > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > > >> >
> > > > >> > > This is incompatible.
> > > > >> > >
> > > > >> > >
> > > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > > >> > > <yo...@gmail.com>wrote:
> > > > >> > >
> > > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > > >> > > >
> > > > >> > > >
> > > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > >> > > > <ad...@gmail.com>wrote:
> > > > >> > > >
> > > > >> > > > > Hi Yves,
> > > > >> > > > >
> > > > >> > > > > as Tejas said, your issue is almost certainly due to a
> > > > >> compatibility
> > > > >> > > > > problem between the version of Nutch and the one of HBase.
> > > > >> > > > >
> > > > >> > > > > I had the same problem and in my case it was due to the
> > > > >> > > > > HBase
> > > > >> > version.
> > > > >> > > > >
> > > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > > > >> > > > >
> > > > >> > > > > > Hi, I'm trying to run Nutch this time around with HBase
> > > > >> > > > > > in the
> > > > >> > > > > background,
> > > > >> > > > > > as
> > > > >> > > > > > opposed to having MySQL instead.
> > > > >> > > > > >
> > > > >> > > > > > In the past, I followed this tutorial:
> > > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > > >> > > > > >
> > > > >> > > > > > This was all in good, but now that I have my HBase, I'd
> > > > >> > > > > > like to
> > > > >> use
> > > > >> > > > that.
> > > > >> > > > > > I left the configuration of Nutch as it was and
> proceeded
> > > > >> > > > > > to
> > > > >> crawl
> > > > >> > > > > > nutch.apache.org.  I got this error:
> > > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > > >> > > > > >
> > > > >> > > > > > What am I doing wrong?
> > > > >> > > > > >
> > > > >> > > > > > At the moment, I'm reading through this, trying to get
> my
> > > > >> > > > > > stack
> > > > >> to
> > > > >> > > > work,
> > > > >> > > > > > will write back if I make any progress:
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> >
> > > > >>
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > > >> ge
> > > > >> .html
> > > > >> > > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > >
> > > > >> > > > > --
> > > > >> > > > > Adriana Farina
> > > > >> > > > >
> > > > >> > > >
> > > > >> > >
> > > > >> > >
> > > > >> > >
> > > > >> > > --
> > > > >> > > *Lewis*
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >>
> > > > >>
> > > > >> --
> > > > >> *Lewis*
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> *Lewis*
>

Re: How to setup HBase as backend

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Make sure that everything is compiled and you are running from runtime or
with the Jar in hadoop


On Thu, May 30, 2013 at 3:00 PM, Yves S. Garret
<yo...@gmail.com>wrote:

> Here is my hbase-site.xml:
> http://bin.cakephp.org/view/2054577438
>
> I've set this property as well.
>
>
> On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <ni...@amazon.com> wrote:
>
> > What about your storage.data.store.class property in nutch-site.xml ? I
> > think you have to change the value to use hbase. For me it is
> > org.apache.gora.hbase.store.HBasetore.
> >
> > -----Original Message-----
> > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > Sent: Thursday, May 30, 2013 2:52 PM
> > To: user@nutch.apache.org
> > Subject: Re: How to setup HBase as backend
> >
> > Yes.  For the moment, for simplicity sake, I have everything going to
> /tmp.
> >
> > hbase(main):004:0> scan 'test'
> > ROW
> > COLUMN+CELL
> >
> > 0 row(s) in 0.2370 seconds
> >
> > I _should_ have a table "webpage being created when I run Nutch.
> >
> >
> > On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <ni...@amazon.com>
> wrote:
> >
> > > Is your hbase running ?
> > >
> > > -----Original Message-----
> > > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > > Sent: Thursday, May 30, 2013 2:18 PM
> > > To: user@nutch.apache.org
> > > Subject: Re: How to setup HBase as backend
> > >
> > > Even when I do bin/nutch generate, this is what I get:
> > > http://bin.cakephp.org/view/1815127825
> > >
> > >
> > > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > > <yo...@gmail.com>wrote:
> > >
> > > > Ok, similar issue:
> > > > http://bin.cakephp.org/view/180499048
> > > >
> > > > I've left the defaults for config as they were, except this is in
> > > > gora.properties in apache nutch.
> > > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > > >
> > > >
> > > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > > lewis.mcgibbney@gmail.com> wrote:
> > > >
> > > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes make
> > > >> more recent HBase versions incompatible.
> > > >> We will be upgrading HBase API usage in Gora within the current
> > > >> development drive.
> > > >> Lewis
> > > >>
> > > >>
> > > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > > >> <yo...@gmail.com>wrote:
> > > >>
> > > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > > >> >
> > > >> >
> > > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > > >> > lewis.mcgibbney@gmail.com> wrote:
> > > >> >
> > > >> > > This is incompatible.
> > > >> > >
> > > >> > >
> > > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > >> > > <yo...@gmail.com>wrote:
> > > >> > >
> > > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > >> > > >
> > > >> > > >
> > > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > >> > > > <ad...@gmail.com>wrote:
> > > >> > > >
> > > >> > > > > Hi Yves,
> > > >> > > > >
> > > >> > > > > as Tejas said, your issue is almost certainly due to a
> > > >> compatibility
> > > >> > > > > problem between the version of Nutch and the one of HBase.
> > > >> > > > >
> > > >> > > > > I had the same problem and in my case it was due to the
> > > >> > > > > HBase
> > > >> > version.
> > > >> > > > >
> > > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > > >> > > > >
> > > >> > > > > > Hi, I'm trying to run Nutch this time around with HBase
> > > >> > > > > > in the
> > > >> > > > > background,
> > > >> > > > > > as
> > > >> > > > > > opposed to having MySQL instead.
> > > >> > > > > >
> > > >> > > > > > In the past, I followed this tutorial:
> > > >> > > > > > http://nlp.solutions.asia/?p=180
> > > >> > > > > >
> > > >> > > > > > This was all in good, but now that I have my HBase, I'd
> > > >> > > > > > like to
> > > >> use
> > > >> > > > that.
> > > >> > > > > > I left the configuration of Nutch as it was and proceeded
> > > >> > > > > > to
> > > >> crawl
> > > >> > > > > > nutch.apache.org.  I got this error:
> > > >> > > > > > http://bin.cakephp.org/view/1301117746
> > > >> > > > > >
> > > >> > > > > > What am I doing wrong?
> > > >> > > > > >
> > > >> > > > > > At the moment, I'm reading through this, trying to get my
> > > >> > > > > > stack
> > > >> to
> > > >> > > > work,
> > > >> > > > > > will write back if I make any progress:
> > > >> > > > > >
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> >
> > > >> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > > >> ge
> > > >> .html
> > > >> > > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > >
> > > >> > > > > --
> > > >> > > > > Adriana Farina
> > > >> > > > >
> > > >> > > >
> > > >> > >
> > > >> > >
> > > >> > >
> > > >> > > --
> > > >> > > *Lewis*
> > > >> > >
> > > >> >
> > > >>
> > > >>
> > > >>
> > > >> --
> > > >> *Lewis*
> > > >>
> > > >
> > > >
> > >
> >
>



-- 
*Lewis*

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
Here is my hbase-site.xml:
http://bin.cakephp.org/view/2054577438

I've set this property as well.


On Thu, May 30, 2013 at 5:57 PM, Shah, Nishant <ni...@amazon.com> wrote:

> What about your storage.data.store.class property in nutch-site.xml ? I
> think you have to change the value to use hbase. For me it is
> org.apache.gora.hbase.store.HBasetore.
>
> -----Original Message-----
> From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> Sent: Thursday, May 30, 2013 2:52 PM
> To: user@nutch.apache.org
> Subject: Re: How to setup HBase as backend
>
> Yes.  For the moment, for simplicity sake, I have everything going to /tmp.
>
> hbase(main):004:0> scan 'test'
> ROW
> COLUMN+CELL
>
> 0 row(s) in 0.2370 seconds
>
> I _should_ have a table "webpage being created when I run Nutch.
>
>
> On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <ni...@amazon.com> wrote:
>
> > Is your hbase running ?
> >
> > -----Original Message-----
> > From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> > Sent: Thursday, May 30, 2013 2:18 PM
> > To: user@nutch.apache.org
> > Subject: Re: How to setup HBase as backend
> >
> > Even when I do bin/nutch generate, this is what I get:
> > http://bin.cakephp.org/view/1815127825
> >
> >
> > On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> > <yo...@gmail.com>wrote:
> >
> > > Ok, similar issue:
> > > http://bin.cakephp.org/view/180499048
> > >
> > > I've left the defaults for config as they were, except this is in
> > > gora.properties in apache nutch.
> > > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> > >
> > >
> > > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > > lewis.mcgibbney@gmail.com> wrote:
> > >
> > >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes make
> > >> more recent HBase versions incompatible.
> > >> We will be upgrading HBase API usage in Gora within the current
> > >> development drive.
> > >> Lewis
> > >>
> > >>
> > >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> > >> <yo...@gmail.com>wrote:
> > >>
> > >> > Would HBase 0.90.X and Nutch 2.1 work?
> > >> >
> > >> >
> > >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > >> > lewis.mcgibbney@gmail.com> wrote:
> > >> >
> > >> > > This is incompatible.
> > >> > >
> > >> > >
> > >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > >> > > <yo...@gmail.com>wrote:
> > >> > >
> > >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > >> > > >
> > >> > > >
> > >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > >> > > > <ad...@gmail.com>wrote:
> > >> > > >
> > >> > > > > Hi Yves,
> > >> > > > >
> > >> > > > > as Tejas said, your issue is almost certainly due to a
> > >> compatibility
> > >> > > > > problem between the version of Nutch and the one of HBase.
> > >> > > > >
> > >> > > > > I had the same problem and in my case it was due to the
> > >> > > > > HBase
> > >> > version.
> > >> > > > >
> > >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > >> > > > >
> > >> > > > > > Hi, I'm trying to run Nutch this time around with HBase
> > >> > > > > > in the
> > >> > > > > background,
> > >> > > > > > as
> > >> > > > > > opposed to having MySQL instead.
> > >> > > > > >
> > >> > > > > > In the past, I followed this tutorial:
> > >> > > > > > http://nlp.solutions.asia/?p=180
> > >> > > > > >
> > >> > > > > > This was all in good, but now that I have my HBase, I'd
> > >> > > > > > like to
> > >> use
> > >> > > > that.
> > >> > > > > > I left the configuration of Nutch as it was and proceeded
> > >> > > > > > to
> > >> crawl
> > >> > > > > > nutch.apache.org.  I got this error:
> > >> > > > > > http://bin.cakephp.org/view/1301117746
> > >> > > > > >
> > >> > > > > > What am I doing wrong?
> > >> > > > > >
> > >> > > > > > At the moment, I'm reading through this, trying to get my
> > >> > > > > > stack
> > >> to
> > >> > > > work,
> > >> > > > > > will write back if I make any progress:
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> > >> ge
> > >> .html
> > >> > > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > --
> > >> > > > > Adriana Farina
> > >> > > > >
> > >> > > >
> > >> > >
> > >> > >
> > >> > >
> > >> > > --
> > >> > > *Lewis*
> > >> > >
> > >> >
> > >>
> > >>
> > >>
> > >> --
> > >> *Lewis*
> > >>
> > >
> > >
> >
>

RE: How to setup HBase as backend

Posted by "Shah, Nishant" <ni...@amazon.com>.
What about your storage.data.store.class property in nutch-site.xml ? I think you have to change the value to use hbase. For me it is org.apache.gora.hbase.store.HBasetore.

-----Original Message-----
From: Yves S. Garret [mailto:yoursurrogategod@gmail.com] 
Sent: Thursday, May 30, 2013 2:52 PM
To: user@nutch.apache.org
Subject: Re: How to setup HBase as backend

Yes.  For the moment, for simplicity sake, I have everything going to /tmp.

hbase(main):004:0> scan 'test'
ROW
COLUMN+CELL

0 row(s) in 0.2370 seconds

I _should_ have a table "webpage being created when I run Nutch.


On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <ni...@amazon.com> wrote:

> Is your hbase running ?
>
> -----Original Message-----
> From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> Sent: Thursday, May 30, 2013 2:18 PM
> To: user@nutch.apache.org
> Subject: Re: How to setup HBase as backend
>
> Even when I do bin/nutch generate, this is what I get:
> http://bin.cakephp.org/view/1815127825
>
>
> On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > Ok, similar issue:
> > http://bin.cakephp.org/view/180499048
> >
> > I've left the defaults for config as they were, except this is in 
> > gora.properties in apache nutch.
> > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> >
> >
> > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney < 
> > lewis.mcgibbney@gmail.com> wrote:
> >
> >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes make 
> >> more recent HBase versions incompatible.
> >> We will be upgrading HBase API usage in Gora within the current 
> >> development drive.
> >> Lewis
> >>
> >>
> >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> >> <yo...@gmail.com>wrote:
> >>
> >> > Would HBase 0.90.X and Nutch 2.1 work?
> >> >
> >> >
> >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney < 
> >> > lewis.mcgibbney@gmail.com> wrote:
> >> >
> >> > > This is incompatible.
> >> > >
> >> > >
> >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> >> > > <yo...@gmail.com>wrote:
> >> > >
> >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> >> > > >
> >> > > >
> >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> >> > > > <ad...@gmail.com>wrote:
> >> > > >
> >> > > > > Hi Yves,
> >> > > > >
> >> > > > > as Tejas said, your issue is almost certainly due to a
> >> compatibility
> >> > > > > problem between the version of Nutch and the one of HBase.
> >> > > > >
> >> > > > > I had the same problem and in my case it was due to the 
> >> > > > > HBase
> >> > version.
> >> > > > >
> >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> >> > > > >
> >> > > > > > Hi, I'm trying to run Nutch this time around with HBase 
> >> > > > > > in the
> >> > > > > background,
> >> > > > > > as
> >> > > > > > opposed to having MySQL instead.
> >> > > > > >
> >> > > > > > In the past, I followed this tutorial:
> >> > > > > > http://nlp.solutions.asia/?p=180
> >> > > > > >
> >> > > > > > This was all in good, but now that I have my HBase, I'd 
> >> > > > > > like to
> >> use
> >> > > > that.
> >> > > > > > I left the configuration of Nutch as it was and proceeded 
> >> > > > > > to
> >> crawl
> >> > > > > > nutch.apache.org.  I got this error:
> >> > > > > > http://bin.cakephp.org/view/1301117746
> >> > > > > >
> >> > > > > > What am I doing wrong?
> >> > > > > >
> >> > > > > > At the moment, I'm reading through this, trying to get my 
> >> > > > > > stack
> >> to
> >> > > > work,
> >> > > > > > will write back if I make any progress:
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-stora
> >> ge
> >> .html
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Adriana Farina
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > *Lewis*
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> *Lewis*
> >>
> >
> >
>

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
Yes.  For the moment, for simplicity sake, I have everything going to /tmp.

hbase(main):004:0> scan 'test'
ROW
COLUMN+CELL

0 row(s) in 0.2370 seconds

I _should_ have a table "webpage being created when I run Nutch.


On Thu, May 30, 2013 at 5:23 PM, Shah, Nishant <ni...@amazon.com> wrote:

> Is your hbase running ?
>
> -----Original Message-----
> From: Yves S. Garret [mailto:yoursurrogategod@gmail.com]
> Sent: Thursday, May 30, 2013 2:18 PM
> To: user@nutch.apache.org
> Subject: Re: How to setup HBase as backend
>
> Even when I do bin/nutch generate, this is what I get:
> http://bin.cakephp.org/view/1815127825
>
>
> On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > Ok, similar issue:
> > http://bin.cakephp.org/view/180499048
> >
> > I've left the defaults for config as they were, except this is in
> > gora.properties in apache nutch.
> > gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
> >
> >
> > On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> >> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes make
> >> more recent HBase versions incompatible.
> >> We will be upgrading HBase API usage in Gora within the current
> >> development drive.
> >> Lewis
> >>
> >>
> >> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> >> <yo...@gmail.com>wrote:
> >>
> >> > Would HBase 0.90.X and Nutch 2.1 work?
> >> >
> >> >
> >> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> >> > lewis.mcgibbney@gmail.com> wrote:
> >> >
> >> > > This is incompatible.
> >> > >
> >> > >
> >> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> >> > > <yo...@gmail.com>wrote:
> >> > >
> >> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> >> > > >
> >> > > >
> >> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> >> > > > <ad...@gmail.com>wrote:
> >> > > >
> >> > > > > Hi Yves,
> >> > > > >
> >> > > > > as Tejas said, your issue is almost certainly due to a
> >> compatibility
> >> > > > > problem between the version of Nutch and the one of HBase.
> >> > > > >
> >> > > > > I had the same problem and in my case it was due to the HBase
> >> > version.
> >> > > > >
> >> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> >> > > > >
> >> > > > > > Hi, I'm trying to run Nutch this time around with HBase in
> >> > > > > > the
> >> > > > > background,
> >> > > > > > as
> >> > > > > > opposed to having MySQL instead.
> >> > > > > >
> >> > > > > > In the past, I followed this tutorial:
> >> > > > > > http://nlp.solutions.asia/?p=180
> >> > > > > >
> >> > > > > > This was all in good, but now that I have my HBase, I'd
> >> > > > > > like to
> >> use
> >> > > > that.
> >> > > > > > I left the configuration of Nutch as it was and proceeded
> >> > > > > > to
> >> crawl
> >> > > > > > nutch.apache.org.  I got this error:
> >> > > > > > http://bin.cakephp.org/view/1301117746
> >> > > > > >
> >> > > > > > What am I doing wrong?
> >> > > > > >
> >> > > > > > At the moment, I'm reading through this, trying to get my
> >> > > > > > stack
> >> to
> >> > > > work,
> >> > > > > > will write back if I make any progress:
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage
> >> .html
> >> > > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > --
> >> > > > > Adriana Farina
> >> > > > >
> >> > > >
> >> > >
> >> > >
> >> > >
> >> > > --
> >> > > *Lewis*
> >> > >
> >> >
> >>
> >>
> >>
> >> --
> >> *Lewis*
> >>
> >
> >
>

RE: How to setup HBase as backend

Posted by "Shah, Nishant" <ni...@amazon.com>.
Is your hbase running ?

-----Original Message-----
From: Yves S. Garret [mailto:yoursurrogategod@gmail.com] 
Sent: Thursday, May 30, 2013 2:18 PM
To: user@nutch.apache.org
Subject: Re: How to setup HBase as backend

Even when I do bin/nutch generate, this is what I get:
http://bin.cakephp.org/view/1815127825


On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
<yo...@gmail.com>wrote:

> Ok, similar issue:
> http://bin.cakephp.org/view/180499048
>
> I've left the defaults for config as they were, except this is in 
> gora.properties in apache nutch.
> gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
>
>
> On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney < 
> lewis.mcgibbney@gmail.com> wrote:
>
>> Yes, as Tejas mentioned, He runs fine with 0.90.6 API changes make 
>> more recent HBase versions incompatible.
>> We will be upgrading HBase API usage in Gora within the current 
>> development drive.
>> Lewis
>>
>>
>> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
>> <yo...@gmail.com>wrote:
>>
>> > Would HBase 0.90.X and Nutch 2.1 work?
>> >
>> >
>> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney < 
>> > lewis.mcgibbney@gmail.com> wrote:
>> >
>> > > This is incompatible.
>> > >
>> > >
>> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
>> > > <yo...@gmail.com>wrote:
>> > >
>> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
>> > > >
>> > > >
>> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
>> > > > <ad...@gmail.com>wrote:
>> > > >
>> > > > > Hi Yves,
>> > > > >
>> > > > > as Tejas said, your issue is almost certainly due to a
>> compatibility
>> > > > > problem between the version of Nutch and the one of HBase.
>> > > > >
>> > > > > I had the same problem and in my case it was due to the HBase
>> > version.
>> > > > >
>> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
>> > > > >
>> > > > >
>> > > > >
>> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
>> > > > >
>> > > > > > Hi, I'm trying to run Nutch this time around with HBase in 
>> > > > > > the
>> > > > > background,
>> > > > > > as
>> > > > > > opposed to having MySQL instead.
>> > > > > >
>> > > > > > In the past, I followed this tutorial:
>> > > > > > http://nlp.solutions.asia/?p=180
>> > > > > >
>> > > > > > This was all in good, but now that I have my HBase, I'd 
>> > > > > > like to
>> use
>> > > > that.
>> > > > > > I left the configuration of Nutch as it was and proceeded 
>> > > > > > to
>> crawl
>> > > > > > nutch.apache.org.  I got this error:
>> > > > > > http://bin.cakephp.org/view/1301117746
>> > > > > >
>> > > > > > What am I doing wrong?
>> > > > > >
>> > > > > > At the moment, I'm reading through this, trying to get my 
>> > > > > > stack
>> to
>> > > > work,
>> > > > > > will write back if I make any progress:
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage
>> .html
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Adriana Farina
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > *Lewis*
>> > >
>> >
>>
>>
>>
>> --
>> *Lewis*
>>
>
>

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
Even when I do bin/nutch generate, this is what I get:
http://bin.cakephp.org/view/1815127825


On Thu, May 30, 2013 at 5:14 PM, Yves S. Garret
<yo...@gmail.com>wrote:

> Ok, similar issue:
> http://bin.cakephp.org/view/180499048
>
> I've left the defaults for config as they were, except this is in
> gora.properties
> in apache nutch.
> gora.datastore.default=org.apache.gora.hbase.store.HBaseStore
>
>
> On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
>> Yes, as Tejas mentioned, He runs fine with 0.90.6
>> API changes make more recent HBase versions incompatible.
>> We will be upgrading HBase API usage in Gora within the current
>> development
>> drive.
>> Lewis
>>
>>
>> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
>> <yo...@gmail.com>wrote:
>>
>> > Would HBase 0.90.X and Nutch 2.1 work?
>> >
>> >
>> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
>> > lewis.mcgibbney@gmail.com> wrote:
>> >
>> > > This is incompatible.
>> > >
>> > >
>> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
>> > > <yo...@gmail.com>wrote:
>> > >
>> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
>> > > >
>> > > >
>> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
>> > > > <ad...@gmail.com>wrote:
>> > > >
>> > > > > Hi Yves,
>> > > > >
>> > > > > as Tejas said, your issue is almost certainly due to a
>> compatibility
>> > > > > problem between the version of Nutch and the one of HBase.
>> > > > >
>> > > > > I had the same problem and in my case it was due to the HBase
>> > version.
>> > > > >
>> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
>> > > > >
>> > > > >
>> > > > >
>> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
>> > > > >
>> > > > > > Hi, I'm trying to run Nutch this time around with HBase in the
>> > > > > background,
>> > > > > > as
>> > > > > > opposed to having MySQL instead.
>> > > > > >
>> > > > > > In the past, I followed this tutorial:
>> > > > > > http://nlp.solutions.asia/?p=180
>> > > > > >
>> > > > > > This was all in good, but now that I have my HBase, I'd like to
>> use
>> > > > that.
>> > > > > > I left the configuration of Nutch as it was and proceeded to
>> crawl
>> > > > > > nutch.apache.org.  I got this error:
>> > > > > > http://bin.cakephp.org/view/1301117746
>> > > > > >
>> > > > > > What am I doing wrong?
>> > > > > >
>> > > > > > At the moment, I'm reading through this, trying to get my stack
>> to
>> > > > work,
>> > > > > > will write back if I make any progress:
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html
>> > > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > --
>> > > > > Adriana Farina
>> > > > >
>> > > >
>> > >
>> > >
>> > >
>> > > --
>> > > *Lewis*
>> > >
>> >
>>
>>
>>
>> --
>> *Lewis*
>>
>
>

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
Ok, similar issue:
http://bin.cakephp.org/view/180499048

I've left the defaults for config as they were, except this is in
gora.properties
in apache nutch.
gora.datastore.default=org.apache.gora.hbase.store.HBaseStore


On Wed, May 29, 2013 at 7:40 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> Yes, as Tejas mentioned, He runs fine with 0.90.6
> API changes make more recent HBase versions incompatible.
> We will be upgrading HBase API usage in Gora within the current development
> drive.
> Lewis
>
>
> On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > Would HBase 0.90.X and Nutch 2.1 work?
> >
> >
> > On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> > lewis.mcgibbney@gmail.com> wrote:
> >
> > > This is incompatible.
> > >
> > >
> > > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > > <yo...@gmail.com>wrote:
> > >
> > > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > > >
> > > >
> > > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > > <ad...@gmail.com>wrote:
> > > >
> > > > > Hi Yves,
> > > > >
> > > > > as Tejas said, your issue is almost certainly due to a
> compatibility
> > > > > problem between the version of Nutch and the one of HBase.
> > > > >
> > > > > I had the same problem and in my case it was due to the HBase
> > version.
> > > > >
> > > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > > >
> > > > >
> > > > >
> > > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > > > >
> > > > > > Hi, I'm trying to run Nutch this time around with HBase in the
> > > > > background,
> > > > > > as
> > > > > > opposed to having MySQL instead.
> > > > > >
> > > > > > In the past, I followed this tutorial:
> > > > > > http://nlp.solutions.asia/?p=180
> > > > > >
> > > > > > This was all in good, but now that I have my HBase, I'd like to
> use
> > > > that.
> > > > > > I left the configuration of Nutch as it was and proceeded to
> crawl
> > > > > > nutch.apache.org.  I got this error:
> > > > > > http://bin.cakephp.org/view/1301117746
> > > > > >
> > > > > > What am I doing wrong?
> > > > > >
> > > > > > At the moment, I'm reading through this, trying to get my stack
> to
> > > > work,
> > > > > > will write back if I make any progress:
> > > > > >
> > > > >
> > > >
> > >
> >
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > Adriana Farina
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Lewis*
> > >
> >
>
>
>
> --
> *Lewis*
>

Re: How to setup HBase as backend

Posted by Lewis John Mcgibbney <le...@gmail.com>.
Yes, as Tejas mentioned, He runs fine with 0.90.6
API changes make more recent HBase versions incompatible.
We will be upgrading HBase API usage in Gora within the current development
drive.
Lewis


On Wed, May 29, 2013 at 4:36 PM, Yves S. Garret
<yo...@gmail.com>wrote:

> Would HBase 0.90.X and Nutch 2.1 work?
>
>
> On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
> lewis.mcgibbney@gmail.com> wrote:
>
> > This is incompatible.
> >
> >
> > On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> > <yo...@gmail.com>wrote:
> >
> > > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> > >
> > >
> > > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > > <ad...@gmail.com>wrote:
> > >
> > > > Hi Yves,
> > > >
> > > > as Tejas said, your issue is almost certainly due to a compatibility
> > > > problem between the version of Nutch and the one of HBase.
> > > >
> > > > I had the same problem and in my case it was due to the HBase
> version.
> > > >
> > > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > > >
> > > >
> > > >
> > > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > > >
> > > > > Hi, I'm trying to run Nutch this time around with HBase in the
> > > > background,
> > > > > as
> > > > > opposed to having MySQL instead.
> > > > >
> > > > > In the past, I followed this tutorial:
> > > > > http://nlp.solutions.asia/?p=180
> > > > >
> > > > > This was all in good, but now that I have my HBase, I'd like to use
> > > that.
> > > > > I left the configuration of Nutch as it was and proceeded to crawl
> > > > > nutch.apache.org.  I got this error:
> > > > > http://bin.cakephp.org/view/1301117746
> > > > >
> > > > > What am I doing wrong?
> > > > >
> > > > > At the moment, I'm reading through this, trying to get my stack to
> > > work,
> > > > > will write back if I make any progress:
> > > > >
> > > >
> > >
> >
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > Adriana Farina
> > > >
> > >
> >
> >
> >
> > --
> > *Lewis*
> >
>



-- 
*Lewis*

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
Would HBase 0.90.X and Nutch 2.1 work?


On Wed, May 29, 2013 at 5:05 PM, Lewis John Mcgibbney <
lewis.mcgibbney@gmail.com> wrote:

> This is incompatible.
>
>
> On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
> <yo...@gmail.com>wrote:
>
> > Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
> >
> >
> > On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> > <ad...@gmail.com>wrote:
> >
> > > Hi Yves,
> > >
> > > as Tejas said, your issue is almost certainly due to a compatibility
> > > problem between the version of Nutch and the one of HBase.
> > >
> > > I had the same problem and in my case it was due to the HBase version.
> > >
> > > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> > >
> > >
> > >
> > > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> > >
> > > > Hi, I'm trying to run Nutch this time around with HBase in the
> > > background,
> > > > as
> > > > opposed to having MySQL instead.
> > > >
> > > > In the past, I followed this tutorial:
> > > > http://nlp.solutions.asia/?p=180
> > > >
> > > > This was all in good, but now that I have my HBase, I'd like to use
> > that.
> > > > I left the configuration of Nutch as it was and proceeded to crawl
> > > > nutch.apache.org.  I got this error:
> > > > http://bin.cakephp.org/view/1301117746
> > > >
> > > > What am I doing wrong?
> > > >
> > > > At the moment, I'm reading through this, trying to get my stack to
> > work,
> > > > will write back if I make any progress:
> > > >
> > >
> >
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html
> > > >
> > >
> > >
> > >
> > > --
> > > Adriana Farina
> > >
> >
>
>
>
> --
> *Lewis*
>

Re: How to setup HBase as backend

Posted by Lewis John Mcgibbney <le...@gmail.com>.
This is incompatible.


On Wed, May 29, 2013 at 1:59 PM, Yves S. Garret
<yo...@gmail.com>wrote:

> Hi all, I'm using HBase 0.94.7 and Nutch 2.1.
>
>
> On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
> <ad...@gmail.com>wrote:
>
> > Hi Yves,
> >
> > as Tejas said, your issue is almost certainly due to a compatibility
> > problem between the version of Nutch and the one of HBase.
> >
> > I had the same problem and in my case it was due to the HBase version.
> >
> > I use Nutch 2.1 with HBase 0.90.4 and it works fine.
> >
> >
> >
> > 2013/5/29 Yves S. Garret <yo...@gmail.com>
> >
> > > Hi, I'm trying to run Nutch this time around with HBase in the
> > background,
> > > as
> > > opposed to having MySQL instead.
> > >
> > > In the past, I followed this tutorial:
> > > http://nlp.solutions.asia/?p=180
> > >
> > > This was all in good, but now that I have my HBase, I'd like to use
> that.
> > > I left the configuration of Nutch as it was and proceeded to crawl
> > > nutch.apache.org.  I got this error:
> > > http://bin.cakephp.org/view/1301117746
> > >
> > > What am I doing wrong?
> > >
> > > At the moment, I'm reading through this, trying to get my stack to
> work,
> > > will write back if I make any progress:
> > >
> >
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html
> > >
> >
> >
> >
> > --
> > Adriana Farina
> >
>



-- 
*Lewis*

Re: How to setup HBase as backend

Posted by "Yves S. Garret" <yo...@gmail.com>.
Hi all, I'm using HBase 0.94.7 and Nutch 2.1.


On Wed, May 29, 2013 at 4:55 PM, Adriana Farina
<ad...@gmail.com>wrote:

> Hi Yves,
>
> as Tejas said, your issue is almost certainly due to a compatibility
> problem between the version of Nutch and the one of HBase.
>
> I had the same problem and in my case it was due to the HBase version.
>
> I use Nutch 2.1 with HBase 0.90.4 and it works fine.
>
>
>
> 2013/5/29 Yves S. Garret <yo...@gmail.com>
>
> > Hi, I'm trying to run Nutch this time around with HBase in the
> background,
> > as
> > opposed to having MySQL instead.
> >
> > In the past, I followed this tutorial:
> > http://nlp.solutions.asia/?p=180
> >
> > This was all in good, but now that I have my HBase, I'd like to use that.
> > I left the configuration of Nutch as it was and proceeded to crawl
> > nutch.apache.org.  I got this error:
> > http://bin.cakephp.org/view/1301117746
> >
> > What am I doing wrong?
> >
> > At the moment, I'm reading through this, trying to get my stack to work,
> > will write back if I make any progress:
> >
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html
> >
>
>
>
> --
> Adriana Farina
>

Re: How to setup HBase as backend

Posted by Adriana Farina <ad...@gmail.com>.
Hi Yves,

as Tejas said, your issue is almost certainly due to a compatibility
problem between the version of Nutch and the one of HBase.

I had the same problem and in my case it was due to the HBase version.

I use Nutch 2.1 with HBase 0.90.4 and it works fine.



2013/5/29 Yves S. Garret <yo...@gmail.com>

> Hi, I'm trying to run Nutch this time around with HBase in the background,
> as
> opposed to having MySQL instead.
>
> In the past, I followed this tutorial:
> http://nlp.solutions.asia/?p=180
>
> This was all in good, but now that I have my HBase, I'd like to use that.
> I left the configuration of Nutch as it was and proceeded to crawl
> nutch.apache.org.  I got this error:
> http://bin.cakephp.org/view/1301117746
>
> What am I doing wrong?
>
> At the moment, I'm reading through this, trying to get my stack to work,
> will write back if I make any progress:
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html
>



-- 
Adriana Farina

Re: How to setup HBase as backend

Posted by Tejas Patil <te...@gmail.com>.
What version of HBase are you using ? See this thread [0].

The official Nutch wiki [1] says:
"Gora 0.2 uses HBase 0.90.4, however the setup is known to work with more
recent versions of the HBase 0.90.x branch"

I use hbase-0.90.6 with Nutch 2.x and it works fine.

[0] :
http://mail-archives.apache.org/mod_mbox/nutch-user/201304.mbox/%3CCAGaRif0mv4DD+xTRyhLfwyrC+KOoa1mR_ndNrJUBYXuKPYPONQ@mail.gmail.com%3E
[1] : https://wiki.apache.org/nutch/Nutch2Tutorial

On Wed, May 29, 2013 at 1:42 PM, Yves S. Garret
<yo...@gmail.com>wrote:

> Hi, I'm trying to run Nutch this time around with HBase in the background,
> as
> opposed to having MySQL instead.
>
> In the past, I followed this tutorial:
> http://nlp.solutions.asia/?p=180
>
> This was all in good, but now that I have my HBase, I'd like to use that.
> I left the configuration of Nutch as it was and proceeded to crawl
> nutch.apache.org.  I got this error:
> http://bin.cakephp.org/view/1301117746
>
> What am I doing wrong?
>
> At the moment, I'm reading through this, trying to get my stack to work,
> will write back if I make any progress:
> http://sujitpal.blogspot.com/2011/01/exploring-nutch-20-hbase-storage.html
>