You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@nutch.apache.org by Alex McLintock <al...@gmail.com> on 2010/06/10 19:17:28 UTC
HBase and RC 1.1 and plugins
I'm not exactly new to Nutch, but haven't used it for a year or so.
I'm a bit out of touch with current "state of the art".
I see there is some HBase code in the form of some patches. I don't
know whether this is more than "proof of concept" stuff.
I also see that there is a 1.1 release candidate in the works.
however I can see no mention of HBase in the release candidate? Is it
there at all?
If I use Nutch I am going to have to develop several plugins of my own
and perhaps change the way that URLs are found for second and
subsequent crawls. I think that HBase would significantly help with
this.
References:
http://www.gossamer-threads.com/lists/lucene/general/99072 [VOTE]
Apache Nutch 1.1 Release Candidate #2
and
http://people.apache.org/~mattmann/apache-nutch-1.1/rc2/CHANGES-1.1.txt
and
https://issues.apache.org/jira/browse/NUTCH-650
Re: HBase and RC 1.1 and plugins
Posted by Kula <ku...@gmail.com>.
i also interesting at hbase with nutch.
2010/6/11 Doğacan Güney <do...@gmail.com>
> Hi,
>
> On Thu, Jun 10, 2010 at 20:17, Alex McLintock <alex.mclintock@gmail.com
> >wrote:
>
> > I'm not exactly new to Nutch, but haven't used it for a year or so.
> > I'm a bit out of touch with current "state of the art".
> >
> > I see there is some HBase code in the form of some patches. I don't
> > know whether this is more than "proof of concept" stuff.
> >
> > I also see that there is a 1.1 release candidate in the works.
> >
> > however I can see no mention of HBase in the release candidate? Is it
> > there at all?
> >
> > If I use Nutch I am going to have to develop several plugins of my own
> > and perhaps change the way that URLs are found for second and
> > subsequent crawls. I think that HBase would significantly help with
> > this.
> >
> >
> > References:
> > http://www.gossamer-threads.com/lists/lucene/general/99072 [VOTE]
> > Apache Nutch 1.1 Release Candidate #2
> > and
> > http://people.apache.org/~mattmann/apache-nutch-1.1/rc2/CHANGES-1.1.txt
> > and
> > https://issues.apache.org/jira/browse/NUTCH-650
> >
>
> Nutch-hbase integration is still on track but development slowed down a lot
> for a while. It
> is currently picking up speed again, and early next week, I will send an
> email explaining
> current situation and then we can discuss next steps from there. FWIW, my
> goal is to finish
> it for Nutch 2.0.
>
> --
> Doğacan Güney
>
Re: HBase and RC 1.1 and plugins
Posted by Doğacan Güney <do...@gmail.com>.
Hi,
On Thu, Jun 10, 2010 at 20:17, Alex McLintock <al...@gmail.com>wrote:
> I'm not exactly new to Nutch, but haven't used it for a year or so.
> I'm a bit out of touch with current "state of the art".
>
> I see there is some HBase code in the form of some patches. I don't
> know whether this is more than "proof of concept" stuff.
>
> I also see that there is a 1.1 release candidate in the works.
>
> however I can see no mention of HBase in the release candidate? Is it
> there at all?
>
> If I use Nutch I am going to have to develop several plugins of my own
> and perhaps change the way that URLs are found for second and
> subsequent crawls. I think that HBase would significantly help with
> this.
>
>
> References:
> http://www.gossamer-threads.com/lists/lucene/general/99072 [VOTE]
> Apache Nutch 1.1 Release Candidate #2
> and
> http://people.apache.org/~mattmann/apache-nutch-1.1/rc2/CHANGES-1.1.txt
> and
> https://issues.apache.org/jira/browse/NUTCH-650
>
Nutch-hbase integration is still on track but development slowed down a lot
for a while. It
is currently picking up speed again, and early next week, I will send an
email explaining
current situation and then we can discuss next steps from there. FWIW, my
goal is to finish
it for Nutch 2.0.
--
Doğacan Güney