You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Geoff Hendrey <gh...@decarta.com> on 2011/01/19 19:12:19 UTC

IndexOuputFormat?

Hi -

 

I have downloaded 0.89 because I need to use
HFileOutputFormat.configureIncrementalLoad. I downloaded it from here:
http://mirror.olnevhost.net/pub/apache//hbase/hbase-0.89.20100924/
<http://mirror.olnevhost.net/pub/apache/hbase/hbase-0.89.20100924/> 

 

However, I don't see these two clases

import org.apache.hadoop.hbase.mapreduce.IndexOutputFormat;

import org.apache.hadoop.hbase.mapreduce.LuceneDocumentWrapper;

 

They were present in the 0.26 jar.  Where are these classes now located?
(and yes, I will include a doc patch once I have worked out all the
kinks around using incremental load).

 

Thanks,

geoff


RE: IndexOuputFormat?

Posted by Geoff Hendrey <gh...@decarta.com>.
"The latter is usually a minor issue (I haven't tried it; I'm just
speaking from experience converting MR jobs to use APIs from new
package).  Are you finding it otherwise Geoff?"

We've been using the new mapreduce apis. We moved off of mapred a long
time ago. It was a straightforward move, and we liked it because we were
moving in the "right direction" (i.e. to new apis). The version of
IndexOutputFormat that was part of 20.6 derived from the mapreduce apis.
So basically, the hbasene package just seems stale in that it seems to
have moved backwards to the "mapred" apis. Having already moved all our
jobs from mapred to mapreduce, going backward seems weird. 

"Why do you have to touch HBase at all Geoff?  Can you not just make a
mapreduce job of adjusted IndexOutputFormat bundling lucene and have
it run against HBase APIs?"

IndexOutputFormat isn't part of Lucene is it? As far as I know it exist
in two packages:
1) the "original"
http://hbase.apache.org/docs/r0.20.6/api/org/apache/hadoop/hbase/mapredu
ce/IndexOutputFormat.html

2) and in this Hbasene project (flakey? stale?):
org.hbasene.index.create.mapred.IndexOutputFormat

If IndexOutputFormat is available in Lucene, I'd be thrilled to use it!
Is it available in Lucene? I can appreciate the need to jettison cruft.
I just wish that IndexOutputFormat existed somewhere in a discrete jar
in a way that my existing code required no code change. Rather, I'd just
put a new jar in my lib.

-g

-----Original Message-----
From: saint.ack@gmail.com [mailto:saint.ack@gmail.com] On Behalf Of
Stack
Sent: Wednesday, January 19, 2011 3:13 PM
To: user@hbase.apache.org
Subject: Re: IndexOuputFormat?

On Wed, Jan 19, 2011 at 3:00 PM, Geoff Hendrey <gh...@decarta.com>
wrote:
> I investigated hbasene. The source download relies on ".mapred" api,
not
> ".mapreduce". It's maven pom doesn't build without a lot of hacking
and
> fixing unresolved dependencies, and even when I was able to build the
> source, I am still out of luck because of the mapred vs mapreduce
issue.
>

The latter is usually a minor issue (I haven't tried it; I'm just
speaking from experience converting MR jobs to use APIs from new
package).  Are you finding it otherwise Geoff?


> Is my only recourse to make a custom build of 0.89 which mixes source
> from the 0.26 release of HBase in with the 0.89 release? I would have
> thought IndexOutputFormat was an important feature to move forward in
> the trunk.
>

Why do you have to touch HBase at all Geoff?  Can you not just make a
mapreduce job of adjusted IndexOutputFormat bundling lucene and have
it run against HBase APIs?

Regards it being an important feature for core HBase, for sure its a
nice-to-have, but we've been trying to jettison all but core from
HBase and have add-ons live elsewhere.  We found that carrying along
all contribs and additions with their different rates of development
(and with flux in developer interest in keeping up the add-on) proved
a drag on core development.

St.Ack

Re: IndexOuputFormat?

Posted by Stack <st...@duboce.net>.
On Wed, Jan 19, 2011 at 3:00 PM, Geoff Hendrey <gh...@decarta.com> wrote:
> I investigated hbasene. The source download relies on ".mapred" api, not
> ".mapreduce". It's maven pom doesn't build without a lot of hacking and
> fixing unresolved dependencies, and even when I was able to build the
> source, I am still out of luck because of the mapred vs mapreduce issue.
>

The latter is usually a minor issue (I haven't tried it; I'm just
speaking from experience converting MR jobs to use APIs from new
package).  Are you finding it otherwise Geoff?


> Is my only recourse to make a custom build of 0.89 which mixes source
> from the 0.26 release of HBase in with the 0.89 release? I would have
> thought IndexOutputFormat was an important feature to move forward in
> the trunk.
>

Why do you have to touch HBase at all Geoff?  Can you not just make a
mapreduce job of adjusted IndexOutputFormat bundling lucene and have
it run against HBase APIs?

Regards it being an important feature for core HBase, for sure its a
nice-to-have, but we've been trying to jettison all but core from
HBase and have add-ons live elsewhere.  We found that carrying along
all contribs and additions with their different rates of development
(and with flux in developer interest in keeping up the add-on) proved
a drag on core development.

St.Ack

RE: IndexOuputFormat?

Posted by Geoff Hendrey <gh...@decarta.com>.
I investigated hbasene. The source download relies on ".mapred" api, not
".mapreduce". It's maven pom doesn't build without a lot of hacking and
fixing unresolved dependencies, and even when I was able to build the
source, I am still out of luck because of the mapred vs mapreduce issue.

Is my only recourse to make a custom build of 0.89 which mixes source
from the 0.26 release of HBase in with the 0.89 release? I would have
thought IndexOutputFormat was an important feature to move forward in
the trunk.

Thoughts? Suggestions?

-geoff

-----Original Message-----
From: Gary Helmling [mailto:ghelmling@gmail.com] 
Sent: Wednesday, January 19, 2011 11:42 AM
To: user@hbase.apache.org
Subject: Re: IndexOuputFormat?

Hi Geoff,

The lucene index building was moved out into a separate project early in
the
0.90 dev cycle, see here:
https://issues.apache.org/jira/browse/HBASE-2212

The current URL seems to be: https://github.com/akkumar/hbasene

I don't know if this is still active, however.

--gh



On Wed, Jan 19, 2011 at 10:12 AM, Geoff Hendrey
<gh...@decarta.com>wrote:

> Hi -
>
>
>
> I have downloaded 0.89 because I need to use
> HFileOutputFormat.configureIncrementalLoad. I downloaded it from here:
> http://mirror.olnevhost.net/pub/apache//hbase/hbase-0.89.20100924/
> <http://mirror.olnevhost.net/pub/apache/hbase/hbase-0.89.20100924/>
>
>
>
> However, I don't see these two clases
>
> import org.apache.hadoop.hbase.mapreduce.IndexOutputFormat;
>
> import org.apache.hadoop.hbase.mapreduce.LuceneDocumentWrapper;
>
>
>
> They were present in the 0.26 jar.  Where are these classes now
located?
> (and yes, I will include a doc patch once I have worked out all the
> kinks around using incremental load).
>
>
>
> Thanks,
>
> geoff
>
>

Re: IndexOuputFormat?

Posted by Gary Helmling <gh...@gmail.com>.
Hi Geoff,

The lucene index building was moved out into a separate project early in the
0.90 dev cycle, see here:
https://issues.apache.org/jira/browse/HBASE-2212

The current URL seems to be: https://github.com/akkumar/hbasene

I don't know if this is still active, however.

--gh



On Wed, Jan 19, 2011 at 10:12 AM, Geoff Hendrey <gh...@decarta.com>wrote:

> Hi -
>
>
>
> I have downloaded 0.89 because I need to use
> HFileOutputFormat.configureIncrementalLoad. I downloaded it from here:
> http://mirror.olnevhost.net/pub/apache//hbase/hbase-0.89.20100924/
> <http://mirror.olnevhost.net/pub/apache/hbase/hbase-0.89.20100924/>
>
>
>
> However, I don't see these two clases
>
> import org.apache.hadoop.hbase.mapreduce.IndexOutputFormat;
>
> import org.apache.hadoop.hbase.mapreduce.LuceneDocumentWrapper;
>
>
>
> They were present in the 0.26 jar.  Where are these classes now located?
> (and yes, I will include a doc patch once I have worked out all the
> kinks around using incremental load).
>
>
>
> Thanks,
>
> geoff
>
>