You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Chandan Tamrakar <ch...@nepasoft.com> on 2009/09/25 10:13:41 UTC

Integrating Lucene in Hadoop

I was doing few R&D on  integrating hadoop and  Lucene . I could create
lucene indexes into HDFS using  index contribution provided for Hadoop

 

Is this a proper way to create Lucene index how much it is different from a
project KATTA ?

 

Please suggest what would be the better approach to create distributed
lucene indexes 

 

Thanks in advance


Re: Integrating Lucene in Hadoop

Posted by Jason Venner <ja...@gmail.com>.
So, solr-1301 builds the solr lucene indexes, it is over kill if you are
just using lucene. The contrib index build code with hadoop is sufficient.

Moving the indexes to the local file system is the most simple strategy, you
could try using fuse to mount hdfs as a local file system, for access as an
alternative.

On Sun, Oct 4, 2009 at 3:19 AM, Chandan Tamrakar <
chandan.tamrakar@nepasoft.com> wrote:

> thanks jason
>
> In my case lucene indexes needs to be accessed thru .net  code ( simple
> search interfaces)
> so should I use solr-1301 patch to index the documents and then move the
> indexes  from HDFS to Local file system so that .net interface could access
> and query the indexes directly ?
>
> Will this be a better approach ? alternatively  we might need some sort of
> web service  that could read/query from index in hdfs and would be called
> by
> .net front end UI.
>
> will very much appreciate the suggestions
> thanks
>
>
> On Mon, Sep 28, 2009 at 10:15 AM, Jason Venner <jason.hadoop@gmail.com
> >wrote:
>
> > Katta is a tool for managing distributed search, which happens by default
> > to
> > use lucene as it' search engine.
> > Katta is able to read indexes from hdfs, or s3, that katta then deploys
> > onto
> > the local disk of the katta nodes.
> >
> > The contrib directory of the hadoop installation contains a tool for
> > building lucene indexes, and patch solr-1395, provides an integration of
> > solr with katta, and patch solr-1301 provides a simple way of building
> solr
> > indexes in hadoop (as a parallel to the contrib lucene index build tool).
> >
> >
> >
> > On Fri, Sep 25, 2009 at 1:13 AM, Chandan Tamrakar <
> > chandan.tamrakar@nepasoft.com> wrote:
> >
> > > I was doing few R&D on  integrating hadoop and  Lucene . I could create
> > > lucene indexes into HDFS using  index contribution provided for Hadoop
> > >
> > >
> > >
> > > Is this a proper way to create Lucene index how much it is different
> from
> > a
> > > project KATTA ?
> > >
> > >
> > >
> > > Please suggest what would be the better approach to create distributed
> > > lucene indexes
> > >
> > >
> > >
> > > Thanks in advance
> > >
> > >
> >
> >
> > --
> > Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> > http://www.amazon.com/dp/1430219424?tag=jewlerymall
> > www.prohadoopbook.com a community for Hadoop Professionals
> >
>
>
>
> --
> Chandan Tamrakar
>



-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals

Re: Integrating Lucene in Hadoop

Posted by Chandan Tamrakar <ch...@nepasoft.com>.
thanks jason

In my case lucene indexes needs to be accessed thru .net  code ( simple
search interfaces)
so should I use solr-1301 patch to index the documents and then move the
indexes  from HDFS to Local file system so that .net interface could access
and query the indexes directly ?

Will this be a better approach ? alternatively  we might need some sort of
web service  that could read/query from index in hdfs and would be called by
.net front end UI.

will very much appreciate the suggestions
thanks


On Mon, Sep 28, 2009 at 10:15 AM, Jason Venner <ja...@gmail.com>wrote:

> Katta is a tool for managing distributed search, which happens by default
> to
> use lucene as it' search engine.
> Katta is able to read indexes from hdfs, or s3, that katta then deploys
> onto
> the local disk of the katta nodes.
>
> The contrib directory of the hadoop installation contains a tool for
> building lucene indexes, and patch solr-1395, provides an integration of
> solr with katta, and patch solr-1301 provides a simple way of building solr
> indexes in hadoop (as a parallel to the contrib lucene index build tool).
>
>
>
> On Fri, Sep 25, 2009 at 1:13 AM, Chandan Tamrakar <
> chandan.tamrakar@nepasoft.com> wrote:
>
> > I was doing few R&D on  integrating hadoop and  Lucene . I could create
> > lucene indexes into HDFS using  index contribution provided for Hadoop
> >
> >
> >
> > Is this a proper way to create Lucene index how much it is different from
> a
> > project KATTA ?
> >
> >
> >
> > Please suggest what would be the better approach to create distributed
> > lucene indexes
> >
> >
> >
> > Thanks in advance
> >
> >
>
>
> --
> Pro Hadoop, a book to guide you from beginner to hadoop mastery,
> http://www.amazon.com/dp/1430219424?tag=jewlerymall
> www.prohadoopbook.com a community for Hadoop Professionals
>



-- 
Chandan Tamrakar

Re: Integrating Lucene in Hadoop

Posted by Jason Venner <ja...@gmail.com>.
Katta is a tool for managing distributed search, which happens by default to
use lucene as it' search engine.
Katta is able to read indexes from hdfs, or s3, that katta then deploys onto
the local disk of the katta nodes.

The contrib directory of the hadoop installation contains a tool for
building lucene indexes, and patch solr-1395, provides an integration of
solr with katta, and patch solr-1301 provides a simple way of building solr
indexes in hadoop (as a parallel to the contrib lucene index build tool).



On Fri, Sep 25, 2009 at 1:13 AM, Chandan Tamrakar <
chandan.tamrakar@nepasoft.com> wrote:

> I was doing few R&D on  integrating hadoop and  Lucene . I could create
> lucene indexes into HDFS using  index contribution provided for Hadoop
>
>
>
> Is this a proper way to create Lucene index how much it is different from a
> project KATTA ?
>
>
>
> Please suggest what would be the better approach to create distributed
> lucene indexes
>
>
>
> Thanks in advance
>
>


-- 
Pro Hadoop, a book to guide you from beginner to hadoop mastery,
http://www.amazon.com/dp/1430219424?tag=jewlerymall
www.prohadoopbook.com a community for Hadoop Professionals