You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Airway Wong <ai...@gmail.com> on 2013/08/24 09:27:55 UTC

Lucene index customization

Hi,

To customize the inverted list for different format, it seems we have to
overload many different classes and functions. We are only interested in
simple inverted index without position/posting information.

Is it possible to customize an inverted list format that only support
simple inverted index (keyword -> list of (doc, an integer) pairs), without
overloading all classes and functions?

Thanks.

Re: Lucene index customization

Posted by Erick Erickson <er...@gmail.com>.
Have you looked at the whole flexible indexing functionality? Here's
a couple of places to start:
http://www.opensourceconnections.com/2013/06/05/build-your-own-lucene-codec/
http://www.slideshare.net/LucidImagination/flexible-indexing-in-lucene-40

I'm still not quite sure why you want to do this, but have you looked
at FieldInfo? The c'tor allows you to control a lot of things like whether
to store termvectors etc. See:
http://lucene.apache.org/core/4_0_0/core/org/apache/lucene/index/FieldInfo.html

And if that doesn't answer, perhaps you'd explain a bit more _why_ you want
to do this, this could be an XY problem.

Best
Erick


On Sat, Aug 24, 2013 at 12:04 PM, Airway Wong <ai...@gmail.com> wrote:

> Thanks for the suggestion.
>
> We plan to build inverted list for a production system, so there is high
> demand for reliability and performance.
>
> Lucene is a highly sophisticated IR lib and has a lot of features. Usually
> it is much easier to trim down features and Lucene already starts to
> support customized inverted list. And that's why I am curious if it can be
> customized to support simple inverted list.
>
> Even for test purpose, it is useful to limit the functionality and
> incrementally add features. My guess is Lucene probably already have a way
> to support simple inverted index if needed.
>
> Could someone give more insight into this?
>
> Thanks.
>
> > On 08/24/2013 09:27 AM, Airway Wong wrote:
> > > To customize the inverted list for different format, it seems we have
> to
> > > overload many different classes and functions. We are only interested
> in
> > > simple inverted index without position/posting information.
> > >
> > > Is it possible to customize an inverted list format that only support
> > > simple inverted index (keyword -> list of (doc, an integer) pairs),
> without
> > > overloading all classes and functions?
> >
> > Hello!
> >
> > Reducing or making big changes to index functionality points to the fact
> > that Lucene is a bad choice for you.
> > I would suggest you to try alternatives, especially http://terrier.org/
> > (flexible IR system with main goal to serve in academic purposes).
> >
> >
> >   Regards,
> >     Ivan Krišto
>

Re: Lucene index customization

Posted by Airway Wong <ai...@gmail.com>.
Thanks for the suggestion.

We plan to build inverted list for a production system, so there is high
demand for reliability and performance.

Lucene is a highly sophisticated IR lib and has a lot of features. Usually
it is much easier to trim down features and Lucene already starts to
support customized inverted list. And that's why I am curious if it can be
customized to support simple inverted list.

Even for test purpose, it is useful to limit the functionality and
incrementally add features. My guess is Lucene probably already have a way
to support simple inverted index if needed.

Could someone give more insight into this?

Thanks.

> On 08/24/2013 09:27 AM, Airway Wong wrote:
> > To customize the inverted list for different format, it seems we have to
> > overload many different classes and functions. We are only interested in
> > simple inverted index without position/posting information.
> >
> > Is it possible to customize an inverted list format that only support
> > simple inverted index (keyword -> list of (doc, an integer) pairs),
without
> > overloading all classes and functions?
>
> Hello!
>
> Reducing or making big changes to index functionality points to the fact
> that Lucene is a bad choice for you.
> I would suggest you to try alternatives, especially http://terrier.org/
> (flexible IR system with main goal to serve in academic purposes).
>
>
>   Regards,
>     Ivan Krišto

Re: Lucene index customization

Posted by Ivan Krišto <iv...@gmail.com>.
On 08/24/2013 09:27 AM, Airway Wong wrote:
> To customize the inverted list for different format, it seems we have to
> overload many different classes and functions. We are only interested in
> simple inverted index without position/posting information.
>
> Is it possible to customize an inverted list format that only support
> simple inverted index (keyword -> list of (doc, an integer) pairs), without
> overloading all classes and functions?

Hello!

Reducing or making big changes to index functionality points to the fact
that Lucene is a bad choice for you.
I would suggest you to try alternatives, especially http://terrier.org/
(flexible IR system with main goal to serve in academic purposes).


  Regards,
    Ivan Krišto


Re: Lucene index customization

Posted by Robert Muir <rc...@gmail.com>.
FieldType myType = new FieldType(TextField.TYPE_NOT_STORED);
myType.setIndexOptions(IndexOptions.DOCS_ONLY);
document.add(new Field("title", "some title", myType));
document.add(new Field("body", "some contents", myType));
...

On Sat, Aug 24, 2013 at 3:27 AM, Airway Wong <ai...@gmail.com> wrote:
> Hi,
>
> To customize the inverted list for different format, it seems we have to
> overload many different classes and functions. We are only interested in
> simple inverted index without position/posting information.
>
> Is it possible to customize an inverted list format that only support
> simple inverted index (keyword -> list of (doc, an integer) pairs), without
> overloading all classes and functions?
>
> Thanks.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org