You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Cheng <zh...@gmail.com> on 2012/02/19 14:27:28 UTC

How to separate one index into multiple?

Hi,

I have one index which is mixed by multiple categories. How can I separate
it by category? I would like to save each category into a different folder.

Code example would be great.

Thanks

RE: How to separate one index into multiple?

Posted by Uwe Schindler <uw...@thetaphi.de>.
There is also MultiPassIndexSplitter, and PKIndexSplitter in contrib/misc.
PKIndexSplitter is very easy to use (one of its ctors supports passing a
Filter, all documents not matched by the filter are landing in second index,
all documents matched by the filter in the first). This splitter is way more
effective as it does not first copy the whole index, it just "merges" a
subset of all documents using a FilterIndexReader to another directory. For
your case, the filter could be a QueryWrapperFilter(TermQuery(new
Term("category", categoryToFilter))).

See http://goo.gl/vUNTd

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Cheng [mailto:zhoucheng2008@gmail.com]
> Sent: Monday, February 20, 2012 10:13 AM
> To: java-user@lucene.apache.org
> Subject: Re: How to separate one index into multiple?
> 
> great idea!
> 
> On Sun, Feb 19, 2012 at 9:43 PM, Li Li <fa...@gmail.com> wrote:
> 
> > you can delete by query like -category:category1
> >
> > On Sun, Feb 19, 2012 at 9:41 PM, Li Li <fa...@gmail.com> wrote:
> >
> > > I think you could do as follows.  taking splitting it to 3 indexes
> > > for example.
> > > you can copy the index 3 times.
> > > for copy 1
> > >   for(int i=0;i<reader1.maxDocs();i+=3){
> > >       reader1.delete(i);
> > >   }
> > > for copy
> > >   for(int i=1;i<reader2.maxDocs();i+=3){
> > >       reader2.delete(i);
> > >  }
> > > ....
> > >  and then optimize these 3 indexes
> > >
> >


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to separate one index into multiple?

Posted by Cheng <zh...@gmail.com>.
great idea!

On Sun, Feb 19, 2012 at 9:43 PM, Li Li <fa...@gmail.com> wrote:

> you can delete by query like -category:category1
>
> On Sun, Feb 19, 2012 at 9:41 PM, Li Li <fa...@gmail.com> wrote:
>
> > I think you could do as follows.  taking splitting it to 3 indexes for
> > example.
> > you can copy the index 3 times.
> > for copy 1
> >   for(int i=0;i<reader1.maxDocs();i+=3){
> >       reader1.delete(i);
> >   }
> > for copy
> >   for(int i=1;i<reader2.maxDocs();i+=3){
> >       reader2.delete(i);
> >  }
> > ....
> >  and then optimize these 3 indexes
> >
>

Re: How to separate one index into multiple?

Posted by Li Li <fa...@gmail.com>.
you can delete by query like -category:category1

On Sun, Feb 19, 2012 at 9:41 PM, Li Li <fa...@gmail.com> wrote:

> I think you could do as follows.  taking splitting it to 3 indexes for
> example.
> you can copy the index 3 times.
> for copy 1
>   for(int i=0;i<reader1.maxDocs();i+=3){
>       reader1.delete(i);
>   }
> for copy
>   for(int i=1;i<reader2.maxDocs();i+=3){
>       reader2.delete(i);
>  }
> ....
>  and then optimize these 3 indexes
>

Re: How to separate one index into multiple?

Posted by Li Li <fa...@gmail.com>.
I think you could do as follows.  taking splitting it to 3 indexes for
example.
you can copy the index 3 times.
for copy 1
  for(int i=0;i<reader1.maxDocs();i+=3){
      reader1.delete(i);
  }
for copy
  for(int i=1;i<reader2.maxDocs();i+=3){
      reader2.delete(i);
 }
....
 and then optimize these 3 indexes