You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Dror Matalon <dr...@zapatec.com> on 2003/10/29 07:58:49 UTC

Multiple writers

Hi folks,

We're in the process of adding search to our online RSS aggregator. You
can see it in action at www.fastbuzz.com.

Currently we have more than five million items in the systems and it's
growing at the rate of more than 100,00 a day.  So we need to take into
account is that the index is constantly growing.

One of the things we want to build into the system is the ability to
rebuild the index on the fly while still inserting the items that are
coming in. 

We've looked at having things go into different directories and then
merge them, but it seems complicated and we'd need to worry about race
conditions and locking issues.

Anyone's done this before? Any suggestions?


Regards,

Dror

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Multiple writers

Posted by Otis Gospodnetic <ot...@yahoo.com>.
--- Dror Matalon <dr...@zapatec.com> wrote:
> On Wed, Oct 29, 2003 at 07:56:53AM -0500, Scott Ganyo wrote:
> > Offhand, I would say that using 2 directories and merging them is 
> > exactly what you waht.  It really shouldn't be all that complicated
> and 
> > Lucene should handle the synchronization for you...
> 
> Will it do that if we build the indexes in two different JVMs, and
> let
> one of them handle the merge. I realize that Lucene is thread safe,
> and
> handles the locking, but I thought that that's only true if you do
> all the work in a single JVM.

Lucene uses file-based locks, so running application(s) that use the
same index in multiple JVMs, should be the same as running it in a
single JVM.

Otis


> > Scott
> > 
> > Dror Matalon wrote:
> > 
> > >Hi folks,
> > >
> > >We're in the process of adding search to our online RSS
> aggregator. You
> > >can see it in action at www.fastbuzz.com.
> > >
> > >Currently we have more than five million items in the systems and
> it's
> > >growing at the rate of more than 100,00 a day.  So we need to take
> into
> > >account is that the index is constantly growing.
> > >
> > >One of the things we want to build into the system is the ability
> to
> > >rebuild the index on the fly while still inserting the items that
> are
> > >coming in. 
> > >
> > >We've looked at having things go into different directories and
> then
> > >merge them, but it seems complicated and we'd need to worry about
> race
> > >conditions and locking issues.
> > >
> > >Anyone's done this before? Any suggestions?
> > >
> > >
> > >Regards,
> > >
> > >Dror
> > >
> > > 
> > >
> > 
> > -- 
> > ...there is nothing more difficult to execute, nor more dubious of
> success, 
> > nor more dangerous to administer than to introduce a new order to
> things; 
> > for he who introduces it has all those who profit from the old
> order as his 
> > enemies; and he has only lukewarm allies in all those who might
> profit from 
> > the new. - Niccolo Machiavelli
> > 
> > 
> > 
> >
> ---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> > 
> 
> -- 
> Dror Matalon
> Zapatec Inc 
> 1700 MLK Way
> Berkeley, CA 94709
> http://www.zapatec.com
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 


__________________________________
Do you Yahoo!?
Exclusive Video Premiere - Britney Spears
http://launch.yahoo.com/promos/britneyspears/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Multiple writers

Posted by Dror Matalon <dr...@zapatec.com>.
On Wed, Oct 29, 2003 at 07:56:53AM -0500, Scott Ganyo wrote:
> Offhand, I would say that using 2 directories and merging them is 
> exactly what you waht.  It really shouldn't be all that complicated and 
> Lucene should handle the synchronization for you...

Will it do that if we build the indexes in two different JVMs, and let
one of them handle the merge. I realize that Lucene is thread safe, and
handles the locking, but I thought that that's only true if you do all
the work in a single JVM.

Regards,

Dror

> 
> Scott
> 
> Dror Matalon wrote:
> 
> >Hi folks,
> >
> >We're in the process of adding search to our online RSS aggregator. You
> >can see it in action at www.fastbuzz.com.
> >
> >Currently we have more than five million items in the systems and it's
> >growing at the rate of more than 100,00 a day.  So we need to take into
> >account is that the index is constantly growing.
> >
> >One of the things we want to build into the system is the ability to
> >rebuild the index on the fly while still inserting the items that are
> >coming in. 
> >
> >We've looked at having things go into different directories and then
> >merge them, but it seems complicated and we'd need to worry about race
> >conditions and locking issues.
> >
> >Anyone's done this before? Any suggestions?
> >
> >
> >Regards,
> >
> >Dror
> >
> > 
> >
> 
> -- 
> ...there is nothing more difficult to execute, nor more dubious of success, 
> nor more dangerous to administer than to introduce a new order to things; 
> for he who introduces it has all those who profit from the old order as his 
> enemies; and he has only lukewarm allies in all those who might profit from 
> the new. - Niccolo Machiavelli
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

-- 
Dror Matalon
Zapatec Inc 
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Multiple writers

Posted by Scott Ganyo <sc...@etapestry.com>.
Offhand, I would say that using 2 directories and merging them is 
exactly what you waht.  It really shouldn't be all that complicated and 
Lucene should handle the synchronization for you...

Scott

Dror Matalon wrote:

>Hi folks,
>
>We're in the process of adding search to our online RSS aggregator. You
>can see it in action at www.fastbuzz.com.
>
>Currently we have more than five million items in the systems and it's
>growing at the rate of more than 100,00 a day.  So we need to take into
>account is that the index is constantly growing.
>
>One of the things we want to build into the system is the ability to
>rebuild the index on the fly while still inserting the items that are
>coming in. 
>
>We've looked at having things go into different directories and then
>merge them, but it seems complicated and we'd need to worry about race
>conditions and locking issues.
>
>Anyone's done this before? Any suggestions?
>
>
>Regards,
>
>Dror
>
>  
>

-- 
...there is nothing more difficult to execute, nor more dubious of success, nor more dangerous to administer than to introduce a new order to things; for he who introduces it has all those who profit from the old order as his enemies; and he has only lukewarm allies in all those who might profit from the new. - Niccolo Machiavelli



---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org