You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ruslan Sivak <rs...@istandfor.com> on 2008/02/07 21:14:37 UTC

Distributed Indexes

I'm wondering if this is a problem that lucene users have already 
tackled.  I have four copies of the application using a lucene index.  
They are located on two physical servers with two copies on each server 
accessing two copies of the lucene index.  I use Windows FRS (File 
Replication Service) to replicate the index between the two servers. 

Things work well most of the time, but sometimes, I believe under load, 
the index doesn't get a chance to propagate before another write takes 
place and it gets corrupted. 

What would you recommend I use to keep the index in sync between the 
four copies of the app?

Russ

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Distributed Indexes

Posted by Ruslan Sivak <rs...@istandfor.com>.
No, FRS copies the whole directory.  It's fairly fast, but if there is a 
modification on both servers at the same time, there will be issues. 

Russ

Michael McCandless wrote:
>
> If you're able to tell Windows FRS which specific files to copy, then 
> SnapshotDeletionPolicy (in 2.3) should work for this.
>
> It basically protects a consistent snapshot of your index, ensuring 
> those files will not be deleted, while not blocking further updates to 
> the index.
>
> Mike
>
> Ruslan Sivak wrote:
>
>> I'm wondering if this is a problem that lucene users have already 
>> tackled.  I have four copies of the application using a lucene 
>> index.  They are located on two physical servers with two copies on 
>> each server accessing two copies of the lucene index.  I use Windows 
>> FRS (File Replication Service) to replicate the index between the two 
>> servers.
>> Things work well most of the time, but sometimes, I believe under 
>> load, the index doesn't get a chance to propagate before another 
>> write takes place and it gets corrupted.
>> What would you recommend I use to keep the index in sync between the 
>> four copies of the app?
>>
>> Russ
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Distributed Indexes

Posted by Ruslan Sivak <rs...@istandfor.com>.
Basically the index is big is because there is a large number of 
documents, but each individual document is very small.  There is also a 
lot of redundancy, which, I believe is also why the index size is fairly 
small. 

Basically I am using the index to store the user's profile information, 
and then using the similarity search to find similar profiles.  The 
update rate can be fairly high, depending on how many users decide to 
create/update their profiles at the same time. 

The problem arises when many people decide to update their profiles, and 
the windows file replication service doesn't get a chance to synchronize 
the folders between updates.  (It's possible for two people to make 
updates at the same time, on different servers). 

I'm thinking of maybe subclassing a Directory or FSDirectory and 
overriding it to read and write from a database.  Would this work, or is 
there a better solution?

Russ



Grant Ingersoll wrote:
> Solr has a strategy using rsync that makes it relatively easy to copy 
> an index around to other servers.  It uses rsync to just copy the 
> diffs, so you could easily mirror this in your application.
>
> There is no SQL backend for Lucene, but at 4mb you could certainly 
> serialize it as a blob to a SQL db, but I don't see how that would 
> make it faster.
>
> Also, you said it takes a long time to create a 4mb index.  Does this 
> mean you are doing something really, really complex during analysis?  
> I guess what I am missing, and I think others have hinted at, is the 
> big picture isn't quite clear in our minds, because the size of the 
> index seems almost trivially small in Lucene terms, so we would think 
> that a) It would be really fast to create the index (seconds, not 
> minutes) and b) such a small index could be easily held entirely in 
> memory and should easily handle a very, very, high query rate given 
> reasonable hardware, which it sounds like you have.  The other piece 
> that doesn't fit in my mind is it sounds like you have a fairly high 
> update rate since you are getting writes before your 4mb index can be 
> copied on a local network, right?  This implies that you must also 
> have a lot of deletes otherwise your index would be growing 
> significantly.
>
> Thus, more details on what you are doing, how you are creating your 
> index, how your CF app talks to Lucene, etc. would be good.
>
> -Grant
>
>
> On Feb 10, 2008, at 12:55 PM, Ruslan Sivak wrote:
>
>> So nobody's run into anything like this before?  The need to share 
>> the index between many copies of the app possibly running on multiple 
>> servers?
>>
>> Russ
>>
>> Ruslan Sivak wrote:
>>> The app does other things then search the index.  I'm basically 
>>> using ColdFusion for the website and have four instances running on 
>>> two servers for load balancing.  Each app does the searches, and the 
>>> search times are small, the index is small, but it takes a long time 
>>> to fully create the index (several minutes), and I would like the 
>>> index to always be up to date (which is why i replicate the changes).
>>> I basically cache the index for several minutes in a RamDirectory, 
>>> which works quite well for performance.  If I could store the index 
>>> in a SQL Table or something, I can have a single place where the 
>>> index lives and atomic updates.
>>> Is there a SQL Backend for the index, or should I just take the 
>>> RamDirectory, serialize it and store it as a BLOB?
>>>
>>> Russ
>>>
>>> Erick Erickson wrote:
>>>> With an index that small, I wonder why you bother with so many copies?
>>>> What kind of load are you hitting it with and how complex are the 
>>>> queries?
>>>>
>>>> Because unless you have *very* high query rate, I'd look at why my 
>>>> queries
>>>> were
>>>> taking so long before complexifying things this way.
>>>>
>>>> Best
>>>> Erick
>>>>
>>>> On Feb 7, 2008 4:52 PM, Ruslan Sivak <rs...@istandfor.com> wrote:
>>>>
>>>>
>>>>> My index is only 4mb.  Is there a SQL backend for Lucene?
>>>>>
>>>>> Russ
>>>>>
>>>>> Michael McCandless wrote:
>>>>>
>>>>>> If you're able to tell Windows FRS which specific files to copy, 
>>>>>> then
>>>>>> SnapshotDeletionPolicy (in 2.3) should work for this.
>>>>>>
>>>>>> It basically protects a consistent snapshot of your index, ensuring
>>>>>> those files will not be deleted, while not blocking further 
>>>>>> updates to
>>>>>> the index.
>>>>>>
>>>>>> Mike
>>>>>>
>>>>>> Ruslan Sivak wrote:
>>>>>>
>>>>>>
>>>>>>> I'm wondering if this is a problem that lucene users have already
>>>>>>> tackled.  I have four copies of the application using a lucene
>>>>>>> index.  They are located on two physical servers with two copies on
>>>>>>> each server accessing two copies of the lucene index.  I use 
>>>>>>> Windows
>>>>>>> FRS (File Replication Service) to replicate the index between 
>>>>>>> the two
>>>>>>> servers.
>>>>>>> Things work well most of the time, but sometimes, I believe under
>>>>>>> load, the index doesn't get a chance to propagate before another
>>>>>>> write takes place and it gets corrupted.
>>>>>>> What would you recommend I use to keep the index in sync between 
>>>>>>> the
>>>>>>> four copies of the app?
>>>>>>>
>>>>>>> Russ
>>>>>>>
>>>>>>> --------------------------------------------------------------------- 
>>>>>>>
>>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>>
>>>>>>>
>>>>>> --------------------------------------------------------------------- 
>>>>>>
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
> --------------------------
> Grant Ingersoll
> http://lucene.grantingersoll.com
> http://www.lucenebootcamp.com
>
> Lucene Helpful Hints:
> http://wiki.apache.org/lucene-java/BasicsOfPerformance
> http://wiki.apache.org/lucene-java/LuceneFAQ
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Distributed Indexes

Posted by Grant Ingersoll <gs...@apache.org>.
Solr has a strategy using rsync that makes it relatively easy to copy  
an index around to other servers.  It uses rsync to just copy the  
diffs, so you could easily mirror this in your application.

There is no SQL backend for Lucene, but at 4mb you could certainly  
serialize it as a blob to a SQL db, but I don't see how that would  
make it faster.

Also, you said it takes a long time to create a 4mb index.  Does this  
mean you are doing something really, really complex during analysis?   
I guess what I am missing, and I think others have hinted at, is the  
big picture isn't quite clear in our minds, because the size of the  
index seems almost trivially small in Lucene terms, so we would think  
that a) It would be really fast to create the index (seconds, not  
minutes) and b) such a small index could be easily held entirely in  
memory and should easily handle a very, very, high query rate given  
reasonable hardware, which it sounds like you have.  The other piece  
that doesn't fit in my mind is it sounds like you have a fairly high  
update rate since you are getting writes before your 4mb index can be  
copied on a local network, right?  This implies that you must also  
have a lot of deletes otherwise your index would be growing  
significantly.

Thus, more details on what you are doing, how you are creating your  
index, how your CF app talks to Lucene, etc. would be good.

-Grant


On Feb 10, 2008, at 12:55 PM, Ruslan Sivak wrote:

> So nobody's run into anything like this before?  The need to share  
> the index between many copies of the app possibly running on  
> multiple servers?
>
> Russ
>
> Ruslan Sivak wrote:
>> The app does other things then search the index.  I'm basically  
>> using ColdFusion for the website and have four instances running on  
>> two servers for load balancing.  Each app does the searches, and  
>> the search times are small, the index is small, but it takes a long  
>> time to fully create the index (several minutes), and I would like  
>> the index to always be up to date (which is why i replicate the  
>> changes).
>> I basically cache the index for several minutes in a RamDirectory,  
>> which works quite well for performance.  If I could store the index  
>> in a SQL Table or something, I can have a single place where the  
>> index lives and atomic updates.
>> Is there a SQL Backend for the index, or should I just take the  
>> RamDirectory, serialize it and store it as a BLOB?
>>
>> Russ
>>
>> Erick Erickson wrote:
>>> With an index that small, I wonder why you bother with so many  
>>> copies?
>>> What kind of load are you hitting it with and how complex are the  
>>> queries?
>>>
>>> Because unless you have *very* high query rate, I'd look at why my  
>>> queries
>>> were
>>> taking so long before complexifying things this way.
>>>
>>> Best
>>> Erick
>>>
>>> On Feb 7, 2008 4:52 PM, Ruslan Sivak <rs...@istandfor.com> wrote:
>>>
>>>
>>>> My index is only 4mb.  Is there a SQL backend for Lucene?
>>>>
>>>> Russ
>>>>
>>>> Michael McCandless wrote:
>>>>
>>>>> If you're able to tell Windows FRS which specific files to copy,  
>>>>> then
>>>>> SnapshotDeletionPolicy (in 2.3) should work for this.
>>>>>
>>>>> It basically protects a consistent snapshot of your index,  
>>>>> ensuring
>>>>> those files will not be deleted, while not blocking further  
>>>>> updates to
>>>>> the index.
>>>>>
>>>>> Mike
>>>>>
>>>>> Ruslan Sivak wrote:
>>>>>
>>>>>
>>>>>> I'm wondering if this is a problem that lucene users have already
>>>>>> tackled.  I have four copies of the application using a lucene
>>>>>> index.  They are located on two physical servers with two  
>>>>>> copies on
>>>>>> each server accessing two copies of the lucene index.  I use  
>>>>>> Windows
>>>>>> FRS (File Replication Service) to replicate the index between  
>>>>>> the two
>>>>>> servers.
>>>>>> Things work well most of the time, but sometimes, I believe under
>>>>>> load, the index doesn't get a chance to propagate before another
>>>>>> write takes place and it gets corrupted.
>>>>>> What would you recommend I use to keep the index in sync  
>>>>>> between the
>>>>>> four copies of the app?
>>>>>>
>>>>>> Russ
>>>>>>
>>>>>> ---------------------------------------------------------------------
>>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>>
>>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

--------------------------
Grant Ingersoll
http://lucene.grantingersoll.com
http://www.lucenebootcamp.com

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ





---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Distributed Indexes

Posted by Ruslan Sivak <rs...@istandfor.com>.
So nobody's run into anything like this before?  The need to share the 
index between many copies of the app possibly running on multiple servers?

Russ

Ruslan Sivak wrote:
> The app does other things then search the index.  I'm basically using 
> ColdFusion for the website and have four instances running on two 
> servers for load balancing.  Each app does the searches, and the 
> search times are small, the index is small, but it takes a long time 
> to fully create the index (several minutes), and I would like the 
> index to always be up to date (which is why i replicate the changes).
> I basically cache the index for several minutes in a RamDirectory, 
> which works quite well for performance.  If I could store the index in 
> a SQL Table or something, I can have a single place where the index 
> lives and atomic updates.
> Is there a SQL Backend for the index, or should I just take the 
> RamDirectory, serialize it and store it as a BLOB?
>
> Russ
>
> Erick Erickson wrote:
>> With an index that small, I wonder why you bother with so many copies?
>> What kind of load are you hitting it with and how complex are the 
>> queries?
>>
>> Because unless you have *very* high query rate, I'd look at why my 
>> queries
>> were
>> taking so long before complexifying things this way.
>>
>> Best
>> Erick
>>
>> On Feb 7, 2008 4:52 PM, Ruslan Sivak <rs...@istandfor.com> wrote:
>>
>>  
>>> My index is only 4mb.  Is there a SQL backend for Lucene?
>>>
>>> Russ
>>>
>>> Michael McCandless wrote:
>>>    
>>>> If you're able to tell Windows FRS which specific files to copy, then
>>>> SnapshotDeletionPolicy (in 2.3) should work for this.
>>>>
>>>> It basically protects a consistent snapshot of your index, ensuring
>>>> those files will not be deleted, while not blocking further updates to
>>>> the index.
>>>>
>>>> Mike
>>>>
>>>> Ruslan Sivak wrote:
>>>>
>>>>      
>>>>> I'm wondering if this is a problem that lucene users have already
>>>>> tackled.  I have four copies of the application using a lucene
>>>>> index.  They are located on two physical servers with two copies on
>>>>> each server accessing two copies of the lucene index.  I use Windows
>>>>> FRS (File Replication Service) to replicate the index between the two
>>>>> servers.
>>>>> Things work well most of the time, but sometimes, I believe under
>>>>> load, the index doesn't get a chance to propagate before another
>>>>> write takes place and it gets corrupted.
>>>>> What would you recommend I use to keep the index in sync between the
>>>>> four copies of the app?
>>>>>
>>>>> Russ
>>>>>
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>>
>>>>>         
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>       
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>
>>>
>>>     
>>
>>   
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Distributed Indexes

Posted by Ruslan Sivak <rs...@istandfor.com>.
Cedric Ho wrote:
> On Feb 9, 2008 12:07 AM, Ruslan Sivak <rs...@istandfor.com> wrote:
>   
>> The app does other things then search the index.  I'm basically using
>> ColdFusion for the website and have four instances running on two
>> servers for load balancing.  Each app does the searches, and the search
>> times are small, the index is small, but it takes a long time to fully
>> create the index (several minutes), and I would like the index to always
>> be up to date (which is why i replicate the changes).
>>
>> I basically cache the index for several minutes in a RamDirectory, which
>> works quite well for performance.  If I could store the index in a SQL
>> Table or something, I can have a single place where the index lives and
>> atomic updates.
>>
>> Is there a SQL Backend for the index, or should I just take the
>> RamDirectory, serialize it and store it as a BLOB?
>>     
>
> Can't you just store the index on a shared network file system? so all
> the copies of your app can access it?
> And load the index into a RamDirectory when need to do searching on it ?
>
>
> Cedric
>
>   
One of the reasons that there are two servers is for redundancy.  There 
is no good shared storage that we can use to store the indexes.   
Although we could theoretically store the files on the db server's file 
system, its a dependency I'd rather avoid.

Russ

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Distributed Indexes

Posted by Cedric Ho <ce...@gmail.com>.
On Feb 9, 2008 12:07 AM, Ruslan Sivak <rs...@istandfor.com> wrote:
> The app does other things then search the index.  I'm basically using
> ColdFusion for the website and have four instances running on two
> servers for load balancing.  Each app does the searches, and the search
> times are small, the index is small, but it takes a long time to fully
> create the index (several minutes), and I would like the index to always
> be up to date (which is why i replicate the changes).
>
> I basically cache the index for several minutes in a RamDirectory, which
> works quite well for performance.  If I could store the index in a SQL
> Table or something, I can have a single place where the index lives and
> atomic updates.
>
> Is there a SQL Backend for the index, or should I just take the
> RamDirectory, serialize it and store it as a BLOB?

Can't you just store the index on a shared network file system? so all
the copies of your app can access it?
And load the index into a RamDirectory when need to do searching on it ?


Cedric

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Distributed Indexes

Posted by Ruslan Sivak <rs...@istandfor.com>.
The app does other things then search the index.  I'm basically using 
ColdFusion for the website and have four instances running on two 
servers for load balancing.  Each app does the searches, and the search 
times are small, the index is small, but it takes a long time to fully 
create the index (several minutes), and I would like the index to always 
be up to date (which is why i replicate the changes). 

I basically cache the index for several minutes in a RamDirectory, which 
works quite well for performance.  If I could store the index in a SQL 
Table or something, I can have a single place where the index lives and 
atomic updates. 

Is there a SQL Backend for the index, or should I just take the 
RamDirectory, serialize it and store it as a BLOB?

Russ

Erick Erickson wrote:
> With an index that small, I wonder why you bother with so many copies?
> What kind of load are you hitting it with and how complex are the queries?
>
> Because unless you have *very* high query rate, I'd look at why my queries
> were
> taking so long before complexifying things this way.
>
> Best
> Erick
>
> On Feb 7, 2008 4:52 PM, Ruslan Sivak <rs...@istandfor.com> wrote:
>
>   
>> My index is only 4mb.  Is there a SQL backend for Lucene?
>>
>> Russ
>>
>> Michael McCandless wrote:
>>     
>>> If you're able to tell Windows FRS which specific files to copy, then
>>> SnapshotDeletionPolicy (in 2.3) should work for this.
>>>
>>> It basically protects a consistent snapshot of your index, ensuring
>>> those files will not be deleted, while not blocking further updates to
>>> the index.
>>>
>>> Mike
>>>
>>> Ruslan Sivak wrote:
>>>
>>>       
>>>> I'm wondering if this is a problem that lucene users have already
>>>> tackled.  I have four copies of the application using a lucene
>>>> index.  They are located on two physical servers with two copies on
>>>> each server accessing two copies of the lucene index.  I use Windows
>>>> FRS (File Replication Service) to replicate the index between the two
>>>> servers.
>>>> Things work well most of the time, but sometimes, I believe under
>>>> load, the index doesn't get a chance to propagate before another
>>>> write takes place and it gets corrupted.
>>>> What would you recommend I use to keep the index in sync between the
>>>> four copies of the app?
>>>>
>>>> Russ
>>>>
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>>
>>>>         
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>>       
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
>>     
>
>   


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Distributed Indexes

Posted by Erick Erickson <er...@gmail.com>.
With an index that small, I wonder why you bother with so many copies?
What kind of load are you hitting it with and how complex are the queries?

Because unless you have *very* high query rate, I'd look at why my queries
were
taking so long before complexifying things this way.

Best
Erick

On Feb 7, 2008 4:52 PM, Ruslan Sivak <rs...@istandfor.com> wrote:

> My index is only 4mb.  Is there a SQL backend for Lucene?
>
> Russ
>
> Michael McCandless wrote:
> >
> > If you're able to tell Windows FRS which specific files to copy, then
> > SnapshotDeletionPolicy (in 2.3) should work for this.
> >
> > It basically protects a consistent snapshot of your index, ensuring
> > those files will not be deleted, while not blocking further updates to
> > the index.
> >
> > Mike
> >
> > Ruslan Sivak wrote:
> >
> >> I'm wondering if this is a problem that lucene users have already
> >> tackled.  I have four copies of the application using a lucene
> >> index.  They are located on two physical servers with two copies on
> >> each server accessing two copies of the lucene index.  I use Windows
> >> FRS (File Replication Service) to replicate the index between the two
> >> servers.
> >> Things work well most of the time, but sometimes, I believe under
> >> load, the index doesn't get a chance to propagate before another
> >> write takes place and it gets corrupted.
> >> What would you recommend I use to keep the index in sync between the
> >> four copies of the app?
> >>
> >> Russ
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Distributed Indexes

Posted by Ruslan Sivak <rs...@istandfor.com>.
My index is only 4mb.  Is there a SQL backend for Lucene?

Russ

Michael McCandless wrote:
>
> If you're able to tell Windows FRS which specific files to copy, then 
> SnapshotDeletionPolicy (in 2.3) should work for this.
>
> It basically protects a consistent snapshot of your index, ensuring 
> those files will not be deleted, while not blocking further updates to 
> the index.
>
> Mike
>
> Ruslan Sivak wrote:
>
>> I'm wondering if this is a problem that lucene users have already 
>> tackled.  I have four copies of the application using a lucene 
>> index.  They are located on two physical servers with two copies on 
>> each server accessing two copies of the lucene index.  I use Windows 
>> FRS (File Replication Service) to replicate the index between the two 
>> servers.
>> Things work well most of the time, but sometimes, I believe under 
>> load, the index doesn't get a chance to propagate before another 
>> write takes place and it gets corrupted.
>> What would you recommend I use to keep the index in sync between the 
>> four copies of the app?
>>
>> Russ
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Distributed Indexes

Posted by Michael McCandless <lu...@mikemccandless.com>.
If you're able to tell Windows FRS which specific files to copy, then  
SnapshotDeletionPolicy (in 2.3) should work for this.

It basically protects a consistent snapshot of your index, ensuring  
those files will not be deleted, while not blocking further updates  
to the index.

Mike

Ruslan Sivak wrote:

> I'm wondering if this is a problem that lucene users have already  
> tackled.  I have four copies of the application using a lucene  
> index.  They are located on two physical servers with two copies on  
> each server accessing two copies of the lucene index.  I use  
> Windows FRS (File Replication Service) to replicate the index  
> between the two servers.
> Things work well most of the time, but sometimes, I believe under  
> load, the index doesn't get a chance to propagate before another  
> write takes place and it gets corrupted.
> What would you recommend I use to keep the index in sync between  
> the four copies of the app?
>
> Russ
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org