You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Zach Bailey <za...@hannonhill.com> on 2007/08/02 16:50:05 UTC

Re: Clustered Indexing on common network filesystem

Hi,

It's been a couple of days now and I haven't heard anything on this 
topic, while there has been substantial list traffic otherwise.

Am I asking in the wrong place? Was I unclear?

I know there are people out there that have used/are using Lucene in a 
clustered environment. I am just looking for any sort of feedback 
(general or specific) about clustering lucene as well as filesystem 
compatibility (windows shares, NFS, etc.).

Thanks again,
-Zach

Zach Bailey wrote:
> Hello all,
> 
> First a little background - we are developing a clustered application 
> that will in part leverage Lucene to provide index and search 
> capabilities. We have already spent time investigating various index 
> storage implementations (database vs. filesystem) and we've decided for 
> performance reasons to go with a filesystem index storage scheme.
> 
> That said, I have read back through the archives a bit and noticed that 
> the support for index storage on NFS is still experimental (e.g. the 
> latest bugfixes have not made it out to an official, stable release). I 
> realize most of the issues related to using a shared file system revolve 
> around locking, and I haven't seen much about the maturity of locking 
> for other network filesystems.
> 
> I was wondering if anyone has tried any other networked filesystems or 
> had any recommendations. We have clients who would be doing this on both 
> Windows and Unix/Linux so any insight there would be appreciated as well 
> - it can be assumed that across any cluster the operating system use 
> would be homogeneous (i.e. all nodes are on windows and would use 
> windows shares, or all nodes are on linux and would use xyz filesystem).
> 
> Thanks in advance,
> -Zach Bailey
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Clustered Indexing on common network filesystem

Posted by Michael McCandless <lu...@mikemccandless.com>.

"Zach Bailey" <za...@hannonhill.com> wrote:

> Unfortunately, I am not sure the leader of the project would feel good 
> about running code from trunk, save without an explicit endorsement from 
> a majority of the developers or contributors for that particular code 
> (do those people keep up with this list, anyway?) Is there any word on 
> the possible timeframe the code required to work with NFS might be
> released?

This person does keep up with the list :)

On timframe ... there are tentative discussions now on the dev
list on releasing 2.3 in a few months time, but by no means is
this a hard schedule.  I'll make sure LUCENE-948 is included in 2.3.

Mike

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Clustered Indexing on common network filesystem

Posted by Michael McCandless <lu...@mikemccandless.com>.

I have been meaning to write up a Wiki page on this general topic but
have not quite made time yet ...

Sharing an index with a shared filesystem will work, however there are
some caveats:

  * This is somewhat unchartered territory because it's fairly recent
    fixes to Lucene that have enabled the things below to work, and,
    it's not a heavily tested area.  Please share your experience so
    we all can learn...

  * If the filesystem does not protect against deletion of open files
    (notably NFS does not, however SMB/CIFS does) then you will need
    to create a custom DeletionPolicy based on your app logic so
    writer & readers "agree" on when it's safe to delete prior commit
    points.

    This can be something simple like "readers always refresh at least
    once per hour so any commit point older than 1 hour may be safely
    deleted".

  * Locking: if your app can ensure only one writer is active at a
    time, you can disable locking in Lucene entirely.  Else, it's best
    to use NativeFSLockFactory, if you can.

  * If you are using a filesystem that does not have coherent caching
    on directory listing (NFS clients often do not), and, different
    nodes can "become" the writer (vs a single dedicated writer node)
    then there is one known open issue that you'll hit once you make
    your own DeletionPolicy which I still have to port to trunk:

      http://issues.apache.org/jira/browse/LUCENE-948

But as Mark said, performance is likely quite poor and so you may want
to take an approach like Solr (or, use Solr) whereby a single writer
makes changes to the index.  Then these changes are efficiently
propagated to multiple hosts (hard link & rsync is one way but not the
only way), and these hosts then search their private copy via their
local filesystem.

Mike

"Zach Bailey" <za...@hannonhill.com> wrote:
> Mark,
> 
> Thanks so much for your response.
> 
> Unfortunately, I am not sure the leader of the project would feel good 
> about running code from trunk, save without an explicit endorsement from 
> a majority of the developers or contributors for that particular code 
> (do those people keep up with this list, anyway?) Is there any word on 
> the possible timeframe the code required to work with NFS might be
> released?
> 
> Thanks for your other insight about hardlinks and rsync. I will look 
> into that; unfortunately it does not cover our userbase who may be 
> clustering in a Windows Server environment. I still have not heard/seen 
> any evidence (anecdotal or otherwise) about how well lucene might work 
> sharing indexes over a mounted Windows share.
> 
> -Zach
> 
> Mark Miller wrote:
> > Some quick info:
> > 
> > NFS should work, but I think youll want to be working off the trunk. 
> > Also, Sharing an index over NFS is supposed to be slow. The standard so 
> > far, if you are not partitioning the index, is to use a unix/linux 
> > filesystem and hardlinks + rsync to efficiently share index changes 
> > across nodes (hard links for instant copy, rsync to only transfer 
> > changed index files, search the mailing list). If you look at solr you 
> > can see scripts that give an example of this. I don't think the scripts 
> > rely on solr. This kind of setup should be quick and simple to 
> > implement. Same with NFS. An RMI solution that allowed for index 
> > partitioning would probably be the longest to do.
> > 
> > -Mark
> > 
> > 
> > 
> > Zach Bailey wrote:
> >> Thanks for your response --
> >>
> >> Based on my understanding, hadoop and nutch are essentially the same 
> >> thing, with nutch being derived from hadoop, and are primarily 
> >> intended to be standalone applications.
> >>
> >> We are not looking for a standalone application, rather we must use a 
> >> framework to implement search inside our current content management 
> >> application. Currently the application search functionality is 
> >> designed and built around Lucene, so migrating frameworks at this 
> >> point is not feasible.
> >>
> >> We are currently re-working our back-end to support clustering (in 
> >> tomcat) and we are looking for information on the migration of Lucene 
> >> from a single node filesystem index (which is what we use now and hope 
> >> to continue to use for clients with a single-node deployment) to a 
> >> shared filesystem index on a mounted network share.
> >>
> >> We prefer to use this strategy because it means we do not have to have 
> >> two disparate methods of managing indexes for clients who run in a 
> >> single-node, non-clustered environment versus clients who run in a 
> >> multiple-node, clustered environment.
> >>
> >> So, hopefully here are some easy questions someone could shed some 
> >> light on:
> >>
> >> Is this not a recommended method of managing indexes across multiple 
> >> nodes?
> >>
> >> At this point would people recommend storing an individual index on 
> >> each node and propagating index updates via a JMS framework rather 
> >> than attempting to handle it transparently with a single shared index?
> >>
> >> Is the Lucene index code so intimately tied to filesystem semantics 
> >> that using a shared/networked file system is infeasible at this point 
> >> in time?
> >>
> >> What would be the quickest time-to-implementation of these strategies 
> >> (JMS vs. shared FS)? The most robust/least error-prone?
> >>
> >> I really appreciate any insight or response anyone can provide, even 
> >> if it is a short answer to any of the related topics, "i.e. we 
> >> implemented clustered search using per-node indexing with JMS update 
> >> propagation and it works great", or even something as simple as "don't 
> >> use a shared filesystem at this point".
> >>
> >> Cheers,
> >> -Zach
> >>
> >> testn wrote:
> >>> Why don't you check out Hadoop and Nutch? It should provide what you are
> >>> looking for.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> > 
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> > 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Clustered Indexing on common network filesystem

Posted by Zach Bailey <za...@hannonhill.com>.

Mark,

Thanks so much for your response.

Unfortunately, I am not sure the leader of the project would feel good 
about running code from trunk, save without an explicit endorsement from 
a majority of the developers or contributors for that particular code 
(do those people keep up with this list, anyway?) Is there any word on 
the possible timeframe the code required to work with NFS might be released?

Thanks for your other insight about hardlinks and rsync. I will look 
into that; unfortunately it does not cover our userbase who may be 
clustering in a Windows Server environment. I still have not heard/seen 
any evidence (anecdotal or otherwise) about how well lucene might work 
sharing indexes over a mounted Windows share.

-Zach

Mark Miller wrote:
> Some quick info:
> 
> NFS should work, but I think youll want to be working off the trunk. 
> Also, Sharing an index over NFS is supposed to be slow. The standard so 
> far, if you are not partitioning the index, is to use a unix/linux 
> filesystem and hardlinks + rsync to efficiently share index changes 
> across nodes (hard links for instant copy, rsync to only transfer 
> changed index files, search the mailing list). If you look at solr you 
> can see scripts that give an example of this. I don't think the scripts 
> rely on solr. This kind of setup should be quick and simple to 
> implement. Same with NFS. An RMI solution that allowed for index 
> partitioning would probably be the longest to do.
> 
> -Mark
> 
> 
> 
> Zach Bailey wrote:
>> Thanks for your response --
>>
>> Based on my understanding, hadoop and nutch are essentially the same 
>> thing, with nutch being derived from hadoop, and are primarily 
>> intended to be standalone applications.
>>
>> We are not looking for a standalone application, rather we must use a 
>> framework to implement search inside our current content management 
>> application. Currently the application search functionality is 
>> designed and built around Lucene, so migrating frameworks at this 
>> point is not feasible.
>>
>> We are currently re-working our back-end to support clustering (in 
>> tomcat) and we are looking for information on the migration of Lucene 
>> from a single node filesystem index (which is what we use now and hope 
>> to continue to use for clients with a single-node deployment) to a 
>> shared filesystem index on a mounted network share.
>>
>> We prefer to use this strategy because it means we do not have to have 
>> two disparate methods of managing indexes for clients who run in a 
>> single-node, non-clustered environment versus clients who run in a 
>> multiple-node, clustered environment.
>>
>> So, hopefully here are some easy questions someone could shed some 
>> light on:
>>
>> Is this not a recommended method of managing indexes across multiple 
>> nodes?
>>
>> At this point would people recommend storing an individual index on 
>> each node and propagating index updates via a JMS framework rather 
>> than attempting to handle it transparently with a single shared index?
>>
>> Is the Lucene index code so intimately tied to filesystem semantics 
>> that using a shared/networked file system is infeasible at this point 
>> in time?
>>
>> What would be the quickest time-to-implementation of these strategies 
>> (JMS vs. shared FS)? The most robust/least error-prone?
>>
>> I really appreciate any insight or response anyone can provide, even 
>> if it is a short answer to any of the related topics, "i.e. we 
>> implemented clustered search using per-node indexing with JMS update 
>> propagation and it works great", or even something as simple as "don't 
>> use a shared filesystem at this point".
>>
>> Cheers,
>> -Zach
>>
>> testn wrote:
>>> Why don't you check out Hadoop and Nutch? It should provide what you are
>>> looking for.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Clustered Indexing on common network filesystem

Posted by Mark Miller <ma...@gmail.com>.

Some quick info:

NFS should work, but I think youll want to be working off the trunk. 
Also, Sharing an index over NFS is supposed to be slow. The standard so 
far, if you are not partitioning the index, is to use a unix/linux 
filesystem and hardlinks + rsync to efficiently share index changes 
across nodes (hard links for instant copy, rsync to only transfer 
changed index files, search the mailing list). If you look at solr you 
can see scripts that give an example of this. I don't think the scripts 
rely on solr. This kind of setup should be quick and simple to 
implement. Same with NFS. An RMI solution that allowed for index 
partitioning would probably be the longest to do.

-Mark



Zach Bailey wrote:
> Thanks for your response --
>
> Based on my understanding, hadoop and nutch are essentially the same 
> thing, with nutch being derived from hadoop, and are primarily 
> intended to be standalone applications.
>
> We are not looking for a standalone application, rather we must use a 
> framework to implement search inside our current content management 
> application. Currently the application search functionality is 
> designed and built around Lucene, so migrating frameworks at this 
> point is not feasible.
>
> We are currently re-working our back-end to support clustering (in 
> tomcat) and we are looking for information on the migration of Lucene 
> from a single node filesystem index (which is what we use now and hope 
> to continue to use for clients with a single-node deployment) to a 
> shared filesystem index on a mounted network share.
>
> We prefer to use this strategy because it means we do not have to have 
> two disparate methods of managing indexes for clients who run in a 
> single-node, non-clustered environment versus clients who run in a 
> multiple-node, clustered environment.
>
> So, hopefully here are some easy questions someone could shed some 
> light on:
>
> Is this not a recommended method of managing indexes across multiple 
> nodes?
>
> At this point would people recommend storing an individual index on 
> each node and propagating index updates via a JMS framework rather 
> than attempting to handle it transparently with a single shared index?
>
> Is the Lucene index code so intimately tied to filesystem semantics 
> that using a shared/networked file system is infeasible at this point 
> in time?
>
> What would be the quickest time-to-implementation of these strategies 
> (JMS vs. shared FS)? The most robust/least error-prone?
>
> I really appreciate any insight or response anyone can provide, even 
> if it is a short answer to any of the related topics, "i.e. we 
> implemented clustered search using per-node indexing with JMS update 
> propagation and it works great", or even something as simple as "don't 
> use a shared filesystem at this point".
>
> Cheers,
> -Zach
>
> testn wrote:
>> Why don't you check out Hadoop and Nutch? It should provide what you are
>> looking for.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Clustered Indexing on common network filesystem

Posted by Zach Bailey <za...@hannonhill.com>.

Rajesh,

I forgot to mention this, but we did investigate this option as well and 
even prototyped it for an internal project. It ended up being too slow 
for us.

It was adding a lot of overhead even to small updates, IIRC, mainly due 
to the fact that the index was essentially stored as a filesystem in the 
database. As you can probably imagine, using a database as a filesystem 
is not very performant.

Rajesh parab wrote:
> One more alternative, though I am not sure if anyone
> is using it.
> 
> Apache Compass has added a plug-in to allow storing
> Lucene index files inside the database. This should
> work in clustered environment as all nodes will share
> the same database instance.
> 
> I am not sure the impact it will have on performance.
> 
> Is anyone using DB for index storage? Any drawbacks of
> this approach?
> 
> Regards,
> Rajesh
> 
> --- Zach Bailey <za...@hannonhill.com> wrote:
> 
>> Thanks for your response --
>>
>> Based on my understanding, hadoop and nutch are
>> essentially the same 
>> thing, with nutch being derived from hadoop, and are
>> primarily intended 
>> to be standalone applications.
>>
>> We are not looking for a standalone application,
>> rather we must use a 
>> framework to implement search inside our current
>> content management 
>> application. Currently the application search
>> functionality is designed 
>> and built around Lucene, so migrating frameworks at
>> this point is not 
>> feasible.
>>
>> We are currently re-working our back-end to support
>> clustering (in 
>> tomcat) and we are looking for information on the
>> migration of Lucene 
>> from a single node filesystem index (which is what
>> we use now and hope 
>> to continue to use for clients with a single-node
>> deployment) to a 
>> shared filesystem index on a mounted network share.
>>
>> We prefer to use this strategy because it means we
>> do not have to have 
>> two disparate methods of managing indexes for
>> clients who run in a 
>> single-node, non-clustered environment versus
>> clients who run in a 
>> multiple-node, clustered environment.
>>
>> So, hopefully here are some easy questions someone
>> could shed some light on:
>>
>> Is this not a recommended method of managing indexes
>> across multiple nodes?
>>
>> At this point would people recommend storing an
>> individual index on each 
>> node and propagating index updates via a JMS
>> framework rather than 
>> attempting to handle it transparently with a single
>> shared index?
>>
>> Is the Lucene index code so intimately tied to
>> filesystem semantics that 
>> using a shared/networked file system is infeasible
>> at this point in time?
>>
>> What would be the quickest time-to-implementation of
>> these strategies 
>> (JMS vs. shared FS)? The most robust/least
>> error-prone?
>>
>> I really appreciate any insight or response anyone
>> can provide, even if 
>> it is a short answer to any of the related topics,
>> "i.e. we implemented 
>> clustered search using per-node indexing with JMS
>> update propagation and 
>> it works great", or even something as simple as
>> "don't use a shared 
>> filesystem at this point".
>>
>> Cheers,
>> -Zach
>>
>> testn wrote:
>>> Why don't you check out Hadoop and Nutch? It
>> should provide what you are
>>> looking for.
>>
> ---------------------------------------------------------------------
>> To unsubscribe, e-mail:
>> java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail:
>> java-user-help@lucene.apache.org
>>
>>
> 
> 
> 
>        
> ____________________________________________________________________________________
> Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online.
> http://smallbusiness.yahoo.com/webhosting 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Clustered Indexing on common network filesystem

Posted by Rajesh parab <ra...@yahoo.com>.

One more alternative, though I am not sure if anyone
is using it.

Apache Compass has added a plug-in to allow storing
Lucene index files inside the database. This should
work in clustered environment as all nodes will share
the same database instance.

I am not sure the impact it will have on performance.

Is anyone using DB for index storage? Any drawbacks of
this approach?

Regards,
Rajesh

--- Zach Bailey <za...@hannonhill.com> wrote:

> Thanks for your response --
> 
> Based on my understanding, hadoop and nutch are
> essentially the same 
> thing, with nutch being derived from hadoop, and are
> primarily intended 
> to be standalone applications.
> 
> We are not looking for a standalone application,
> rather we must use a 
> framework to implement search inside our current
> content management 
> application. Currently the application search
> functionality is designed 
> and built around Lucene, so migrating frameworks at
> this point is not 
> feasible.
> 
> We are currently re-working our back-end to support
> clustering (in 
> tomcat) and we are looking for information on the
> migration of Lucene 
> from a single node filesystem index (which is what
> we use now and hope 
> to continue to use for clients with a single-node
> deployment) to a 
> shared filesystem index on a mounted network share.
> 
> We prefer to use this strategy because it means we
> do not have to have 
> two disparate methods of managing indexes for
> clients who run in a 
> single-node, non-clustered environment versus
> clients who run in a 
> multiple-node, clustered environment.
> 
> So, hopefully here are some easy questions someone
> could shed some light on:
> 
> Is this not a recommended method of managing indexes
> across multiple nodes?
> 
> At this point would people recommend storing an
> individual index on each 
> node and propagating index updates via a JMS
> framework rather than 
> attempting to handle it transparently with a single
> shared index?
> 
> Is the Lucene index code so intimately tied to
> filesystem semantics that 
> using a shared/networked file system is infeasible
> at this point in time?
> 
> What would be the quickest time-to-implementation of
> these strategies 
> (JMS vs. shared FS)? The most robust/least
> error-prone?
> 
> I really appreciate any insight or response anyone
> can provide, even if 
> it is a short answer to any of the related topics,
> "i.e. we implemented 
> clustered search using per-node indexing with JMS
> update propagation and 
> it works great", or even something as simple as
> "don't use a shared 
> filesystem at this point".
> 
> Cheers,
> -Zach
> 
> testn wrote:
> > Why don't you check out Hadoop and Nutch? It
> should provide what you are
> > looking for.
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail:
> java-user-help@lucene.apache.org
> 
> 



       
____________________________________________________________________________________
Building a website is a piece of cake. Yahoo! Small Business gives you all the tools to get online.
http://smallbusiness.yahoo.com/webhosting 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Clustered Indexing on common network filesystem

Posted by Zach Bailey <za...@hannonhill.com>.

Thanks for your response --

Based on my understanding, hadoop and nutch are essentially the same 
thing, with nutch being derived from hadoop, and are primarily intended 
to be standalone applications.

We are not looking for a standalone application, rather we must use a 
framework to implement search inside our current content management 
application. Currently the application search functionality is designed 
and built around Lucene, so migrating frameworks at this point is not 
feasible.

We are currently re-working our back-end to support clustering (in 
tomcat) and we are looking for information on the migration of Lucene 
from a single node filesystem index (which is what we use now and hope 
to continue to use for clients with a single-node deployment) to a 
shared filesystem index on a mounted network share.

We prefer to use this strategy because it means we do not have to have 
two disparate methods of managing indexes for clients who run in a 
single-node, non-clustered environment versus clients who run in a 
multiple-node, clustered environment.

So, hopefully here are some easy questions someone could shed some light on:

Is this not a recommended method of managing indexes across multiple nodes?

At this point would people recommend storing an individual index on each 
node and propagating index updates via a JMS framework rather than 
attempting to handle it transparently with a single shared index?

Is the Lucene index code so intimately tied to filesystem semantics that 
using a shared/networked file system is infeasible at this point in time?

What would be the quickest time-to-implementation of these strategies 
(JMS vs. shared FS)? The most robust/least error-prone?

I really appreciate any insight or response anyone can provide, even if 
it is a short answer to any of the related topics, "i.e. we implemented 
clustered search using per-node indexing with JMS update propagation and 
it works great", or even something as simple as "don't use a shared 
filesystem at this point".

Cheers,
-Zach

testn wrote:
> Why don't you check out Hadoop and Nutch? It should provide what you are
> looking for.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Clustered Indexing on common network filesystem

Posted by testn <te...@doramail.com>.

Why don't you check out Hadoop and Nutch? It should provide what you are
looking for.


Zach Bailey wrote:
> 
> Hi,
> 
> It's been a couple of days now and I haven't heard anything on this 
> topic, while there has been substantial list traffic otherwise.
> 
> Am I asking in the wrong place? Was I unclear?
> 
> I know there are people out there that have used/are using Lucene in a 
> clustered environment. I am just looking for any sort of feedback 
> (general or specific) about clustering lucene as well as filesystem 
> compatibility (windows shares, NFS, etc.).
> 
> Thanks again,
> -Zach
> 
> Zach Bailey wrote:
>> Hello all,
>> 
>> First a little background - we are developing a clustered application 
>> that will in part leverage Lucene to provide index and search 
>> capabilities. We have already spent time investigating various index 
>> storage implementations (database vs. filesystem) and we've decided for 
>> performance reasons to go with a filesystem index storage scheme.
>> 
>> That said, I have read back through the archives a bit and noticed that 
>> the support for index storage on NFS is still experimental (e.g. the 
>> latest bugfixes have not made it out to an official, stable release). I 
>> realize most of the issues related to using a shared file system revolve 
>> around locking, and I haven't seen much about the maturity of locking 
>> for other network filesystems.
>> 
>> I was wondering if anyone has tried any other networked filesystems or 
>> had any recommendations. We have clients who would be doing this on both 
>> Windows and Unix/Linux so any insight there would be appreciated as well 
>> - it can be assumed that across any cluster the operating system use 
>> would be homogeneous (i.e. all nodes are on windows and would use 
>> windows shares, or all nodes are on linux and would use xyz filesystem).
>> 
>> Thanks in advance,
>> -Zach Bailey
>> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Clustered-Indexing-on-common-network-filesystem-tf4194135.html#a11966423
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org