You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Gili Nachum <gi...@gmail.com> on 2014/11/04 22:57:38 UTC

A bad idea to store core data directory over NAS?

My data center is out of SAN or local disk storage - is it a big no-no to
store Solr core data folder over NAS?
That means 1. Lucene index 2. Transaction log.

The NAS mount would be accessed by a single machine. I do care about
performance.

If I do go with NAS. Should I expect index corruption and other oddities?

Re: A bad idea to store core data directory over NAS?

Posted by Jack Krupansky <ja...@basetechnology.com>.
Think of Solr/SolrCloud itself as a SAN - smart networked machines that 
intensely manage local storage. Have two levels of "SAN" is 
counterproductive.

-- Jack Krupansky

-----Original Message----- 
From: Gili Nachum
Sent: Tuesday, November 4, 2014 4:57 PM
To: solr-user@lucene.apache.org
Subject: A bad idea to store core data directory over NAS?

My data center is out of SAN or local disk storage - is it a big no-no to
store Solr core data folder over NAS?
That means 1. Lucene index 2. Transaction log.

The NAS mount would be accessed by a single machine. I do care about
performance.

If I do go with NAS. Should I expect index corruption and other oddities? 


Re: A bad idea to store core data directory over NAS?

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Wed, 2014-11-05 at 23:04 +0100, Gili Nachum wrote:
> Update: Turns out that someone did find 50TB over SAN laying around the
> data center for me to use, so I won't find out for my self how's life with
> NFS/NAS in the near future.

There seems to be issues especially with NFS that you need to consider.
The thread "Effectiveness MMapDirectory on  NFS Mounted indexes" on the
lucene-users mailing list is about that.

- Toke Eskildsen, State and University Library, Denmark



Re: A bad idea to store core data directory over NAS?

Posted by Gili Nachum <gi...@gmail.com>.
So NFS it's doable, and performance will vary by the grade of storage I'm
getting and the volume of other activity on the NAS. Good to know it's not
attributed to index corruptions in Lucene (failures to sync to disk and
such).

Update: Turns out that someone did find 50TB over SAN laying around the
data center for me to use, so I won't find out for my self how's life with
NFS/NAS in the near future.

Cheers!

On Wed, Nov 5, 2014 at 8:51 PM, Charlie Hull <ch...@flax.co.uk> wrote:

> In our experience yes, it's a bad idea.
>
> Charlie
>
> On 5 November 2014 10:27, Walter Underwood <wu...@wunderwood.org> wrote:
>
> > My experience was with Solr 1.2 and regular old NFS, so that was probably
> > worst case. I was very surprised that it was that bad, though.
> >
> > So benchmark it before you assume it is fast enough.
> >
> > wunder
> > Walter Underwood
> > wunder@wunderwood.org
> > http://observer.wunderwood.org/
> >
> > On Nov 5, 2014, at 12:27 AM, Toke Eskildsen <te...@statsbiblioteket.dk>
> > wrote:
> >
> > > On Tue, 2014-11-04 at 22:57 +0100, Gili Nachum wrote:
> > >> My data center is out of SAN or local disk storage - is it a big no-no
> > to
> > >> store Solr core data folder over NAS?
> > >
> > > It depends on your NAS speed. Both Walter and David are right: It can
> > > perform really bad or quite satisfactory. We briefly experimented with
> > > using 400GB of Isilon ( http://www.emc.com/isilon ) SSD cache as
> backend
> > > for a searcher. As far as I remember, speed was surprisingly fine;
> about
> > > 3 times slower than with similar local storage. As we needed 20TB+ of
> > > index, it would be too expensive for us to use the enterprise NAS
> system
> > > though (long story).
> > >
> > >> The NAS mount would be accessed by a single machine. I do care about
> > >> performance.
> > >
> > > I have a vision of a off-the-shelf 4-drive box Gorilla-taped to the
> side
> > > of a server rack :-)
> > >
> > > Or in other words: If the SAN is only to be used by a single machine,
> > > this will be more of a kludge than a solid solution. Is it not possible
> > > to upgrade local storage to hold the data? How large an index are we
> > > talking about?
> > >
> > >> If I do go with NAS. Should I expect index corruption and other
> > oddities?
> > >
> > > Not that I know of. As the NAS is dedicated, you won't compete for
> > > performance there. Do check if your network is fast enough though.
> > >
> > >
> > > - Toke Eskildsen, State and University Library, Denmark
> > > I highly recommend Gorilla Tape for semi-permanent kludges.
> > >
> >
> >
>

Re: A bad idea to store core data directory over NAS?

Posted by Charlie Hull <ch...@flax.co.uk>.
In our experience yes, it's a bad idea.

Charlie

On 5 November 2014 10:27, Walter Underwood <wu...@wunderwood.org> wrote:

> My experience was with Solr 1.2 and regular old NFS, so that was probably
> worst case. I was very surprised that it was that bad, though.
>
> So benchmark it before you assume it is fast enough.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/
>
> On Nov 5, 2014, at 12:27 AM, Toke Eskildsen <te...@statsbiblioteket.dk>
> wrote:
>
> > On Tue, 2014-11-04 at 22:57 +0100, Gili Nachum wrote:
> >> My data center is out of SAN or local disk storage - is it a big no-no
> to
> >> store Solr core data folder over NAS?
> >
> > It depends on your NAS speed. Both Walter and David are right: It can
> > perform really bad or quite satisfactory. We briefly experimented with
> > using 400GB of Isilon ( http://www.emc.com/isilon ) SSD cache as backend
> > for a searcher. As far as I remember, speed was surprisingly fine; about
> > 3 times slower than with similar local storage. As we needed 20TB+ of
> > index, it would be too expensive for us to use the enterprise NAS system
> > though (long story).
> >
> >> The NAS mount would be accessed by a single machine. I do care about
> >> performance.
> >
> > I have a vision of a off-the-shelf 4-drive box Gorilla-taped to the side
> > of a server rack :-)
> >
> > Or in other words: If the SAN is only to be used by a single machine,
> > this will be more of a kludge than a solid solution. Is it not possible
> > to upgrade local storage to hold the data? How large an index are we
> > talking about?
> >
> >> If I do go with NAS. Should I expect index corruption and other
> oddities?
> >
> > Not that I know of. As the NAS is dedicated, you won't compete for
> > performance there. Do check if your network is fast enough though.
> >
> >
> > - Toke Eskildsen, State and University Library, Denmark
> > I highly recommend Gorilla Tape for semi-permanent kludges.
> >
>
>

Re: A bad idea to store core data directory over NAS?

Posted by andrey prokopenko <an...@gmail.com>.
SolrCoud cluster heavily depends on data locality and high I/O, thus any
NFS with access to disk array over the network is multitude times slower
than direct I/O and must be avoided. Classical JBOD (just a bunch of disks)
config + memory mapped files ensure high performance.

On Wed, Nov 5, 2014 at 4:27 PM, Walter Underwood <wu...@wunderwood.org>
wrote:

> My experience was with Solr 1.2 and regular old NFS, so that was probably
> worst case. I was very surprised that it was that bad, though.
>
> So benchmark it before you assume it is fast enough.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/
>
> On Nov 5, 2014, at 12:27 AM, Toke Eskildsen <te...@statsbiblioteket.dk>
> wrote:
>
> > On Tue, 2014-11-04 at 22:57 +0100, Gili Nachum wrote:
> >> My data center is out of SAN or local disk storage - is it a big no-no
> to
> >> store Solr core data folder over NAS?
> >
> > It depends on your NAS speed. Both Walter and David are right: It can
> > perform really bad or quite satisfactory. We briefly experimented with
> > using 400GB of Isilon ( http://www.emc.com/isilon ) SSD cache as backend
> > for a searcher. As far as I remember, speed was surprisingly fine; about
> > 3 times slower than with similar local storage. As we needed 20TB+ of
> > index, it would be too expensive for us to use the enterprise NAS system
> > though (long story).
> >
> >> The NAS mount would be accessed by a single machine. I do care about
> >> performance.
> >
> > I have a vision of a off-the-shelf 4-drive box Gorilla-taped to the side
> > of a server rack :-)
> >
> > Or in other words: If the SAN is only to be used by a single machine,
> > this will be more of a kludge than a solid solution. Is it not possible
> > to upgrade local storage to hold the data? How large an index are we
> > talking about?
> >
> >> If I do go with NAS. Should I expect index corruption and other
> oddities?
> >
> > Not that I know of. As the NAS is dedicated, you won't compete for
> > performance there. Do check if your network is fast enough though.
> >
> >
> > - Toke Eskildsen, State and University Library, Denmark
> > I highly recommend Gorilla Tape for semi-permanent kludges.
> >
>
>

Re: A bad idea to store core data directory over NAS?

Posted by Walter Underwood <wu...@wunderwood.org>.
My experience was with Solr 1.2 and regular old NFS, so that was probably worst case. I was very surprised that it was that bad, though.

So benchmark it before you assume it is fast enough. 

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/

On Nov 5, 2014, at 12:27 AM, Toke Eskildsen <te...@statsbiblioteket.dk> wrote:

> On Tue, 2014-11-04 at 22:57 +0100, Gili Nachum wrote:
>> My data center is out of SAN or local disk storage - is it a big no-no to
>> store Solr core data folder over NAS?
> 
> It depends on your NAS speed. Both Walter and David are right: It can
> perform really bad or quite satisfactory. We briefly experimented with
> using 400GB of Isilon ( http://www.emc.com/isilon ) SSD cache as backend
> for a searcher. As far as I remember, speed was surprisingly fine; about
> 3 times slower than with similar local storage. As we needed 20TB+ of
> index, it would be too expensive for us to use the enterprise NAS system
> though (long story).
> 
>> The NAS mount would be accessed by a single machine. I do care about
>> performance.
> 
> I have a vision of a off-the-shelf 4-drive box Gorilla-taped to the side
> of a server rack :-)
> 
> Or in other words: If the SAN is only to be used by a single machine,
> this will be more of a kludge than a solid solution. Is it not possible
> to upgrade local storage to hold the data? How large an index are we
> talking about?
> 
>> If I do go with NAS. Should I expect index corruption and other oddities?
> 
> Not that I know of. As the NAS is dedicated, you won't compete for
> performance there. Do check if your network is fast enough though.
> 
> 
> - Toke Eskildsen, State and University Library, Denmark
> I highly recommend Gorilla Tape for semi-permanent kludges.
> 


Re: A bad idea to store core data directory over NAS?

Posted by Toke Eskildsen <te...@statsbiblioteket.dk>.
On Tue, 2014-11-04 at 22:57 +0100, Gili Nachum wrote:
> My data center is out of SAN or local disk storage - is it a big no-no to
> store Solr core data folder over NAS?

It depends on your NAS speed. Both Walter and David are right: It can
perform really bad or quite satisfactory. We briefly experimented with
using 400GB of Isilon ( http://www.emc.com/isilon ) SSD cache as backend
for a searcher. As far as I remember, speed was surprisingly fine; about
3 times slower than with similar local storage. As we needed 20TB+ of
index, it would be too expensive for us to use the enterprise NAS system
though (long story).

> The NAS mount would be accessed by a single machine. I do care about
> performance.

I have a vision of a off-the-shelf 4-drive box Gorilla-taped to the side
of a server rack :-)

Or in other words: If the SAN is only to be used by a single machine,
this will be more of a kludge than a solid solution. Is it not possible
to upgrade local storage to hold the data? How large an index are we
talking about?

> If I do go with NAS. Should I expect index corruption and other oddities?

Not that I know of. As the NAS is dedicated, you won't compete for
performance there. Do check if your network is fast enough though.


- Toke Eskildsen, State and University Library, Denmark
I highly recommend Gorilla Tape for semi-permanent kludges.


Re: A bad idea to store core data directory over NAS?

Posted by David Santamauro <da...@gmail.com>.
Interestingly enough, one of our installations has a 16-node cluster 
using 4 NAS devices (xen as virtualization backbone). The data drive for 
the individual node that holds the index is a stripe of 2x 500GB disks. 
Each disk of the stripe is on a different NAS device (scattered 
pattern). With a total index size (not including replicas) of over 2TB, 
performance is pretty snappy.

Indexing, of course, is resource intensive (disk I/O on the NAS as well 
as network bandwidth). Also, other activity on each NAS by other NFS 
clients could severely impact performance of search and index, so one 
needs to be aware of contentious activity.

David


On 11/4/2014 4:59 PM, Walter Underwood wrote:
> I did that once by accident. It was 100X slower.
>
> wunder
> Walter Underwood
> wunder@wunderwood.org
> http://observer.wunderwood.org/
>
> On Nov 4, 2014, at 1:57 PM, Gili Nachum <gi...@gmail.com> wrote:
>
>> My data center is out of SAN or local disk storage - is it a big no-no to
>> store Solr core data folder over NAS?
>> That means 1. Lucene index 2. Transaction log.
>>
>> The NAS mount would be accessed by a single machine. I do care about
>> performance.
>>
>> If I do go with NAS. Should I expect index corruption and other oddities?
>

Re: A bad idea to store core data directory over NAS?

Posted by Walter Underwood <wu...@wunderwood.org>.
I did that once by accident. It was 100X slower.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/

On Nov 4, 2014, at 1:57 PM, Gili Nachum <gi...@gmail.com> wrote:

> My data center is out of SAN or local disk storage - is it a big no-no to
> store Solr core data folder over NAS?
> That means 1. Lucene index 2. Transaction log.
> 
> The NAS mount would be accessed by a single machine. I do care about
> performance.
> 
> If I do go with NAS. Should I expect index corruption and other oddities?