You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Imran Rajjad <ra...@gmail.com> on 2018/09/05 12:55:38 UTC

Solr 6.4.1: : SolrException:nfs no locks available

Hello,

I am using Solr Cloud 6.4.1. After a hard restart the solr nodes are
constantly showing to be in DOWN state and would not go into recovery. I
have also deleted the write.lock files from all the replica folders, but
the problem would not go away. The error displayed at web console is : no
locks available

My replica folders reside in an nfs mount, I am using RHEL 6/CentOS6.8. Has
anyone ever faced this issue?

regards,
Imran

-- 
I.R

Re: Solr 6.4.1: : SolrException:nfs no locks available

Posted by Erick Erickson <er...@gmail.com>.
Here's what I copied from an explanation from Uwe Schindler, 'cause I
believe most anything he has to say on this subject:

It is just simple, Lucene locking and commits do not work correct on
NFS file systems because they are not fully POSIX conformant. Because
of this you may also produce corrupt indexes, as commits don't work
and corrupt concurrently open files. Also you may see JVM crushes if
memory mapped files are unmapped because of network failures and cause
SIGSEGV.

If you want to use Lucene on NFS mounts, you have 2 possibilities:
- Change to CIFS/Samba mounts (CIFS conforms to POSIX standards like
delete on last close and also supports correct locking with
NativeFSLockFactory) -- or move to local disks!
- Use a special deletion policy (https://lucene.apache.org/) to make
the commits not corrupt you open IndexSearchers because of suddenly
disappearing files (Lucene deletes files while they are open, as POSIX
has delete-on-last-close) and use SimpleFSLockFactory. But
SimpleFSLockFactory may hit stale lock files issues on killed JVMs.
Also don't use MMapDirectory for file storage as this wil likely crush
your JVM on network problems!

Some background: The original and recommended lock system works
correct with killed VMs, as the existence of the lock file has nothing
to do with the "state" of being locked. The lock file is just a
placeholder to actually have a file instance to do the locking. There
is no solution for mixed NFS and non-NFS directories. So get either
rid of them or add your own logic to choose right lock and deletion
policy depending on the file system. You may use Java 7+'s Path/Files
API to get all mount points. Memory mapping is risky with NFS, as a
no-longer reachable file may suddenly unmap the cache buffers from
process space and the next access will segmentation fault your JVM.
The Snapshot deletion policy keeps the last commits available on disk,
so the "delete-on-last-close" behaviour by POSIX is not required. But
you have to take care to delete snapshots when you have closed all
readers.

On Wed, Sep 5, 2018 at 6:59 AM Shawn Heisey <ap...@elyograg.org> wrote:
>
> On 9/5/2018 6:55 AM, Imran Rajjad wrote:
> > I am using Solr Cloud 6.4.1. After a hard restart the solr nodes are
> > constantly showing to be in DOWN state and would not go into recovery. I
> > have also deleted the write.lock files from all the replica folders, but
> > the problem would not go away. The error displayed at web console is : no
> > locks available
> >
> > My replica folders reside in an nfs mount, I am using RHEL 6/CentOS6.8. Has
> > anyone ever faced this issue?
>
> Lucene-based software (including Solr) does NOT work well on NFS.  NFS
> does not provide all the locking functionality that Lucene tries to use
> by default.
>
> You're probably going to need to change the lock factory, and might even
> need to completely disable locking.  If you do disable locking, you have
> to be VERY careful to never allow more than one core or more than one
> Solr instance to try and open a core directory.  Doing so will likely
> corrupt the index.
>
> I strongly recommend NOT using NFS storage for Solr.  In addition to
> locking problems, it also tends to be extremely slow.Use a local filesystem.
>
> Thanks,
> Shawn
>

Re: Solr 6.4.1: : SolrException:nfs no locks available

Posted by Shawn Heisey <ap...@elyograg.org>.
On 9/5/2018 6:55 AM, Imran Rajjad wrote:
> I am using Solr Cloud 6.4.1. After a hard restart the solr nodes are
> constantly showing to be in DOWN state and would not go into recovery. I
> have also deleted the write.lock files from all the replica folders, but
> the problem would not go away. The error displayed at web console is : no
> locks available
>
> My replica folders reside in an nfs mount, I am using RHEL 6/CentOS6.8. Has
> anyone ever faced this issue?

Lucene-based software (including Solr) does NOT work well on NFS.  NFS 
does not provide all the locking functionality that Lucene tries to use 
by default.

You're probably going to need to change the lock factory, and might even 
need to completely disable locking.  If you do disable locking, you have 
to be VERY careful to never allow more than one core or more than one 
Solr instance to try and open a core directory.  Doing so will likely 
corrupt the index.

I strongly recommend NOT using NFS storage for Solr.  In addition to 
locking problems, it also tends to be extremely slow.Use a local filesystem.

Thanks,
Shawn