You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Raveendra Yerraguntla <ra...@gmail.com> on 2016/03/23 04:28:51 UTC

Solr cloud replication factort

All,

I am using Solr 5.4 in solr cloud mode in a 8 node cluster. Used the
replication factor of 1 for creating the index, then switched to
replication factor > 1 for redundancy. With replication factor > 1, and
tried to do indexing for incremental.  When the incremental indexing
happens - getting a stack trace with the root cause pointing to write.lock
is not available. Further analysis found that there is only one write.lock
across all shards (leader and replicas).

But with replication factor of 1 , I could see write.lock across all nodes.

is this the expected behavior (one write.lock) in the solr cloud with
replication factor > 1. If so how can the indexing be done (even though it
is slow) with distributed and redundant shards?

OR

is there any config param that is missing to create write.lock across all
shards with replication factor > 1.

Appreciate your insights.


Thanks
Ravi

Re: Fwd: Solr cloud replication factort

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/25/2016 7:50 AM, Shawn Heisey wrote:
> If you're trying to use NFS to share an index directory between Solr
> nodes, don't do that. Each Solr node needs its own copy of all index
> data. Getting *this* to work *might* be possible, but even when it
> works, it's not very stable.

Followup on this part:  Because you're running in Cloud mode, trying to
share index data between replicas *will* have problems.  There's no way
to work around those problems.  It would completely explain the
"write.lock" issues.  Cloud mode ABSOLUTELY requires that each node has
its own copy of the data, and prefers to have full locking support from
the OS.

Many NFS implementations cannot provide full locking support, which is
why running Solr on NFS is not advised.

Thanks,
Shawn


Re: Fwd: Solr cloud replication factort

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/25/2016 7:29 AM, Raveendra Yerraguntla wrote:
> I got both the replies. Most likely we might have used some of the NFS
> options. I will try them early next week.

Running on NFS is not advised.  You can make it work, but Solr doesn't
like it.

If you're trying to use NFS to share an index directory between Solr
nodes, don't do that.  Each Solr node needs its own copy of all index
data.  Getting *this* to work *might* be possible, but even when it
works, it's not very stable.

Thanks,
Shawn


Re: Fwd: Solr cloud replication factort

Posted by Raveendra Yerraguntla <ra...@gmail.com>.
Thanks Shawn.

I got both the replies. Most likely we might have used some of the NFS
options. I will try them early next week.

Thanks
Ravi

On Wed, Mar 23, 2016 at 9:50 AM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 3/23/2016 6:00 AM, Raveendra Yerraguntla wrote:
> > I am using Solr 5.4 in solr cloud mode in a 8 node cluster. Used the
> > replication factor of 1 for creating the index, then switched to
> > replication factor > 1 for redundancy. With replication factor > 1, and
> > tried to do indexing for incremental.  When the incremental indexing
> > happens - getting a stack trace with the root cause pointing to
> write.lock
> > is not available. Further analysis found that there is only one
> write.lock
> > across all shards (leader and replicas).
> >
> > But with replication factor of 1 , I could see write.lock across all
> nodes.
>
> Did you see my reply on the dev list, sent before I told you that your
> question belonged on this list?
>
> In that reply I told you that replicationFactor will have no effect
> after you create your collection.  It also cannot cause problems with
> the write.lock file.
>
> I also said this:
>
> There are three major reasons for a problem with write.lock.
> 1) Solr is crashing and leaving the write.lock file behind.
> 2) You are trying to share an index directory between more than one core
> or Solr instance.
> 3) You are trying to run with your index data on a network filesystem
> like NFS.
>
> Thanks,
> Shawn
>
>

Re: Fwd: Solr cloud replication factort

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/23/2016 6:00 AM, Raveendra Yerraguntla wrote:
> I am using Solr 5.4 in solr cloud mode in a 8 node cluster. Used the
> replication factor of 1 for creating the index, then switched to
> replication factor > 1 for redundancy. With replication factor > 1, and
> tried to do indexing for incremental.  When the incremental indexing
> happens - getting a stack trace with the root cause pointing to write.lock
> is not available. Further analysis found that there is only one write.lock
> across all shards (leader and replicas).
>
> But with replication factor of 1 , I could see write.lock across all nodes.

Did you see my reply on the dev list, sent before I told you that your
question belonged on this list? 

In that reply I told you that replicationFactor will have no effect
after you create your collection.  It also cannot cause problems with
the write.lock file.

I also said this:

There are three major reasons for a problem with write.lock.
1) Solr is crashing and leaving the write.lock file behind.
2) You are trying to share an index directory between more than one core
or Solr instance.
3) You are trying to run with your index data on a network filesystem
like NFS.

Thanks,
Shawn


Fwd: Solr cloud replication factort

Posted by Raveendra Yerraguntla <ra...@gmail.com>.
All,

I am using Solr 5.4 in solr cloud mode in a 8 node cluster. Used the
replication factor of 1 for creating the index, then switched to
replication factor > 1 for redundancy. With replication factor > 1, and
tried to do indexing for incremental.  When the incremental indexing
happens - getting a stack trace with the root cause pointing to write.lock
is not available. Further analysis found that there is only one write.lock
across all shards (leader and replicas).

But with replication factor of 1 , I could see write.lock across all nodes.

is this the expected behavior (one write.lock) in the solr cloud with
replication factor > 1. If so how can the indexing be done (even though it
is slow) with distributed and redundant shards?

OR

is there any config param that is missing to create write.lock across all
shards with replication factor > 1.

Appreciate your insights.


Thanks
Ravi

Re: Solr cloud replication factort

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/22/2016 9:28 PM, Raveendra Yerraguntla wrote:
>> I am using Solr 5.4 in solr cloud mode in a 8 node cluster. Used the

I didn't notice when I replied before that the message was on the dev
list.  This mailing list is for discussions about development of Lucene
and Solr.

For user questions about Solr, please post on the solr-user mailing list.

http://lucene.apache.org/solr/resources.html#mailing-lists

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Solr cloud replication factort

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/22/2016 9:28 PM, Raveendra Yerraguntla wrote:
> I am using Solr 5.4 in solr cloud mode in a 8 node cluster. Used the
> replication factor of 1 for creating the index, then switched to
> replication factor > 1 for redundancy. With replication factor > 1,
> and tried to do indexing for incremental.  When the incremental
> indexing happens - getting a stack trace with the root cause pointing
> to write.lock is not available. Further analysis found that there is
> only one write.lock across all shards (leader and replicas). 

Unless you use the HDFS Directory implementation in Solr, the *only*
time replicationFactor has *any* effect is when you first create your
collection.  After that, it has *zero* effect -- unless you are using
HDFS and have configured it in a particular way.

To achieve redundancy with the normal Directory implementation (usually
NRTCachingDirectoryFactory), you will need to either create the
collection with a replicationFactor higher than 1, or you will need to
use the ADDREPLICA action on the Collections API to create more replicas
of your shards.

> But with replication factor of 1 , I could see write.lock across all
> nodes.
>
> is this the expected behavior (one write.lock) in the solr cloud with
> replication factor > 1. If so how can the indexing be done (even
> though it is slow) with distributed and redundant shards?

There are three major reasons for a problem with write.lock.
1) Solr is crashing and leaving the write.lock file behind.
2) You are trying to share an index directory between more than one core
or Solr instance.
3) You are trying to run with your index data on a network filesystem
like NFS.

Thanks,
Shawn


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org