You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by allrightname <al...@gmail.com> on 2013/08/22 21:23:16 UTC

Re: updating docs in solr cloud hangs

Erick,

I've read over SOLR-4816 after finding your comment about the server-side
stack traces showing threads locked up over semaphores and I'm curious how
that issue cures the problem on the server-side as the patch only includes
client-side changes. Do the servers get so tied up shuffling documents
around when they're not sent to the master that they get blocked as
described? If they do get blocked due to shuffling documents around is a
client-side fix for this not more of a workaround than a true fix?

I'm entirely willing to apply this patch to all of the code I've got that
talks to my solr servers and try it out but I'm reluctant to because this
looks like a client-side fix to a server-side issue.

Thanks,
Greg



--
View this message in context: http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-tp4067388p4086160.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: updating docs in solr cloud hangs

Posted by Greg Walters <gw...@sherpaanalytics.com>.

Thanks, Erick that's exactly the clarification/confirmation I was looking for!

Greg

Re: updating docs in solr cloud hangs

Posted by Erick Erickson <er...@gmail.com>.

Right, it's a little arcane. But the lockup is because the
various leaders send documents to each other and wait
for returns. If there are a _lot_ of incoming packets to
various leaders, it can generate the distributed deadlock.
So the shuffling you refer to is the root of the issue.

If the leaders only receive documents for the shard they're
a leader of, then they won't have to send updates to other
leaders and shouldn't hit this condition.

But you're right, this situation was encountered the first time
by SolrJ clients sending lots and lots or parallel requests,
I don't remember whether it was just one client with lots of
threads or many clients. If you're not using SolrJ, then
it won't do you much good since it's client-side only.

As far as being a true fix or not, you can look at it as
kicking the can down the road. This patch has several
advantages:
1> It should pave the way for, and move towards,
    linear scalability as far as scaling up to many
    many nodes when indexing from SolrJ.
2> It should improve throughput in the normal case as well.
3> Along the way it _should_ significantly lower (perhaps
    remove entirely) the chance that this deadlock will occur,
    again when indexing from SolrJ.

If you had a bunch of clients sending, say, posting csv files
to SolrCloud I'd guess you'd find this happening again.

So it's an improvement not a perfect cure. But if you think
it'd help....

Best,
Erick

On Thu, Aug 22, 2013 at 3:23 PM, allrightname <al...@gmail.com>wrote:

> Erick,
>
> I've read over SOLR-4816 after finding your comment about the server-side
> stack traces showing threads locked up over semaphores and I'm curious how
> that issue cures the problem on the server-side as the patch only includes
> client-side changes. Do the servers get so tied up shuffling documents
> around when they're not sent to the master that they get blocked as
> described? If they do get blocked due to shuffling documents around is a
> client-side fix for this not more of a workaround than a true fix?
>
> I'm entirely willing to apply this patch to all of the code I've got that
> talks to my solr servers and try it out but I'm reluctant to because this
> looks like a client-side fix to a server-side issue.
>
> Thanks,
> Greg
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/updating-docs-in-solr-cloud-hangs-tp4067388p4086160.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>