You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Markus Jelsma <ma...@openindex.io> on 2012/11/02 14:45:33 UTC
SolrCloud indexing blocks if node is recovering
Hi,
We just tested indexing some million docs from Hadoop to a 10 node 2 rep SolrCloud cluster with this week's trunk. One of the nodes gave an OOM but indexing continued without interruption. When i restarted the node indexing stopped completely, the node tried to recover - which was unsuccessful. I restarted the node again but that wasn't very helpful either. Finally i decided to stop the node completely and see what happens - indexing resumed.
Why or how won't the other nodes accept incoming documents when one node behaves really bad? The dying node wasn't the node we were sending documents to and we are not using CloudSolrServer yet (see other thread). Is this known behavior? Is it a bug?
Thanks,
Markus
RE: SolrCloud indexing blocks if node is recovering
Posted by Markus Jelsma <ma...@openindex.io>.
https://issues.apache.org/jira/browse/SOLR-4038
Still trying to gather the logs
-----Original message-----
> From:Mark Miller <ma...@gmail.com>
> Sent: Sat 03-Nov-2012 14:17
> To: Markus Jelsma <ma...@openindex.io>
> Cc: solr-user@lucene.apache.org
> Subject: Re: SolrCloud indexing blocks if node is recovering
>
> The OOM machine and any surrounding if possible (eg especially the leader of the shard).
>
> Not sure what I'm looking for yet, so the more info the better.
>
> - Mark
>
> On Nov 3, 2012, at 5:23 AM, Markus Jelsma <ma...@openindex.io> wrote:
>
> > Hi - yes, i should be able to make sense out of them next monday. I assume you're not too interested in the OOM machine but all surrounding nodes that blocked instead?
> >
> >
> >
> > -----Original message-----
> >> From:Mark Miller <ma...@gmail.com>
> >> Sent: Sat 03-Nov-2012 03:14
> >> To: solr-user@lucene.apache.org
> >> Subject: Re: SolrCloud indexing blocks if node is recovering
> >>
> >> Doesn't sound right. Still have the logs?
> >>
> >> - Mark
> >>
> >> On Fri, Nov 2, 2012 at 9:45 AM, Markus Jelsma
> >> <ma...@openindex.io> wrote:
> >>> Hi,
> >>>
> >>> We just tested indexing some million docs from Hadoop to a 10 node 2 rep SolrCloud cluster with this week's trunk. One of the nodes gave an OOM but indexing continued without interruption. When i restarted the node indexing stopped completely, the node tried to recover - which was unsuccessful. I restarted the node again but that wasn't very helpful either. Finally i decided to stop the node completely and see what happens - indexing resumed.
> >>>
> >>> Why or how won't the other nodes accept incoming documents when one node behaves really bad? The dying node wasn't the node we were sending documents to and we are not using CloudSolrServer yet (see other thread). Is this known behavior? Is it a bug?
> >>>
> >>> Thanks,
> >>> Markus
> >>
> >>
> >>
> >> --
> >> - Mark
> >>
>
>
Re: SolrCloud indexing blocks if node is recovering
Posted by Mark Miller <ma...@gmail.com>.
The OOM machine and any surrounding if possible (eg especially the leader of the shard).
Not sure what I'm looking for yet, so the more info the better.
- Mark
On Nov 3, 2012, at 5:23 AM, Markus Jelsma <ma...@openindex.io> wrote:
> Hi - yes, i should be able to make sense out of them next monday. I assume you're not too interested in the OOM machine but all surrounding nodes that blocked instead?
>
>
>
> -----Original message-----
>> From:Mark Miller <ma...@gmail.com>
>> Sent: Sat 03-Nov-2012 03:14
>> To: solr-user@lucene.apache.org
>> Subject: Re: SolrCloud indexing blocks if node is recovering
>>
>> Doesn't sound right. Still have the logs?
>>
>> - Mark
>>
>> On Fri, Nov 2, 2012 at 9:45 AM, Markus Jelsma
>> <ma...@openindex.io> wrote:
>>> Hi,
>>>
>>> We just tested indexing some million docs from Hadoop to a 10 node 2 rep SolrCloud cluster with this week's trunk. One of the nodes gave an OOM but indexing continued without interruption. When i restarted the node indexing stopped completely, the node tried to recover - which was unsuccessful. I restarted the node again but that wasn't very helpful either. Finally i decided to stop the node completely and see what happens - indexing resumed.
>>>
>>> Why or how won't the other nodes accept incoming documents when one node behaves really bad? The dying node wasn't the node we were sending documents to and we are not using CloudSolrServer yet (see other thread). Is this known behavior? Is it a bug?
>>>
>>> Thanks,
>>> Markus
>>
>>
>>
>> --
>> - Mark
>>
RE: SolrCloud indexing blocks if node is recovering
Posted by Markus Jelsma <ma...@openindex.io>.
Hi - yes, i should be able to make sense out of them next monday. I assume you're not too interested in the OOM machine but all surrounding nodes that blocked instead?
-----Original message-----
> From:Mark Miller <ma...@gmail.com>
> Sent: Sat 03-Nov-2012 03:14
> To: solr-user@lucene.apache.org
> Subject: Re: SolrCloud indexing blocks if node is recovering
>
> Doesn't sound right. Still have the logs?
>
> - Mark
>
> On Fri, Nov 2, 2012 at 9:45 AM, Markus Jelsma
> <ma...@openindex.io> wrote:
> > Hi,
> >
> > We just tested indexing some million docs from Hadoop to a 10 node 2 rep SolrCloud cluster with this week's trunk. One of the nodes gave an OOM but indexing continued without interruption. When i restarted the node indexing stopped completely, the node tried to recover - which was unsuccessful. I restarted the node again but that wasn't very helpful either. Finally i decided to stop the node completely and see what happens - indexing resumed.
> >
> > Why or how won't the other nodes accept incoming documents when one node behaves really bad? The dying node wasn't the node we were sending documents to and we are not using CloudSolrServer yet (see other thread). Is this known behavior? Is it a bug?
> >
> > Thanks,
> > Markus
>
>
>
> --
> - Mark
>
Re: SolrCloud indexing blocks if node is recovering
Posted by Mark Miller <ma...@gmail.com>.
Doesn't sound right. Still have the logs?
- Mark
On Fri, Nov 2, 2012 at 9:45 AM, Markus Jelsma
<ma...@openindex.io> wrote:
> Hi,
>
> We just tested indexing some million docs from Hadoop to a 10 node 2 rep SolrCloud cluster with this week's trunk. One of the nodes gave an OOM but indexing continued without interruption. When i restarted the node indexing stopped completely, the node tried to recover - which was unsuccessful. I restarted the node again but that wasn't very helpful either. Finally i decided to stop the node completely and see what happens - indexing resumed.
>
> Why or how won't the other nodes accept incoming documents when one node behaves really bad? The dying node wasn't the node we were sending documents to and we are not using CloudSolrServer yet (see other thread). Is this known behavior? Is it a bug?
>
> Thanks,
> Markus
--
- Mark