You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jordan Drake <jo...@exterro.com> on 2016/06/02 18:37:28 UTC

Zookeeper hanging after a commit

Hi all,

We are in the processing of streamlining our indexing process and trying to
increase some performance. We came across an issue where zookeeper seems to
hang for 10+ minutes (we've seen it as high as 40 min) after committing.
See the portion of the logs below.

Our indexing is being done using the MapReduceIndexerTool with the go-live
option to merge into our live Solr.
The creation of the segments in mapreduce is fairly quick, and the merge is
usually fast. It's just that we occasionally see this issue in one of our
environments.

I'm not sure whether this is a Zookeeper or Solr issue or if this is just
expected behavior. Any ideas on where to look for debugging?



16/06/02 09:03:06 INFO hadoop.MapReduceIndexerTool: Indexing 1 files
using 1 real mappers into 1 reducers
16/06/02 09:04:08 INFO hadoop.MapReduceIndexerTool: Done. Indexing 1
files using 1 real mappers into 1 reducers took 2.06103613E10 secs
16/06/02 09:04:08 INFO hadoop.GoLive: Live merging of output shards
into Solr cluster...
16/06/02 09:04:08 INFO hadoop.GoLive: Live merge
hdfs://192.168.5.228:8020/indexed/tmp/e2e/2223/results/part-00000 into
http://192.168.5.227:8983/solr
16/06/02 09:04:22 INFO hadoop.GoLive: Committing live merge...
16/06/02 09:04:22 INFO zookeeper.ZooKeeper: Initiating client
connection, connectString=192.168.5.227:9983 sessionTimeout=10000
watcher=org.apache.solr.common.cloud.ConnectionManager@1deca477
16/06/02 09:04:22 INFO cloud.ConnectionManager: Waiting for client to
connect to ZooKeeper
16/06/02 09:04:22 INFO zookeeper.ClientCnxn: Opening socket connection
to server 192.168.5.227/192.168.5.227:9983. Will not attempt to
authenticate using SASL (unknown error)
16/06/02 09:04:22 INFO zookeeper.ClientCnxn: Socket connection
established to 192.168.5.227/192.168.5.227:9983, initiating session
16/06/02 09:04:22 INFO zookeeper.ClientCnxn: Session establishment
complete on server 192.168.5.227/192.168.5.227:9983, sessionid =
0x154e9ea749c028f, negotiated timeout = 10000
16/06/02 09:04:22 INFO cloud.ConnectionManager: Watcher
org.apache.solr.common.cloud.ConnectionManager@1deca477
name:ZooKeeperConnection Watcher:192.168.5.227:9983 got event
WatchedEvent state:SyncConnected type:None path:null path:null
type:None
16/06/02 09:04:22 INFO cloud.ConnectionManager: Client is connected to
ZooKeeper*16/06/02 09:04:22 INFO cloud.ZkStateReader: Updating cluster
state from ZooKeeper...
16/06/02 09:18:17 INFO zookeeper.ZooKeeper: Session: 0x154e9ea749c028f closed*
16/06/02 09:18:17 INFO zookeeper.ClientCnxn: EventThread shut down
16/06/02 09:18:17 INFO hadoop.GoLive: Done committing live merge
16/06/02 09:18:17 INFO hadoop.GoLive: Live merging of index shards
into Solr cluster took 2.83196359E11 secs
16/06/02 09:18:17 INFO hadoop.GoLive: Live merging completed successfully
16/06/02 09:18:17 INFO hadoop.MapReduceIndexerTool: Succeeded with
job: jobName: org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper,
jobId: job_1464681461364_0604
16/06/02 09:18:17 INFO hadoop.MapReduceIndexerTool: Success. Done.
Program took 3.04902275E11 secs. Goodbye.



Thanks,
Jordan Drake

Re: Zookeeper hanging after a commit

Posted by Erick Erickson <er...@gmail.com>.
Zookeeper hanging? If it was truly unresponsive I would
think your entire SolrCloud would be down. I guess you
could test this by, say, creating a new collection and
seeing if it goes live, if Zookeeper is truly unresponsive
that would fail.

Are you sure it's not just that the merging that's going
on as part of MRIT?

Best,
Erick

On Thu, Jun 2, 2016 at 11:37 AM, Jordan Drake <jo...@exterro.com> wrote:
> Hi all,
>
> We are in the processing of streamlining our indexing process and trying to
> increase some performance. We came across an issue where zookeeper seems to
> hang for 10+ minutes (we've seen it as high as 40 min) after committing.
> See the portion of the logs below.
>
> Our indexing is being done using the MapReduceIndexerTool with the go-live
> option to merge into our live Solr.
> The creation of the segments in mapreduce is fairly quick, and the merge is
> usually fast. It's just that we occasionally see this issue in one of our
> environments.
>
> I'm not sure whether this is a Zookeeper or Solr issue or if this is just
> expected behavior. Any ideas on where to look for debugging?
>
>
>
> 16/06/02 09:03:06 INFO hadoop.MapReduceIndexerTool: Indexing 1 files
> using 1 real mappers into 1 reducers
> 16/06/02 09:04:08 INFO hadoop.MapReduceIndexerTool: Done. Indexing 1
> files using 1 real mappers into 1 reducers took 2.06103613E10 secs
> 16/06/02 09:04:08 INFO hadoop.GoLive: Live merging of output shards
> into Solr cluster...
> 16/06/02 09:04:08 INFO hadoop.GoLive: Live merge
> hdfs://192.168.5.228:8020/indexed/tmp/e2e/2223/results/part-00000 into
> http://192.168.5.227:8983/solr
> 16/06/02 09:04:22 INFO hadoop.GoLive: Committing live merge...
> 16/06/02 09:04:22 INFO zookeeper.ZooKeeper: Initiating client
> connection, connectString=192.168.5.227:9983 sessionTimeout=10000
> watcher=org.apache.solr.common.cloud.ConnectionManager@1deca477
> 16/06/02 09:04:22 INFO cloud.ConnectionManager: Waiting for client to
> connect to ZooKeeper
> 16/06/02 09:04:22 INFO zookeeper.ClientCnxn: Opening socket connection
> to server 192.168.5.227/192.168.5.227:9983. Will not attempt to
> authenticate using SASL (unknown error)
> 16/06/02 09:04:22 INFO zookeeper.ClientCnxn: Socket connection
> established to 192.168.5.227/192.168.5.227:9983, initiating session
> 16/06/02 09:04:22 INFO zookeeper.ClientCnxn: Session establishment
> complete on server 192.168.5.227/192.168.5.227:9983, sessionid =
> 0x154e9ea749c028f, negotiated timeout = 10000
> 16/06/02 09:04:22 INFO cloud.ConnectionManager: Watcher
> org.apache.solr.common.cloud.ConnectionManager@1deca477
> name:ZooKeeperConnection Watcher:192.168.5.227:9983 got event
> WatchedEvent state:SyncConnected type:None path:null path:null
> type:None
> 16/06/02 09:04:22 INFO cloud.ConnectionManager: Client is connected to
> ZooKeeper*16/06/02 09:04:22 INFO cloud.ZkStateReader: Updating cluster
> state from ZooKeeper...
> 16/06/02 09:18:17 INFO zookeeper.ZooKeeper: Session: 0x154e9ea749c028f closed*
> 16/06/02 09:18:17 INFO zookeeper.ClientCnxn: EventThread shut down
> 16/06/02 09:18:17 INFO hadoop.GoLive: Done committing live merge
> 16/06/02 09:18:17 INFO hadoop.GoLive: Live merging of index shards
> into Solr cluster took 2.83196359E11 secs
> 16/06/02 09:18:17 INFO hadoop.GoLive: Live merging completed successfully
> 16/06/02 09:18:17 INFO hadoop.MapReduceIndexerTool: Succeeded with
> job: jobName: org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper,
> jobId: job_1464681461364_0604
> 16/06/02 09:18:17 INFO hadoop.MapReduceIndexerTool: Success. Done.
> Program took 3.04902275E11 secs. Goodbye.
>
>
>
> Thanks,
> Jordan Drake