You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@manifoldcf.apache.org by Olivier Tavard <ol...@francelabs.com> on 2017/10/23 15:19:25 UTC

ZK based synchronization questions

Hello,

We configured MCF to use ZK sync instead of file sync. We noticed a huge improvement regarding the stability of the MCF jobs in every case especially for large data to index (15M of files using the Windows Share repository connector). Before that, we had some errors when the job was running randomly. With that change, we did not notice any error on the job so far.
However, after testing that configuration on several servers, we had errors reported and I would like to know what you suggest for that.
We installed MCF on servers that already have Solr 6.6.X on them. I saw on other threads on the mailing list that it was OK to use existing ZK installation rather than using a new ZK instance dedicated to MCF so we use the same ZK for both Solr and MCF.
After starting MCF and Solr, we noticed some errors on the MCF log for few servers : Session 0x0 for server localhost/127.0.0.1:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Connection reset by peer
Then after checking the ZK log we saw this message : "WARN 2017-10-23 08:53:35,431 (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181) - ZooKeeper|ZooKeeper|zookeeper.server.NIOServerCnxnFactory|Too many connections from /127.0.0.1 - max is 60"
Therefore we changed the parameter maxClientCnxns from 60 in our ZK configuration to 1000 as in the MCF zookeeper.cfg default file and there is no problem anymore.
I would just like to know why this parameter needs to be so high for MCF needs and if other people share their ZK cluster with both Solr and MCF too without any problem. And last question I had : MCF uses Zookeeper 3.4.8 while Solr 6.6+ uses ZK 3.4.10. As written above our ZK cluster version is 3.4.10 and we use MCF on it, is it ok to do that or would it be best to use a ZK installation with version 3.4.8 only for MCF ? So far we did not see any problems using it.

Thanks,
Best regards,

Olivier TAVARD

Re: ZK based synchronization questions

Posted by Olivier Tavard <ol...@francelabs.com>.

Hi Karl,

OK it makes senses, thanks for the explanation !

Best regards,

Olivier TAVARD


> Le 31 oct. 2017 à 17:53, Karl Wright <da...@gmail.com> a écrit :
> 
> Hi Olivier,
> 
> Zookeeper connections are pooled, so they are pulled out of the pool at will by ManifoldCF and returned when the lock etc is done.  This means you really should need only a total number of outstanding Zookeeper handles that is on the same order as the number of operating threads in your ManifoldCF cluster.  We do have some users, though, who have hundreds of worker threads in the mistaken assumption that more threads makes the system faster, and when people do stuff like that, we often get tickets because the number of zookeeper handles runs out.  That is why there is such a large number.
> 
> Thanks,
> Karl
> 
> 
> On Tue, Oct 31, 2017 at 12:23 PM, Olivier Tavard <olivier.tavard@francelabs.com <ma...@francelabs.com>> wrote:
> Hi all,
> Just to clarify my concern on ZK:
> To my knowledge, best practices concerning ZK connections are to not go beyond 60. Is there any rationale for setting it at 1000 for MCF ? 
> Could this can have side effects on our ZK cluster shared by MCF and SolrCloud ?
> 
> Thanks,
> 
> Olivier
> 
> 
>> Le 23 oct. 2017 à 17:19, Olivier Tavard <olivier.tavard@francelabs.com <ma...@francelabs.com>> a écrit :
>> 
>> Hello,
>> 
>> We configured MCF to use ZK sync instead of file sync. We noticed a huge improvement regarding the stability of the MCF jobs in every case especially for large data to index (15M of files using the Windows Share repository connector). Before that, we had some errors when the job was running randomly. With that change, we did not notice any error on the job so far.
>> However, after testing that configuration on several servers, we had errors reported and I would like to know what you suggest for that.
>> We installed MCF on servers that already have Solr 6.6.X on them. I saw on other threads on the mailing list that it was OK to use existing ZK installation rather than using a new ZK instance dedicated to MCF so we use the same ZK for both Solr and MCF.
>> After starting MCF and Solr, we noticed some errors on the MCF log for few servers : Session 0x0 for server localhost/127.0.0.1:2181 <http://127.0.0.1:2181/>, unexpected error, closing socket connection and attempting reconnect
>> java.io.IOException: Connection reset by peer
>> Then after checking the ZK log we saw this message : "WARN 2017-10-23 08:53:35,431 (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181 <http://0.0.0.0/0.0.0.0:2181>) - ZooKeeper|ZooKeeper|zookeeper.server.NIOServerCnxnFactory|Too many connections from /127.0.0.1 <http://127.0.0.1/> - max is 60"
>> Therefore we changed the parameter maxClientCnxns from 60 in our ZK configuration to 1000 as in the MCF zookeeper.cfg default file and there is no problem anymore.
>> I  would just like to know why this parameter needs to be so high for MCF needs and if other people share their ZK cluster with both Solr and MCF too without any problem. And last question I had : MCF uses Zookeeper 3.4.8 while Solr 6.6+ uses ZK 3.4.10.  As written above our ZK cluster version is 3.4.10 and we use MCF on it, is it ok to do that or would it be best to use a ZK installation with version 3.4.8 only for MCF ?  So far we did not see any problems using it.
>> 
>> Thanks,
>> Best regards, 
>> 
>> Olivier TAVARD
> 
>

Re: ZK based synchronization questions

Posted by Karl Wright <da...@gmail.com>.

Hi Olivier,

Zookeeper connections are pooled, so they are pulled out of the pool at
will by ManifoldCF and returned when the lock etc is done.  This means you
really should need only a total number of outstanding Zookeeper handles
that is on the same order as the number of operating threads in your
ManifoldCF cluster.  We do have some users, though, who have hundreds of
worker threads in the mistaken assumption that more threads makes the
system faster, and when people do stuff like that, we often get tickets
because the number of zookeeper handles runs out.  That is why there is
such a large number.

Thanks,
Karl


On Tue, Oct 31, 2017 at 12:23 PM, Olivier Tavard <
olivier.tavard@francelabs.com> wrote:

> Hi all,
> Just to clarify my concern on ZK:
> To my knowledge, best practices concerning ZK connections are to not go
> beyond 60. Is there any rationale for setting it at 1000 for MCF ?
> Could this can have side effects on our ZK cluster shared by MCF and
> SolrCloud ?
>
> Thanks,
>
> Olivier
>
>
> Le 23 oct. 2017 à 17:19, Olivier Tavard <ol...@francelabs.com> a
> écrit :
>
> Hello,
>
> We configured MCF to use ZK sync instead of file sync. We noticed a huge
> improvement regarding the stability of the MCF jobs in every case
> especially for large data to index (15M of files using the Windows Share
> repository connector). Before that, we had some errors when the job was
> running randomly. With that change, we did not notice any error on the job
> so far.
> However, after testing that configuration on several servers, we had
> errors reported and I would like to know what you suggest for that.
> We installed MCF on servers that already have Solr 6.6.X on them. I saw on
> other threads on the mailing list that it was OK to use existing ZK
> installation rather than using a new ZK instance dedicated to MCF so we use
> the same ZK for both Solr and MCF.
> After starting MCF and Solr, we noticed some errors on the MCF log for few
> servers : Session 0x0 for server localhost/127.0.0.1:2181, unexpected
> error, closing socket connection and attempting reconnect
> java.io.IOException: Connection reset by peer
> Then after checking the ZK log we saw this message : "WARN 2017-10-23
> 08:53:35,431 (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181) -
> ZooKeeper|ZooKeeper|zookeeper.server.NIOServerCnxnFactory|Too many
> connections from /127.0.0.1 - max is 60"
> Therefore we changed the parameter maxClientCnxns from 60 in our ZK
> configuration to 1000 as in the MCF zookeeper.cfg default file and there is
> no problem anymore.
> I  would just like to know why this parameter needs to be so high for MCF
> needs and if other people share their ZK cluster with both Solr and MCF too
> without any problem. And last question I had : MCF uses Zookeeper 3.4.8
> while Solr 6.6+ uses ZK 3.4.10.  As written above our ZK cluster version is
> 3.4.10 and we use MCF on it, is it ok to do that or would it be best to use
> a ZK installation with version 3.4.8 only for MCF ?  So far we did not see
> any problems using it.
>
> Thanks,
> Best regards,
>
> Olivier TAVARD
>
>
>

Re: ZK based synchronization questions

Posted by Olivier Tavard <ol...@francelabs.com>.

Hi all,
Just to clarify my concern on ZK:
To my knowledge, best practices concerning ZK connections are to not go beyond 60. Is there any rationale for setting it at 1000 for MCF ? 
Could this can have side effects on our ZK cluster shared by MCF and SolrCloud ?

Thanks,

Olivier


> Le 23 oct. 2017 à 17:19, Olivier Tavard <ol...@francelabs.com> a écrit :
> 
> Hello,
> 
> We configured MCF to use ZK sync instead of file sync. We noticed a huge improvement regarding the stability of the MCF jobs in every case especially for large data to index (15M of files using the Windows Share repository connector). Before that, we had some errors when the job was running randomly. With that change, we did not notice any error on the job so far.
> However, after testing that configuration on several servers, we had errors reported and I would like to know what you suggest for that.
> We installed MCF on servers that already have Solr 6.6.X on them. I saw on other threads on the mailing list that it was OK to use existing ZK installation rather than using a new ZK instance dedicated to MCF so we use the same ZK for both Solr and MCF.
> After starting MCF and Solr, we noticed some errors on the MCF log for few servers : Session 0x0 for server localhost/127.0.0.1:2181, unexpected error, closing socket connection and attempting reconnect
> java.io.IOException: Connection reset by peer
> Then after checking the ZK log we saw this message : "WARN 2017-10-23 08:53:35,431 (NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181) - ZooKeeper|ZooKeeper|zookeeper.server.NIOServerCnxnFactory|Too many connections from /127.0.0.1 - max is 60"
> Therefore we changed the parameter maxClientCnxns from 60 in our ZK configuration to 1000 as in the MCF zookeeper.cfg default file and there is no problem anymore.
> I  would just like to know why this parameter needs to be so high for MCF needs and if other people share their ZK cluster with both Solr and MCF too without any problem. And last question I had : MCF uses Zookeeper 3.4.8 while Solr 6.6+ uses ZK 3.4.10.  As written above our ZK cluster version is 3.4.10 and we use MCF on it, is it ok to do that or would it be best to use a ZK installation with version 3.4.8 only for MCF ?  So far we did not see any problems using it.
> 
> Thanks,
> Best regards, 
> 
> Olivier TAVARD