You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by Luis Cappa Banda <lu...@gmail.com> on 2013/03/13 18:16:31 UTC

SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Hello, guys!

I´ve been experiencing some annoying behavior with my current production
scenario. Here is the snapshot:


   - SolrCloud: 2 shards
   - Zookeeper ensemble: 3 nodes in *different machines *(most of the
   tutorials installs 3 Zookeeper nodes in the same machine).
   - This is the zoo.cfg from every

tickTime=2000  // I´ve also tried with 60000

initLimit=10

syncLimit=5

dataDir=/var/lib/zookeeper

clientPort=9000

server.1=zoohost1:2888:3888

server.2=zoohost1:2888:3888

server.3=zoohost1:2888:3888



   - I´ve developed a Java Application with a REST API (let´s call it *
   engine*) that dispatches queries into SolrCloud. It´s a wrapper around
   CloudSolrServer, so it´s mandatory to specify some Zookeeper configuration
   params too. They are loaded dynamically when the application is deployed in
   a Tomcat server, but the current values that I´m using are as follows:

cloudSolrServer.*setZkConnectTimeout(60000)*

cloudSolrServer.*setZkClientTimeout(60000)*
*
*
*
*

*THE PROBLEM*
*
*
Everything goes OK, but after two days more or less (yes, I´ve checked that
this behavior occurrs periodically, more or less) the *engine blocks * and
cannot dispatch any query to SolrCloud.

   - The *engine *log only outputs "updating Zookeeper..." one last time,
   but never updates.
   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
   everything is green, and I cant execute queries directly into Solr.
   - So then Solr appears to be OK, so the next step is to restart *engine
   but *it again appears "updating Zookeeper...". Unfortunately switch off
   + switch on doesn´t work here, :-(
   - I´ve checked too Zookeeper logs and it appears some connection log
   outs, but the ensemble appears to be OK too.
   - *The end: *If I restart Zookeeper one by one, and I restart SolrCloud,
   plus I restart the engine, the problem is solved. I´m using Amazon AWS as
   hostage, so I discard connection problems between instances.


Does anyone experienced something similar? Can anybody shed some light on
this problem?

Thank you very much.


Regards,


- Luis Cappa

SolrCloud with Zookeeper ensemble : fail to restart master server

Posted by Patrick Mi <pa...@touchpointgroup.com>.

Hi there,

I have experienced some problems starting the master server.

Solr4.2 under Tomcat 7 on Centos6.

Configuration : 
3 solr instances running on different machines, one shard, 3 cores, 2
replicas, using Zookeeper comes with Solr 

The master server A has the following run option: -Dbootstrap_conf=true
-DzkRun -DnumShards=1, 
The slave servers B and C have : -DzkHost=masterServerIP:2181 

It works well for add/update/delete etc after I start up master and slave
servers in order.

When the master A is up stop/start slave B and C are OK.

When slave B and C are running I couldn't restart master A. Only after I
shutdown B and C then I can start master A.

Is this a feature or bug or something I haven't configure properly?

Thanks advance for your help

Regards,
Patrick

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Luis Cappa Banda <lu...@gmail.com>.

Me neither. Please, Mark, can you tell us how?

2013/3/15 Jack Park <ja...@topicquests.org>

> Is there a document that tells how to create multiple threads? Search
> returns many hits which orbit this idea, but I haven't spotted one
> which tells how.
>
> Thanks
> Jack
>
> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller <ma...@gmail.com>
> wrote:
> > You def have to use multiple threads with it for it to be fast, but 3 or
> 4 docs a second still sounds absurdly slow.
> >
> > - Mark
> >
> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda <lu...@gmail.com>
> wrote:
> >
> >> And up! :-)
> >>
> >> I´ve been wondering if using CloudSolrServer has something to do here.
> Does
> >> it have a bad performance when a CloudSolrServer singletong receives
> >> multiple queries? Is it recommended to have a CloudSolrServer instances
> >> list and select one of them with a Round Robin criteria?
> >>
> >>
> >>
> >> 2013/3/14 Luis Cappa Banda <lu...@gmail.com>
> >>
> >>> Hello!
> >>>
> >>> Thanks a lot, Erick! I've attached some stack traces during a normal
> >>> 'engine' running.
> >>>
> >>> Cheers,
> >>>
> >>> - Luis Cappa
> >>>
> >>>
> >>> 2013/3/13 Erick Erickson <er...@gmail.com>
> >>>
> >>>> Stack traces..
> >>>>
> >>>> First,
> >>>> jps -l
> >>>>
> >>>> that will give you a the process IDs of your running Java processes.
> Then:
> >>>>
> >>>> jstack <pid from above>
> >>>>
> >>>> Usually I pipe the output from jstack into a text file...
> >>>>
> >>>> Best
> >>>> Erick
> >>>>
> >>>>
> >>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <
> luiscappa@gmail.com
> >>>>> wrote:
> >>>>
> >>>>> Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
> >>>> posible
> >>>>> to output this traces, but with a .war application built on top of
> >>>> Spring I
> >>>>> don´t know how can I do that. In any case, here is my CloudSolrServer
> >>>>> wrapper that is used by other classes. There is no sync method or
> piece
> >>>> of
> >>>>> code:
> >>>>>
> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >>>> - -
> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >>>>>
> >>>>> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
> >>>>>
> >>>>> private static final long serialVersionUID = 3905956120804659445L;
> >>>>>    public BinaryLBHttpSolrServer(String[] endpoints) throws
> >>>>> MalformedURLException {
> >>>>>    super(endpoints);
> >>>>>    }
> >>>>>
> >>>>>    @Override
> >>>>>    protected HttpSolrServer makeServer(String server) throws
> >>>>> MalformedURLException {
> >>>>>        HttpSolrServer solrServer = super.makeServer(server);
> >>>>>        solrServer.setRequestWriter(new BinaryRequestWriter());
> >>>>>        return solrServer;
> >>>>>    }
> >>>>> }
> >>>>>
> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >>>> - -
> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >>>>>
> >>>>> *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer
> {*
> >>>>> private CloudSolrServer cloudSolrServer;
> >>>>>
> >>>>> private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
> >>>>>
> >>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
> >>>>> endpoints, int clientTimeout,
> >>>>> int connectTimeout, String cloudCollection) {
> >>>>> try {
> >>>>> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
> >>>>> (endpoints);
> >>>>> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
> >>>>> lbSolrServer);
> >>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
> >>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
> >>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection);
> >>>>> } catch (MalformedURLException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> @Override
> >>>>> public QueryResponse *search*(SolrQuery query) throws
> >>>> SolrServerException {
> >>>>> return cloudSolrServer.query(query, METHOD.POST);
> >>>>> }
> >>>>>
> >>>>> @Override
> >>>>> public boolean *index*(DocumentBean user) {
> >>>>> boolean indexed = false;
> >>>>> int retries = 0;
> >>>>> do {
> >>>>> indexed = addBean(user);
> >>>>> retries++;
> >>>>> } while(!indexed && retries<4);
> >>>>> return indexed;
> >>>>> }
> >>>>> @Override
> >>>>> public boolean *update*(SolrInputDocument updateDoc) {
> >>>>> boolean update = false;
> >>>>> int retries = 0;
> >>>>>
> >>>>> do {
> >>>>> update = addSolrInputDocument(updateDoc);
> >>>>> retries++;
> >>>>> } while(!update && retries<4);
> >>>>> return update;
> >>>>> }
> >>>>> @Override
> >>>>> public void commit() {
> >>>>> try {
> >>>>> cloudSolrServer.commit();
> >>>>> } catch (SolrServerException e) {
> >>>>>     log.error(e);
> >>>>> } catch (IOException e) {
> >>>>>     log.error(e);
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> @Override
> >>>>> public boolean *delete*(String ... ids) {
> >>>>> boolean deleted = false;
> >>>>> List<String> idList = Arrays.asList(ids);
> >>>>> try {
> >>>>> this.cloudSolrServer.deleteById(idList);
> >>>>> this.cloudSolrServer.commit(true, true);
> >>>>> deleted = true;
> >>>>>
> >>>>> } catch (SolrServerException e) {
> >>>>> log.error(e);
> >>>>>
> >>>>> } catch (IOException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> return deleted;
> >>>>> }
> >>>>>
> >>>>> @Override
> >>>>> public void *optimize*() {
> >>>>> try {
> >>>>> this.cloudSolrServer.optimize();
> >>>>> } catch (SolrServerException e) {
> >>>>> log.error(e);
> >>>>> } catch (IOException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> }
> >>>>> /*
> >>>>> * ********************
> >>>>> *  Getters & setters *
> >>>>> * ********************
> >>>>> * */
> >>>>> public CloudSolrServer getSolrServer() {
> >>>>> return cloudSolrServer;
> >>>>> }
> >>>>>
> >>>>> public void setSolrServer(CloudSolrServer solrServer) {
> >>>>> this.cloudSolrServer = solrServer;
> >>>>> }
> >>>>>
> >>>>> private boolean addBean(DocumentBean user) {
> >>>>> boolean added = false;
> >>>>> try {
> >>>>> this.cloudSolrServer.addBean(user, 100);
> >>>>> this.commit();
> >>>>>
> >>>>> } catch (IOException e) {
> >>>>> log.error(e);
> >>>>>
> >>>>> } catch (SolrServerException e) {
> >>>>> log.error(e);
> >>>>> }catch(SolrException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> return added;
> >>>>> }
> >>>>> private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
> >>>>> boolean added = false;
> >>>>> try {
> >>>>> this.cloudSolrServer.add(updateDoc, 100);
> >>>>> this.commit();
> >>>>> added = true;
> >>>>> } catch (IOException e) {
> >>>>> log.error(e);
> >>>>>
> >>>>> } catch (SolrServerException e) {
> >>>>> log.error(e);
> >>>>> }catch(SolrException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> return added;
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> Thank you very much, Mark.
> >>>>>
> >>>>>
> >>>>> -  Luis Cappa
> >>>>>
> >>>>>
> >>>>>
> >>>>> And
> >>>>> 2013/3/13 Mark Miller <ma...@gmail.com>
> >>>>>
> >>>>>>
> >>>>>> Could you capture some thread stack traces in the 'engine' and see
> if
> >>>>>> there are any blocking methods?
> >>>>>>
> >>>>>> - Mark
> >>>>>>
> >>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Just one correction:
> >>>>>>>
> >>>>>>> When I said:
> >>>>>>>
> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >>>>>>>  everything is green, and I cant execute queries directly into
> >>>> Solr.
> >>>>>>>
> >>>>>>> I mean:
> >>>>>>>
> >>>>>>>
> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >>>>>>>  everything is green, and *I can* execute queries directly into
> >>>> Solr.
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>>
> >>>>>>>
> >>>>>>> - Luis Cappa
> >>>>>>>
> >>>>>>>
> >>>>>>> 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
> >>>>>>>
> >>>>>>>> Hello, guys!
> >>>>>>>>
> >>>>>>>> I´ve been experiencing some annoying behavior with my current
> >>>>> production
> >>>>>>>> scenario. Here is the snapshot:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>  - SolrCloud: 2 shards
> >>>>>>>>  - Zookeeper ensemble: 3 nodes in *different machines *(most of
> >>>> the
> >>>>>>>>  tutorials installs 3 Zookeeper nodes in the same machine).
> >>>>>>>>  - This is the zoo.cfg from every
> >>>>>>>>
> >>>>>>>> tickTime=2000  // I´ve also tried with 60000
> >>>>>>>>
> >>>>>>>> initLimit=10
> >>>>>>>>
> >>>>>>>> syncLimit=5
> >>>>>>>>
> >>>>>>>> dataDir=/var/lib/zookeeper
> >>>>>>>>
> >>>>>>>> clientPort=9000
> >>>>>>>>
> >>>>>>>> server.1=zoohost1:2888:3888
> >>>>>>>>
> >>>>>>>> server.2=zoohost1:2888:3888
> >>>>>>>>
> >>>>>>>> server.3=zoohost1:2888:3888
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>  - I´ve developed a Java Application with a REST API (let´s call
> >>>> it *
> >>>>>>>>  engine*) that dispatches queries into SolrCloud. It´s a wrapper
> >>>>> around
> >>>>>>>>  CloudSolrServer, so it´s mandatory to specify some Zookeeper
> >>>>>> configuration
> >>>>>>>>  params too. They are loaded dynamically when the application is
> >>>>>> deployed in
> >>>>>>>>  a Tomcat server, but the current values that I´m using are as
> >>>>> follows:
> >>>>>>>>
> >>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)*
> >>>>>>>>
> >>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)*
> >>>>>>>> *
> >>>>>>>> *
> >>>>>>>> *
> >>>>>>>> *
> >>>>>>>>
> >>>>>>>> *THE PROBLEM*
> >>>>>>>> *
> >>>>>>>> *
> >>>>>>>> Everything goes OK, but after two days more or less (yes, I´ve
> >>>> checked
> >>>>>>>> that this behavior occurrs periodically, more or less) the *engine
> >>>>>> blocks
> >>>>>>>> * and cannot dispatch any query to SolrCloud.
> >>>>>>>>
> >>>>>>>>  - The *engine *log only outputs "updating Zookeeper..." one last
> >>>>> time,
> >>>>>>>>  but never updates.
> >>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >>>>>>>>  everything is green, and I cant execute queries directly into
> >>>> Solr.
> >>>>>>>>  - So then Solr appears to be OK, so the next step is to restart
> >>>>>> *engine
> >>>>>>>>  but *it again appears "updating Zookeeper...". Unfortunately
> >>>> switch
> >>>>>>>>  off + switch on doesn´t work here, :-(
> >>>>>>>>  - I´ve checked too Zookeeper logs and it appears some connection
> >>>> log
> >>>>>>>>  outs, but the ensemble appears to be OK too.
> >>>>>>>>  - *The end: *If I restart Zookeeper one by one, and I restart
> >>>>>>>>  SolrCloud, plus I restart the engine, the problem is solved. I´m
> >>>>> using
> >>>>>>>>  Amazon AWS as hostage, so I discard connection problems between
> >>>>>> instances.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Does anyone experienced something similar? Can anybody shed some
> >>>> light
> >>>>>> on
> >>>>>>>> this problem?
> >>>>>>>>
> >>>>>>>> Thank you very much.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> - Luis Cappa
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >
>

答复: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by "Rollin.R.Ma (lab.sh04.Newegg) 41099" <Ro...@newegg.com>.

Thx！

-----邮件原件-----
发件人: Michael Della Bitta [mailto:michael.della.bitta@appinions.com] 
发送时间: 2013年3月20日 20:42
收件人: solr-user@lucene.apache.org
主题: Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

> 2. As far as I know the better SolrJ interface to index with SolrCloud 
> is
CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many instances of CloudSolrServer and you correctly balance them with a Round Robin or something similar you´ll get a better performance in SolrCloud scenarios.
At least is what I´ve read in the documentation, and also I asked to Mark Miller some months ago when I started dealing with Solr 4.0-BETA.

I was told otherwise during Solr Boot Camp.

Michael Della Bitta

------------------------------------------------
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Wed, Mar 20, 2013 at 5:14 AM, Luis Cappa Banda <lu...@gmail.com> wrote:
> Thank you for answering. Some notes:
>
> 1. The Java engine I´ve developed that wrappers SolrJ 4.1  with some 
> business logic only executes search queries, not index/update 
> operations, so the problem is not related with concurrent updates, or something similar.
>
> 2. As far as I know the better SolrJ interface to index with SolrCloud 
> is CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many 
> instances of CloudSolrServer and you correctly balance them with a 
> Round Robin or something similar you´ll get a better performance in SolrCloud scenarios.
> At least is what I´ve read in the documentation, and also I asked to 
> Mark Miller some months ago when I started dealing with Solr 4.0-BETA.
>
> 3. I´m almost convinced that the problem is related with:
>
> - Zookeeper ensemble configuration.
> - Zookeeper version (3.4.5) is not compatible with Solr 4.1. expected one.
> - SolrJ Zookeeper driver.
>
> In short, all my architecture works perfectly with search operations. 
> Also I´ve got another NRT Indexer module that deals with 
> CloudSolrServer and works perfectly. But after two, three days, 
> something happens with Zookeeper - CloudSolrServer connection, and 
> tries to update cluster status forever with no success. Only after 
> Zookeeper + SolrCloud leader&replica shards restart the problem is solved.
>
>
> 2013/3/19 Michael Della Bitta <mi...@appinions.com>
>
>> Don't use CloudSolrServer for writes. Instead, use 
>> ConcurrentUpdateSolrServer, something like:
>>
>> SolrServer solrServer = new ConcurrentUpdateSolrServer(solrUrl, 100, 
>> 4);
>>
>> The 100 corresponds to how many docs to send in a batch. The higher 
>> this is, the better performance is (to a point, don't set that to 50k 
>> or anything).
>>
>> The 4 corresponds to the number of threads that will be sending batches.
>>
>> Note that this class doesn't report errors, so if you want to see 
>> exceptions when bad things happen, you'll have to override 
>> handleError(Throwable ex) method.
>>
>> Here's the javadoc for the class:
>>
>> http://lucene.apache.org/solr/4_2_0/solr-solrj/org/apache/solr/client
>> /solrj/impl/ConcurrentUpdateSolrServer.html
>>
>> It'd be best if you can use a load balancer in front of your Solr 
>> Cloud and use that as the solrUrl parameter.
>>
>> ***Either way, though, Mark is right in that you need to diagnose why 
>> you're only able to do a few documents per second first.*** Adding 
>> more threads at this point is probably not going to help.
>>
>> Michael Della Bitta
>>
>> ------------------------------------------------
>> Appinions
>> 18 East 41st Street, 2nd Floor
>> New York, NY 10017-6271
>>
>> www.appinions.com
>>
>> Where Influence Isn’t a Game
>>
>>
>> On Tue, Mar 19, 2013 at 3:57 PM, Luis Cappa Banda 
>> <lu...@gmail.com>
>> wrote:
>> > Anyone can help me? Each response may save a little kitten from a
>> horrible
>> > and dramatic  death somewhere in the world :-P El 15/03/2013 21:06, 
>> > "Jack Park" <ja...@topicquests.org> escribió:
>> >
>> >> Is there a document that tells how to create multiple threads? 
>> >> Search returns many hits which orbit this idea, but I haven't 
>> >> spotted one which tells how.
>> >>
>> >> Thanks
>> >> Jack
>> >>
>> >> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller 
>> >> <ma...@gmail.com>
>> >> wrote:
>> >> > You def have to use multiple threads with it for it to be fast, 
>> >> > but 3
>> or
>> >> 4 docs a second still sounds absurdly slow.
>> >> >
>> >> > - Mark
>> >> >
>> >> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda 
>> >> > <lu...@gmail.com>
>> >> wrote:
>> >> >
>> >> >> And up! :-)
>> >> >>
>> >> >> I´ve been wondering if using CloudSolrServer has something to 
>> >> >> do
>> here.
>> >> Does
>> >> >> it have a bad performance when a CloudSolrServer singletong 
>> >> >> receives multiple queries? Is it recommended to have a 
>> >> >> CloudSolrServer
>> instances
>> >> >> list and select one of them with a Round Robin criteria?
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2013/3/14 Luis Cappa Banda <lu...@gmail.com>
>> >> >>
>> >> >>> Hello!
>> >> >>>
>> >> >>> Thanks a lot, Erick! I've attached some stack traces during a 
>> >> >>> normal 'engine' running.
>> >> >>>
>> >> >>> Cheers,
>> >> >>>
>> >> >>> - Luis Cappa
>> >> >>>
>> >> >>>
>> >> >>> 2013/3/13 Erick Erickson <er...@gmail.com>
>> >> >>>
>> >> >>>> Stack traces..
>> >> >>>>
>> >> >>>> First,
>> >> >>>> jps -l
>> >> >>>>
>> >> >>>> that will give you a the process IDs of your running Java
>> processes.
>> >> Then:
>> >> >>>>
>> >> >>>> jstack <pid from above>
>> >> >>>>
>> >> >>>> Usually I pipe the output from jstack into a text file...
>> >> >>>>
>> >> >>>> Best
>> >> >>>> Erick
>> >> >>>>
>> >> >>>>
>> >> >>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <
>> >> luiscappa@gmail.com
>> >> >>>>> wrote:
>> >> >>>>
>> >> >>>>> Uhm, how can I do that... 'cleanly'? I know that with 
>> >> >>>>> JConsole
>> it´s
>> >> >>>> posible
>> >> >>>>> to output this traces, but with a .war application built on 
>> >> >>>>> top of
>> >> >>>> Spring I
>> >> >>>>> don´t know how can I do that. In any case, here is my
>> CloudSolrServer
>> >> >>>>> wrapper that is used by other classes. There is no sync 
>> >> >>>>> method or
>> >> piece
>> >> >>>> of
>> >> >>>>> code:
>> >> >>>>>
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
>> >> >>>>> - - -
>> - -
>> >> >>>> - -
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
>> >> >>>>> - - -
>> - -
>> >> >>>>>
>> >> >>>>> *public class BinaryLBHttpSolrServer extends 
>> >> >>>>> LBHttpSolrServer {*
>> >> >>>>>
>> >> >>>>> private static final long serialVersionUID = 3905956120804659445L;
>> >> >>>>>    public BinaryLBHttpSolrServer(String[] endpoints) throws 
>> >> >>>>> MalformedURLException {
>> >> >>>>>    super(endpoints);
>> >> >>>>>    }
>> >> >>>>>
>> >> >>>>>    @Override
>> >> >>>>>    protected HttpSolrServer makeServer(String server) throws 
>> >> >>>>> MalformedURLException {
>> >> >>>>>        HttpSolrServer solrServer = super.makeServer(server);
>> >> >>>>>        solrServer.setRequestWriter(new BinaryRequestWriter());
>> >> >>>>>        return solrServer;
>> >> >>>>>    }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
>> >> >>>>> - - -
>> - -
>> >> >>>> - -
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - 
>> >> >>>>> - - -
>> - -
>> >> >>>>>
>> >> >>>>> *public class CloudSolrHttpServerImpl implements
>> CloudSolrHttpServer
>> >> {*
>> >> >>>>> private CloudSolrServer cloudSolrServer;
>> >> >>>>>
>> >> >>>>> private Logger log =
>> Logger.getLogger(CloudSolrHttpServerImpl.class);
>> >> >>>>>
>> >> >>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, 
>> >> >>>>> String[] endpoints, int clientTimeout, int connectTimeout, 
>> >> >>>>> String cloudCollection) { try { BinaryLBHttpSolrServer 
>> >> >>>>> lbSolrServer = new *BinaryLBHttpSolrServer* (endpoints); 
>> >> >>>>> this.cloudSolrServer = new 
>> >> >>>>> CloudSolrServer(zookeeperEndpoints,
>> >> >>>>> lbSolrServer);
>> >> >>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
>> >> >>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
>> >> >>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection);
>> >> >>>>> } catch (MalformedURLException e) { log.error(e); } }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public QueryResponse *search*(SolrQuery query) throws
>> >> >>>> SolrServerException {
>> >> >>>>> return cloudSolrServer.query(query, METHOD.POST); }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public boolean *index*(DocumentBean user) { boolean indexed 
>> >> >>>>> = false; int retries = 0; do { indexed = addBean(user);
>> >> >>>>> retries++;
>> >> >>>>> } while(!indexed && retries<4); return indexed; } @Override 
>> >> >>>>> public boolean *update*(SolrInputDocument updateDoc) { 
>> >> >>>>> boolean update = false; int retries = 0;
>> >> >>>>>
>> >> >>>>> do {
>> >> >>>>> update = addSolrInputDocument(updateDoc);
>> >> >>>>> retries++;
>> >> >>>>> } while(!update && retries<4); return update; } @Override 
>> >> >>>>> public void commit() { try { cloudSolrServer.commit(); } 
>> >> >>>>> catch (SolrServerException e) {
>> >> >>>>>     log.error(e);
>> >> >>>>> } catch (IOException e) {
>> >> >>>>>     log.error(e);
>> >> >>>>> }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public boolean *delete*(String ... ids) { boolean deleted = 
>> >> >>>>> false; List<String> idList = Arrays.asList(ids); try { 
>> >> >>>>> this.cloudSolrServer.deleteById(idList);
>> >> >>>>> this.cloudSolrServer.commit(true, true); deleted = true;
>> >> >>>>>
>> >> >>>>> } catch (SolrServerException e) { log.error(e);
>> >> >>>>>
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> return deleted;
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public void *optimize*() {
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.optimize(); } catch 
>> >> >>>>> (SolrServerException e) { log.error(e); } catch (IOException 
>> >> >>>>> e) { log.error(e); } }
>> >> >>>>> /*
>> >> >>>>> * ********************
>> >> >>>>> *  Getters & setters *
>> >> >>>>> * ********************
>> >> >>>>> * */
>> >> >>>>> public CloudSolrServer getSolrServer() { return 
>> >> >>>>> cloudSolrServer; }
>> >> >>>>>
>> >> >>>>> public void setSolrServer(CloudSolrServer solrServer) { 
>> >> >>>>> this.cloudSolrServer = solrServer; }
>> >> >>>>>
>> >> >>>>> private boolean addBean(DocumentBean user) { boolean added = 
>> >> >>>>> false; try { this.cloudSolrServer.addBean(user, 100); 
>> >> >>>>> this.commit();
>> >> >>>>>
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>>
>> >> >>>>> } catch (SolrServerException e) { log.error(e); 
>> >> >>>>> }catch(SolrException e) { log.error(e); } return added; } 
>> >> >>>>> private boolean addSolrInputDocument(SolrInputDocument 
>> >> >>>>> updateDoc)
>> {
>> >> >>>>> boolean added = false;
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.add(updateDoc, 100); this.commit(); 
>> >> >>>>> added = true; } catch (IOException e) { log.error(e);
>> >> >>>>>
>> >> >>>>> } catch (SolrServerException e) { log.error(e); 
>> >> >>>>> }catch(SolrException e) { log.error(e); } return added; } }
>> >> >>>>>
>> >> >>>>> Thank you very much, Mark.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> -  Luis Cappa
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> And
>> >> >>>>> 2013/3/13 Mark Miller <ma...@gmail.com>
>> >> >>>>>
>> >> >>>>>>
>> >> >>>>>> Could you capture some thread stack traces in the 'engine' 
>> >> >>>>>> and
>> see
>> >> if
>> >> >>>>>> there are any blocking methods?
>> >> >>>>>>
>> >> >>>>>> - Mark
>> >> >>>>>>
>> >> >>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <
>> luiscappa@gmail.com>
>> >> >>>>> wrote:
>> >> >>>>>>
>> >> >>>>>>> Just one correction:
>> >> >>>>>>>
>> >> >>>>>>> When I said:
>> >> >>>>>>>
>> >> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >> >>>>>>>  everything is green, and I cant execute queries directly 
>> >> >>>>>>> into
>> >> >>>> Solr.
>> >> >>>>>>>
>> >> >>>>>>> I mean:
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >> >>>>>>>  everything is green, and *I can* execute queries directly 
>> >> >>>>>>> into
>> >> >>>> Solr.
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> Thanks!
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> - Luis Cappa
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
>> >> >>>>>>>
>> >> >>>>>>>> Hello, guys!
>> >> >>>>>>>>
>> >> >>>>>>>> I´ve been experiencing some annoying behavior with my 
>> >> >>>>>>>> current
>> >> >>>>> production
>> >> >>>>>>>> scenario. Here is the snapshot:
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>  - SolrCloud: 2 shards
>> >> >>>>>>>>  - Zookeeper ensemble: 3 nodes in *different machines 
>> >> >>>>>>>> *(most of
>> >> >>>> the
>> >> >>>>>>>>  tutorials installs 3 Zookeeper nodes in the same machine).
>> >> >>>>>>>>  - This is the zoo.cfg from every
>> >> >>>>>>>>
>> >> >>>>>>>> tickTime=2000  // I´ve also tried with 60000
>> >> >>>>>>>>
>> >> >>>>>>>> initLimit=10
>> >> >>>>>>>>
>> >> >>>>>>>> syncLimit=5
>> >> >>>>>>>>
>> >> >>>>>>>> dataDir=/var/lib/zookeeper
>> >> >>>>>>>>
>> >> >>>>>>>> clientPort=9000
>> >> >>>>>>>>
>> >> >>>>>>>> server.1=zoohost1:2888:3888
>> >> >>>>>>>>
>> >> >>>>>>>> server.2=zoohost1:2888:3888
>> >> >>>>>>>>
>> >> >>>>>>>> server.3=zoohost1:2888:3888
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>  - I´ve developed a Java Application with a REST API 
>> >> >>>>>>>> (let´s
>> call
>> >> >>>> it *
>> >> >>>>>>>>  engine*) that dispatches queries into SolrCloud. It´s a
>> wrapper
>> >> >>>>> around
>> >> >>>>>>>>  CloudSolrServer, so it´s mandatory to specify some 
>> >> >>>>>>>> Zookeeper
>> >> >>>>>> configuration
>> >> >>>>>>>>  params too. They are loaded dynamically when the 
>> >> >>>>>>>> application
>> is
>> >> >>>>>> deployed in
>> >> >>>>>>>>  a Tomcat server, but the current values that I´m using 
>> >> >>>>>>>> are as
>> >> >>>>> follows:
>> >> >>>>>>>>
>> >> >>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)*
>> >> >>>>>>>>
>> >> >>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)*
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>>
>> >> >>>>>>>> *THE PROBLEM*
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>> Everything goes OK, but after two days more or less (yes, 
>> >> >>>>>>>> I´ve
>> >> >>>> checked
>> >> >>>>>>>> that this behavior occurrs periodically, more or less) 
>> >> >>>>>>>> the
>> *engine
>> >> >>>>>> blocks
>> >> >>>>>>>> * and cannot dispatch any query to SolrCloud.
>> >> >>>>>>>>
>> >> >>>>>>>>  - The *engine *log only outputs "updating Zookeeper..." 
>> >> >>>>>>>> one
>> last
>> >> >>>>> time,
>> >> >>>>>>>>  but never updates.
>> >> >>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >> >>>>>>>>  everything is green, and I cant execute queries directly 
>> >> >>>>>>>> into
>> >> >>>> Solr.
>> >> >>>>>>>>  - So then Solr appears to be OK, so the next step is to
>> restart
>> >> >>>>>> *engine
>> >> >>>>>>>>  but *it again appears "updating Zookeeper...". 
>> >> >>>>>>>> Unfortunately
>> >> >>>> switch
>> >> >>>>>>>>  off + switch on doesn´t work here, :-(
>> >> >>>>>>>>  - I´ve checked too Zookeeper logs and it appears some
>> connection
>> >> >>>> log
>> >> >>>>>>>>  outs, but the ensemble appears to be OK too.
>> >> >>>>>>>>  - *The end: *If I restart Zookeeper one by one, and I 
>> >> >>>>>>>> restart  SolrCloud, plus I restart the engine, the problem is solved.
>> I´m
>> >> >>>>> using
>> >> >>>>>>>>  Amazon AWS as hostage, so I discard connection problems
>> between
>> >> >>>>>> instances.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Does anyone experienced something similar? Can anybody 
>> >> >>>>>>>> shed
>> some
>> >> >>>> light
>> >> >>>>>> on
>> >> >>>>>>>> this problem?
>> >> >>>>>>>>
>> >> >>>>>>>> Thank you very much.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Regards,
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> - Luis Cappa
>> >> >>>>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >
>> >>
>>
>
>
>
> --
> Luis Cappa Banda
>
> *Phone*: (0034) 686 200 375
> *Skype*: luiscappabanda

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Michael Della Bitta <mi...@appinions.com>.

> 2. As far as I know the better SolrJ interface to index with SolrCloud is
CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many instances
of CloudSolrServer and you correctly balance them with a Round Robin or
something similar you´ll get a better performance in SolrCloud scenarios.
At least is what I´ve read in the documentation, and also I asked to Mark
Miller some months ago when I started dealing with Solr 4.0-BETA.

I was told otherwise during Solr Boot Camp.

Michael Della Bitta

------------------------------------------------
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Wed, Mar 20, 2013 at 5:14 AM, Luis Cappa Banda <lu...@gmail.com> wrote:
> Thank you for answering. Some notes:
>
> 1. The Java engine I´ve developed that wrappers SolrJ 4.1  with some
> business logic only executes search queries, not index/update operations,
> so the problem is not related with concurrent updates, or something similar.
>
> 2. As far as I know the better SolrJ interface to index with SolrCloud is
> CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many instances
> of CloudSolrServer and you correctly balance them with a Round Robin or
> something similar you´ll get a better performance in SolrCloud scenarios.
> At least is what I´ve read in the documentation, and also I asked to Mark
> Miller some months ago when I started dealing with Solr 4.0-BETA.
>
> 3. I´m almost convinced that the problem is related with:
>
> - Zookeeper ensemble configuration.
> - Zookeeper version (3.4.5) is not compatible with Solr 4.1. expected one.
> - SolrJ Zookeeper driver.
>
> In short, all my architecture works perfectly with search operations. Also
> I´ve got another NRT Indexer module that deals with CloudSolrServer and
> works perfectly. But after two, three days, something happens with
> Zookeeper - CloudSolrServer connection, and tries to update cluster status
> forever with no success. Only after Zookeeper + SolrCloud leader&replica
> shards restart the problem is solved.
>
>
> 2013/3/19 Michael Della Bitta <mi...@appinions.com>
>
>> Don't use CloudSolrServer for writes. Instead, use
>> ConcurrentUpdateSolrServer, something like:
>>
>> SolrServer solrServer = new ConcurrentUpdateSolrServer(solrUrl, 100, 4);
>>
>> The 100 corresponds to how many docs to send in a batch. The higher
>> this is, the better performance is (to a point, don't set that to 50k
>> or anything).
>>
>> The 4 corresponds to the number of threads that will be sending batches.
>>
>> Note that this class doesn't report errors, so if you want to see
>> exceptions when bad things happen, you'll have to override
>> handleError(Throwable ex) method.
>>
>> Here's the javadoc for the class:
>>
>> http://lucene.apache.org/solr/4_2_0/solr-solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.html
>>
>> It'd be best if you can use a load balancer in front of your Solr
>> Cloud and use that as the solrUrl parameter.
>>
>> ***Either way, though, Mark is right in that you need to diagnose why
>> you're only able to do a few documents per second first.*** Adding
>> more threads at this point is probably not going to help.
>>
>> Michael Della Bitta
>>
>> ------------------------------------------------
>> Appinions
>> 18 East 41st Street, 2nd Floor
>> New York, NY 10017-6271
>>
>> www.appinions.com
>>
>> Where Influence Isn’t a Game
>>
>>
>> On Tue, Mar 19, 2013 at 3:57 PM, Luis Cappa Banda <lu...@gmail.com>
>> wrote:
>> > Anyone can help me? Each response may save a little kitten from a
>> horrible
>> > and dramatic  death somewhere in the world :-P
>> > El 15/03/2013 21:06, "Jack Park" <ja...@topicquests.org> escribió:
>> >
>> >> Is there a document that tells how to create multiple threads? Search
>> >> returns many hits which orbit this idea, but I haven't spotted one
>> >> which tells how.
>> >>
>> >> Thanks
>> >> Jack
>> >>
>> >> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller <ma...@gmail.com>
>> >> wrote:
>> >> > You def have to use multiple threads with it for it to be fast, but 3
>> or
>> >> 4 docs a second still sounds absurdly slow.
>> >> >
>> >> > - Mark
>> >> >
>> >> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda <lu...@gmail.com>
>> >> wrote:
>> >> >
>> >> >> And up! :-)
>> >> >>
>> >> >> I´ve been wondering if using CloudSolrServer has something to do
>> here.
>> >> Does
>> >> >> it have a bad performance when a CloudSolrServer singletong receives
>> >> >> multiple queries? Is it recommended to have a CloudSolrServer
>> instances
>> >> >> list and select one of them with a Round Robin criteria?
>> >> >>
>> >> >>
>> >> >>
>> >> >> 2013/3/14 Luis Cappa Banda <lu...@gmail.com>
>> >> >>
>> >> >>> Hello!
>> >> >>>
>> >> >>> Thanks a lot, Erick! I've attached some stack traces during a normal
>> >> >>> 'engine' running.
>> >> >>>
>> >> >>> Cheers,
>> >> >>>
>> >> >>> - Luis Cappa
>> >> >>>
>> >> >>>
>> >> >>> 2013/3/13 Erick Erickson <er...@gmail.com>
>> >> >>>
>> >> >>>> Stack traces..
>> >> >>>>
>> >> >>>> First,
>> >> >>>> jps -l
>> >> >>>>
>> >> >>>> that will give you a the process IDs of your running Java
>> processes.
>> >> Then:
>> >> >>>>
>> >> >>>> jstack <pid from above>
>> >> >>>>
>> >> >>>> Usually I pipe the output from jstack into a text file...
>> >> >>>>
>> >> >>>> Best
>> >> >>>> Erick
>> >> >>>>
>> >> >>>>
>> >> >>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <
>> >> luiscappa@gmail.com
>> >> >>>>> wrote:
>> >> >>>>
>> >> >>>>> Uhm, how can I do that... 'cleanly'? I know that with JConsole
>> it´s
>> >> >>>> posible
>> >> >>>>> to output this traces, but with a .war application built on top of
>> >> >>>> Spring I
>> >> >>>>> don´t know how can I do that. In any case, here is my
>> CloudSolrServer
>> >> >>>>> wrapper that is used by other classes. There is no sync method or
>> >> piece
>> >> >>>> of
>> >> >>>>> code:
>> >> >>>>>
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> >> >>>> - -
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> >> >>>>>
>> >> >>>>> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
>> >> >>>>>
>> >> >>>>> private static final long serialVersionUID = 3905956120804659445L;
>> >> >>>>>    public BinaryLBHttpSolrServer(String[] endpoints) throws
>> >> >>>>> MalformedURLException {
>> >> >>>>>    super(endpoints);
>> >> >>>>>    }
>> >> >>>>>
>> >> >>>>>    @Override
>> >> >>>>>    protected HttpSolrServer makeServer(String server) throws
>> >> >>>>> MalformedURLException {
>> >> >>>>>        HttpSolrServer solrServer = super.makeServer(server);
>> >> >>>>>        solrServer.setRequestWriter(new BinaryRequestWriter());
>> >> >>>>>        return solrServer;
>> >> >>>>>    }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> >> >>>> - -
>> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> >> >>>>>
>> >> >>>>> *public class CloudSolrHttpServerImpl implements
>> CloudSolrHttpServer
>> >> {*
>> >> >>>>> private CloudSolrServer cloudSolrServer;
>> >> >>>>>
>> >> >>>>> private Logger log =
>> Logger.getLogger(CloudSolrHttpServerImpl.class);
>> >> >>>>>
>> >> >>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
>> >> >>>>> endpoints, int clientTimeout,
>> >> >>>>> int connectTimeout, String cloudCollection) {
>> >> >>>>> try {
>> >> >>>>> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
>> >> >>>>> (endpoints);
>> >> >>>>> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
>> >> >>>>> lbSolrServer);
>> >> >>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
>> >> >>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
>> >> >>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection);
>> >> >>>>> } catch (MalformedURLException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public QueryResponse *search*(SolrQuery query) throws
>> >> >>>> SolrServerException {
>> >> >>>>> return cloudSolrServer.query(query, METHOD.POST);
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public boolean *index*(DocumentBean user) {
>> >> >>>>> boolean indexed = false;
>> >> >>>>> int retries = 0;
>> >> >>>>> do {
>> >> >>>>> indexed = addBean(user);
>> >> >>>>> retries++;
>> >> >>>>> } while(!indexed && retries<4);
>> >> >>>>> return indexed;
>> >> >>>>> }
>> >> >>>>> @Override
>> >> >>>>> public boolean *update*(SolrInputDocument updateDoc) {
>> >> >>>>> boolean update = false;
>> >> >>>>> int retries = 0;
>> >> >>>>>
>> >> >>>>> do {
>> >> >>>>> update = addSolrInputDocument(updateDoc);
>> >> >>>>> retries++;
>> >> >>>>> } while(!update && retries<4);
>> >> >>>>> return update;
>> >> >>>>> }
>> >> >>>>> @Override
>> >> >>>>> public void commit() {
>> >> >>>>> try {
>> >> >>>>> cloudSolrServer.commit();
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>>     log.error(e);
>> >> >>>>> } catch (IOException e) {
>> >> >>>>>     log.error(e);
>> >> >>>>> }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public boolean *delete*(String ... ids) {
>> >> >>>>> boolean deleted = false;
>> >> >>>>> List<String> idList = Arrays.asList(ids);
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.deleteById(idList);
>> >> >>>>> this.cloudSolrServer.commit(true, true);
>> >> >>>>> deleted = true;
>> >> >>>>>
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>> log.error(e);
>> >> >>>>>
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> return deleted;
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> @Override
>> >> >>>>> public void *optimize*() {
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.optimize();
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> }
>> >> >>>>> /*
>> >> >>>>> * ********************
>> >> >>>>> *  Getters & setters *
>> >> >>>>> * ********************
>> >> >>>>> * */
>> >> >>>>> public CloudSolrServer getSolrServer() {
>> >> >>>>> return cloudSolrServer;
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> public void setSolrServer(CloudSolrServer solrServer) {
>> >> >>>>> this.cloudSolrServer = solrServer;
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> private boolean addBean(DocumentBean user) {
>> >> >>>>> boolean added = false;
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.addBean(user, 100);
>> >> >>>>> this.commit();
>> >> >>>>>
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>>
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }catch(SolrException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> return added;
>> >> >>>>> }
>> >> >>>>> private boolean addSolrInputDocument(SolrInputDocument updateDoc)
>> {
>> >> >>>>> boolean added = false;
>> >> >>>>> try {
>> >> >>>>> this.cloudSolrServer.add(updateDoc, 100);
>> >> >>>>> this.commit();
>> >> >>>>> added = true;
>> >> >>>>> } catch (IOException e) {
>> >> >>>>> log.error(e);
>> >> >>>>>
>> >> >>>>> } catch (SolrServerException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }catch(SolrException e) {
>> >> >>>>> log.error(e);
>> >> >>>>> }
>> >> >>>>> return added;
>> >> >>>>> }
>> >> >>>>> }
>> >> >>>>>
>> >> >>>>> Thank you very much, Mark.
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> -  Luis Cappa
>> >> >>>>>
>> >> >>>>>
>> >> >>>>>
>> >> >>>>> And
>> >> >>>>> 2013/3/13 Mark Miller <ma...@gmail.com>
>> >> >>>>>
>> >> >>>>>>
>> >> >>>>>> Could you capture some thread stack traces in the 'engine' and
>> see
>> >> if
>> >> >>>>>> there are any blocking methods?
>> >> >>>>>>
>> >> >>>>>> - Mark
>> >> >>>>>>
>> >> >>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <
>> luiscappa@gmail.com>
>> >> >>>>> wrote:
>> >> >>>>>>
>> >> >>>>>>> Just one correction:
>> >> >>>>>>>
>> >> >>>>>>> When I said:
>> >> >>>>>>>
>> >> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >> >>>>>>>  everything is green, and I cant execute queries directly into
>> >> >>>> Solr.
>> >> >>>>>>>
>> >> >>>>>>> I mean:
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >> >>>>>>>  everything is green, and *I can* execute queries directly into
>> >> >>>> Solr.
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> Thanks!
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> - Luis Cappa
>> >> >>>>>>>
>> >> >>>>>>>
>> >> >>>>>>> 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
>> >> >>>>>>>
>> >> >>>>>>>> Hello, guys!
>> >> >>>>>>>>
>> >> >>>>>>>> I´ve been experiencing some annoying behavior with my current
>> >> >>>>> production
>> >> >>>>>>>> scenario. Here is the snapshot:
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>  - SolrCloud: 2 shards
>> >> >>>>>>>>  - Zookeeper ensemble: 3 nodes in *different machines *(most of
>> >> >>>> the
>> >> >>>>>>>>  tutorials installs 3 Zookeeper nodes in the same machine).
>> >> >>>>>>>>  - This is the zoo.cfg from every
>> >> >>>>>>>>
>> >> >>>>>>>> tickTime=2000  // I´ve also tried with 60000
>> >> >>>>>>>>
>> >> >>>>>>>> initLimit=10
>> >> >>>>>>>>
>> >> >>>>>>>> syncLimit=5
>> >> >>>>>>>>
>> >> >>>>>>>> dataDir=/var/lib/zookeeper
>> >> >>>>>>>>
>> >> >>>>>>>> clientPort=9000
>> >> >>>>>>>>
>> >> >>>>>>>> server.1=zoohost1:2888:3888
>> >> >>>>>>>>
>> >> >>>>>>>> server.2=zoohost1:2888:3888
>> >> >>>>>>>>
>> >> >>>>>>>> server.3=zoohost1:2888:3888
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>>  - I´ve developed a Java Application with a REST API (let´s
>> call
>> >> >>>> it *
>> >> >>>>>>>>  engine*) that dispatches queries into SolrCloud. It´s a
>> wrapper
>> >> >>>>> around
>> >> >>>>>>>>  CloudSolrServer, so it´s mandatory to specify some Zookeeper
>> >> >>>>>> configuration
>> >> >>>>>>>>  params too. They are loaded dynamically when the application
>> is
>> >> >>>>>> deployed in
>> >> >>>>>>>>  a Tomcat server, but the current values that I´m using are as
>> >> >>>>> follows:
>> >> >>>>>>>>
>> >> >>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)*
>> >> >>>>>>>>
>> >> >>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)*
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>>
>> >> >>>>>>>> *THE PROBLEM*
>> >> >>>>>>>> *
>> >> >>>>>>>> *
>> >> >>>>>>>> Everything goes OK, but after two days more or less (yes, I´ve
>> >> >>>> checked
>> >> >>>>>>>> that this behavior occurrs periodically, more or less) the
>> *engine
>> >> >>>>>> blocks
>> >> >>>>>>>> * and cannot dispatch any query to SolrCloud.
>> >> >>>>>>>>
>> >> >>>>>>>>  - The *engine *log only outputs "updating Zookeeper..." one
>> last
>> >> >>>>> time,
>> >> >>>>>>>>  but never updates.
>> >> >>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >> >>>>>>>>  everything is green, and I cant execute queries directly into
>> >> >>>> Solr.
>> >> >>>>>>>>  - So then Solr appears to be OK, so the next step is to
>> restart
>> >> >>>>>> *engine
>> >> >>>>>>>>  but *it again appears "updating Zookeeper...". Unfortunately
>> >> >>>> switch
>> >> >>>>>>>>  off + switch on doesn´t work here, :-(
>> >> >>>>>>>>  - I´ve checked too Zookeeper logs and it appears some
>> connection
>> >> >>>> log
>> >> >>>>>>>>  outs, but the ensemble appears to be OK too.
>> >> >>>>>>>>  - *The end: *If I restart Zookeeper one by one, and I restart
>> >> >>>>>>>>  SolrCloud, plus I restart the engine, the problem is solved.
>> I´m
>> >> >>>>> using
>> >> >>>>>>>>  Amazon AWS as hostage, so I discard connection problems
>> between
>> >> >>>>>> instances.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Does anyone experienced something similar? Can anybody shed
>> some
>> >> >>>> light
>> >> >>>>>> on
>> >> >>>>>>>> this problem?
>> >> >>>>>>>>
>> >> >>>>>>>> Thank you very much.
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> Regards,
>> >> >>>>>>>>
>> >> >>>>>>>>
>> >> >>>>>>>> - Luis Cappa
>> >> >>>>>>>>
>> >> >>>>>>
>> >> >>>>>>
>> >> >>>>>
>> >> >>>>
>> >> >>>
>> >> >>>
>> >> >
>> >>
>>
>
>
>
> --
> Luis Cappa Banda
>
> *Phone*: (0034) 686 200 375
> *Skype*: luiscappabanda

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Luis Cappa Banda <lu...@gmail.com>.

Thank you for answering. Some notes:

1. The Java engine I´ve developed that wrappers SolrJ 4.1  with some
business logic only executes search queries, not index/update operations,
so the problem is not related with concurrent updates, or something similar.

2. As far as I know the better SolrJ interface to index with SolrCloud is
CloudSolrServer, not ConcurrentUpdateSolrServer. If you have many instances
of CloudSolrServer and you correctly balance them with a Round Robin or
something similar you´ll get a better performance in SolrCloud scenarios.
At least is what I´ve read in the documentation, and also I asked to Mark
Miller some months ago when I started dealing with Solr 4.0-BETA.

3. I´m almost convinced that the problem is related with:

- Zookeeper ensemble configuration.
- Zookeeper version (3.4.5) is not compatible with Solr 4.1. expected one.
- SolrJ Zookeeper driver.

In short, all my architecture works perfectly with search operations. Also
I´ve got another NRT Indexer module that deals with CloudSolrServer and
works perfectly. But after two, three days, something happens with
Zookeeper - CloudSolrServer connection, and tries to update cluster status
forever with no success. Only after Zookeeper + SolrCloud leader&replica
shards restart the problem is solved.


2013/3/19 Michael Della Bitta <mi...@appinions.com>

> Don't use CloudSolrServer for writes. Instead, use
> ConcurrentUpdateSolrServer, something like:
>
> SolrServer solrServer = new ConcurrentUpdateSolrServer(solrUrl, 100, 4);
>
> The 100 corresponds to how many docs to send in a batch. The higher
> this is, the better performance is (to a point, don't set that to 50k
> or anything).
>
> The 4 corresponds to the number of threads that will be sending batches.
>
> Note that this class doesn't report errors, so if you want to see
> exceptions when bad things happen, you'll have to override
> handleError(Throwable ex) method.
>
> Here's the javadoc for the class:
>
> http://lucene.apache.org/solr/4_2_0/solr-solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.html
>
> It'd be best if you can use a load balancer in front of your Solr
> Cloud and use that as the solrUrl parameter.
>
> ***Either way, though, Mark is right in that you need to diagnose why
> you're only able to do a few documents per second first.*** Adding
> more threads at this point is probably not going to help.
>
> Michael Della Bitta
>
> ------------------------------------------------
> Appinions
> 18 East 41st Street, 2nd Floor
> New York, NY 10017-6271
>
> www.appinions.com
>
> Where Influence Isn’t a Game
>
>
> On Tue, Mar 19, 2013 at 3:57 PM, Luis Cappa Banda <lu...@gmail.com>
> wrote:
> > Anyone can help me? Each response may save a little kitten from a
> horrible
> > and dramatic  death somewhere in the world :-P
> > El 15/03/2013 21:06, "Jack Park" <ja...@topicquests.org> escribió:
> >
> >> Is there a document that tells how to create multiple threads? Search
> >> returns many hits which orbit this idea, but I haven't spotted one
> >> which tells how.
> >>
> >> Thanks
> >> Jack
> >>
> >> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller <ma...@gmail.com>
> >> wrote:
> >> > You def have to use multiple threads with it for it to be fast, but 3
> or
> >> 4 docs a second still sounds absurdly slow.
> >> >
> >> > - Mark
> >> >
> >> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda <lu...@gmail.com>
> >> wrote:
> >> >
> >> >> And up! :-)
> >> >>
> >> >> I´ve been wondering if using CloudSolrServer has something to do
> here.
> >> Does
> >> >> it have a bad performance when a CloudSolrServer singletong receives
> >> >> multiple queries? Is it recommended to have a CloudSolrServer
> instances
> >> >> list and select one of them with a Round Robin criteria?
> >> >>
> >> >>
> >> >>
> >> >> 2013/3/14 Luis Cappa Banda <lu...@gmail.com>
> >> >>
> >> >>> Hello!
> >> >>>
> >> >>> Thanks a lot, Erick! I've attached some stack traces during a normal
> >> >>> 'engine' running.
> >> >>>
> >> >>> Cheers,
> >> >>>
> >> >>> - Luis Cappa
> >> >>>
> >> >>>
> >> >>> 2013/3/13 Erick Erickson <er...@gmail.com>
> >> >>>
> >> >>>> Stack traces..
> >> >>>>
> >> >>>> First,
> >> >>>> jps -l
> >> >>>>
> >> >>>> that will give you a the process IDs of your running Java
> processes.
> >> Then:
> >> >>>>
> >> >>>> jstack <pid from above>
> >> >>>>
> >> >>>> Usually I pipe the output from jstack into a text file...
> >> >>>>
> >> >>>> Best
> >> >>>> Erick
> >> >>>>
> >> >>>>
> >> >>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <
> >> luiscappa@gmail.com
> >> >>>>> wrote:
> >> >>>>
> >> >>>>> Uhm, how can I do that... 'cleanly'? I know that with JConsole
> it´s
> >> >>>> posible
> >> >>>>> to output this traces, but with a .war application built on top of
> >> >>>> Spring I
> >> >>>>> don´t know how can I do that. In any case, here is my
> CloudSolrServer
> >> >>>>> wrapper that is used by other classes. There is no sync method or
> >> piece
> >> >>>> of
> >> >>>>> code:
> >> >>>>>
> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - -
> >> >>>> - -
> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - -
> >> >>>>>
> >> >>>>> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
> >> >>>>>
> >> >>>>> private static final long serialVersionUID = 3905956120804659445L;
> >> >>>>>    public BinaryLBHttpSolrServer(String[] endpoints) throws
> >> >>>>> MalformedURLException {
> >> >>>>>    super(endpoints);
> >> >>>>>    }
> >> >>>>>
> >> >>>>>    @Override
> >> >>>>>    protected HttpSolrServer makeServer(String server) throws
> >> >>>>> MalformedURLException {
> >> >>>>>        HttpSolrServer solrServer = super.makeServer(server);
> >> >>>>>        solrServer.setRequestWriter(new BinaryRequestWriter());
> >> >>>>>        return solrServer;
> >> >>>>>    }
> >> >>>>> }
> >> >>>>>
> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - -
> >> >>>> - -
> >> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - -
> >> >>>>>
> >> >>>>> *public class CloudSolrHttpServerImpl implements
> CloudSolrHttpServer
> >> {*
> >> >>>>> private CloudSolrServer cloudSolrServer;
> >> >>>>>
> >> >>>>> private Logger log =
> Logger.getLogger(CloudSolrHttpServerImpl.class);
> >> >>>>>
> >> >>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
> >> >>>>> endpoints, int clientTimeout,
> >> >>>>> int connectTimeout, String cloudCollection) {
> >> >>>>> try {
> >> >>>>> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
> >> >>>>> (endpoints);
> >> >>>>> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
> >> >>>>> lbSolrServer);
> >> >>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
> >> >>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
> >> >>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection);
> >> >>>>> } catch (MalformedURLException e) {
> >> >>>>> log.error(e);
> >> >>>>> }
> >> >>>>> }
> >> >>>>>
> >> >>>>> @Override
> >> >>>>> public QueryResponse *search*(SolrQuery query) throws
> >> >>>> SolrServerException {
> >> >>>>> return cloudSolrServer.query(query, METHOD.POST);
> >> >>>>> }
> >> >>>>>
> >> >>>>> @Override
> >> >>>>> public boolean *index*(DocumentBean user) {
> >> >>>>> boolean indexed = false;
> >> >>>>> int retries = 0;
> >> >>>>> do {
> >> >>>>> indexed = addBean(user);
> >> >>>>> retries++;
> >> >>>>> } while(!indexed && retries<4);
> >> >>>>> return indexed;
> >> >>>>> }
> >> >>>>> @Override
> >> >>>>> public boolean *update*(SolrInputDocument updateDoc) {
> >> >>>>> boolean update = false;
> >> >>>>> int retries = 0;
> >> >>>>>
> >> >>>>> do {
> >> >>>>> update = addSolrInputDocument(updateDoc);
> >> >>>>> retries++;
> >> >>>>> } while(!update && retries<4);
> >> >>>>> return update;
> >> >>>>> }
> >> >>>>> @Override
> >> >>>>> public void commit() {
> >> >>>>> try {
> >> >>>>> cloudSolrServer.commit();
> >> >>>>> } catch (SolrServerException e) {
> >> >>>>>     log.error(e);
> >> >>>>> } catch (IOException e) {
> >> >>>>>     log.error(e);
> >> >>>>> }
> >> >>>>> }
> >> >>>>>
> >> >>>>> @Override
> >> >>>>> public boolean *delete*(String ... ids) {
> >> >>>>> boolean deleted = false;
> >> >>>>> List<String> idList = Arrays.asList(ids);
> >> >>>>> try {
> >> >>>>> this.cloudSolrServer.deleteById(idList);
> >> >>>>> this.cloudSolrServer.commit(true, true);
> >> >>>>> deleted = true;
> >> >>>>>
> >> >>>>> } catch (SolrServerException e) {
> >> >>>>> log.error(e);
> >> >>>>>
> >> >>>>> } catch (IOException e) {
> >> >>>>> log.error(e);
> >> >>>>> }
> >> >>>>> return deleted;
> >> >>>>> }
> >> >>>>>
> >> >>>>> @Override
> >> >>>>> public void *optimize*() {
> >> >>>>> try {
> >> >>>>> this.cloudSolrServer.optimize();
> >> >>>>> } catch (SolrServerException e) {
> >> >>>>> log.error(e);
> >> >>>>> } catch (IOException e) {
> >> >>>>> log.error(e);
> >> >>>>> }
> >> >>>>> }
> >> >>>>> /*
> >> >>>>> * ********************
> >> >>>>> *  Getters & setters *
> >> >>>>> * ********************
> >> >>>>> * */
> >> >>>>> public CloudSolrServer getSolrServer() {
> >> >>>>> return cloudSolrServer;
> >> >>>>> }
> >> >>>>>
> >> >>>>> public void setSolrServer(CloudSolrServer solrServer) {
> >> >>>>> this.cloudSolrServer = solrServer;
> >> >>>>> }
> >> >>>>>
> >> >>>>> private boolean addBean(DocumentBean user) {
> >> >>>>> boolean added = false;
> >> >>>>> try {
> >> >>>>> this.cloudSolrServer.addBean(user, 100);
> >> >>>>> this.commit();
> >> >>>>>
> >> >>>>> } catch (IOException e) {
> >> >>>>> log.error(e);
> >> >>>>>
> >> >>>>> } catch (SolrServerException e) {
> >> >>>>> log.error(e);
> >> >>>>> }catch(SolrException e) {
> >> >>>>> log.error(e);
> >> >>>>> }
> >> >>>>> return added;
> >> >>>>> }
> >> >>>>> private boolean addSolrInputDocument(SolrInputDocument updateDoc)
> {
> >> >>>>> boolean added = false;
> >> >>>>> try {
> >> >>>>> this.cloudSolrServer.add(updateDoc, 100);
> >> >>>>> this.commit();
> >> >>>>> added = true;
> >> >>>>> } catch (IOException e) {
> >> >>>>> log.error(e);
> >> >>>>>
> >> >>>>> } catch (SolrServerException e) {
> >> >>>>> log.error(e);
> >> >>>>> }catch(SolrException e) {
> >> >>>>> log.error(e);
> >> >>>>> }
> >> >>>>> return added;
> >> >>>>> }
> >> >>>>> }
> >> >>>>>
> >> >>>>> Thank you very much, Mark.
> >> >>>>>
> >> >>>>>
> >> >>>>> -  Luis Cappa
> >> >>>>>
> >> >>>>>
> >> >>>>>
> >> >>>>> And
> >> >>>>> 2013/3/13 Mark Miller <ma...@gmail.com>
> >> >>>>>
> >> >>>>>>
> >> >>>>>> Could you capture some thread stack traces in the 'engine' and
> see
> >> if
> >> >>>>>> there are any blocking methods?
> >> >>>>>>
> >> >>>>>> - Mark
> >> >>>>>>
> >> >>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <
> luiscappa@gmail.com>
> >> >>>>> wrote:
> >> >>>>>>
> >> >>>>>>> Just one correction:
> >> >>>>>>>
> >> >>>>>>> When I said:
> >> >>>>>>>
> >> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >> >>>>>>>  everything is green, and I cant execute queries directly into
> >> >>>> Solr.
> >> >>>>>>>
> >> >>>>>>> I mean:
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >> >>>>>>>  everything is green, and *I can* execute queries directly into
> >> >>>> Solr.
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> Thanks!
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> - Luis Cappa
> >> >>>>>>>
> >> >>>>>>>
> >> >>>>>>> 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
> >> >>>>>>>
> >> >>>>>>>> Hello, guys!
> >> >>>>>>>>
> >> >>>>>>>> I´ve been experiencing some annoying behavior with my current
> >> >>>>> production
> >> >>>>>>>> scenario. Here is the snapshot:
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>  - SolrCloud: 2 shards
> >> >>>>>>>>  - Zookeeper ensemble: 3 nodes in *different machines *(most of
> >> >>>> the
> >> >>>>>>>>  tutorials installs 3 Zookeeper nodes in the same machine).
> >> >>>>>>>>  - This is the zoo.cfg from every
> >> >>>>>>>>
> >> >>>>>>>> tickTime=2000  // I´ve also tried with 60000
> >> >>>>>>>>
> >> >>>>>>>> initLimit=10
> >> >>>>>>>>
> >> >>>>>>>> syncLimit=5
> >> >>>>>>>>
> >> >>>>>>>> dataDir=/var/lib/zookeeper
> >> >>>>>>>>
> >> >>>>>>>> clientPort=9000
> >> >>>>>>>>
> >> >>>>>>>> server.1=zoohost1:2888:3888
> >> >>>>>>>>
> >> >>>>>>>> server.2=zoohost1:2888:3888
> >> >>>>>>>>
> >> >>>>>>>> server.3=zoohost1:2888:3888
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>>  - I´ve developed a Java Application with a REST API (let´s
> call
> >> >>>> it *
> >> >>>>>>>>  engine*) that dispatches queries into SolrCloud. It´s a
> wrapper
> >> >>>>> around
> >> >>>>>>>>  CloudSolrServer, so it´s mandatory to specify some Zookeeper
> >> >>>>>> configuration
> >> >>>>>>>>  params too. They are loaded dynamically when the application
> is
> >> >>>>>> deployed in
> >> >>>>>>>>  a Tomcat server, but the current values that I´m using are as
> >> >>>>> follows:
> >> >>>>>>>>
> >> >>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)*
> >> >>>>>>>>
> >> >>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)*
> >> >>>>>>>> *
> >> >>>>>>>> *
> >> >>>>>>>> *
> >> >>>>>>>> *
> >> >>>>>>>>
> >> >>>>>>>> *THE PROBLEM*
> >> >>>>>>>> *
> >> >>>>>>>> *
> >> >>>>>>>> Everything goes OK, but after two days more or less (yes, I´ve
> >> >>>> checked
> >> >>>>>>>> that this behavior occurrs periodically, more or less) the
> *engine
> >> >>>>>> blocks
> >> >>>>>>>> * and cannot dispatch any query to SolrCloud.
> >> >>>>>>>>
> >> >>>>>>>>  - The *engine *log only outputs "updating Zookeeper..." one
> last
> >> >>>>> time,
> >> >>>>>>>>  but never updates.
> >> >>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >> >>>>>>>>  everything is green, and I cant execute queries directly into
> >> >>>> Solr.
> >> >>>>>>>>  - So then Solr appears to be OK, so the next step is to
> restart
> >> >>>>>> *engine
> >> >>>>>>>>  but *it again appears "updating Zookeeper...". Unfortunately
> >> >>>> switch
> >> >>>>>>>>  off + switch on doesn´t work here, :-(
> >> >>>>>>>>  - I´ve checked too Zookeeper logs and it appears some
> connection
> >> >>>> log
> >> >>>>>>>>  outs, but the ensemble appears to be OK too.
> >> >>>>>>>>  - *The end: *If I restart Zookeeper one by one, and I restart
> >> >>>>>>>>  SolrCloud, plus I restart the engine, the problem is solved.
> I´m
> >> >>>>> using
> >> >>>>>>>>  Amazon AWS as hostage, so I discard connection problems
> between
> >> >>>>>> instances.
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> Does anyone experienced something similar? Can anybody shed
> some
> >> >>>> light
> >> >>>>>> on
> >> >>>>>>>> this problem?
> >> >>>>>>>>
> >> >>>>>>>> Thank you very much.
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> Regards,
> >> >>>>>>>>
> >> >>>>>>>>
> >> >>>>>>>> - Luis Cappa
> >> >>>>>>>>
> >> >>>>>>
> >> >>>>>>
> >> >>>>>
> >> >>>>
> >> >>>
> >> >>>
> >> >
> >>
>



-- 
Luis Cappa Banda

*Phone*: (0034) 686 200 375
*Skype*: luiscappabanda

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Michael Della Bitta <mi...@appinions.com>.

Don't use CloudSolrServer for writes. Instead, use
ConcurrentUpdateSolrServer, something like:

SolrServer solrServer = new ConcurrentUpdateSolrServer(solrUrl, 100, 4);

The 100 corresponds to how many docs to send in a batch. The higher
this is, the better performance is (to a point, don't set that to 50k
or anything).

The 4 corresponds to the number of threads that will be sending batches.

Note that this class doesn't report errors, so if you want to see
exceptions when bad things happen, you'll have to override
handleError(Throwable ex) method.

Here's the javadoc for the class:
http://lucene.apache.org/solr/4_2_0/solr-solrj/org/apache/solr/client/solrj/impl/ConcurrentUpdateSolrServer.html

It'd be best if you can use a load balancer in front of your Solr
Cloud and use that as the solrUrl parameter.

***Either way, though, Mark is right in that you need to diagnose why
you're only able to do a few documents per second first.*** Adding
more threads at this point is probably not going to help.

Michael Della Bitta

------------------------------------------------
Appinions
18 East 41st Street, 2nd Floor
New York, NY 10017-6271

www.appinions.com

Where Influence Isn’t a Game


On Tue, Mar 19, 2013 at 3:57 PM, Luis Cappa Banda <lu...@gmail.com> wrote:
> Anyone can help me? Each response may save a little kitten from a horrible
> and dramatic  death somewhere in the world :-P
> El 15/03/2013 21:06, "Jack Park" <ja...@topicquests.org> escribió:
>
>> Is there a document that tells how to create multiple threads? Search
>> returns many hits which orbit this idea, but I haven't spotted one
>> which tells how.
>>
>> Thanks
>> Jack
>>
>> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller <ma...@gmail.com>
>> wrote:
>> > You def have to use multiple threads with it for it to be fast, but 3 or
>> 4 docs a second still sounds absurdly slow.
>> >
>> > - Mark
>> >
>> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda <lu...@gmail.com>
>> wrote:
>> >
>> >> And up! :-)
>> >>
>> >> I´ve been wondering if using CloudSolrServer has something to do here.
>> Does
>> >> it have a bad performance when a CloudSolrServer singletong receives
>> >> multiple queries? Is it recommended to have a CloudSolrServer instances
>> >> list and select one of them with a Round Robin criteria?
>> >>
>> >>
>> >>
>> >> 2013/3/14 Luis Cappa Banda <lu...@gmail.com>
>> >>
>> >>> Hello!
>> >>>
>> >>> Thanks a lot, Erick! I've attached some stack traces during a normal
>> >>> 'engine' running.
>> >>>
>> >>> Cheers,
>> >>>
>> >>> - Luis Cappa
>> >>>
>> >>>
>> >>> 2013/3/13 Erick Erickson <er...@gmail.com>
>> >>>
>> >>>> Stack traces..
>> >>>>
>> >>>> First,
>> >>>> jps -l
>> >>>>
>> >>>> that will give you a the process IDs of your running Java processes.
>> Then:
>> >>>>
>> >>>> jstack <pid from above>
>> >>>>
>> >>>> Usually I pipe the output from jstack into a text file...
>> >>>>
>> >>>> Best
>> >>>> Erick
>> >>>>
>> >>>>
>> >>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <
>> luiscappa@gmail.com
>> >>>>> wrote:
>> >>>>
>> >>>>> Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
>> >>>> posible
>> >>>>> to output this traces, but with a .war application built on top of
>> >>>> Spring I
>> >>>>> don´t know how can I do that. In any case, here is my CloudSolrServer
>> >>>>> wrapper that is used by other classes. There is no sync method or
>> piece
>> >>>> of
>> >>>>> code:
>> >>>>>
>> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> >>>> - -
>> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> >>>>>
>> >>>>> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
>> >>>>>
>> >>>>> private static final long serialVersionUID = 3905956120804659445L;
>> >>>>>    public BinaryLBHttpSolrServer(String[] endpoints) throws
>> >>>>> MalformedURLException {
>> >>>>>    super(endpoints);
>> >>>>>    }
>> >>>>>
>> >>>>>    @Override
>> >>>>>    protected HttpSolrServer makeServer(String server) throws
>> >>>>> MalformedURLException {
>> >>>>>        HttpSolrServer solrServer = super.makeServer(server);
>> >>>>>        solrServer.setRequestWriter(new BinaryRequestWriter());
>> >>>>>        return solrServer;
>> >>>>>    }
>> >>>>> }
>> >>>>>
>> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> >>>> - -
>> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> >>>>>
>> >>>>> *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer
>> {*
>> >>>>> private CloudSolrServer cloudSolrServer;
>> >>>>>
>> >>>>> private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
>> >>>>>
>> >>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
>> >>>>> endpoints, int clientTimeout,
>> >>>>> int connectTimeout, String cloudCollection) {
>> >>>>> try {
>> >>>>> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
>> >>>>> (endpoints);
>> >>>>> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
>> >>>>> lbSolrServer);
>> >>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
>> >>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
>> >>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection);
>> >>>>> } catch (MalformedURLException e) {
>> >>>>> log.error(e);
>> >>>>> }
>> >>>>> }
>> >>>>>
>> >>>>> @Override
>> >>>>> public QueryResponse *search*(SolrQuery query) throws
>> >>>> SolrServerException {
>> >>>>> return cloudSolrServer.query(query, METHOD.POST);
>> >>>>> }
>> >>>>>
>> >>>>> @Override
>> >>>>> public boolean *index*(DocumentBean user) {
>> >>>>> boolean indexed = false;
>> >>>>> int retries = 0;
>> >>>>> do {
>> >>>>> indexed = addBean(user);
>> >>>>> retries++;
>> >>>>> } while(!indexed && retries<4);
>> >>>>> return indexed;
>> >>>>> }
>> >>>>> @Override
>> >>>>> public boolean *update*(SolrInputDocument updateDoc) {
>> >>>>> boolean update = false;
>> >>>>> int retries = 0;
>> >>>>>
>> >>>>> do {
>> >>>>> update = addSolrInputDocument(updateDoc);
>> >>>>> retries++;
>> >>>>> } while(!update && retries<4);
>> >>>>> return update;
>> >>>>> }
>> >>>>> @Override
>> >>>>> public void commit() {
>> >>>>> try {
>> >>>>> cloudSolrServer.commit();
>> >>>>> } catch (SolrServerException e) {
>> >>>>>     log.error(e);
>> >>>>> } catch (IOException e) {
>> >>>>>     log.error(e);
>> >>>>> }
>> >>>>> }
>> >>>>>
>> >>>>> @Override
>> >>>>> public boolean *delete*(String ... ids) {
>> >>>>> boolean deleted = false;
>> >>>>> List<String> idList = Arrays.asList(ids);
>> >>>>> try {
>> >>>>> this.cloudSolrServer.deleteById(idList);
>> >>>>> this.cloudSolrServer.commit(true, true);
>> >>>>> deleted = true;
>> >>>>>
>> >>>>> } catch (SolrServerException e) {
>> >>>>> log.error(e);
>> >>>>>
>> >>>>> } catch (IOException e) {
>> >>>>> log.error(e);
>> >>>>> }
>> >>>>> return deleted;
>> >>>>> }
>> >>>>>
>> >>>>> @Override
>> >>>>> public void *optimize*() {
>> >>>>> try {
>> >>>>> this.cloudSolrServer.optimize();
>> >>>>> } catch (SolrServerException e) {
>> >>>>> log.error(e);
>> >>>>> } catch (IOException e) {
>> >>>>> log.error(e);
>> >>>>> }
>> >>>>> }
>> >>>>> /*
>> >>>>> * ********************
>> >>>>> *  Getters & setters *
>> >>>>> * ********************
>> >>>>> * */
>> >>>>> public CloudSolrServer getSolrServer() {
>> >>>>> return cloudSolrServer;
>> >>>>> }
>> >>>>>
>> >>>>> public void setSolrServer(CloudSolrServer solrServer) {
>> >>>>> this.cloudSolrServer = solrServer;
>> >>>>> }
>> >>>>>
>> >>>>> private boolean addBean(DocumentBean user) {
>> >>>>> boolean added = false;
>> >>>>> try {
>> >>>>> this.cloudSolrServer.addBean(user, 100);
>> >>>>> this.commit();
>> >>>>>
>> >>>>> } catch (IOException e) {
>> >>>>> log.error(e);
>> >>>>>
>> >>>>> } catch (SolrServerException e) {
>> >>>>> log.error(e);
>> >>>>> }catch(SolrException e) {
>> >>>>> log.error(e);
>> >>>>> }
>> >>>>> return added;
>> >>>>> }
>> >>>>> private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
>> >>>>> boolean added = false;
>> >>>>> try {
>> >>>>> this.cloudSolrServer.add(updateDoc, 100);
>> >>>>> this.commit();
>> >>>>> added = true;
>> >>>>> } catch (IOException e) {
>> >>>>> log.error(e);
>> >>>>>
>> >>>>> } catch (SolrServerException e) {
>> >>>>> log.error(e);
>> >>>>> }catch(SolrException e) {
>> >>>>> log.error(e);
>> >>>>> }
>> >>>>> return added;
>> >>>>> }
>> >>>>> }
>> >>>>>
>> >>>>> Thank you very much, Mark.
>> >>>>>
>> >>>>>
>> >>>>> -  Luis Cappa
>> >>>>>
>> >>>>>
>> >>>>>
>> >>>>> And
>> >>>>> 2013/3/13 Mark Miller <ma...@gmail.com>
>> >>>>>
>> >>>>>>
>> >>>>>> Could you capture some thread stack traces in the 'engine' and see
>> if
>> >>>>>> there are any blocking methods?
>> >>>>>>
>> >>>>>> - Mark
>> >>>>>>
>> >>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com>
>> >>>>> wrote:
>> >>>>>>
>> >>>>>>> Just one correction:
>> >>>>>>>
>> >>>>>>> When I said:
>> >>>>>>>
>> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >>>>>>>  everything is green, and I cant execute queries directly into
>> >>>> Solr.
>> >>>>>>>
>> >>>>>>> I mean:
>> >>>>>>>
>> >>>>>>>
>> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >>>>>>>  everything is green, and *I can* execute queries directly into
>> >>>> Solr.
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> Thanks!
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> - Luis Cappa
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
>> >>>>>>>
>> >>>>>>>> Hello, guys!
>> >>>>>>>>
>> >>>>>>>> I´ve been experiencing some annoying behavior with my current
>> >>>>> production
>> >>>>>>>> scenario. Here is the snapshot:
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>  - SolrCloud: 2 shards
>> >>>>>>>>  - Zookeeper ensemble: 3 nodes in *different machines *(most of
>> >>>> the
>> >>>>>>>>  tutorials installs 3 Zookeeper nodes in the same machine).
>> >>>>>>>>  - This is the zoo.cfg from every
>> >>>>>>>>
>> >>>>>>>> tickTime=2000  // I´ve also tried with 60000
>> >>>>>>>>
>> >>>>>>>> initLimit=10
>> >>>>>>>>
>> >>>>>>>> syncLimit=5
>> >>>>>>>>
>> >>>>>>>> dataDir=/var/lib/zookeeper
>> >>>>>>>>
>> >>>>>>>> clientPort=9000
>> >>>>>>>>
>> >>>>>>>> server.1=zoohost1:2888:3888
>> >>>>>>>>
>> >>>>>>>> server.2=zoohost1:2888:3888
>> >>>>>>>>
>> >>>>>>>> server.3=zoohost1:2888:3888
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>>  - I´ve developed a Java Application with a REST API (let´s call
>> >>>> it *
>> >>>>>>>>  engine*) that dispatches queries into SolrCloud. It´s a wrapper
>> >>>>> around
>> >>>>>>>>  CloudSolrServer, so it´s mandatory to specify some Zookeeper
>> >>>>>> configuration
>> >>>>>>>>  params too. They are loaded dynamically when the application is
>> >>>>>> deployed in
>> >>>>>>>>  a Tomcat server, but the current values that I´m using are as
>> >>>>> follows:
>> >>>>>>>>
>> >>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)*
>> >>>>>>>>
>> >>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)*
>> >>>>>>>> *
>> >>>>>>>> *
>> >>>>>>>> *
>> >>>>>>>> *
>> >>>>>>>>
>> >>>>>>>> *THE PROBLEM*
>> >>>>>>>> *
>> >>>>>>>> *
>> >>>>>>>> Everything goes OK, but after two days more or less (yes, I´ve
>> >>>> checked
>> >>>>>>>> that this behavior occurrs periodically, more or less) the *engine
>> >>>>>> blocks
>> >>>>>>>> * and cannot dispatch any query to SolrCloud.
>> >>>>>>>>
>> >>>>>>>>  - The *engine *log only outputs "updating Zookeeper..." one last
>> >>>>> time,
>> >>>>>>>>  but never updates.
>> >>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> >>>>>>>>  everything is green, and I cant execute queries directly into
>> >>>> Solr.
>> >>>>>>>>  - So then Solr appears to be OK, so the next step is to restart
>> >>>>>> *engine
>> >>>>>>>>  but *it again appears "updating Zookeeper...". Unfortunately
>> >>>> switch
>> >>>>>>>>  off + switch on doesn´t work here, :-(
>> >>>>>>>>  - I´ve checked too Zookeeper logs and it appears some connection
>> >>>> log
>> >>>>>>>>  outs, but the ensemble appears to be OK too.
>> >>>>>>>>  - *The end: *If I restart Zookeeper one by one, and I restart
>> >>>>>>>>  SolrCloud, plus I restart the engine, the problem is solved. I´m
>> >>>>> using
>> >>>>>>>>  Amazon AWS as hostage, so I discard connection problems between
>> >>>>>> instances.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Does anyone experienced something similar? Can anybody shed some
>> >>>> light
>> >>>>>> on
>> >>>>>>>> this problem?
>> >>>>>>>>
>> >>>>>>>> Thank you very much.
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> Regards,
>> >>>>>>>>
>> >>>>>>>>
>> >>>>>>>> - Luis Cappa
>> >>>>>>>>
>> >>>>>>
>> >>>>>>
>> >>>>>
>> >>>>
>> >>>
>> >>>
>> >
>>

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Luis Cappa Banda <lu...@gmail.com>.

Anyone can help me? Each response may save a little kitten from a horrible
and dramatic  death somewhere in the world :-P
El 15/03/2013 21:06, "Jack Park" <ja...@topicquests.org> escribió:

> Is there a document that tells how to create multiple threads? Search
> returns many hits which orbit this idea, but I haven't spotted one
> which tells how.
>
> Thanks
> Jack
>
> On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller <ma...@gmail.com>
> wrote:
> > You def have to use multiple threads with it for it to be fast, but 3 or
> 4 docs a second still sounds absurdly slow.
> >
> > - Mark
> >
> > On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda <lu...@gmail.com>
> wrote:
> >
> >> And up! :-)
> >>
> >> I´ve been wondering if using CloudSolrServer has something to do here.
> Does
> >> it have a bad performance when a CloudSolrServer singletong receives
> >> multiple queries? Is it recommended to have a CloudSolrServer instances
> >> list and select one of them with a Round Robin criteria?
> >>
> >>
> >>
> >> 2013/3/14 Luis Cappa Banda <lu...@gmail.com>
> >>
> >>> Hello!
> >>>
> >>> Thanks a lot, Erick! I've attached some stack traces during a normal
> >>> 'engine' running.
> >>>
> >>> Cheers,
> >>>
> >>> - Luis Cappa
> >>>
> >>>
> >>> 2013/3/13 Erick Erickson <er...@gmail.com>
> >>>
> >>>> Stack traces..
> >>>>
> >>>> First,
> >>>> jps -l
> >>>>
> >>>> that will give you a the process IDs of your running Java processes.
> Then:
> >>>>
> >>>> jstack <pid from above>
> >>>>
> >>>> Usually I pipe the output from jstack into a text file...
> >>>>
> >>>> Best
> >>>> Erick
> >>>>
> >>>>
> >>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <
> luiscappa@gmail.com
> >>>>> wrote:
> >>>>
> >>>>> Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
> >>>> posible
> >>>>> to output this traces, but with a .war application built on top of
> >>>> Spring I
> >>>>> don´t know how can I do that. In any case, here is my CloudSolrServer
> >>>>> wrapper that is used by other classes. There is no sync method or
> piece
> >>>> of
> >>>>> code:
> >>>>>
> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >>>> - -
> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >>>>>
> >>>>> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
> >>>>>
> >>>>> private static final long serialVersionUID = 3905956120804659445L;
> >>>>>    public BinaryLBHttpSolrServer(String[] endpoints) throws
> >>>>> MalformedURLException {
> >>>>>    super(endpoints);
> >>>>>    }
> >>>>>
> >>>>>    @Override
> >>>>>    protected HttpSolrServer makeServer(String server) throws
> >>>>> MalformedURLException {
> >>>>>        HttpSolrServer solrServer = super.makeServer(server);
> >>>>>        solrServer.setRequestWriter(new BinaryRequestWriter());
> >>>>>        return solrServer;
> >>>>>    }
> >>>>> }
> >>>>>
> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >>>> - -
> >>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >>>>>
> >>>>> *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer
> {*
> >>>>> private CloudSolrServer cloudSolrServer;
> >>>>>
> >>>>> private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
> >>>>>
> >>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
> >>>>> endpoints, int clientTimeout,
> >>>>> int connectTimeout, String cloudCollection) {
> >>>>> try {
> >>>>> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
> >>>>> (endpoints);
> >>>>> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
> >>>>> lbSolrServer);
> >>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
> >>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
> >>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection);
> >>>>> } catch (MalformedURLException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> @Override
> >>>>> public QueryResponse *search*(SolrQuery query) throws
> >>>> SolrServerException {
> >>>>> return cloudSolrServer.query(query, METHOD.POST);
> >>>>> }
> >>>>>
> >>>>> @Override
> >>>>> public boolean *index*(DocumentBean user) {
> >>>>> boolean indexed = false;
> >>>>> int retries = 0;
> >>>>> do {
> >>>>> indexed = addBean(user);
> >>>>> retries++;
> >>>>> } while(!indexed && retries<4);
> >>>>> return indexed;
> >>>>> }
> >>>>> @Override
> >>>>> public boolean *update*(SolrInputDocument updateDoc) {
> >>>>> boolean update = false;
> >>>>> int retries = 0;
> >>>>>
> >>>>> do {
> >>>>> update = addSolrInputDocument(updateDoc);
> >>>>> retries++;
> >>>>> } while(!update && retries<4);
> >>>>> return update;
> >>>>> }
> >>>>> @Override
> >>>>> public void commit() {
> >>>>> try {
> >>>>> cloudSolrServer.commit();
> >>>>> } catch (SolrServerException e) {
> >>>>>     log.error(e);
> >>>>> } catch (IOException e) {
> >>>>>     log.error(e);
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> @Override
> >>>>> public boolean *delete*(String ... ids) {
> >>>>> boolean deleted = false;
> >>>>> List<String> idList = Arrays.asList(ids);
> >>>>> try {
> >>>>> this.cloudSolrServer.deleteById(idList);
> >>>>> this.cloudSolrServer.commit(true, true);
> >>>>> deleted = true;
> >>>>>
> >>>>> } catch (SolrServerException e) {
> >>>>> log.error(e);
> >>>>>
> >>>>> } catch (IOException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> return deleted;
> >>>>> }
> >>>>>
> >>>>> @Override
> >>>>> public void *optimize*() {
> >>>>> try {
> >>>>> this.cloudSolrServer.optimize();
> >>>>> } catch (SolrServerException e) {
> >>>>> log.error(e);
> >>>>> } catch (IOException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> }
> >>>>> /*
> >>>>> * ********************
> >>>>> *  Getters & setters *
> >>>>> * ********************
> >>>>> * */
> >>>>> public CloudSolrServer getSolrServer() {
> >>>>> return cloudSolrServer;
> >>>>> }
> >>>>>
> >>>>> public void setSolrServer(CloudSolrServer solrServer) {
> >>>>> this.cloudSolrServer = solrServer;
> >>>>> }
> >>>>>
> >>>>> private boolean addBean(DocumentBean user) {
> >>>>> boolean added = false;
> >>>>> try {
> >>>>> this.cloudSolrServer.addBean(user, 100);
> >>>>> this.commit();
> >>>>>
> >>>>> } catch (IOException e) {
> >>>>> log.error(e);
> >>>>>
> >>>>> } catch (SolrServerException e) {
> >>>>> log.error(e);
> >>>>> }catch(SolrException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> return added;
> >>>>> }
> >>>>> private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
> >>>>> boolean added = false;
> >>>>> try {
> >>>>> this.cloudSolrServer.add(updateDoc, 100);
> >>>>> this.commit();
> >>>>> added = true;
> >>>>> } catch (IOException e) {
> >>>>> log.error(e);
> >>>>>
> >>>>> } catch (SolrServerException e) {
> >>>>> log.error(e);
> >>>>> }catch(SolrException e) {
> >>>>> log.error(e);
> >>>>> }
> >>>>> return added;
> >>>>> }
> >>>>> }
> >>>>>
> >>>>> Thank you very much, Mark.
> >>>>>
> >>>>>
> >>>>> -  Luis Cappa
> >>>>>
> >>>>>
> >>>>>
> >>>>> And
> >>>>> 2013/3/13 Mark Miller <ma...@gmail.com>
> >>>>>
> >>>>>>
> >>>>>> Could you capture some thread stack traces in the 'engine' and see
> if
> >>>>>> there are any blocking methods?
> >>>>>>
> >>>>>> - Mark
> >>>>>>
> >>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com>
> >>>>> wrote:
> >>>>>>
> >>>>>>> Just one correction:
> >>>>>>>
> >>>>>>> When I said:
> >>>>>>>
> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >>>>>>>  everything is green, and I cant execute queries directly into
> >>>> Solr.
> >>>>>>>
> >>>>>>> I mean:
> >>>>>>>
> >>>>>>>
> >>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >>>>>>>  everything is green, and *I can* execute queries directly into
> >>>> Solr.
> >>>>>>>
> >>>>>>>
> >>>>>>> Thanks!
> >>>>>>>
> >>>>>>>
> >>>>>>> - Luis Cappa
> >>>>>>>
> >>>>>>>
> >>>>>>> 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
> >>>>>>>
> >>>>>>>> Hello, guys!
> >>>>>>>>
> >>>>>>>> I´ve been experiencing some annoying behavior with my current
> >>>>> production
> >>>>>>>> scenario. Here is the snapshot:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>  - SolrCloud: 2 shards
> >>>>>>>>  - Zookeeper ensemble: 3 nodes in *different machines *(most of
> >>>> the
> >>>>>>>>  tutorials installs 3 Zookeeper nodes in the same machine).
> >>>>>>>>  - This is the zoo.cfg from every
> >>>>>>>>
> >>>>>>>> tickTime=2000  // I´ve also tried with 60000
> >>>>>>>>
> >>>>>>>> initLimit=10
> >>>>>>>>
> >>>>>>>> syncLimit=5
> >>>>>>>>
> >>>>>>>> dataDir=/var/lib/zookeeper
> >>>>>>>>
> >>>>>>>> clientPort=9000
> >>>>>>>>
> >>>>>>>> server.1=zoohost1:2888:3888
> >>>>>>>>
> >>>>>>>> server.2=zoohost1:2888:3888
> >>>>>>>>
> >>>>>>>> server.3=zoohost1:2888:3888
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>  - I´ve developed a Java Application with a REST API (let´s call
> >>>> it *
> >>>>>>>>  engine*) that dispatches queries into SolrCloud. It´s a wrapper
> >>>>> around
> >>>>>>>>  CloudSolrServer, so it´s mandatory to specify some Zookeeper
> >>>>>> configuration
> >>>>>>>>  params too. They are loaded dynamically when the application is
> >>>>>> deployed in
> >>>>>>>>  a Tomcat server, but the current values that I´m using are as
> >>>>> follows:
> >>>>>>>>
> >>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)*
> >>>>>>>>
> >>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)*
> >>>>>>>> *
> >>>>>>>> *
> >>>>>>>> *
> >>>>>>>> *
> >>>>>>>>
> >>>>>>>> *THE PROBLEM*
> >>>>>>>> *
> >>>>>>>> *
> >>>>>>>> Everything goes OK, but after two days more or less (yes, I´ve
> >>>> checked
> >>>>>>>> that this behavior occurrs periodically, more or less) the *engine
> >>>>>> blocks
> >>>>>>>> * and cannot dispatch any query to SolrCloud.
> >>>>>>>>
> >>>>>>>>  - The *engine *log only outputs "updating Zookeeper..." one last
> >>>>> time,
> >>>>>>>>  but never updates.
> >>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >>>>>>>>  everything is green, and I cant execute queries directly into
> >>>> Solr.
> >>>>>>>>  - So then Solr appears to be OK, so the next step is to restart
> >>>>>> *engine
> >>>>>>>>  but *it again appears "updating Zookeeper...". Unfortunately
> >>>> switch
> >>>>>>>>  off + switch on doesn´t work here, :-(
> >>>>>>>>  - I´ve checked too Zookeeper logs and it appears some connection
> >>>> log
> >>>>>>>>  outs, but the ensemble appears to be OK too.
> >>>>>>>>  - *The end: *If I restart Zookeeper one by one, and I restart
> >>>>>>>>  SolrCloud, plus I restart the engine, the problem is solved. I´m
> >>>>> using
> >>>>>>>>  Amazon AWS as hostage, so I discard connection problems between
> >>>>>> instances.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Does anyone experienced something similar? Can anybody shed some
> >>>> light
> >>>>>> on
> >>>>>>>> this problem?
> >>>>>>>>
> >>>>>>>> Thank you very much.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Regards,
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> - Luis Cappa
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >
>

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Jack Park <ja...@topicquests.org>.

Is there a document that tells how to create multiple threads? Search
returns many hits which orbit this idea, but I haven't spotted one
which tells how.

Thanks
Jack

On Fri, Mar 15, 2013 at 1:01 PM, Mark Miller <ma...@gmail.com> wrote:
> You def have to use multiple threads with it for it to be fast, but 3 or 4 docs a second still sounds absurdly slow.
>
> - Mark
>
> On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda <lu...@gmail.com> wrote:
>
>> And up! :-)
>>
>> I´ve been wondering if using CloudSolrServer has something to do here. Does
>> it have a bad performance when a CloudSolrServer singletong receives
>> multiple queries? Is it recommended to have a CloudSolrServer instances
>> list and select one of them with a Round Robin criteria?
>>
>>
>>
>> 2013/3/14 Luis Cappa Banda <lu...@gmail.com>
>>
>>> Hello!
>>>
>>> Thanks a lot, Erick! I've attached some stack traces during a normal
>>> 'engine' running.
>>>
>>> Cheers,
>>>
>>> - Luis Cappa
>>>
>>>
>>> 2013/3/13 Erick Erickson <er...@gmail.com>
>>>
>>>> Stack traces..
>>>>
>>>> First,
>>>> jps -l
>>>>
>>>> that will give you a the process IDs of your running Java processes. Then:
>>>>
>>>> jstack <pid from above>
>>>>
>>>> Usually I pipe the output from jstack into a text file...
>>>>
>>>> Best
>>>> Erick
>>>>
>>>>
>>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <luiscappa@gmail.com
>>>>> wrote:
>>>>
>>>>> Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
>>>> posible
>>>>> to output this traces, but with a .war application built on top of
>>>> Spring I
>>>>> don´t know how can I do that. In any case, here is my CloudSolrServer
>>>>> wrapper that is used by other classes. There is no sync method or piece
>>>> of
>>>>> code:
>>>>>
>>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>>> - -
>>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>>>>
>>>>> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
>>>>>
>>>>> private static final long serialVersionUID = 3905956120804659445L;
>>>>>    public BinaryLBHttpSolrServer(String[] endpoints) throws
>>>>> MalformedURLException {
>>>>>    super(endpoints);
>>>>>    }
>>>>>
>>>>>    @Override
>>>>>    protected HttpSolrServer makeServer(String server) throws
>>>>> MalformedURLException {
>>>>>        HttpSolrServer solrServer = super.makeServer(server);
>>>>>        solrServer.setRequestWriter(new BinaryRequestWriter());
>>>>>        return solrServer;
>>>>>    }
>>>>> }
>>>>>
>>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>>> - -
>>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>>>>
>>>>> *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
>>>>> private CloudSolrServer cloudSolrServer;
>>>>>
>>>>> private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
>>>>>
>>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
>>>>> endpoints, int clientTimeout,
>>>>> int connectTimeout, String cloudCollection) {
>>>>> try {
>>>>> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
>>>>> (endpoints);
>>>>> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
>>>>> lbSolrServer);
>>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
>>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
>>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection);
>>>>> } catch (MalformedURLException e) {
>>>>> log.error(e);
>>>>> }
>>>>> }
>>>>>
>>>>> @Override
>>>>> public QueryResponse *search*(SolrQuery query) throws
>>>> SolrServerException {
>>>>> return cloudSolrServer.query(query, METHOD.POST);
>>>>> }
>>>>>
>>>>> @Override
>>>>> public boolean *index*(DocumentBean user) {
>>>>> boolean indexed = false;
>>>>> int retries = 0;
>>>>> do {
>>>>> indexed = addBean(user);
>>>>> retries++;
>>>>> } while(!indexed && retries<4);
>>>>> return indexed;
>>>>> }
>>>>> @Override
>>>>> public boolean *update*(SolrInputDocument updateDoc) {
>>>>> boolean update = false;
>>>>> int retries = 0;
>>>>>
>>>>> do {
>>>>> update = addSolrInputDocument(updateDoc);
>>>>> retries++;
>>>>> } while(!update && retries<4);
>>>>> return update;
>>>>> }
>>>>> @Override
>>>>> public void commit() {
>>>>> try {
>>>>> cloudSolrServer.commit();
>>>>> } catch (SolrServerException e) {
>>>>>     log.error(e);
>>>>> } catch (IOException e) {
>>>>>     log.error(e);
>>>>> }
>>>>> }
>>>>>
>>>>> @Override
>>>>> public boolean *delete*(String ... ids) {
>>>>> boolean deleted = false;
>>>>> List<String> idList = Arrays.asList(ids);
>>>>> try {
>>>>> this.cloudSolrServer.deleteById(idList);
>>>>> this.cloudSolrServer.commit(true, true);
>>>>> deleted = true;
>>>>>
>>>>> } catch (SolrServerException e) {
>>>>> log.error(e);
>>>>>
>>>>> } catch (IOException e) {
>>>>> log.error(e);
>>>>> }
>>>>> return deleted;
>>>>> }
>>>>>
>>>>> @Override
>>>>> public void *optimize*() {
>>>>> try {
>>>>> this.cloudSolrServer.optimize();
>>>>> } catch (SolrServerException e) {
>>>>> log.error(e);
>>>>> } catch (IOException e) {
>>>>> log.error(e);
>>>>> }
>>>>> }
>>>>> /*
>>>>> * ********************
>>>>> *  Getters & setters *
>>>>> * ********************
>>>>> * */
>>>>> public CloudSolrServer getSolrServer() {
>>>>> return cloudSolrServer;
>>>>> }
>>>>>
>>>>> public void setSolrServer(CloudSolrServer solrServer) {
>>>>> this.cloudSolrServer = solrServer;
>>>>> }
>>>>>
>>>>> private boolean addBean(DocumentBean user) {
>>>>> boolean added = false;
>>>>> try {
>>>>> this.cloudSolrServer.addBean(user, 100);
>>>>> this.commit();
>>>>>
>>>>> } catch (IOException e) {
>>>>> log.error(e);
>>>>>
>>>>> } catch (SolrServerException e) {
>>>>> log.error(e);
>>>>> }catch(SolrException e) {
>>>>> log.error(e);
>>>>> }
>>>>> return added;
>>>>> }
>>>>> private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
>>>>> boolean added = false;
>>>>> try {
>>>>> this.cloudSolrServer.add(updateDoc, 100);
>>>>> this.commit();
>>>>> added = true;
>>>>> } catch (IOException e) {
>>>>> log.error(e);
>>>>>
>>>>> } catch (SolrServerException e) {
>>>>> log.error(e);
>>>>> }catch(SolrException e) {
>>>>> log.error(e);
>>>>> }
>>>>> return added;
>>>>> }
>>>>> }
>>>>>
>>>>> Thank you very much, Mark.
>>>>>
>>>>>
>>>>> -  Luis Cappa
>>>>>
>>>>>
>>>>>
>>>>> And
>>>>> 2013/3/13 Mark Miller <ma...@gmail.com>
>>>>>
>>>>>>
>>>>>> Could you capture some thread stack traces in the 'engine' and see if
>>>>>> there are any blocking methods?
>>>>>>
>>>>>> - Mark
>>>>>>
>>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>>> Just one correction:
>>>>>>>
>>>>>>> When I said:
>>>>>>>
>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>>>>>>>  everything is green, and I cant execute queries directly into
>>>> Solr.
>>>>>>>
>>>>>>> I mean:
>>>>>>>
>>>>>>>
>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>>>>>>>  everything is green, and *I can* execute queries directly into
>>>> Solr.
>>>>>>>
>>>>>>>
>>>>>>> Thanks!
>>>>>>>
>>>>>>>
>>>>>>> - Luis Cappa
>>>>>>>
>>>>>>>
>>>>>>> 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
>>>>>>>
>>>>>>>> Hello, guys!
>>>>>>>>
>>>>>>>> I´ve been experiencing some annoying behavior with my current
>>>>> production
>>>>>>>> scenario. Here is the snapshot:
>>>>>>>>
>>>>>>>>
>>>>>>>>  - SolrCloud: 2 shards
>>>>>>>>  - Zookeeper ensemble: 3 nodes in *different machines *(most of
>>>> the
>>>>>>>>  tutorials installs 3 Zookeeper nodes in the same machine).
>>>>>>>>  - This is the zoo.cfg from every
>>>>>>>>
>>>>>>>> tickTime=2000  // I´ve also tried with 60000
>>>>>>>>
>>>>>>>> initLimit=10
>>>>>>>>
>>>>>>>> syncLimit=5
>>>>>>>>
>>>>>>>> dataDir=/var/lib/zookeeper
>>>>>>>>
>>>>>>>> clientPort=9000
>>>>>>>>
>>>>>>>> server.1=zoohost1:2888:3888
>>>>>>>>
>>>>>>>> server.2=zoohost1:2888:3888
>>>>>>>>
>>>>>>>> server.3=zoohost1:2888:3888
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>  - I´ve developed a Java Application with a REST API (let´s call
>>>> it *
>>>>>>>>  engine*) that dispatches queries into SolrCloud. It´s a wrapper
>>>>> around
>>>>>>>>  CloudSolrServer, so it´s mandatory to specify some Zookeeper
>>>>>> configuration
>>>>>>>>  params too. They are loaded dynamically when the application is
>>>>>> deployed in
>>>>>>>>  a Tomcat server, but the current values that I´m using are as
>>>>> follows:
>>>>>>>>
>>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)*
>>>>>>>>
>>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)*
>>>>>>>> *
>>>>>>>> *
>>>>>>>> *
>>>>>>>> *
>>>>>>>>
>>>>>>>> *THE PROBLEM*
>>>>>>>> *
>>>>>>>> *
>>>>>>>> Everything goes OK, but after two days more or less (yes, I´ve
>>>> checked
>>>>>>>> that this behavior occurrs periodically, more or less) the *engine
>>>>>> blocks
>>>>>>>> * and cannot dispatch any query to SolrCloud.
>>>>>>>>
>>>>>>>>  - The *engine *log only outputs "updating Zookeeper..." one last
>>>>> time,
>>>>>>>>  but never updates.
>>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>>>>>>>>  everything is green, and I cant execute queries directly into
>>>> Solr.
>>>>>>>>  - So then Solr appears to be OK, so the next step is to restart
>>>>>> *engine
>>>>>>>>  but *it again appears "updating Zookeeper...". Unfortunately
>>>> switch
>>>>>>>>  off + switch on doesn´t work here, :-(
>>>>>>>>  - I´ve checked too Zookeeper logs and it appears some connection
>>>> log
>>>>>>>>  outs, but the ensemble appears to be OK too.
>>>>>>>>  - *The end: *If I restart Zookeeper one by one, and I restart
>>>>>>>>  SolrCloud, plus I restart the engine, the problem is solved. I´m
>>>>> using
>>>>>>>>  Amazon AWS as hostage, so I discard connection problems between
>>>>>> instances.
>>>>>>>>
>>>>>>>>
>>>>>>>> Does anyone experienced something similar? Can anybody shed some
>>>> light
>>>>>> on
>>>>>>>> this problem?
>>>>>>>>
>>>>>>>> Thank you very much.
>>>>>>>>
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>>
>>>>>>>>
>>>>>>>> - Luis Cappa
>>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>>
>

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Mark Miller <ma...@gmail.com>.

You def have to use multiple threads with it for it to be fast, but 3 or 4 docs a second still sounds absurdly slow.

- Mark

On Mar 15, 2013, at 2:58 PM, Luis Cappa Banda <lu...@gmail.com> wrote:

> And up! :-)
> 
> I´ve been wondering if using CloudSolrServer has something to do here. Does
> it have a bad performance when a CloudSolrServer singletong receives
> multiple queries? Is it recommended to have a CloudSolrServer instances
> list and select one of them with a Round Robin criteria?
> 
> 
> 
> 2013/3/14 Luis Cappa Banda <lu...@gmail.com>
> 
>> Hello!
>> 
>> Thanks a lot, Erick! I've attached some stack traces during a normal
>> 'engine' running.
>> 
>> Cheers,
>> 
>> - Luis Cappa
>> 
>> 
>> 2013/3/13 Erick Erickson <er...@gmail.com>
>> 
>>> Stack traces..
>>> 
>>> First,
>>> jps -l
>>> 
>>> that will give you a the process IDs of your running Java processes. Then:
>>> 
>>> jstack <pid from above>
>>> 
>>> Usually I pipe the output from jstack into a text file...
>>> 
>>> Best
>>> Erick
>>> 
>>> 
>>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <luiscappa@gmail.com
>>>> wrote:
>>> 
>>>> Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
>>> posible
>>>> to output this traces, but with a .war application built on top of
>>> Spring I
>>>> don´t know how can I do that. In any case, here is my CloudSolrServer
>>>> wrapper that is used by other classes. There is no sync method or piece
>>> of
>>>> code:
>>>> 
>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>> - -
>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>>> 
>>>> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
>>>> 
>>>> private static final long serialVersionUID = 3905956120804659445L;
>>>>    public BinaryLBHttpSolrServer(String[] endpoints) throws
>>>> MalformedURLException {
>>>>    super(endpoints);
>>>>    }
>>>> 
>>>>    @Override
>>>>    protected HttpSolrServer makeServer(String server) throws
>>>> MalformedURLException {
>>>>        HttpSolrServer solrServer = super.makeServer(server);
>>>>        solrServer.setRequestWriter(new BinaryRequestWriter());
>>>>        return solrServer;
>>>>    }
>>>> }
>>>> 
>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>> - -
>>>> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>>>> 
>>>> *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
>>>> private CloudSolrServer cloudSolrServer;
>>>> 
>>>> private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
>>>> 
>>>> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
>>>> endpoints, int clientTimeout,
>>>> int connectTimeout, String cloudCollection) {
>>>> try {
>>>> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
>>>> (endpoints);
>>>> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
>>>> lbSolrServer);
>>>> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
>>>> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
>>>> this.cloudSolrServer.setDefaultCollection(cloudCollection);
>>>> } catch (MalformedURLException e) {
>>>> log.error(e);
>>>> }
>>>> }
>>>> 
>>>> @Override
>>>> public QueryResponse *search*(SolrQuery query) throws
>>> SolrServerException {
>>>> return cloudSolrServer.query(query, METHOD.POST);
>>>> }
>>>> 
>>>> @Override
>>>> public boolean *index*(DocumentBean user) {
>>>> boolean indexed = false;
>>>> int retries = 0;
>>>> do {
>>>> indexed = addBean(user);
>>>> retries++;
>>>> } while(!indexed && retries<4);
>>>> return indexed;
>>>> }
>>>> @Override
>>>> public boolean *update*(SolrInputDocument updateDoc) {
>>>> boolean update = false;
>>>> int retries = 0;
>>>> 
>>>> do {
>>>> update = addSolrInputDocument(updateDoc);
>>>> retries++;
>>>> } while(!update && retries<4);
>>>> return update;
>>>> }
>>>> @Override
>>>> public void commit() {
>>>> try {
>>>> cloudSolrServer.commit();
>>>> } catch (SolrServerException e) {
>>>>     log.error(e);
>>>> } catch (IOException e) {
>>>>     log.error(e);
>>>> }
>>>> }
>>>> 
>>>> @Override
>>>> public boolean *delete*(String ... ids) {
>>>> boolean deleted = false;
>>>> List<String> idList = Arrays.asList(ids);
>>>> try {
>>>> this.cloudSolrServer.deleteById(idList);
>>>> this.cloudSolrServer.commit(true, true);
>>>> deleted = true;
>>>> 
>>>> } catch (SolrServerException e) {
>>>> log.error(e);
>>>> 
>>>> } catch (IOException e) {
>>>> log.error(e);
>>>> }
>>>> return deleted;
>>>> }
>>>> 
>>>> @Override
>>>> public void *optimize*() {
>>>> try {
>>>> this.cloudSolrServer.optimize();
>>>> } catch (SolrServerException e) {
>>>> log.error(e);
>>>> } catch (IOException e) {
>>>> log.error(e);
>>>> }
>>>> }
>>>> /*
>>>> * ********************
>>>> *  Getters & setters *
>>>> * ********************
>>>> * */
>>>> public CloudSolrServer getSolrServer() {
>>>> return cloudSolrServer;
>>>> }
>>>> 
>>>> public void setSolrServer(CloudSolrServer solrServer) {
>>>> this.cloudSolrServer = solrServer;
>>>> }
>>>> 
>>>> private boolean addBean(DocumentBean user) {
>>>> boolean added = false;
>>>> try {
>>>> this.cloudSolrServer.addBean(user, 100);
>>>> this.commit();
>>>> 
>>>> } catch (IOException e) {
>>>> log.error(e);
>>>> 
>>>> } catch (SolrServerException e) {
>>>> log.error(e);
>>>> }catch(SolrException e) {
>>>> log.error(e);
>>>> }
>>>> return added;
>>>> }
>>>> private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
>>>> boolean added = false;
>>>> try {
>>>> this.cloudSolrServer.add(updateDoc, 100);
>>>> this.commit();
>>>> added = true;
>>>> } catch (IOException e) {
>>>> log.error(e);
>>>> 
>>>> } catch (SolrServerException e) {
>>>> log.error(e);
>>>> }catch(SolrException e) {
>>>> log.error(e);
>>>> }
>>>> return added;
>>>> }
>>>> }
>>>> 
>>>> Thank you very much, Mark.
>>>> 
>>>> 
>>>> -  Luis Cappa
>>>> 
>>>> 
>>>> 
>>>> And
>>>> 2013/3/13 Mark Miller <ma...@gmail.com>
>>>> 
>>>>> 
>>>>> Could you capture some thread stack traces in the 'engine' and see if
>>>>> there are any blocking methods?
>>>>> 
>>>>> - Mark
>>>>> 
>>>>> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com>
>>>> wrote:
>>>>> 
>>>>>> Just one correction:
>>>>>> 
>>>>>> When I said:
>>>>>> 
>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>>>>>>  everything is green, and I cant execute queries directly into
>>> Solr.
>>>>>> 
>>>>>> I mean:
>>>>>> 
>>>>>> 
>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>>>>>>  everything is green, and *I can* execute queries directly into
>>> Solr.
>>>>>> 
>>>>>> 
>>>>>> Thanks!
>>>>>> 
>>>>>> 
>>>>>> - Luis Cappa
>>>>>> 
>>>>>> 
>>>>>> 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
>>>>>> 
>>>>>>> Hello, guys!
>>>>>>> 
>>>>>>> I´ve been experiencing some annoying behavior with my current
>>>> production
>>>>>>> scenario. Here is the snapshot:
>>>>>>> 
>>>>>>> 
>>>>>>>  - SolrCloud: 2 shards
>>>>>>>  - Zookeeper ensemble: 3 nodes in *different machines *(most of
>>> the
>>>>>>>  tutorials installs 3 Zookeeper nodes in the same machine).
>>>>>>>  - This is the zoo.cfg from every
>>>>>>> 
>>>>>>> tickTime=2000  // I´ve also tried with 60000
>>>>>>> 
>>>>>>> initLimit=10
>>>>>>> 
>>>>>>> syncLimit=5
>>>>>>> 
>>>>>>> dataDir=/var/lib/zookeeper
>>>>>>> 
>>>>>>> clientPort=9000
>>>>>>> 
>>>>>>> server.1=zoohost1:2888:3888
>>>>>>> 
>>>>>>> server.2=zoohost1:2888:3888
>>>>>>> 
>>>>>>> server.3=zoohost1:2888:3888
>>>>>>> 
>>>>>>> 
>>>>>>> 
>>>>>>>  - I´ve developed a Java Application with a REST API (let´s call
>>> it *
>>>>>>>  engine*) that dispatches queries into SolrCloud. It´s a wrapper
>>>> around
>>>>>>>  CloudSolrServer, so it´s mandatory to specify some Zookeeper
>>>>> configuration
>>>>>>>  params too. They are loaded dynamically when the application is
>>>>> deployed in
>>>>>>>  a Tomcat server, but the current values that I´m using are as
>>>> follows:
>>>>>>> 
>>>>>>> cloudSolrServer.*setZkConnectTimeout(60000)*
>>>>>>> 
>>>>>>> cloudSolrServer.*setZkClientTimeout(60000)*
>>>>>>> *
>>>>>>> *
>>>>>>> *
>>>>>>> *
>>>>>>> 
>>>>>>> *THE PROBLEM*
>>>>>>> *
>>>>>>> *
>>>>>>> Everything goes OK, but after two days more or less (yes, I´ve
>>> checked
>>>>>>> that this behavior occurrs periodically, more or less) the *engine
>>>>> blocks
>>>>>>> * and cannot dispatch any query to SolrCloud.
>>>>>>> 
>>>>>>>  - The *engine *log only outputs "updating Zookeeper..." one last
>>>> time,
>>>>>>>  but never updates.
>>>>>>>  - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>>>>>>>  everything is green, and I cant execute queries directly into
>>> Solr.
>>>>>>>  - So then Solr appears to be OK, so the next step is to restart
>>>>> *engine
>>>>>>>  but *it again appears "updating Zookeeper...". Unfortunately
>>> switch
>>>>>>>  off + switch on doesn´t work here, :-(
>>>>>>>  - I´ve checked too Zookeeper logs and it appears some connection
>>> log
>>>>>>>  outs, but the ensemble appears to be OK too.
>>>>>>>  - *The end: *If I restart Zookeeper one by one, and I restart
>>>>>>>  SolrCloud, plus I restart the engine, the problem is solved. I´m
>>>> using
>>>>>>>  Amazon AWS as hostage, so I discard connection problems between
>>>>> instances.
>>>>>>> 
>>>>>>> 
>>>>>>> Does anyone experienced something similar? Can anybody shed some
>>> light
>>>>> on
>>>>>>> this problem?
>>>>>>> 
>>>>>>> Thank you very much.
>>>>>>> 
>>>>>>> 
>>>>>>> Regards,
>>>>>>> 
>>>>>>> 
>>>>>>> - Luis Cappa
>>>>>>> 
>>>>> 
>>>>> 
>>>> 
>>> 
>> 
>>

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Luis Cappa Banda <lu...@gmail.com>.

And up! :-)

I´ve been wondering if using CloudSolrServer has something to do here. Does
it have a bad performance when a CloudSolrServer singletong receives
multiple queries? Is it recommended to have a CloudSolrServer instances
list and select one of them with a Round Robin criteria?



2013/3/14 Luis Cappa Banda <lu...@gmail.com>

> Hello!
>
> Thanks a lot, Erick! I've attached some stack traces during a normal
> 'engine' running.
>
> Cheers,
>
> - Luis Cappa
>
>
> 2013/3/13 Erick Erickson <er...@gmail.com>
>
>> Stack traces..
>>
>> First,
>> jps -l
>>
>> that will give you a the process IDs of your running Java processes. Then:
>>
>> jstack <pid from above>
>>
>> Usually I pipe the output from jstack into a text file...
>>
>> Best
>> Erick
>>
>>
>> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <luiscappa@gmail.com
>> >wrote:
>>
>> > Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
>> posible
>> > to output this traces, but with a .war application built on top of
>> Spring I
>> > don´t know how can I do that. In any case, here is my CloudSolrServer
>> > wrapper that is used by other classes. There is no sync method or piece
>> of
>> > code:
>> >
>> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> >
>> > *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
>> >
>> > private static final long serialVersionUID = 3905956120804659445L;
>> >     public BinaryLBHttpSolrServer(String[] endpoints) throws
>> > MalformedURLException {
>> >     super(endpoints);
>> >     }
>> >
>> >     @Override
>> >     protected HttpSolrServer makeServer(String server) throws
>> > MalformedURLException {
>> >         HttpSolrServer solrServer = super.makeServer(server);
>> >         solrServer.setRequestWriter(new BinaryRequestWriter());
>> >         return solrServer;
>> >     }
>> > }
>> >
>> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> - -
>> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>> >
>> > *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
>> >  private CloudSolrServer cloudSolrServer;
>> >
>> > private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
>> >
>> > public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
>> > endpoints, int clientTimeout,
>> > int connectTimeout, String cloudCollection) {
>> >  try {
>> > BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
>> > (endpoints);
>> > this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
>> > lbSolrServer);
>> > this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
>> > this.cloudSolrServer.setZkClientTimeout(clientTimeout);
>> > this.cloudSolrServer.setDefaultCollection(cloudCollection);
>> >  } catch (MalformedURLException e) {
>> > log.error(e);
>> > }
>> > }
>> >
>> > @Override
>> > public QueryResponse *search*(SolrQuery query) throws
>> SolrServerException {
>> > return cloudSolrServer.query(query, METHOD.POST);
>> > }
>> >
>> > @Override
>> > public boolean *index*(DocumentBean user) {
>> > boolean indexed = false;
>> > int retries = 0;
>> >  do {
>> > indexed = addBean(user);
>> > retries++;
>> >  } while(!indexed && retries<4);
>> >  return indexed;
>> > }
>> >  @Override
>> > public boolean *update*(SolrInputDocument updateDoc) {
>> > boolean update = false;
>> > int retries = 0;
>> >
>> > do {
>> > update = addSolrInputDocument(updateDoc);
>> > retries++;
>> >  } while(!update && retries<4);
>> >  return update;
>> > }
>> >  @Override
>> > public void commit() {
>> > try {
>> > cloudSolrServer.commit();
>> > } catch (SolrServerException e) {
>> >      log.error(e);
>> > } catch (IOException e) {
>> >      log.error(e);
>> > }
>> > }
>> >
>> > @Override
>> > public boolean *delete*(String ... ids) {
>> > boolean deleted = false;
>> >  List<String> idList = Arrays.asList(ids);
>> >  try {
>> > this.cloudSolrServer.deleteById(idList);
>> > this.cloudSolrServer.commit(true, true);
>> > deleted = true;
>> >
>> > } catch (SolrServerException e) {
>> > log.error(e);
>> >
>> > } catch (IOException e) {
>> > log.error(e);
>> >  }
>> >  return deleted;
>> > }
>> >
>> > @Override
>> > public void *optimize*() {
>> > try {
>> > this.cloudSolrServer.optimize();
>> >  } catch (SolrServerException e) {
>> > log.error(e);
>> >  } catch (IOException e) {
>> > log.error(e);
>> > }
>> > }
>> >  /*
>> >  * ********************
>> >  *  Getters & setters *
>> >  * ********************
>> >  * */
>> >  public CloudSolrServer getSolrServer() {
>> > return cloudSolrServer;
>> > }
>> >
>> > public void setSolrServer(CloudSolrServer solrServer) {
>> > this.cloudSolrServer = solrServer;
>> > }
>> >
>> > private boolean addBean(DocumentBean user) {
>> > boolean added = false;
>> >  try {
>> > this.cloudSolrServer.addBean(user, 100);
>> > this.commit();
>> >
>> > } catch (IOException e) {
>> > log.error(e);
>> >
>> > } catch (SolrServerException e) {
>> > log.error(e);
>> >  }catch(SolrException e) {
>> > log.error(e);
>> > }
>> >  return added;
>> > }
>> >  private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
>> > boolean added = false;
>> >  try {
>> > this.cloudSolrServer.add(updateDoc, 100);
>> > this.commit();
>> > added = true;
>> >  } catch (IOException e) {
>> > log.error(e);
>> >
>> > } catch (SolrServerException e) {
>> > log.error(e);
>> >  }catch(SolrException e) {
>> > log.error(e);
>> > }
>> >  return added;
>> > }
>> > }
>> >
>> > Thank you very much, Mark.
>> >
>> >
>> > -  Luis Cappa
>> >
>> >
>> >
>> > And
>> > 2013/3/13 Mark Miller <ma...@gmail.com>
>> >
>> > >
>> > > Could you capture some thread stack traces in the 'engine' and see if
>> > > there are any blocking methods?
>> > >
>> > > - Mark
>> > >
>> > > On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com>
>> > wrote:
>> > >
>> > > > Just one correction:
>> > > >
>> > > > When I said:
>> > > >
>> > > >   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> > > >   everything is green, and I cant execute queries directly into
>> Solr.
>> > > >
>> > > > I mean:
>> > > >
>> > > >
>> > > >   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> > > >   everything is green, and *I can* execute queries directly into
>> Solr.
>> > > >
>> > > >
>> > > > Thanks!
>> > > >
>> > > >
>> > > > - Luis Cappa
>> > > >
>> > > >
>> > > > 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
>> > > >
>> > > >> Hello, guys!
>> > > >>
>> > > >> I´ve been experiencing some annoying behavior with my current
>> > production
>> > > >> scenario. Here is the snapshot:
>> > > >>
>> > > >>
>> > > >>   - SolrCloud: 2 shards
>> > > >>   - Zookeeper ensemble: 3 nodes in *different machines *(most of
>> the
>> > > >>   tutorials installs 3 Zookeeper nodes in the same machine).
>> > > >>   - This is the zoo.cfg from every
>> > > >>
>> > > >> tickTime=2000  // I´ve also tried with 60000
>> > > >>
>> > > >> initLimit=10
>> > > >>
>> > > >> syncLimit=5
>> > > >>
>> > > >> dataDir=/var/lib/zookeeper
>> > > >>
>> > > >> clientPort=9000
>> > > >>
>> > > >> server.1=zoohost1:2888:3888
>> > > >>
>> > > >> server.2=zoohost1:2888:3888
>> > > >>
>> > > >> server.3=zoohost1:2888:3888
>> > > >>
>> > > >>
>> > > >>
>> > > >>   - I´ve developed a Java Application with a REST API (let´s call
>> it *
>> > > >>   engine*) that dispatches queries into SolrCloud. It´s a wrapper
>> > around
>> > > >>   CloudSolrServer, so it´s mandatory to specify some Zookeeper
>> > > configuration
>> > > >>   params too. They are loaded dynamically when the application is
>> > > deployed in
>> > > >>   a Tomcat server, but the current values that I´m using are as
>> > follows:
>> > > >>
>> > > >> cloudSolrServer.*setZkConnectTimeout(60000)*
>> > > >>
>> > > >> cloudSolrServer.*setZkClientTimeout(60000)*
>> > > >> *
>> > > >> *
>> > > >> *
>> > > >> *
>> > > >>
>> > > >> *THE PROBLEM*
>> > > >> *
>> > > >> *
>> > > >> Everything goes OK, but after two days more or less (yes, I´ve
>> checked
>> > > >> that this behavior occurrs periodically, more or less) the *engine
>> > > blocks
>> > > >> * and cannot dispatch any query to SolrCloud.
>> > > >>
>> > > >>   - The *engine *log only outputs "updating Zookeeper..." one last
>> > time,
>> > > >>   but never updates.
>> > > >>   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>> > > >>   everything is green, and I cant execute queries directly into
>> Solr.
>> > > >>   - So then Solr appears to be OK, so the next step is to restart
>> > > *engine
>> > > >>   but *it again appears "updating Zookeeper...". Unfortunately
>> switch
>> > > >>   off + switch on doesn´t work here, :-(
>> > > >>   - I´ve checked too Zookeeper logs and it appears some connection
>> log
>> > > >>   outs, but the ensemble appears to be OK too.
>> > > >>   - *The end: *If I restart Zookeeper one by one, and I restart
>> > > >>   SolrCloud, plus I restart the engine, the problem is solved. I´m
>> > using
>> > > >>   Amazon AWS as hostage, so I discard connection problems between
>> > > instances.
>> > > >>
>> > > >>
>> > > >> Does anyone experienced something similar? Can anybody shed some
>> light
>> > > on
>> > > >> this problem?
>> > > >>
>> > > >> Thank you very much.
>> > > >>
>> > > >>
>> > > >> Regards,
>> > > >>
>> > > >>
>> > > >> - Luis Cappa
>> > > >>
>> > >
>> > >
>> >
>>
>
>

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Luis Cappa Banda <lu...@gmail.com>.

Hello!

Thanks a lot, Erick! I've attached some stack traces during a normal
'engine' running.

Cheers,

- Luis Cappa


2013/3/13 Erick Erickson <er...@gmail.com>

> Stack traces..
>
> First,
> jps -l
>
> that will give you a the process IDs of your running Java processes. Then:
>
> jstack <pid from above>
>
> Usually I pipe the output from jstack into a text file...
>
> Best
> Erick
>
>
> On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <luiscappa@gmail.com
> >wrote:
>
> > Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s
> posible
> > to output this traces, but with a .war application built on top of
> Spring I
> > don´t know how can I do that. In any case, here is my CloudSolrServer
> > wrapper that is used by other classes. There is no sync method or piece
> of
> > code:
> >
> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> -
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >
> > *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
> >
> > private static final long serialVersionUID = 3905956120804659445L;
> >     public BinaryLBHttpSolrServer(String[] endpoints) throws
> > MalformedURLException {
> >     super(endpoints);
> >     }
> >
> >     @Override
> >     protected HttpSolrServer makeServer(String server) throws
> > MalformedURLException {
> >         HttpSolrServer solrServer = super.makeServer(server);
> >         solrServer.setRequestWriter(new BinaryRequestWriter());
> >         return solrServer;
> >     }
> > }
> >
> >  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> -
> > - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> >
> > *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
> >  private CloudSolrServer cloudSolrServer;
> >
> > private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
> >
> > public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
> > endpoints, int clientTimeout,
> > int connectTimeout, String cloudCollection) {
> >  try {
> > BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
> > (endpoints);
> > this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
> > lbSolrServer);
> > this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
> > this.cloudSolrServer.setZkClientTimeout(clientTimeout);
> > this.cloudSolrServer.setDefaultCollection(cloudCollection);
> >  } catch (MalformedURLException e) {
> > log.error(e);
> > }
> > }
> >
> > @Override
> > public QueryResponse *search*(SolrQuery query) throws
> SolrServerException {
> > return cloudSolrServer.query(query, METHOD.POST);
> > }
> >
> > @Override
> > public boolean *index*(DocumentBean user) {
> > boolean indexed = false;
> > int retries = 0;
> >  do {
> > indexed = addBean(user);
> > retries++;
> >  } while(!indexed && retries<4);
> >  return indexed;
> > }
> >  @Override
> > public boolean *update*(SolrInputDocument updateDoc) {
> > boolean update = false;
> > int retries = 0;
> >
> > do {
> > update = addSolrInputDocument(updateDoc);
> > retries++;
> >  } while(!update && retries<4);
> >  return update;
> > }
> >  @Override
> > public void commit() {
> > try {
> > cloudSolrServer.commit();
> > } catch (SolrServerException e) {
> >      log.error(e);
> > } catch (IOException e) {
> >      log.error(e);
> > }
> > }
> >
> > @Override
> > public boolean *delete*(String ... ids) {
> > boolean deleted = false;
> >  List<String> idList = Arrays.asList(ids);
> >  try {
> > this.cloudSolrServer.deleteById(idList);
> > this.cloudSolrServer.commit(true, true);
> > deleted = true;
> >
> > } catch (SolrServerException e) {
> > log.error(e);
> >
> > } catch (IOException e) {
> > log.error(e);
> >  }
> >  return deleted;
> > }
> >
> > @Override
> > public void *optimize*() {
> > try {
> > this.cloudSolrServer.optimize();
> >  } catch (SolrServerException e) {
> > log.error(e);
> >  } catch (IOException e) {
> > log.error(e);
> > }
> > }
> >  /*
> >  * ********************
> >  *  Getters & setters *
> >  * ********************
> >  * */
> >  public CloudSolrServer getSolrServer() {
> > return cloudSolrServer;
> > }
> >
> > public void setSolrServer(CloudSolrServer solrServer) {
> > this.cloudSolrServer = solrServer;
> > }
> >
> > private boolean addBean(DocumentBean user) {
> > boolean added = false;
> >  try {
> > this.cloudSolrServer.addBean(user, 100);
> > this.commit();
> >
> > } catch (IOException e) {
> > log.error(e);
> >
> > } catch (SolrServerException e) {
> > log.error(e);
> >  }catch(SolrException e) {
> > log.error(e);
> > }
> >  return added;
> > }
> >  private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
> > boolean added = false;
> >  try {
> > this.cloudSolrServer.add(updateDoc, 100);
> > this.commit();
> > added = true;
> >  } catch (IOException e) {
> > log.error(e);
> >
> > } catch (SolrServerException e) {
> > log.error(e);
> >  }catch(SolrException e) {
> > log.error(e);
> > }
> >  return added;
> > }
> > }
> >
> > Thank you very much, Mark.
> >
> >
> > -  Luis Cappa
> >
> >
> >
> > And
> > 2013/3/13 Mark Miller <ma...@gmail.com>
> >
> > >
> > > Could you capture some thread stack traces in the 'engine' and see if
> > > there are any blocking methods?
> > >
> > > - Mark
> > >
> > > On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com>
> > wrote:
> > >
> > > > Just one correction:
> > > >
> > > > When I said:
> > > >
> > > >   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> > > >   everything is green, and I cant execute queries directly into Solr.
> > > >
> > > > I mean:
> > > >
> > > >
> > > >   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> > > >   everything is green, and *I can* execute queries directly into
> Solr.
> > > >
> > > >
> > > > Thanks!
> > > >
> > > >
> > > > - Luis Cappa
> > > >
> > > >
> > > > 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
> > > >
> > > >> Hello, guys!
> > > >>
> > > >> I´ve been experiencing some annoying behavior with my current
> > production
> > > >> scenario. Here is the snapshot:
> > > >>
> > > >>
> > > >>   - SolrCloud: 2 shards
> > > >>   - Zookeeper ensemble: 3 nodes in *different machines *(most of the
> > > >>   tutorials installs 3 Zookeeper nodes in the same machine).
> > > >>   - This is the zoo.cfg from every
> > > >>
> > > >> tickTime=2000  // I´ve also tried with 60000
> > > >>
> > > >> initLimit=10
> > > >>
> > > >> syncLimit=5
> > > >>
> > > >> dataDir=/var/lib/zookeeper
> > > >>
> > > >> clientPort=9000
> > > >>
> > > >> server.1=zoohost1:2888:3888
> > > >>
> > > >> server.2=zoohost1:2888:3888
> > > >>
> > > >> server.3=zoohost1:2888:3888
> > > >>
> > > >>
> > > >>
> > > >>   - I´ve developed a Java Application with a REST API (let´s call
> it *
> > > >>   engine*) that dispatches queries into SolrCloud. It´s a wrapper
> > around
> > > >>   CloudSolrServer, so it´s mandatory to specify some Zookeeper
> > > configuration
> > > >>   params too. They are loaded dynamically when the application is
> > > deployed in
> > > >>   a Tomcat server, but the current values that I´m using are as
> > follows:
> > > >>
> > > >> cloudSolrServer.*setZkConnectTimeout(60000)*
> > > >>
> > > >> cloudSolrServer.*setZkClientTimeout(60000)*
> > > >> *
> > > >> *
> > > >> *
> > > >> *
> > > >>
> > > >> *THE PROBLEM*
> > > >> *
> > > >> *
> > > >> Everything goes OK, but after two days more or less (yes, I´ve
> checked
> > > >> that this behavior occurrs periodically, more or less) the *engine
> > > blocks
> > > >> * and cannot dispatch any query to SolrCloud.
> > > >>
> > > >>   - The *engine *log only outputs "updating Zookeeper..." one last
> > time,
> > > >>   but never updates.
> > > >>   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> > > >>   everything is green, and I cant execute queries directly into
> Solr.
> > > >>   - So then Solr appears to be OK, so the next step is to restart
> > > *engine
> > > >>   but *it again appears "updating Zookeeper...". Unfortunately
> switch
> > > >>   off + switch on doesn´t work here, :-(
> > > >>   - I´ve checked too Zookeeper logs and it appears some connection
> log
> > > >>   outs, but the ensemble appears to be OK too.
> > > >>   - *The end: *If I restart Zookeeper one by one, and I restart
> > > >>   SolrCloud, plus I restart the engine, the problem is solved. I´m
> > using
> > > >>   Amazon AWS as hostage, so I discard connection problems between
> > > instances.
> > > >>
> > > >>
> > > >> Does anyone experienced something similar? Can anybody shed some
> light
> > > on
> > > >> this problem?
> > > >>
> > > >> Thank you very much.
> > > >>
> > > >>
> > > >> Regards,
> > > >>
> > > >>
> > > >> - Luis Cappa
> > > >>
> > >
> > >
> >
>

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Erick Erickson <er...@gmail.com>.

Stack traces..

First,
jps -l

that will give you a the process IDs of your running Java processes. Then:

jstack <pid from above>

Usually I pipe the output from jstack into a text file...

Best
Erick


On Wed, Mar 13, 2013 at 1:48 PM, Luis Cappa Banda <lu...@gmail.com>wrote:

> Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s posible
> to output this traces, but with a .war application built on top of Spring I
> don´t know how can I do that. In any case, here is my CloudSolrServer
> wrapper that is used by other classes. There is no sync method or piece of
> code:
>
>  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>
> *public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*
>
> private static final long serialVersionUID = 3905956120804659445L;
>     public BinaryLBHttpSolrServer(String[] endpoints) throws
> MalformedURLException {
>     super(endpoints);
>     }
>
>     @Override
>     protected HttpSolrServer makeServer(String server) throws
> MalformedURLException {
>         HttpSolrServer solrServer = super.makeServer(server);
>         solrServer.setRequestWriter(new BinaryRequestWriter());
>         return solrServer;
>     }
> }
>
>  - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
>
> *public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
>  private CloudSolrServer cloudSolrServer;
>
> private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);
>
> public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
> endpoints, int clientTimeout,
> int connectTimeout, String cloudCollection) {
>  try {
> BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
> (endpoints);
> this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
> lbSolrServer);
> this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
> this.cloudSolrServer.setZkClientTimeout(clientTimeout);
> this.cloudSolrServer.setDefaultCollection(cloudCollection);
>  } catch (MalformedURLException e) {
> log.error(e);
> }
> }
>
> @Override
> public QueryResponse *search*(SolrQuery query) throws SolrServerException {
> return cloudSolrServer.query(query, METHOD.POST);
> }
>
> @Override
> public boolean *index*(DocumentBean user) {
> boolean indexed = false;
> int retries = 0;
>  do {
> indexed = addBean(user);
> retries++;
>  } while(!indexed && retries<4);
>  return indexed;
> }
>  @Override
> public boolean *update*(SolrInputDocument updateDoc) {
> boolean update = false;
> int retries = 0;
>
> do {
> update = addSolrInputDocument(updateDoc);
> retries++;
>  } while(!update && retries<4);
>  return update;
> }
>  @Override
> public void commit() {
> try {
> cloudSolrServer.commit();
> } catch (SolrServerException e) {
>      log.error(e);
> } catch (IOException e) {
>      log.error(e);
> }
> }
>
> @Override
> public boolean *delete*(String ... ids) {
> boolean deleted = false;
>  List<String> idList = Arrays.asList(ids);
>  try {
> this.cloudSolrServer.deleteById(idList);
> this.cloudSolrServer.commit(true, true);
> deleted = true;
>
> } catch (SolrServerException e) {
> log.error(e);
>
> } catch (IOException e) {
> log.error(e);
>  }
>  return deleted;
> }
>
> @Override
> public void *optimize*() {
> try {
> this.cloudSolrServer.optimize();
>  } catch (SolrServerException e) {
> log.error(e);
>  } catch (IOException e) {
> log.error(e);
> }
> }
>  /*
>  * ********************
>  *  Getters & setters *
>  * ********************
>  * */
>  public CloudSolrServer getSolrServer() {
> return cloudSolrServer;
> }
>
> public void setSolrServer(CloudSolrServer solrServer) {
> this.cloudSolrServer = solrServer;
> }
>
> private boolean addBean(DocumentBean user) {
> boolean added = false;
>  try {
> this.cloudSolrServer.addBean(user, 100);
> this.commit();
>
> } catch (IOException e) {
> log.error(e);
>
> } catch (SolrServerException e) {
> log.error(e);
>  }catch(SolrException e) {
> log.error(e);
> }
>  return added;
> }
>  private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
> boolean added = false;
>  try {
> this.cloudSolrServer.add(updateDoc, 100);
> this.commit();
> added = true;
>  } catch (IOException e) {
> log.error(e);
>
> } catch (SolrServerException e) {
> log.error(e);
>  }catch(SolrException e) {
> log.error(e);
> }
>  return added;
> }
> }
>
> Thank you very much, Mark.
>
>
> -  Luis Cappa
>
>
>
> And
> 2013/3/13 Mark Miller <ma...@gmail.com>
>
> >
> > Could you capture some thread stack traces in the 'engine' and see if
> > there are any blocking methods?
> >
> > - Mark
> >
> > On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com>
> wrote:
> >
> > > Just one correction:
> > >
> > > When I said:
> > >
> > >   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> > >   everything is green, and I cant execute queries directly into Solr.
> > >
> > > I mean:
> > >
> > >
> > >   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> > >   everything is green, and *I can* execute queries directly into Solr.
> > >
> > >
> > > Thanks!
> > >
> > >
> > > - Luis Cappa
> > >
> > >
> > > 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
> > >
> > >> Hello, guys!
> > >>
> > >> I´ve been experiencing some annoying behavior with my current
> production
> > >> scenario. Here is the snapshot:
> > >>
> > >>
> > >>   - SolrCloud: 2 shards
> > >>   - Zookeeper ensemble: 3 nodes in *different machines *(most of the
> > >>   tutorials installs 3 Zookeeper nodes in the same machine).
> > >>   - This is the zoo.cfg from every
> > >>
> > >> tickTime=2000  // I´ve also tried with 60000
> > >>
> > >> initLimit=10
> > >>
> > >> syncLimit=5
> > >>
> > >> dataDir=/var/lib/zookeeper
> > >>
> > >> clientPort=9000
> > >>
> > >> server.1=zoohost1:2888:3888
> > >>
> > >> server.2=zoohost1:2888:3888
> > >>
> > >> server.3=zoohost1:2888:3888
> > >>
> > >>
> > >>
> > >>   - I´ve developed a Java Application with a REST API (let´s call it *
> > >>   engine*) that dispatches queries into SolrCloud. It´s a wrapper
> around
> > >>   CloudSolrServer, so it´s mandatory to specify some Zookeeper
> > configuration
> > >>   params too. They are loaded dynamically when the application is
> > deployed in
> > >>   a Tomcat server, but the current values that I´m using are as
> follows:
> > >>
> > >> cloudSolrServer.*setZkConnectTimeout(60000)*
> > >>
> > >> cloudSolrServer.*setZkClientTimeout(60000)*
> > >> *
> > >> *
> > >> *
> > >> *
> > >>
> > >> *THE PROBLEM*
> > >> *
> > >> *
> > >> Everything goes OK, but after two days more or less (yes, I´ve checked
> > >> that this behavior occurrs periodically, more or less) the *engine
> > blocks
> > >> * and cannot dispatch any query to SolrCloud.
> > >>
> > >>   - The *engine *log only outputs "updating Zookeeper..." one last
> time,
> > >>   but never updates.
> > >>   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> > >>   everything is green, and I cant execute queries directly into Solr.
> > >>   - So then Solr appears to be OK, so the next step is to restart
> > *engine
> > >>   but *it again appears "updating Zookeeper...". Unfortunately switch
> > >>   off + switch on doesn´t work here, :-(
> > >>   - I´ve checked too Zookeeper logs and it appears some connection log
> > >>   outs, but the ensemble appears to be OK too.
> > >>   - *The end: *If I restart Zookeeper one by one, and I restart
> > >>   SolrCloud, plus I restart the engine, the problem is solved. I´m
> using
> > >>   Amazon AWS as hostage, so I discard connection problems between
> > instances.
> > >>
> > >>
> > >> Does anyone experienced something similar? Can anybody shed some light
> > on
> > >> this problem?
> > >>
> > >> Thank you very much.
> > >>
> > >>
> > >> Regards,
> > >>
> > >>
> > >> - Luis Cappa
> > >>
> >
> >
>

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Luis Cappa Banda <lu...@gmail.com>.

Uhm, how can I do that... 'cleanly'? I know that with JConsole it´s posible
to output this traces, but with a .war application built on top of Spring I
don´t know how can I do that. In any case, here is my CloudSolrServer
wrapper that is used by other classes. There is no sync method or piece of
code:

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

*public class BinaryLBHttpSolrServer extends LBHttpSolrServer {*

private static final long serialVersionUID = 3905956120804659445L;
    public BinaryLBHttpSolrServer(String[] endpoints) throws
MalformedURLException {
    super(endpoints);
    }

    @Override
    protected HttpSolrServer makeServer(String server) throws
MalformedURLException {
        HttpSolrServer solrServer = super.makeServer(server);
        solrServer.setRequestWriter(new BinaryRequestWriter());
        return solrServer;
    }
}

 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

*public class CloudSolrHttpServerImpl implements CloudSolrHttpServer {*
 private CloudSolrServer cloudSolrServer;

private Logger log = Logger.getLogger(CloudSolrHttpServerImpl.class);

public CloudSolrHttpServerImpl(String zookeeperEndpoints, String[]
endpoints, int clientTimeout,
int connectTimeout, String cloudCollection) {
 try {
BinaryLBHttpSolrServer lbSolrServer = new *BinaryLBHttpSolrServer*
(endpoints);
this.cloudSolrServer = new CloudSolrServer(zookeeperEndpoints,
lbSolrServer);
this.cloudSolrServer.setZkConnectTimeout(connectTimeout);
this.cloudSolrServer.setZkClientTimeout(clientTimeout);
this.cloudSolrServer.setDefaultCollection(cloudCollection);
 } catch (MalformedURLException e) {
log.error(e);
}
}

@Override
public QueryResponse *search*(SolrQuery query) throws SolrServerException {
return cloudSolrServer.query(query, METHOD.POST);
}

@Override
public boolean *index*(DocumentBean user) {
boolean indexed = false;
int retries = 0;
 do {
indexed = addBean(user);
retries++;
 } while(!indexed && retries<4);
 return indexed;
}
 @Override
public boolean *update*(SolrInputDocument updateDoc) {
boolean update = false;
int retries = 0;

do {
update = addSolrInputDocument(updateDoc);
retries++;
 } while(!update && retries<4);
 return update;
}
 @Override
public void commit() {
try {
cloudSolrServer.commit();
} catch (SolrServerException e) {
     log.error(e);
} catch (IOException e) {
     log.error(e);
}
}

@Override
public boolean *delete*(String ... ids) {
boolean deleted = false;
 List<String> idList = Arrays.asList(ids);
 try {
this.cloudSolrServer.deleteById(idList);
this.cloudSolrServer.commit(true, true);
deleted = true;

} catch (SolrServerException e) {
log.error(e);

} catch (IOException e) {
log.error(e);
 }
 return deleted;
}

@Override
public void *optimize*() {
try {
this.cloudSolrServer.optimize();
 } catch (SolrServerException e) {
log.error(e);
 } catch (IOException e) {
log.error(e);
}
}
 /*
 * ********************
 *  Getters & setters *
 * ********************
 * */
 public CloudSolrServer getSolrServer() {
return cloudSolrServer;
}

public void setSolrServer(CloudSolrServer solrServer) {
this.cloudSolrServer = solrServer;
}

private boolean addBean(DocumentBean user) {
boolean added = false;
 try {
this.cloudSolrServer.addBean(user, 100);
this.commit();

} catch (IOException e) {
log.error(e);

} catch (SolrServerException e) {
log.error(e);
 }catch(SolrException e) {
log.error(e);
}
 return added;
}
 private boolean addSolrInputDocument(SolrInputDocument updateDoc) {
boolean added = false;
 try {
this.cloudSolrServer.add(updateDoc, 100);
this.commit();
added = true;
 } catch (IOException e) {
log.error(e);

} catch (SolrServerException e) {
log.error(e);
 }catch(SolrException e) {
log.error(e);
}
 return added;
}
}

Thank you very much, Mark.


-  Luis Cappa



And
2013/3/13 Mark Miller <ma...@gmail.com>

>
> Could you capture some thread stack traces in the 'engine' and see if
> there are any blocking methods?
>
> - Mark
>
> On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com> wrote:
>
> > Just one correction:
> >
> > When I said:
> >
> >   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >   everything is green, and I cant execute queries directly into Solr.
> >
> > I mean:
> >
> >
> >   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >   everything is green, and *I can* execute queries directly into Solr.
> >
> >
> > Thanks!
> >
> >
> > - Luis Cappa
> >
> >
> > 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
> >
> >> Hello, guys!
> >>
> >> I´ve been experiencing some annoying behavior with my current production
> >> scenario. Here is the snapshot:
> >>
> >>
> >>   - SolrCloud: 2 shards
> >>   - Zookeeper ensemble: 3 nodes in *different machines *(most of the
> >>   tutorials installs 3 Zookeeper nodes in the same machine).
> >>   - This is the zoo.cfg from every
> >>
> >> tickTime=2000  // I´ve also tried with 60000
> >>
> >> initLimit=10
> >>
> >> syncLimit=5
> >>
> >> dataDir=/var/lib/zookeeper
> >>
> >> clientPort=9000
> >>
> >> server.1=zoohost1:2888:3888
> >>
> >> server.2=zoohost1:2888:3888
> >>
> >> server.3=zoohost1:2888:3888
> >>
> >>
> >>
> >>   - I´ve developed a Java Application with a REST API (let´s call it *
> >>   engine*) that dispatches queries into SolrCloud. It´s a wrapper around
> >>   CloudSolrServer, so it´s mandatory to specify some Zookeeper
> configuration
> >>   params too. They are loaded dynamically when the application is
> deployed in
> >>   a Tomcat server, but the current values that I´m using are as follows:
> >>
> >> cloudSolrServer.*setZkConnectTimeout(60000)*
> >>
> >> cloudSolrServer.*setZkClientTimeout(60000)*
> >> *
> >> *
> >> *
> >> *
> >>
> >> *THE PROBLEM*
> >> *
> >> *
> >> Everything goes OK, but after two days more or less (yes, I´ve checked
> >> that this behavior occurrs periodically, more or less) the *engine
> blocks
> >> * and cannot dispatch any query to SolrCloud.
> >>
> >>   - The *engine *log only outputs "updating Zookeeper..." one last time,
> >>   but never updates.
> >>   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
> >>   everything is green, and I cant execute queries directly into Solr.
> >>   - So then Solr appears to be OK, so the next step is to restart
> *engine
> >>   but *it again appears "updating Zookeeper...". Unfortunately switch
> >>   off + switch on doesn´t work here, :-(
> >>   - I´ve checked too Zookeeper logs and it appears some connection log
> >>   outs, but the ensemble appears to be OK too.
> >>   - *The end: *If I restart Zookeeper one by one, and I restart
> >>   SolrCloud, plus I restart the engine, the problem is solved. I´m using
> >>   Amazon AWS as hostage, so I discard connection problems between
> instances.
> >>
> >>
> >> Does anyone experienced something similar? Can anybody shed some light
> on
> >> this problem?
> >>
> >> Thank you very much.
> >>
> >>
> >> Regards,
> >>
> >>
> >> - Luis Cappa
> >>
>
>

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Mark Miller <ma...@gmail.com>.

Could you capture some thread stack traces in the 'engine' and see if there are any blocking methods?

- Mark

On Mar 13, 2013, at 1:34 PM, Luis Cappa Banda <lu...@gmail.com> wrote:

> Just one correction:
> 
> When I said:
> 
>   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>   everything is green, and I cant execute queries directly into Solr.
> 
> I mean:
> 
> 
>   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>   everything is green, and *I can* execute queries directly into Solr.
> 
> 
> Thanks!
> 
> 
> - Luis Cappa
> 
> 
> 2013/3/13 Luis Cappa Banda <lu...@gmail.com>
> 
>> Hello, guys!
>> 
>> I´ve been experiencing some annoying behavior with my current production
>> scenario. Here is the snapshot:
>> 
>> 
>>   - SolrCloud: 2 shards
>>   - Zookeeper ensemble: 3 nodes in *different machines *(most of the
>>   tutorials installs 3 Zookeeper nodes in the same machine).
>>   - This is the zoo.cfg from every
>> 
>> tickTime=2000  // I´ve also tried with 60000
>> 
>> initLimit=10
>> 
>> syncLimit=5
>> 
>> dataDir=/var/lib/zookeeper
>> 
>> clientPort=9000
>> 
>> server.1=zoohost1:2888:3888
>> 
>> server.2=zoohost1:2888:3888
>> 
>> server.3=zoohost1:2888:3888
>> 
>> 
>> 
>>   - I´ve developed a Java Application with a REST API (let´s call it *
>>   engine*) that dispatches queries into SolrCloud. It´s a wrapper around
>>   CloudSolrServer, so it´s mandatory to specify some Zookeeper configuration
>>   params too. They are loaded dynamically when the application is deployed in
>>   a Tomcat server, but the current values that I´m using are as follows:
>> 
>> cloudSolrServer.*setZkConnectTimeout(60000)*
>> 
>> cloudSolrServer.*setZkClientTimeout(60000)*
>> *
>> *
>> *
>> *
>> 
>> *THE PROBLEM*
>> *
>> *
>> Everything goes OK, but after two days more or less (yes, I´ve checked
>> that this behavior occurrs periodically, more or less) the *engine blocks
>> * and cannot dispatch any query to SolrCloud.
>> 
>>   - The *engine *log only outputs "updating Zookeeper..." one last time,
>>   but never updates.
>>   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>>   everything is green, and I cant execute queries directly into Solr.
>>   - So then Solr appears to be OK, so the next step is to restart *engine
>>   but *it again appears "updating Zookeeper...". Unfortunately switch
>>   off + switch on doesn´t work here, :-(
>>   - I´ve checked too Zookeeper logs and it appears some connection log
>>   outs, but the ensemble appears to be OK too.
>>   - *The end: *If I restart Zookeeper one by one, and I restart
>>   SolrCloud, plus I restart the engine, the problem is solved. I´m using
>>   Amazon AWS as hostage, so I discard connection problems between instances.
>> 
>> 
>> Does anyone experienced something similar? Can anybody shed some light on
>> this problem?
>> 
>> Thank you very much.
>> 
>> 
>> Regards,
>> 
>> 
>> - Luis Cappa
>>

Re: SolrCloud with Zookeeper ensemble in production environment: SEVERE problems.

Posted by Luis Cappa Banda <lu...@gmail.com>.

Just one correction:

When I said:

   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
   everything is green, and I cant execute queries directly into Solr.

I mean:


   - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
   everything is green, and *I can* execute queries directly into Solr.


Thanks!


- Luis Cappa


2013/3/13 Luis Cappa Banda <lu...@gmail.com>

> Hello, guys!
>
> I´ve been experiencing some annoying behavior with my current production
> scenario. Here is the snapshot:
>
>
>    - SolrCloud: 2 shards
>    - Zookeeper ensemble: 3 nodes in *different machines *(most of the
>    tutorials installs 3 Zookeeper nodes in the same machine).
>    - This is the zoo.cfg from every
>
> tickTime=2000  // I´ve also tried with 60000
>
> initLimit=10
>
> syncLimit=5
>
> dataDir=/var/lib/zookeeper
>
> clientPort=9000
>
> server.1=zoohost1:2888:3888
>
> server.2=zoohost1:2888:3888
>
> server.3=zoohost1:2888:3888
>
>
>
>    - I´ve developed a Java Application with a REST API (let´s call it *
>    engine*) that dispatches queries into SolrCloud. It´s a wrapper around
>    CloudSolrServer, so it´s mandatory to specify some Zookeeper configuration
>    params too. They are loaded dynamically when the application is deployed in
>    a Tomcat server, but the current values that I´m using are as follows:
>
> cloudSolrServer.*setZkConnectTimeout(60000)*
>
> cloudSolrServer.*setZkClientTimeout(60000)*
> *
> *
> *
> *
>
> *THE PROBLEM*
> *
> *
> Everything goes OK, but after two days more or less (yes, I´ve checked
> that this behavior occurrs periodically, more or less) the *engine blocks
> * and cannot dispatch any query to SolrCloud.
>
>    - The *engine *log only outputs "updating Zookeeper..." one last time,
>    but never updates.
>    - I´ve checked SolrCloud via Solr Admin interface and it´s OK:
>    everything is green, and I cant execute queries directly into Solr.
>    - So then Solr appears to be OK, so the next step is to restart *engine
>    but *it again appears "updating Zookeeper...". Unfortunately switch
>    off + switch on doesn´t work here, :-(
>    - I´ve checked too Zookeeper logs and it appears some connection log
>    outs, but the ensemble appears to be OK too.
>    - *The end: *If I restart Zookeeper one by one, and I restart
>    SolrCloud, plus I restart the engine, the problem is solved. I´m using
>    Amazon AWS as hostage, so I discard connection problems between instances.
>
>
> Does anyone experienced something similar? Can anybody shed some light on
> this problem?
>
> Thank you very much.
>
>
> Regards,
>
>
> - Luis Cappa
>

RE: SolrCloud with Zookeeper ensemble : fail to restart master server

Posted by Patrick Mi <pa...@touchpointgroup.com>.

After a number of testing I found that running embedded zookeeper isn't a
good idea especially only run one Zookeeper instance. When the Solr instance
with ZooKeeper embedded gets rebooted it got confused who should be the
leader therefore it will not start while others(followers) are still
running. I now use standalone Zookeeper instance and that works well.

Thanks Erick for giving the right direction, much appreciated!

Regards,
Patrick

-----Original Message-----
From: Erick Erickson [mailto:erickerickson@gmail.com] 
Sent: Wednesday, 20 March 2013 2:57 a.m.
To: solr-user@lucene.apache.org
Subject: Re: SolrCloud with Zookeeper ensemble : fail to restart master
server

First, the bootstrap_conf  and numShards should only be specified the
_first_ time you start up your leader. bootstrap_conf's purpose is to push
the configuration files to Zookeeper. numShards is a one-time-only
parameter that you shouldn't specify more than once, it is ignored
afterwards I think. Once the conf files are up in zookeeper, then they
don't need to be pushed again until they change, and you can use the
command-line tools to do that....

Terminology: we're trying to get away from master/slave and use
leader/replica in SolrCloud mode to distinguish it from the old replication
process, so just checking to be sure that you probably really mean
leader/replica, right?

 Watch your admin/SolrCloud link as you bring machines up and down. That
page will show you the state of each of your machines. Normally there's no
trouble bringing the leader up and down, _except_ it sounds like you have
your zookeeper running embedded. A quorum of ZK nodes (in this case one)
needs to be running for SolrCloud to operate. Still, that shouldn't prevent
your machine running ZK from coming back up.

So I'm a bit puzzled, but let's straighten out the startup stuff and watch
your solr log on your leader when you bring it up, that should generate
some more questions..

Best
Erick

On Mon, Mar 18, 2013 at 11:12 PM, Patrick Mi <patrick.mi@touchpointgroup.com
> wrote:

> Hi there,
>
> I have experienced some problems starting the master server.
>
> Solr4.2 under Tomcat 7 on Centos6.
>
> Configuration :
> 3 solr instances running on different machines, one shard, 3 cores, 2
> replicas, using Zookeeper comes with Solr
>
> The master server A has the following run option: -Dbootstrap_conf=true
> -DzkRun -DnumShards=1,
> The slave servers B and C have : -DzkHost=masterServerIP:2181
>
> It works well for add/update/delete etc after I start up master and slave
> servers in order.
>
> When the master A is up stop/start slave B and C are OK.
>
> When slave B and C are running I couldn't restart master A. Only after I
> shutdown B and C then I can start master A.
>
> Is this a feature or bug or something I haven't configure properly?
>
> Thanks advance for your help
>
> Regards,
> Patrick
>
>

答复: SolrCloud with Zookeeper ensemble : fail to restart master server

Posted by "Rollin.R.Ma (lab.sh04.Newegg) 41099" <Ro...@newegg.com>.

Mark very good.

-----邮件原件-----
发件人: Erick Erickson [mailto:erickerickson@gmail.com] 
发送时间: 2013年3月19日 21:57
收件人: solr-user@lucene.apache.org
主题: Re: SolrCloud with Zookeeper ensemble : fail to restart master server

First, the bootstrap_conf  and numShards should only be specified the _first_ time you start up your leader. bootstrap_conf's purpose is to push the configuration files to Zookeeper. numShards is a one-time-only parameter that you shouldn't specify more than once, it is ignored afterwards I think. Once the conf files are up in zookeeper, then they don't need to be pushed again until they change, and you can use the command-line tools to do that....

Terminology: we're trying to get away from master/slave and use leader/replica in SolrCloud mode to distinguish it from the old replication process, so just checking to be sure that you probably really mean leader/replica, right?

 Watch your admin/SolrCloud link as you bring machines up and down. That page will show you the state of each of your machines. Normally there's no trouble bringing the leader up and down, _except_ it sounds like you have your zookeeper running embedded. A quorum of ZK nodes (in this case one) needs to be running for SolrCloud to operate. Still, that shouldn't prevent your machine running ZK from coming back up.

So I'm a bit puzzled, but let's straighten out the startup stuff and watch your solr log on your leader when you bring it up, that should generate some more questions..

Best
Erick

On Mon, Mar 18, 2013 at 11:12 PM, Patrick Mi <patrick.mi@touchpointgroup.com
> wrote:

> Hi there,
>
> I have experienced some problems starting the master server.
>
> Solr4.2 under Tomcat 7 on Centos6.
>
> Configuration :
> 3 solr instances running on different machines, one shard, 3 cores, 2 
> replicas, using Zookeeper comes with Solr
>
> The master server A has the following run option: 
> -Dbootstrap_conf=true -DzkRun -DnumShards=1, The slave servers B and C 
> have : -DzkHost=masterServerIP:2181
>
> It works well for add/update/delete etc after I start up master and 
> slave servers in order.
>
> When the master A is up stop/start slave B and C are OK.
>
> When slave B and C are running I couldn't restart master A. Only after 
> I shutdown B and C then I can start master A.
>
> Is this a feature or bug or something I haven't configure properly?
>
> Thanks advance for your help
>
> Regards,
> Patrick
>
>

Re: SolrCloud with Zookeeper ensemble : fail to restart master server

Posted by Erick Erickson <er...@gmail.com>.

First, the bootstrap_conf  and numShards should only be specified the
_first_ time you start up your leader. bootstrap_conf's purpose is to push
the configuration files to Zookeeper. numShards is a one-time-only
parameter that you shouldn't specify more than once, it is ignored
afterwards I think. Once the conf files are up in zookeeper, then they
don't need to be pushed again until they change, and you can use the
command-line tools to do that....

Terminology: we're trying to get away from master/slave and use
leader/replica in SolrCloud mode to distinguish it from the old replication
process, so just checking to be sure that you probably really mean
leader/replica, right?

 Watch your admin/SolrCloud link as you bring machines up and down. That
page will show you the state of each of your machines. Normally there's no
trouble bringing the leader up and down, _except_ it sounds like you have
your zookeeper running embedded. A quorum of ZK nodes (in this case one)
needs to be running for SolrCloud to operate. Still, that shouldn't prevent
your machine running ZK from coming back up.

So I'm a bit puzzled, but let's straighten out the startup stuff and watch
your solr log on your leader when you bring it up, that should generate
some more questions..

Best
Erick

On Mon, Mar 18, 2013 at 11:12 PM, Patrick Mi <patrick.mi@touchpointgroup.com
> wrote:

> Hi there,
>
> I have experienced some problems starting the master server.
>
> Solr4.2 under Tomcat 7 on Centos6.
>
> Configuration :
> 3 solr instances running on different machines, one shard, 3 cores, 2
> replicas, using Zookeeper comes with Solr
>
> The master server A has the following run option: -Dbootstrap_conf=true
> -DzkRun -DnumShards=1,
> The slave servers B and C have : -DzkHost=masterServerIP:2181
>
> It works well for add/update/delete etc after I start up master and slave
> servers in order.
>
> When the master A is up stop/start slave B and C are OK.
>
> When slave B and C are running I couldn't restart master A. Only after I
> shutdown B and C then I can start master A.
>
> Is this a feature or bug or something I haven't configure properly?
>
> Thanks advance for your help
>
> Regards,
> Patrick
>
>