You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by darul <da...@gmail.com> on 2012/11/07 21:47:08 UTC

Testing Solr Cloud with ZooKeeper

Hello everyone,

Having used *Hadoop* (not in charge of deployment, just java code part) and
*Solr 3.6* (deployment and coding) this year, today I made the solr cloud
wiki.

Well, 

* I have deployed 2 zookeeper (not embedded) instances
* 2 solr instances with 2 shards (pointing to zookeeper nodes)
* 2 solr replicates

.... successfully ...thank you for new administration ui, graph and co,
nice.

But I am still confused with all these new amazing features. (compared to
when I was using multicore and master/slave behaviour).

Here in cloud, I am lost (in translation too)

*Few questions:*
- my both zookeeper have their own data directory, as usual, but I did not
see so much change inside after indexing examples docs. Are data stored
their or just /configuration (conf files) /is stored in zookeeper ensemble ?
Can you confirmed /index data/ are also stored in zookeeper cluster ? Or not
?
- In my solr instances directory tree,  /solr/mycollection/ sometimes I have
an "index" or "index.20121107185908378" directory and tlog directory, what
is it used for, could you explain me why index directory sometimes looks
like a snapshot ? zookeeper should not store index, sorry I repeat myself,
or is it just a snaphot. what is tlog directory for ?
- Then, playing a little bit, I test following command
http://localhost:8983/solr/admin/collections?action=CREATE&name=myname&numShards=2&replicationFactor=1
and see it update configuration of core.xml and create "data" directory as
well, nice. But when I navigate to admin ui and check schema for instance,
where does this configuration come from ? I do not get any conf directory
for this core, does it take one by default....

I have so much questions to ask.

Thanks,

Julien




--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
Looks like after timeout has finished, first solr instance respond



I was not waiting enough. Is it possible to reduce this *timeout* value ?

Thanks



--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4020190.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
https://issues.apache.org/jira/browse/SOLR-3993 has been resolved.

Just few question, is it in trunk, I mean in main distrib downloadable on
main solr site.

Because I have downloaded it and get still same behaviour while running
first instance..or second shards.



--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4020118.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
Yes ku3ia, I read your thread yesterday and looks like we get same issue. I
wish Apache Con is nearly finished and expert can resolve this 
Thanks again to solr community,
Jul



--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019271.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by ku3ia <de...@gmail.com>.
Hi, I have near the same problems with cloud state
see
http://lucene.472066.n3.nabble.com/Replicated-zookeeper-td4018984.html



--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019264.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
- Shards : 2
- ZooKeeper Cluster : 3
- One collection.

Here is how I run it and my scenario case:

In first console, I get first Node (first Shard) running on port 8983:





In second console, I get second Node (second Shard) running on port 8984:





Here I get just 2 nodes for my 2 shards running.

The I decide to add 2 replicates for each shard node.


and


Now everything is fine, a "robust" collection with 2 shards, 2 replicates
running. 

Result expected is here:

<http://lucene.472066.n3.nabble.com/file/n4019257/Solr_Admin_192.168.1.6_.png> 

Then, I decide to stop the 2 last predicates running on port 7501/7502.

Results expected is here:
<http://lucene.472066.n3.nabble.com/file/n4019257/2.png> 

Then I now stop the 2 main instances running on port 8983/8983.

Restart the first one 8983:

I get a lot of this dump in console:


Why not, I start second one running on 8984, and get 



I do not understand why replicates are needed at this phase...first when I
started the first time, no need for replicates. And now, I would like
restart 2 main instances, and maybe start replicates later.

If I start both instances 7501/7502, everything is fine but not what I was
expected.

Any ideas,

Thanks again,

Jul



--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019257.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by Erick Erickson <er...@gmail.com>.
you have to have at least one node per shard running for SolrCloud to
function. So when you bring down all nodes and start one, then you have
some shards with no live nodes and SolrCloud goes into a wait state.

Best
Erick


On Thu, Nov 8, 2012 at 6:17 PM, darul <da...@gmail.com> wrote:

> Is it same issue as one detailed here
>
> http://lucene.472066.n3.nabble.com/SolrCloud-leader-election-on-single-node-td4015804.html
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019183.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
Is it same issue as one detailed here
http://lucene.472066.n3.nabble.com/SolrCloud-leader-election-on-single-node-td4015804.html



--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019183.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
Too illustrate:

<http://lucene.472066.n3.nabble.com/file/n4019103/SolrAdmin.png> 

Taking this example, 8983 and 8984 are Shard "owner", 7501/7502 just
replicates.

If I stop all instance, then restart 8983 or 8984 first, they won't run and
asked for replicates too be started...




--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019103.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
Thanks Otis, 

Indeed here too  zoo doc
<http://zookeeper.apache.org/doc/trunk/zookeeperAdmin.html#sc_zkMulitServerSetup> 
, they advise to choose odd number of zk nodes this way "To create a
deployment that can tolerate the failure of F machines, you should count on
deploying 2xF+1 machines"...

Well, I just do not yet understand why after using replicate, I am not able
to restart solr instances if replicates are not running. (When I start them,
it is ok)

Do I need to erase all zookeeper config every time solr servers are
restarted...I mean send the conf again with bootstrap, looks like I am not
doing the right way ;)





--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019102.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi,

The number of shards is unrelated to the number of ZK nodes.  Use 3 or 5 ZK
nodes, not 2.  See http://hbase.apache.org/book/zookeeper.html why. :)

Otis
--
Search Analytics - http://sematext.com/search-analytics/index.html
Performance Monitoring - http://sematext.com/spm/index.html


On Thu, Nov 8, 2012 at 8:00 AM, darul <da...@gmail.com> wrote:

> Hello again,
>
> With the following config :
>
> - 2 zookeeper ensemble
> - 2 shards
> - 2 main solr instances for the 2 shards
> - I added 2, 3 replicates for fun.
>
> While running and I stop one replicate, I see in admin ui graph updates
> (replicate disabled/inactivated)...normal.
>
> But if I stopped all solr instance and restart the first main instance
> :8983, I always get it waiting for some replicates...is it useful ? Why
> replicate are needed to run ? Can not access to admin anymore.
>
> Solution is to erase zookeeper data and start again, do you have any
> solutions to avoid :
>
>
>
> What if my replicates are really down in production and I restart
> everything
> ?
>
> Another question, 2 shards means 2 zookeeper ensemble, 3 shards, 3
> zookeeper
> ensemble ?
>
> Thanks,
>
> Jul
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019028.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
Hello again,

With the following config :

- 2 zookeeper ensemble
- 2 shards
- 2 main solr instances for the 2 shards
- I added 2, 3 replicates for fun.

While running and I stop one replicate, I see in admin ui graph updates
(replicate disabled/inactivated)...normal.

But if I stopped all solr instance and restart the first main instance
:8983, I always get it waiting for some replicates...is it useful ? Why
replicate are needed to run ? Can not access to admin anymore. 

Solution is to erase zookeeper data and start again, do you have any
solutions to avoid :



What if my replicates are really down in production and I restart everything
?

Another question, 2 shards means 2 zookeeper ensemble, 3 shards, 3 zookeeper
ensemble ?

Thanks,

Jul



--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4019028.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by Otis Gospodnetic <ot...@gmail.com>.
You didn't ask about this, but you'll want an odd number of zookeeper
nodes. Think voting.

Otis
--
Performance Monitoring - http://sematext.com/spm
On Nov 7, 2012 4:43 PM, "darul" <da...@gmail.com> wrote:

> Yes instanceDir attribute point to new created core (with no conf dir) so
> it
> is stranged...
>
> but looks like I have played to much:
>
>
>
> when I start main solr shard. I try everything again tomorrow and give you
> feedback.
>
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4018909.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
Yes instanceDir attribute point to new created core (with no conf dir) so it
is stranged...

but looks like I have played to much:



when I start main solr shard. I try everything again tomorrow and give you
feedback.





--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4018909.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Testing Solr Cloud with ZooKeeper

Posted by Erick Erickson <er...@gmail.com>.
Right. Solr uses zookeeper only for configuration information. The index
resides on the machines running solr.

bq: In my solr instances directory tree,  /solr/mycollection/ sometimes I
have an "index" or "index.20121107185908378"

You can configure Solr to keep snapshots of indexes around under control of
an index deletion policy, which you can configure. I think what you're
seeing is this policy in action, you can check to see how it's set up in
your particular situation. This is independent of SolrCloud, it's local to
the solr node.

About CREATE. I'm not entirely sure where the config comes from, sorry I
can't help there... What does the solr.xml file show? Are there instanceDir
attribute to the newly-created core (or schema or config)?

Best
Erick


On Wed, Nov 7, 2012 at 3:52 PM, darul <da...@gmail.com> wrote:

> I reply to myself :
>
>
> darul wrote
> >
> *
> > Few questions:
> *
> > - my both zookeeper have their own data directory, as usual, but I did
> not
> > see so much change inside after indexing examples docs. Are data stored
> > their or just
> /
> > configuration (conf files)
> /
> > is stored in zookeeper ensemble ? Can you confirmed
> /
> > index data
> /
> >  are also stored in zookeeper cluster ? Or not ?
>
> I read again and see "Solr embeds and uses Zookeeper as a repository for
> cluster configuration and coordination", so meaning just configuration, not
> index repository at all ?
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4018902.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Testing Solr Cloud with ZooKeeper

Posted by darul <da...@gmail.com>.
I reply to myself :


darul wrote
> 
*
> Few questions:
*
> - my both zookeeper have their own data directory, as usual, but I did not
> see so much change inside after indexing examples docs. Are data stored
> their or just 
/
> configuration (conf files) 
/
> is stored in zookeeper ensemble ? Can you confirmed 
/
> index data
/
>  are also stored in zookeeper cluster ? Or not ?

I read again and see "Solr embeds and uses Zookeeper as a repository for
cluster configuration and coordination", so meaning just configuration, not
index repository at all ?



--
View this message in context: http://lucene.472066.n3.nabble.com/Testing-Solr-Cloud-with-ZooKeeper-tp4018900p4018902.html
Sent from the Solr - User mailing list archive at Nabble.com.