You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by tuxedomoon <da...@yahoo.com> on 2015/06/02 01:37:17 UTC

SolrCloud 5.1 startup looking for standalone config

I followed these steps and I am unable to launch in cloud mode.

1. created / started 3 external Zookeeper hosts: zk1, zk2, zk3

2. installed Solr 5.1 as a service called solrsvc on two hosts: s1, s2

3. uploaded a configset to zk1  (solr home is /volume/solr/data)
    -------------------------------------------------------------------
    /opt/solrsvc/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -zkhost 
zk1:2181  -confname mycollection_cloud_conf -solrhome /volume/solr/data
-confdir  /home/ec2-user/mycollection/conf


4. on s1, added these params to solr.in.sh
    -------------------------------------------------------------------
    ZK_HOST=zk1:2181,zk2:2181,zk3:2181
    SOLR_HOST=s1
    ZK_CLIENT_TIMEOUT="15000"
    SOLR_OPTS="$SOLR_OPTS -DnumShards=2"


5. on s1 created core directory and file
    --------------------------------------------------------------------
    /volume/solr/data/mycollection/core.properties (name=mycollection)


6. repeated steps 4,5 for s2 minus the numShards param


Starting the service on s1 gives me

mycollection:
org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
Could not load conf for core mycollection: Error loading solr config from
/volume/solr/data/mycollection/conf/solrconfig.xml 

but aren't the config files supposed to be in Zookeeper?  

Tux


   
   



    



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-5-1-startup-looking-for-standalone-config-tp4209118.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud 5.1 startup looking for standalone config

Posted by Erick Erickson <er...@gmail.com>.
bq: Does this remain 'fixed' in Zookeeper once established, so that restarting
nodes will not affect their shardn assignment?

How could it work otherwise? If restarting a node assigned the index
on that disk to another shard chaos would ensue.

Best,
Erick

On Fri, Jun 5, 2015 at 6:51 AM, tuxedomoon <da...@yahoo.com> wrote:
>>> I would need to look at the code to figure out how it works, but I would
>>> imagine that the shards are shuffled randomly among the hosts so that
>>> multiple collections will be evenly distributed across the cluster.  It
>>> would take me quite a while to familiarize myself with the code before I
>>> could figure out where to look.
>
> The random assignment is ok, wherever shard3 is created will become node3
> for my system.  As long as each leader and replica pair remain partnered
>
> mycollection_shard1_replica1  <--> mycollection_shard1_replica2
> mycollection_shard2_replica1  <--> mycollection_shard2_replica2
> etc
>
> Does this remain 'fixed' in Zookeeper once established, so that restarting
> nodes will not affect their shardn assignment?
>
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-5-1-startup-looking-for-standalone-config-tp4209118p4209990.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud 5.1 startup looking for standalone config

Posted by tuxedomoon <da...@yahoo.com>.
>> I would need to look at the code to figure out how it works, but I would
>> imagine that the shards are shuffled randomly among the hosts so that
>> multiple collections will be evenly distributed across the cluster.  It
>> would take me quite a while to familiarize myself with the code before I
>> could figure out where to look.

The random assignment is ok, wherever shard3 is created will become node3
for my system.  As long as each leader and replica pair remain partnered

mycollection_shard1_replica1  <--> mycollection_shard1_replica2
mycollection_shard2_replica1  <--> mycollection_shard2_replica2
etc

Does this remain 'fixed' in Zookeeper once established, so that restarting
nodes will not affect their shardn assignment?


 



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-5-1-startup-looking-for-standalone-config-tp4209118p4209990.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud 5.1 startup looking for standalone config

Posted by Shawn Heisey <ap...@elyograg.org>.
On 6/3/2015 2:48 PM, tuxedomoon wrote:
> Yes adding _solr worked, thx.  But I also had to populate the SOLR_HOST param
> for each of the 4 hosts, as in
> SOLR_HOST=ec2-52-4-232-216.compute-1.amazonaws.com.   I'm in an EC2 VPN
> environment which might be the problem.
>
> This command now works (leaving off port)
>
> http://s1/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&collection.configName=mycollection_cloud_conf&createNodeSet=s1_solr,s2_solr,s3_solr
>
> The shard directories do now appear on s1,s2,s3 but the order is different
> every time I DELETE the collection and rerun the CREATE, right now it is
>
> s1: mycollection_shard2_replica1
> s2: mycollection_shard3_replica1
> s3: mycollection_shard1_replica1
>
> I'll look further at your article but any advice appreciated on controlling
> what hosts the shards land on.
>
> Also are these considered leaders?  If so I don't understand the replica1
> suffix.

A leader is merely a replica that has won an election and has a
temporary title.  It's still a replica, even if it's the ONLY replica.

I would need to look at the code to figure out how it works, but I would
imagine that the shards are shuffled randomly among the hosts so that
multiple collections will be evenly distributed across the cluster.  It
would take me quite a while to familiarize myself with the code before I
could figure out where to look.

If you want to have absolute control over shard and replica placement,
then you will probably need to follow steps similar to these:

* Create a collection with replicationFactor=1.
* Create foo_shardN_replica2 cores with CoreAdmin or ADDREPLICA where
you want them.
* Let the replication fully catch up.
* Use DELETEREPLICA on all the foo_shardN_replica1 cores.
* Manually create the foo_shardN_replica1 cores where you want them.
* Manually create any additional replicas that you desire.

Thanks,
Shawn


Re: SolrCloud 5.1 startup looking for standalone config

Posted by tuxedomoon <da...@yahoo.com>.
Yes adding _solr worked, thx.  But I also had to populate the SOLR_HOST param
for each of the 4 hosts, as in
SOLR_HOST=ec2-52-4-232-216.compute-1.amazonaws.com.   I'm in an EC2 VPN
environment which might be the problem.

This command now works (leaving off port)

http://s1/solr/admin/collections?action=CREATE&name=mycollection&numShards=3&collection.configName=mycollection_cloud_conf&createNodeSet=s1_solr,s2_solr,s3_solr

The shard directories do now appear on s1,s2,s3 but the order is different
every time I DELETE the collection and rerun the CREATE, right now it is

s1: mycollection_shard2_replica1
s2: mycollection_shard3_replica1
s3: mycollection_shard1_replica1

I'll look further at your article but any advice appreciated on controlling
what hosts the shards land on.

Also are these considered leaders?  If so I don't understand the replica1
suffix.




--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-5-1-startup-looking-for-standalone-config-tp4209118p4209581.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud 5.1 startup looking for standalone config

Posted by Erick Erickson <er...@gmail.com>.
Take a closer look at the admin UI>>cloud>>tree and you'll see that
the nodes have names like "s1:8983_solr" as does the example at the
link, did you try it with that kind of labeling?

Best,
Erick

On Tue, Jun 2, 2015 at 11:16 AM, tuxedomoon <da...@yahoo.com> wrote:
> I ran this command with Solr hosts s1 & s2 running.
>
> http://s1:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&collection.configName=mycollection_cloud_conf&createNodeSet=s1:8983,s2:8983
>
> I referred to  this link
> <http://heliosearch.org/solrcloud-assigning-nodes-machines/>   which looks
> like it is only passing the desired leaders to createNodeSet.
>
> But I'm getting this error
> -----------------------------------------------------------------------------------------
> Cannot create collection mycollection. Value of maxShardsPerNode is 1, and
> the number of nodes currently live or live and part of your createNodeSet is
> 0. This allows a maximum of 0 to be created. Value of numShards is 2 and
> value of replicationFactor is 1. This requires 2 shards to be created
> (higher than the allowed number)
>
> I get the same error with createNodeSet=s1:8983,s2:8983,s3:8983,s4:8983 with
> all four Solr hosts running.
>
> But the service status command shows that Zookeeper sees all my running
> nodes
> --------------------------------------------------------------------------------
> Solr process 24603 running on port 8983
> {
>   "solr_home":"/volume/solr/data/",
>   "version":"5.1.0 1672403 - timpotter - 2015-04-09 10:37:54",
>   "startTime":"2015-06-02T18:00:06.665Z",
>   "uptime":"0 days, 0 hours, 4 minutes, 35 seconds",
>   "memory":"19.6 MB (%4) of 490.7 MB",
>   "cloud":{
>     "ZooKeeper":"zk1:2181,zk2:2181,k3:2181",
>     "liveNodes":"4",
>     "collections":"0"}}
>
>
> I was expecting the absent maxShardsPerNode param to default to 1 and give
> me 2 leaders, 2 replicas.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-5-1-startup-looking-for-standalone-config-tp4209118p4209294.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud 5.1 startup looking for standalone config

Posted by tuxedomoon <da...@yahoo.com>.
I ran this command with Solr hosts s1 & s2 running.  

http://s1:8983/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&collection.configName=mycollection_cloud_conf&createNodeSet=s1:8983,s2:8983

I referred to  this link
<http://heliosearch.org/solrcloud-assigning-nodes-machines/>   which looks
like it is only passing the desired leaders to createNodeSet.

But I'm getting this error
-----------------------------------------------------------------------------------------
Cannot create collection mycollection. Value of maxShardsPerNode is 1, and
the number of nodes currently live or live and part of your createNodeSet is
0. This allows a maximum of 0 to be created. Value of numShards is 2 and
value of replicationFactor is 1. This requires 2 shards to be created
(higher than the allowed number)

I get the same error with createNodeSet=s1:8983,s2:8983,s3:8983,s4:8983 with
all four Solr hosts running.

But the service status command shows that Zookeeper sees all my running
nodes
--------------------------------------------------------------------------------
Solr process 24603 running on port 8983
{
  "solr_home":"/volume/solr/data/",
  "version":"5.1.0 1672403 - timpotter - 2015-04-09 10:37:54",
  "startTime":"2015-06-02T18:00:06.665Z",
  "uptime":"0 days, 0 hours, 4 minutes, 35 seconds",
  "memory":"19.6 MB (%4) of 490.7 MB",
  "cloud":{
    "ZooKeeper":"zk1:2181,zk2:2181,k3:2181",
    "liveNodes":"4",
    "collections":"0"}}


I was expecting the absent maxShardsPerNode param to default to 1 and give
me 2 leaders, 2 replicas.



--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-5-1-startup-looking-for-standalone-config-tp4209118p4209294.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud 5.1 startup looking for standalone config

Posted by tuxedomoon <da...@yahoo.com>.
ok thanks, continuing...

>> numShards in SOLR_OPTS isn't a good idea, what happens if you want to
>> create a collection with 5 shards?)
yes I was following my old pattern CATALINA_OPTS="${CATALINA_OPTS}
-DnumShards=n

>> down the nodes and nuke the directories you created by hand and bring the
>> nodes back up
yes I did this

>> create the collection via the Collections API CREATE 
 I did this but kept getting "not running in SolrCloud mode".  Added the -c
option to my service script like this

su -c "SOLR_INCLUDE=$SOLR_ENV $SOLR_INSTALL_DIR/bin/solr $SOLR_CMD -c" -
$RUNAS

and it did start in cloud mode.  Is the -c necessary and is that the right
place for it?  I thought  uncommenting the ZK param in solr.in.sh would put
it in cloud mode.  

Reran the CREATE and got a shard1 and shard2 in the GUI cloud view.   

New directories are arc_search_shard1_replica1 and
arc_search_shard2_replica1.  Is this because I have only 2 Solr hosts
running?   I'm used to adding nodes one by one and having the replica
assignments start when numShards count is exceeded.

Transitioning from 4.2 to 5.1 and it's quite different!





--
View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-5-1-startup-looking-for-standalone-config-tp4209118p4209222.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: SolrCloud 5.1 startup looking for standalone config

Posted by Erick Erickson <er...@gmail.com>.
bq: but aren't the config files supposed to be in Zookeeper

Yes, but you haven't done anything to tell Solr that the nodes you've
created are part of SolrCloud!

You're confusing, I think, core discovery with creating collections.
Basically you were pretty much OK up until step 5 (although I'm not at
all sure that SOLR_HOST is doing you any good, and certainly setting
numShards in SOLR_OPTS isn't a good idea, what happens if you want to
create a collection with 5 shards?)

You don't need to create any directories on your Solr nodes, that'll
be done for you automatically by the collection creation command from
the Collections API. So I'd down the nodes and nuke the directories
you created by hand and bring the nodes back up. It's probably not
necessary to take the nodes down, but I tend to be paranoid about
that.

Then just create the collection via the Collections API CREATE
command, see: https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1

You can use curl or a browser to issue something like this to any
active Solr node, Solr will do the rest:
http://some_solr_node:port/solr/admin/collections?action=CREATE&name=mycollection&numShards=2&collection.configName=my_collection_cloud_conf&etc......

I believe it's _possible_ to carefully construct the core.properties
files on all the Solr instances, but unless you know _exactly_ what's
going on under the covers it'll lead to endless tail-chasing. You can
control which nodes the collection ends up on with the createNodeSet
parameter etc....

Best,
Erick

On Mon, Jun 1, 2015 at 4:37 PM, tuxedomoon <da...@yahoo.com> wrote:
> I followed these steps and I am unable to launch in cloud mode.
>
> 1. created / started 3 external Zookeeper hosts: zk1, zk2, zk3
>
> 2. installed Solr 5.1 as a service called solrsvc on two hosts: s1, s2
>
> 3. uploaded a configset to zk1  (solr home is /volume/solr/data)
>     -------------------------------------------------------------------
>     /opt/solrsvc/server/scripts/cloud-scripts/zkcli.sh -cmd upconfig -zkhost
> zk1:2181  -confname mycollection_cloud_conf -solrhome /volume/solr/data
> -confdir  /home/ec2-user/mycollection/conf
>
>
> 4. on s1, added these params to solr.in.sh
>     -------------------------------------------------------------------
>     ZK_HOST=zk1:2181,zk2:2181,zk3:2181
>     SOLR_HOST=s1
>     ZK_CLIENT_TIMEOUT="15000"
>     SOLR_OPTS="$SOLR_OPTS -DnumShards=2"
>
>
> 5. on s1 created core directory and file
>     --------------------------------------------------------------------
>     /volume/solr/data/mycollection/core.properties (name=mycollection)
>
>
> 6. repeated steps 4,5 for s2 minus the numShards param
>
>
> Starting the service on s1 gives me
>
> mycollection:
> org.apache.solr.common.SolrException:org.apache.solr.common.SolrException:
> Could not load conf for core mycollection: Error loading solr config from
> /volume/solr/data/mycollection/conf/solrconfig.xml
>
> but aren't the config files supposed to be in Zookeeper?
>
> Tux
>
>
>
>
>
>
>
>
>
>
>
> --
> View this message in context: http://lucene.472066.n3.nabble.com/SolrCloud-5-1-startup-looking-for-standalone-config-tp4209118.html
> Sent from the Solr - User mailing list archive at Nabble.com.