You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Ethan <eh...@gmail.com> on 2014/09/08 20:20:45 UTC

Solr Sharding Help

I am trying to setup 2 shard cluster with 2 replicas with dedicated nodes
for replicas.  I have 4 node SolrCloud setup that I am trying to shard
using collections api .. (Like
https://wiki.apache.org/solr/SolrCloud#Example_C:_Two_shard_cluster_with_shard_replicas_and_zookeeper_ensemble
)

I ran this command -

http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numShards=2&maxShardsPerNode=1&createNodeSet=
 serv001:5258_solr, serv002:5258_solr

Response -

<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3932</int>
</lst>
<lst name="success">
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">2982</int>
</lst>
<str name="core">Main_shard2_replica1</str>
</lst>
<lst>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">3005</int>
</lst>
<str name="core">Main_shard1_replica1</str>
</lst>
</lst>
</response>

I want to know what *_replica1 or *_replica2 means?  Are they actually
replicas and not the shards?  I intended to add 2 more nodes as dedicated
replication nodes.  How to accomplish that?

Would appreciate any pointers.

-E

Re: Solr Sharding Help

Posted by Alexandre Rafalovitch <ar...@gmail.com>.
On Mon, Sep 8, 2014 at 3:11 PM, Erick Erickson <er...@gmail.com> wrote:
> I've started a one-man
> campaign to talk about "leaders" and "followers" when relevant

Well, if you write it up on the Wiki/Manual and keep pointing people
to it, maybe we will all fall in line. I, for one, do not care what
terminology is actually used as long as it is consistent and explains
the situation (Ugh core vs collection vs index vs .....).

Regards,
   Alex.
P.s. Or you could discuss that on the Solr popularizers LinkedIn group too....

Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter: http://www.solr-start.com/ and @solrstart
Solr popularizers community: https://www.linkedin.com/groups?gid=6713853

Re: Solr Sharding Help

Posted by Ethan <eh...@gmail.com>.
Thanks Jeff.  I had different idea of how replicationFactor worked.  I was
able to create the setup with that command.

Now as I import data into the cluster how can I determine that it's being
sharding?

On Mon, Sep 8, 2014 at 1:52 PM, Jeff Wartes <jw...@whitepages.com> wrote:

>
> You need to specify a replication factor of 2 if you want two copies of
> each shard. Solr doesn¹t ³auto fill² available capacity, contrary to the
> misleading examples on the http://wiki.apache.org/solr/SolrCloud page.
> Those examples only have that behavior because they ask you to copy the
> examples directory, which brings some on-disk configuration with it.
>
>
>
> On 9/8/14, 1:33 PM, "Ethan" <eh...@gmail.com> wrote:
>
> >Thanks Erick.  That cleared my confusion.
> >
> >I have a follow up question -  If I run the CREATE command with 4 nodes in
> >createNodeSet, I thought 2 leaders and 2 followers will be created
> >automatically. Thats not the case, however.
> >
> >
> http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numShar
> >ds=2&maxShardsPerNode=1&createNodeSet=
> > serv001:5258_solr, serv002:5258_solr,serv003:5258_solr, serv004:5258_solr
> >
> >I still get the same response.  I see 2 leaders being created, but I do
> >not
> >see other 2 nodes show up as followers in the cloud page in Solr Admin UI.
> > It looks like collection was not created for those 2 nodes at all.
> >
> >Is there additional step involved to add them?
> >
> >On Mon, Sep 8, 2014 at 12:11 PM, Erick Erickson <er...@gmail.com>
> >wrote:
> >
> >> Ahhh, this is a continual source of confusion. I've started a one-man
> >> campaign to talk about "leaders" and "followers" when relevant...
> >>
> >> _Every_ node is a "replica". This is because a node can be a leader or
> >> follower, and the role can change.
> >>
> >> So your case is entirely normal. These nodes are probably the leaders
> >> too, and will remain so while you add more replicas/followers.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Sep 8, 2014 at 11:20 AM, Ethan <eh...@gmail.com> wrote:
> >> > I am trying to setup 2 shard cluster with 2 replicas with dedicated
> >>nodes
> >> > for replicas.  I have 4 node SolrCloud setup that I am trying to shard
> >> > using collections api .. (Like
> >> >
> >>
> >>
> https://wiki.apache.org/solr/SolrCloud#Example_C:_Two_shard_cluster_with_
> >>shard_replicas_and_zookeeper_ensemble
> >> > )
> >> >
> >> > I ran this command -
> >> >
> >> >
> >>
> >>
> http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numSha
> >>rds=2&maxShardsPerNode=1&createNodeSet=
> >> >  serv001:5258_solr, serv002:5258_solr
> >> >
> >> > Response -
> >> >
> >> > <response>
> >> > <lst name="responseHeader">
> >> > <int name="status">0</int>
> >> > <int name="QTime">3932</int>
> >> > </lst>
> >> > <lst name="success">
> >> > <lst>
> >> > <lst name="responseHeader">
> >> > <int name="status">0</int>
> >> > <int name="QTime">2982</int>
> >> > </lst>
> >> > <str name="core">Main_shard2_replica1</str>
> >> > </lst>
> >> > <lst>
> >> > <lst name="responseHeader">
> >> > <int name="status">0</int>
> >> > <int name="QTime">3005</int>
> >> > </lst>
> >> > <str name="core">Main_shard1_replica1</str>
> >> > </lst>
> >> > </lst>
> >> > </response>
> >> >
> >> > I want to know what *_replica1 or *_replica2 means?  Are they actually
> >> > replicas and not the shards?  I intended to add 2 more nodes as
> >>dedicated
> >> > replication nodes.  How to accomplish that?
> >> >
> >> > Would appreciate any pointers.
> >> >
> >> > -E
> >>
>
>

Re: Solr Sharding Help

Posted by Jeff Wartes <jw...@whitepages.com>.
You need to specify a replication factor of 2 if you want two copies of
each shard. Solr doesn¹t ³auto fill² available capacity, contrary to the
misleading examples on the http://wiki.apache.org/solr/SolrCloud page.
Those examples only have that behavior because they ask you to copy the
examples directory, which brings some on-disk configuration with it.



On 9/8/14, 1:33 PM, "Ethan" <eh...@gmail.com> wrote:

>Thanks Erick.  That cleared my confusion.
>
>I have a follow up question -  If I run the CREATE command with 4 nodes in
>createNodeSet, I thought 2 leaders and 2 followers will be created
>automatically. Thats not the case, however.
>
>http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numShar
>ds=2&maxShardsPerNode=1&createNodeSet=
> serv001:5258_solr, serv002:5258_solr,serv003:5258_solr, serv004:5258_solr
>
>I still get the same response.  I see 2 leaders being created, but I do
>not
>see other 2 nodes show up as followers in the cloud page in Solr Admin UI.
> It looks like collection was not created for those 2 nodes at all.
>
>Is there additional step involved to add them?
>
>On Mon, Sep 8, 2014 at 12:11 PM, Erick Erickson <er...@gmail.com>
>wrote:
>
>> Ahhh, this is a continual source of confusion. I've started a one-man
>> campaign to talk about "leaders" and "followers" when relevant...
>>
>> _Every_ node is a "replica". This is because a node can be a leader or
>> follower, and the role can change.
>>
>> So your case is entirely normal. These nodes are probably the leaders
>> too, and will remain so while you add more replicas/followers.
>>
>> Best,
>> Erick
>>
>> On Mon, Sep 8, 2014 at 11:20 AM, Ethan <eh...@gmail.com> wrote:
>> > I am trying to setup 2 shard cluster with 2 replicas with dedicated
>>nodes
>> > for replicas.  I have 4 node SolrCloud setup that I am trying to shard
>> > using collections api .. (Like
>> >
>> 
>>https://wiki.apache.org/solr/SolrCloud#Example_C:_Two_shard_cluster_with_
>>shard_replicas_and_zookeeper_ensemble
>> > )
>> >
>> > I ran this command -
>> >
>> >
>> 
>>http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numSha
>>rds=2&maxShardsPerNode=1&createNodeSet=
>> >  serv001:5258_solr, serv002:5258_solr
>> >
>> > Response -
>> >
>> > <response>
>> > <lst name="responseHeader">
>> > <int name="status">0</int>
>> > <int name="QTime">3932</int>
>> > </lst>
>> > <lst name="success">
>> > <lst>
>> > <lst name="responseHeader">
>> > <int name="status">0</int>
>> > <int name="QTime">2982</int>
>> > </lst>
>> > <str name="core">Main_shard2_replica1</str>
>> > </lst>
>> > <lst>
>> > <lst name="responseHeader">
>> > <int name="status">0</int>
>> > <int name="QTime">3005</int>
>> > </lst>
>> > <str name="core">Main_shard1_replica1</str>
>> > </lst>
>> > </lst>
>> > </response>
>> >
>> > I want to know what *_replica1 or *_replica2 means?  Are they actually
>> > replicas and not the shards?  I intended to add 2 more nodes as
>>dedicated
>> > replication nodes.  How to accomplish that?
>> >
>> > Would appreciate any pointers.
>> >
>> > -E
>>


Re: Solr Sharding Help

Posted by Ethan <eh...@gmail.com>.
Thanks Erick.  That cleared my confusion.

I have a follow up question -  If I run the CREATE command with 4 nodes in
createNodeSet, I thought 2 leaders and 2 followers will be created
automatically. Thats not the case, however.

http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numShards=2&maxShardsPerNode=1&createNodeSet=
 serv001:5258_solr, serv002:5258_solr,serv003:5258_solr, serv004:5258_solr

I still get the same response.  I see 2 leaders being created, but I do not
see other 2 nodes show up as followers in the cloud page in Solr Admin UI.
 It looks like collection was not created for those 2 nodes at all.

Is there additional step involved to add them?

On Mon, Sep 8, 2014 at 12:11 PM, Erick Erickson <er...@gmail.com>
wrote:

> Ahhh, this is a continual source of confusion. I've started a one-man
> campaign to talk about "leaders" and "followers" when relevant...
>
> _Every_ node is a "replica". This is because a node can be a leader or
> follower, and the role can change.
>
> So your case is entirely normal. These nodes are probably the leaders
> too, and will remain so while you add more replicas/followers.
>
> Best,
> Erick
>
> On Mon, Sep 8, 2014 at 11:20 AM, Ethan <eh...@gmail.com> wrote:
> > I am trying to setup 2 shard cluster with 2 replicas with dedicated nodes
> > for replicas.  I have 4 node SolrCloud setup that I am trying to shard
> > using collections api .. (Like
> >
> https://wiki.apache.org/solr/SolrCloud#Example_C:_Two_shard_cluster_with_shard_replicas_and_zookeeper_ensemble
> > )
> >
> > I ran this command -
> >
> >
> http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numShards=2&maxShardsPerNode=1&createNodeSet=
> >  serv001:5258_solr, serv002:5258_solr
> >
> > Response -
> >
> > <response>
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">3932</int>
> > </lst>
> > <lst name="success">
> > <lst>
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">2982</int>
> > </lst>
> > <str name="core">Main_shard2_replica1</str>
> > </lst>
> > <lst>
> > <lst name="responseHeader">
> > <int name="status">0</int>
> > <int name="QTime">3005</int>
> > </lst>
> > <str name="core">Main_shard1_replica1</str>
> > </lst>
> > </lst>
> > </response>
> >
> > I want to know what *_replica1 or *_replica2 means?  Are they actually
> > replicas and not the shards?  I intended to add 2 more nodes as dedicated
> > replication nodes.  How to accomplish that?
> >
> > Would appreciate any pointers.
> >
> > -E
>

Re: Solr Sharding Help

Posted by Erick Erickson <er...@gmail.com>.
Ahhh, this is a continual source of confusion. I've started a one-man
campaign to talk about "leaders" and "followers" when relevant...

_Every_ node is a "replica". This is because a node can be a leader or
follower, and the role can change.

So your case is entirely normal. These nodes are probably the leaders
too, and will remain so while you add more replicas/followers.

Best,
Erick

On Mon, Sep 8, 2014 at 11:20 AM, Ethan <eh...@gmail.com> wrote:
> I am trying to setup 2 shard cluster with 2 replicas with dedicated nodes
> for replicas.  I have 4 node SolrCloud setup that I am trying to shard
> using collections api .. (Like
> https://wiki.apache.org/solr/SolrCloud#Example_C:_Two_shard_cluster_with_shard_replicas_and_zookeeper_ensemble
> )
>
> I ran this command -
>
> http://serv001:5258/solr/admin/collections?action=CREATE&name=Main&numShards=2&maxShardsPerNode=1&createNodeSet=
>  serv001:5258_solr, serv002:5258_solr
>
> Response -
>
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">3932</int>
> </lst>
> <lst name="success">
> <lst>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">2982</int>
> </lst>
> <str name="core">Main_shard2_replica1</str>
> </lst>
> <lst>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">3005</int>
> </lst>
> <str name="core">Main_shard1_replica1</str>
> </lst>
> </lst>
> </response>
>
> I want to know what *_replica1 or *_replica2 means?  Are they actually
> replicas and not the shards?  I intended to add 2 more nodes as dedicated
> replication nodes.  How to accomplish that?
>
> Would appreciate any pointers.
>
> -E