You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Himanshu Mehrotra <hi...@gmail.com> on 2014/07/05 10:57:49 UTC

Solr and SolrCloud repllcation, and load balancing questions.

Hi,

I had three quesions/doubts regarding Solr and SolrCloud functionality.
Can anyone help clarify these? I know these are bit long, please bear with
me.

[A] Replication related - As I understand before SolrCloud, under a classic
master/slave replication setup, every 'X' minutes slaves will pull/poll the
updated index (index segments added and deleted/merged away ).  And when a
client explicitly issues a 'commit' only master solr closes/finalizes
current index segment and creates a new current index segment.  As port of
this index segment merges as well as 'fsync' ensuring data is on the disk
also happens.

I read documentation regarding replication on SolrCloud but unfortunately
it is still not very clear to me.

Say I have solr cloud setup of 3 solr servers with just a single shard.
Let's call them L (the leader) and F1 and F2, the followers.

Case 1: We are not using autoCommits, and explictly issue 'commit' via
Client.  How does replication happen now?
Does the each update to leader L that goes into tlog get replicated to
followers F1, and F2 (wher they also put update in tlog ) before client
sees response from leader L?  What happens when client issues a 'commit'?
Does  the creation of new segment, merging of index segments if required,
and fsync happen on all three solrs or that just happens on leader L and
followers F1, F2 simply sync the post commit state of index.  More-over
does leader L wait for fsync in followers F1, F2, before responding
sucessfully to Client?  If yes does it sequentially updates F1 and then F2
or is the process concurrent/parallel via threads.

Case 2: We use autoCommit every 'X' minutes and do not issue 'commit' via
Client.  Is this setup similar to classic master slave in terms of
data/index updates?
As in since autoCommit happens every 'X' minutes replication will happen
after commit, every 'X' minutes followers get updated index.  But does
simple updates, the ones that go int tlog get replicated immediately to
follower's tlog .

Another thing I noticed in Solr Admin UI, is that replication is set to
afterCommit, what are other possible settings for this knob.  And what
behaviour do we get out of them.




[B] Load balancing related - In traditional master/slave setup we use load
balancer to distribute load search query load equally over slaves.  In case
one of the slave solr is running on 'beefier' machine (say more RAM or CPU
or both) than others, then load balancers allow distributing load by
weights so that we can distribute load proportional to percieved machine
capacity.

With solr cloud setup, lets take an example, 2 shards, 3 replicas per
shard, totaling to 6 solr servers are running and say we have
Servers S1L1, S1F1, S1F2 hosting replicas of shard1 and servers S2L1, S2F1,
S2F2 hosting replicas of shard2.  S1L1 and S2L2 happen to be leaders of
their respective shard.  And lets say S1F2, and S2F1 happen to be twice as
big machines as others (twice the RAM and CPU).

Ideally speaking in such case we would want S2F1 and S1F2 to handle twice
the search query load as their peers.  That is if 100 search queries come
we know each shard will receive these 100 queries.  So we want S1L1, and
S1F1 to handle 25 queries each, and S1F2 to handle 50 queries.  Similarly
we would want S2L1 and S2F2 to handle 25 queries and S2F1 to handle 50
queries.

As far as I understand, this is not possible via smart client provided in
SolrJ.  All solr servers will handle 33% of the query load.

Alternative is to use dumb client and load balancer over all servers.  But
even then I guess we won't get correct/desired distribution of queries.
Say we put following weights for each server

1 - S1L1
1 - S1F1
2 - S1F2
1 - S1L1
2 - S1F1
1 - S1F2

Now 1/4 of total number of requests go to S1F2 directly, plus now it
recieves  1/6 ( 1/2 * 1/3 ) of request that went to some server on shard 2.
This totals up to 10/24 of request load, not half as we would expect.

One way could be to chose weight y and x such that y/(2*(y + 2x)) + 1/6 =
1/2 . It seems too much of trouble to get them ( y = 4 and x = 1 ).
Every time we add/remove/upgrade servers we need to recalculate new weights.

A simpler alternative it appears would be that each solr node register its
'query_weight' with zoo-keeper on joining the cluster. This 'query_weight'
could be a property similar to 'solr.solr.home' or 'zkHosts' that we
specify with startup commandline for solr server.

And all smart clients and solr servers, to honour that weight when they
distribute load.  Is there such a feature planned for Solr Cloud?




[C] GC/Memory usage related - From the documentation and videos available
on internet, it appears that solr perform well if index fits into the
memory and stord fields fit in the memory.  Holding just index in memory
has more degrading impact on solr performance and if we don't have enough
memory to hold index solr is still slower, and the moment java process hits
swap space solr will slow to a crawl.

My question is what the 'memory' being talked about is? Is it the Java Heap
we specify via Xmx and Xms options.  Or is it the free memory, or buffered,
or cached memory as available from output free command on *nix systems.
And how do we know if our index and stored fields will fit the memory.  For
example say the data directory for the core/collection occupies 200MB on
disk ( 150,000 live documents and 180,000 max documents per Solr UI) , then
is a 8GB machine with solr being configured with Xmx 4G going to be
sufficient?

Are there any guidlines as to configuring the java heap and total RAM,
given an index size and the expected query rate ( queries per
minute/second).
On production system I observed via gc logs that minor collections happen
at rate of 20 per minute, full gc happens every seven to ten minutes, are
these  too high or low given direct search query load on that solr node is
about 2500 request per minute.  What kind of GC behaviour I should expect
from an healthy and fast/optimal solr node in solr-cloud setup.  Is the
answer it depends on your specific response time and throughput
requirements or is there some kind of rule of thumb that can followed
irrespective of the situation.  Or should I see if any improvements can be
made via regular measure, tweak , measure cycles.

Thanks,
Himanshu

Re: Solr and SolrCloud repllcation, and load balancing questions.

Posted by Himanshu Mehrotra <hi...@snapdeal.com>.
Ok, great.  Thanks for helping  out.

Thanks,
Himanshu


On Sun, Jul 6, 2014 at 9:35 PM, Erick Erickson <er...@gmail.com>
wrote:

> [C] I've rarely seen situations where the document cache has
> a very high a hit rate. For that to happen, the queries would
> need to be returning the exact same documents, which isn't
> usually the case. I wouldn't increase this very far. The
> recommendation is that it be
> (total simultaneous queries you expect) * (page size). The
> response chain sometimes has separate components that
> operate on the docs being returned (i.e. "rows" parameter).
> The docs in the cache can then be accessed by successive
> components without going back out to disk. But by and
> large this isn't used by different queries.
>
> And, in fact, with the MMapDirectory stuff it's not even clear to
> me that this cache is as useful as it was when it was conceived.
>
> Bottom line: I wouldn't worry about this one, and I certainly
> wouldn't allocate lots of memory to it until and unless I was able
> to measure performance gains....
>
> Best,
> Erick
>
> On Sun, Jul 6, 2014 at 2:31 AM, Himanshu Mehrotra
> <hi...@snapdeal.com> wrote:
> > Erick, first up thanks for thoroughly answering my questions.
> >
> > [A]  I had read the blot mentioned, and yet failed to 'get it'.  Now I
> > understand the flow.
> > [B]  The automatic, heuristic based approach as you said will be
> difficult
> > to get right, that is why I thought 'beefiness' index configuration
> similar
> > to load balancer might help get same effective result for the most part.
>  I
> > guess it is feature that most people won't need, only the ones in process
> > of upgrading their machines.
> > [C] I will go through the blog, and do empirical analysis.  Speaking of
> > caches, I see that for us filter cache hit ratio is good 97% while
> document
> > cache hit ratio is below 10%, does it mean that document cache
> (size=4096)
> >  is not big enough and I should increase the size or does it mean that we
> > are getting queries that result in too wide a result set and hence we
> would
> > probably better off switching off the document cache altogether if we
> could
> > do it.
> >
> > Thanks,
> > Himanshu
> >
> >
> >
> > On Sun, Jul 6, 2014 at 5:27 AM, Erick Erickson <er...@gmail.com>
> > wrote:
> >
> >> Question1, both sub-cases.
> >>
> >> You're off on the wrong track here, you have to forget about
> replication.
> >>
> >> When documents are added to the index, they get forwarded to _all_
> >> replicas. So the flow is like this...
> >> 1> leader gets update request
> >> 2> leader indexes docs locally, and adds to (local) transaction log
> >>       _and_ forwards request to all followers
> >> 3> followers add docs to tlog and index locally
> >> 4> followers ack back to leader
> >> 5> leader acks back to client.
> >>
> >> There is no replication in the old sense at all in this scenario. I'll
> >> add parenthetically that old-style replication _is_ still used to
> >> "catch up" a follower that is waaaaaay behind, but the follower is
> >> in the "recovering" state if this ever occurs.
> >>
> >> About commit. If you commit from the client, the commit is forwarded
> >> to all followers (actually, all nodes in the collection). If you have
> >> autocommit configured, each of the replicas will fire their commit when
> >> the time period expires.
> >>
> >> Here's a blog that might help:
> >>
> >>
> http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
> >>
> >> [B] right, SolrCloud really supposes that the machines are pretty
> >> similar so doesn't provide any way to do what you're asking. Really,
> >> you're asking for some way to assign "beefiness" to the node in terms
> >> of load sent to it... I don't know of a way to do that and I'm not
> >> sure it's on the roadmap either.
> >>
> >> What you'd really want, though, is some kind of heuristic that was
> >> automatically applied. That would take into account transient load
> >> problems, i.e. replica N happened to get a really nasty query to run
> >> and is just slow for a while. I can see this being very tricky to get
> >> right though. Would a GC pause get weighted as "slow" even though the
> >> pause could be over already? Anyway, I don't think this is on the
> >> roadmap at present but could well be wrong.
> >>
> >> In your specific example, though (this works because of the convenient
> >> 2x....) you could host 2x the number of shards/replicas on the beefier
> >> machines.
> >>
> >> [C] Right, memory allocation is difficult. The general recommendation
> >> is that memory for Solr allocated in the JVM should be as small as
> >> possible, and leave let the op system use memory for MMapDirectory.
> >> See the excellent blog here:
> >> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html
> .
> >> If you over-allocate memory to the JVM, your GC profile worsens...
> >>
> >> Generally, when people throw "memory" around they're talking about JVM
> >> memory...
> >>
> >> And don't be mislead by the notion of "the index fitting into memory".
> >> You're absolutely right that when you get into a swapping situation,
> >> performance will suffer. But there are some very interesting tricks
> >> played to keep JVM consumption down. For instance, only every 128th
> >> term is stored in the JVM memory. Other terms are then read as needed.
> >> And stored in the OS memory via MMapDirectory implementations....
> >>
> >> Your GC stats look quite reasonable. You can get a snapshot of memory
> >> usage by attaching, say, jConsole to the running JVM and see what
> >> memory usage was after a forced GC. Sounds like you've already seen
> >> this, but in case not:
> >> http://searchhub.org/2011/03/27/garbage-collection-bootcamp-1-0/. It
> >> was written before there was much mileage on the new G1 garbage
> >> collector which has received mixed reviews.
> >>
> >> Note that the stored fields kept in memory are controlled by the
> >> documentCache in solrconfig.xml. I think of this as just something
> >> that holds documents when assembling the return list, it really
> >> doesn't have anything to do with searching per-se, just keeping disk
> >> seeks down during processing for a particular query. I.e. for a query
> >> returning 10 rows, only 10 docs will be kept here not the 5M rows that
> >> matched.
> >>
> >> Whether 4G is sufficient is.... not answerable. I've doubled the
> >> memory requirements by changing the query without changing the index.
> >> Here's a blog outlining why we can't predict and how to get an answer
> >> empirically:
> >>
> >>
> http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
> >>
> >> Best,
> >> Erick
> >>
> >> On Sat, Jul 5, 2014 at 1:57 AM, Himanshu Mehrotra
> >> <hi...@gmail.com> wrote:
> >> > Hi,
> >> >
> >> > I had three quesions/doubts regarding Solr and SolrCloud
> functionality.
> >> > Can anyone help clarify these? I know these are bit long, please bear
> >> with
> >> > me.
> >> >
> >> > [A] Replication related - As I understand before SolrCloud, under a
> >> classic
> >> > master/slave replication setup, every 'X' minutes slaves will
> pull/poll
> >> the
> >> > updated index (index segments added and deleted/merged away ).  And
> when
> >> a
> >> > client explicitly issues a 'commit' only master solr closes/finalizes
> >> > current index segment and creates a new current index segment.  As
> port
> >> of
> >> > this index segment merges as well as 'fsync' ensuring data is on the
> disk
> >> > also happens.
> >> >
> >> > I read documentation regarding replication on SolrCloud but
> unfortunately
> >> > it is still not very clear to me.
> >> >
> >> > Say I have solr cloud setup of 3 solr servers with just a single
> shard.
> >> > Let's call them L (the leader) and F1 and F2, the followers.
> >> >
> >> > Case 1: We are not using autoCommits, and explictly issue 'commit' via
> >> > Client.  How does replication happen now?
> >> > Does the each update to leader L that goes into tlog get replicated to
> >> > followers F1, and F2 (wher they also put update in tlog ) before
> client
> >> > sees response from leader L?  What happens when client issues a
> 'commit'?
> >> > Does  the creation of new segment, merging of index segments if
> required,
> >> > and fsync happen on all three solrs or that just happens on leader L
> and
> >> > followers F1, F2 simply sync the post commit state of index.
>  More-over
> >> > does leader L wait for fsync in followers F1, F2, before responding
> >> > sucessfully to Client?  If yes does it sequentially updates F1 and
> then
> >> F2
> >> > or is the process concurrent/parallel via threads.
> >> >
> >> > Case 2: We use autoCommit every 'X' minutes and do not issue 'commit'
> via
> >> > Client.  Is this setup similar to classic master slave in terms of
> >> > data/index updates?
> >> > As in since autoCommit happens every 'X' minutes replication will
> happen
> >> > after commit, every 'X' minutes followers get updated index.  But does
> >> > simple updates, the ones that go int tlog get replicated immediately
> to
> >> > follower's tlog .
> >> >
> >> > Another thing I noticed in Solr Admin UI, is that replication is set
> to
> >> > afterCommit, what are other possible settings for this knob.  And what
> >> > behaviour do we get out of them.
> >> >
> >> >
> >> >
> >> >
> >> > [B] Load balancing related - In traditional master/slave setup we use
> >> load
> >> > balancer to distribute load search query load equally over slaves.  In
> >> case
> >> > one of the slave solr is running on 'beefier' machine (say more RAM or
> >> CPU
> >> > or both) than others, then load balancers allow distributing load by
> >> > weights so that we can distribute load proportional to percieved
> machine
> >> > capacity.
> >> >
> >> > With solr cloud setup, lets take an example, 2 shards, 3 replicas per
> >> > shard, totaling to 6 solr servers are running and say we have
> >> > Servers S1L1, S1F1, S1F2 hosting replicas of shard1 and servers S2L1,
> >> S2F1,
> >> > S2F2 hosting replicas of shard2.  S1L1 and S2L2 happen to be leaders
> of
> >> > their respective shard.  And lets say S1F2, and S2F1 happen to be
> twice
> >> as
> >> > big machines as others (twice the RAM and CPU).
> >> >
> >> > Ideally speaking in such case we would want S2F1 and S1F2 to handle
> twice
> >> > the search query load as their peers.  That is if 100 search queries
> come
> >> > we know each shard will receive these 100 queries.  So we want S1L1,
> and
> >> > S1F1 to handle 25 queries each, and S1F2 to handle 50 queries.
>  Similarly
> >> > we would want S2L1 and S2F2 to handle 25 queries and S2F1 to handle 50
> >> > queries.
> >> >
> >> > As far as I understand, this is not possible via smart client
> provided in
> >> > SolrJ.  All solr servers will handle 33% of the query load.
> >> >
> >> > Alternative is to use dumb client and load balancer over all servers.
> >>  But
> >> > even then I guess we won't get correct/desired distribution of
> queries.
> >> > Say we put following weights for each server
> >> >
> >> > 1 - S1L1
> >> > 1 - S1F1
> >> > 2 - S1F2
> >> > 1 - S1L1
> >> > 2 - S1F1
> >> > 1 - S1F2
> >> >
> >> > Now 1/4 of total number of requests go to S1F2 directly, plus now it
> >> > recieves  1/6 ( 1/2 * 1/3 ) of request that went to some server on
> shard
> >> 2.
> >> > This totals up to 10/24 of request load, not half as we would expect.
> >> >
> >> > One way could be to chose weight y and x such that y/(2*(y + 2x)) +
> 1/6 =
> >> > 1/2 . It seems too much of trouble to get them ( y = 4 and x = 1 ).
> >> > Every time we add/remove/upgrade servers we need to recalculate new
> >> weights.
> >> >
> >> > A simpler alternative it appears would be that each solr node register
> >> its
> >> > 'query_weight' with zoo-keeper on joining the cluster. This
> >> 'query_weight'
> >> > could be a property similar to 'solr.solr.home' or 'zkHosts' that we
> >> > specify with startup commandline for solr server.
> >> >
> >> > And all smart clients and solr servers, to honour that weight when
> they
> >> > distribute load.  Is there such a feature planned for Solr Cloud?
> >> >
> >> >
> >> >
> >> >
> >> > [C] GC/Memory usage related - From the documentation and videos
> available
> >> > on internet, it appears that solr perform well if index fits into the
> >> > memory and stord fields fit in the memory.  Holding just index in
> memory
> >> > has more degrading impact on solr performance and if we don't have
> enough
> >> > memory to hold index solr is still slower, and the moment java process
> >> hits
> >> > swap space solr will slow to a crawl.
> >> >
> >> > My question is what the 'memory' being talked about is? Is it the Java
> >> Heap
> >> > we specify via Xmx and Xms options.  Or is it the free memory, or
> >> buffered,
> >> > or cached memory as available from output free command on *nix
> systems.
> >> > And how do we know if our index and stored fields will fit the memory.
> >>  For
> >> > example say the data directory for the core/collection occupies 200MB
> on
> >> > disk ( 150,000 live documents and 180,000 max documents per Solr UI) ,
> >> then
> >> > is a 8GB machine with solr being configured with Xmx 4G going to be
> >> > sufficient?
> >> >
> >> > Are there any guidlines as to configuring the java heap and total RAM,
> >> > given an index size and the expected query rate ( queries per
> >> > minute/second).
> >> > On production system I observed via gc logs that minor collections
> happen
> >> > at rate of 20 per minute, full gc happens every seven to ten minutes,
> are
> >> > these  too high or low given direct search query load on that solr
> node
> >> is
> >> > about 2500 request per minute.  What kind of GC behaviour I should
> expect
> >> > from an healthy and fast/optimal solr node in solr-cloud setup.  Is
> the
> >> > answer it depends on your specific response time and throughput
> >> > requirements or is there some kind of rule of thumb that can followed
> >> > irrespective of the situation.  Or should I see if any improvements
> can
> >> be
> >> > made via regular measure, tweak , measure cycles.
> >> >
> >> > Thanks,
> >> > Himanshu
> >>
>

Re: Solr and SolrCloud repllcation, and load balancing questions.

Posted by Erick Erickson <er...@gmail.com>.
[C] I've rarely seen situations where the document cache has
a very high a hit rate. For that to happen, the queries would
need to be returning the exact same documents, which isn't
usually the case. I wouldn't increase this very far. The
recommendation is that it be
(total simultaneous queries you expect) * (page size). The
response chain sometimes has separate components that
operate on the docs being returned (i.e. "rows" parameter).
The docs in the cache can then be accessed by successive
components without going back out to disk. But by and
large this isn't used by different queries.

And, in fact, with the MMapDirectory stuff it's not even clear to
me that this cache is as useful as it was when it was conceived.

Bottom line: I wouldn't worry about this one, and I certainly
wouldn't allocate lots of memory to it until and unless I was able
to measure performance gains....

Best,
Erick

On Sun, Jul 6, 2014 at 2:31 AM, Himanshu Mehrotra
<hi...@snapdeal.com> wrote:
> Erick, first up thanks for thoroughly answering my questions.
>
> [A]  I had read the blot mentioned, and yet failed to 'get it'.  Now I
> understand the flow.
> [B]  The automatic, heuristic based approach as you said will be difficult
> to get right, that is why I thought 'beefiness' index configuration similar
> to load balancer might help get same effective result for the most part.  I
> guess it is feature that most people won't need, only the ones in process
> of upgrading their machines.
> [C] I will go through the blog, and do empirical analysis.  Speaking of
> caches, I see that for us filter cache hit ratio is good 97% while document
> cache hit ratio is below 10%, does it mean that document cache (size=4096)
>  is not big enough and I should increase the size or does it mean that we
> are getting queries that result in too wide a result set and hence we would
> probably better off switching off the document cache altogether if we could
> do it.
>
> Thanks,
> Himanshu
>
>
>
> On Sun, Jul 6, 2014 at 5:27 AM, Erick Erickson <er...@gmail.com>
> wrote:
>
>> Question1, both sub-cases.
>>
>> You're off on the wrong track here, you have to forget about replication.
>>
>> When documents are added to the index, they get forwarded to _all_
>> replicas. So the flow is like this...
>> 1> leader gets update request
>> 2> leader indexes docs locally, and adds to (local) transaction log
>>       _and_ forwards request to all followers
>> 3> followers add docs to tlog and index locally
>> 4> followers ack back to leader
>> 5> leader acks back to client.
>>
>> There is no replication in the old sense at all in this scenario. I'll
>> add parenthetically that old-style replication _is_ still used to
>> "catch up" a follower that is waaaaaay behind, but the follower is
>> in the "recovering" state if this ever occurs.
>>
>> About commit. If you commit from the client, the commit is forwarded
>> to all followers (actually, all nodes in the collection). If you have
>> autocommit configured, each of the replicas will fire their commit when
>> the time period expires.
>>
>> Here's a blog that might help:
>>
>> http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>>
>> [B] right, SolrCloud really supposes that the machines are pretty
>> similar so doesn't provide any way to do what you're asking. Really,
>> you're asking for some way to assign "beefiness" to the node in terms
>> of load sent to it... I don't know of a way to do that and I'm not
>> sure it's on the roadmap either.
>>
>> What you'd really want, though, is some kind of heuristic that was
>> automatically applied. That would take into account transient load
>> problems, i.e. replica N happened to get a really nasty query to run
>> and is just slow for a while. I can see this being very tricky to get
>> right though. Would a GC pause get weighted as "slow" even though the
>> pause could be over already? Anyway, I don't think this is on the
>> roadmap at present but could well be wrong.
>>
>> In your specific example, though (this works because of the convenient
>> 2x....) you could host 2x the number of shards/replicas on the beefier
>> machines.
>>
>> [C] Right, memory allocation is difficult. The general recommendation
>> is that memory for Solr allocated in the JVM should be as small as
>> possible, and leave let the op system use memory for MMapDirectory.
>> See the excellent blog here:
>> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html.
>> If you over-allocate memory to the JVM, your GC profile worsens...
>>
>> Generally, when people throw "memory" around they're talking about JVM
>> memory...
>>
>> And don't be mislead by the notion of "the index fitting into memory".
>> You're absolutely right that when you get into a swapping situation,
>> performance will suffer. But there are some very interesting tricks
>> played to keep JVM consumption down. For instance, only every 128th
>> term is stored in the JVM memory. Other terms are then read as needed.
>> And stored in the OS memory via MMapDirectory implementations....
>>
>> Your GC stats look quite reasonable. You can get a snapshot of memory
>> usage by attaching, say, jConsole to the running JVM and see what
>> memory usage was after a forced GC. Sounds like you've already seen
>> this, but in case not:
>> http://searchhub.org/2011/03/27/garbage-collection-bootcamp-1-0/. It
>> was written before there was much mileage on the new G1 garbage
>> collector which has received mixed reviews.
>>
>> Note that the stored fields kept in memory are controlled by the
>> documentCache in solrconfig.xml. I think of this as just something
>> that holds documents when assembling the return list, it really
>> doesn't have anything to do with searching per-se, just keeping disk
>> seeks down during processing for a particular query. I.e. for a query
>> returning 10 rows, only 10 docs will be kept here not the 5M rows that
>> matched.
>>
>> Whether 4G is sufficient is.... not answerable. I've doubled the
>> memory requirements by changing the query without changing the index.
>> Here's a blog outlining why we can't predict and how to get an answer
>> empirically:
>>
>> http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>>
>> Best,
>> Erick
>>
>> On Sat, Jul 5, 2014 at 1:57 AM, Himanshu Mehrotra
>> <hi...@gmail.com> wrote:
>> > Hi,
>> >
>> > I had three quesions/doubts regarding Solr and SolrCloud functionality.
>> > Can anyone help clarify these? I know these are bit long, please bear
>> with
>> > me.
>> >
>> > [A] Replication related - As I understand before SolrCloud, under a
>> classic
>> > master/slave replication setup, every 'X' minutes slaves will pull/poll
>> the
>> > updated index (index segments added and deleted/merged away ).  And when
>> a
>> > client explicitly issues a 'commit' only master solr closes/finalizes
>> > current index segment and creates a new current index segment.  As port
>> of
>> > this index segment merges as well as 'fsync' ensuring data is on the disk
>> > also happens.
>> >
>> > I read documentation regarding replication on SolrCloud but unfortunately
>> > it is still not very clear to me.
>> >
>> > Say I have solr cloud setup of 3 solr servers with just a single shard.
>> > Let's call them L (the leader) and F1 and F2, the followers.
>> >
>> > Case 1: We are not using autoCommits, and explictly issue 'commit' via
>> > Client.  How does replication happen now?
>> > Does the each update to leader L that goes into tlog get replicated to
>> > followers F1, and F2 (wher they also put update in tlog ) before client
>> > sees response from leader L?  What happens when client issues a 'commit'?
>> > Does  the creation of new segment, merging of index segments if required,
>> > and fsync happen on all three solrs or that just happens on leader L and
>> > followers F1, F2 simply sync the post commit state of index.  More-over
>> > does leader L wait for fsync in followers F1, F2, before responding
>> > sucessfully to Client?  If yes does it sequentially updates F1 and then
>> F2
>> > or is the process concurrent/parallel via threads.
>> >
>> > Case 2: We use autoCommit every 'X' minutes and do not issue 'commit' via
>> > Client.  Is this setup similar to classic master slave in terms of
>> > data/index updates?
>> > As in since autoCommit happens every 'X' minutes replication will happen
>> > after commit, every 'X' minutes followers get updated index.  But does
>> > simple updates, the ones that go int tlog get replicated immediately to
>> > follower's tlog .
>> >
>> > Another thing I noticed in Solr Admin UI, is that replication is set to
>> > afterCommit, what are other possible settings for this knob.  And what
>> > behaviour do we get out of them.
>> >
>> >
>> >
>> >
>> > [B] Load balancing related - In traditional master/slave setup we use
>> load
>> > balancer to distribute load search query load equally over slaves.  In
>> case
>> > one of the slave solr is running on 'beefier' machine (say more RAM or
>> CPU
>> > or both) than others, then load balancers allow distributing load by
>> > weights so that we can distribute load proportional to percieved machine
>> > capacity.
>> >
>> > With solr cloud setup, lets take an example, 2 shards, 3 replicas per
>> > shard, totaling to 6 solr servers are running and say we have
>> > Servers S1L1, S1F1, S1F2 hosting replicas of shard1 and servers S2L1,
>> S2F1,
>> > S2F2 hosting replicas of shard2.  S1L1 and S2L2 happen to be leaders of
>> > their respective shard.  And lets say S1F2, and S2F1 happen to be twice
>> as
>> > big machines as others (twice the RAM and CPU).
>> >
>> > Ideally speaking in such case we would want S2F1 and S1F2 to handle twice
>> > the search query load as their peers.  That is if 100 search queries come
>> > we know each shard will receive these 100 queries.  So we want S1L1, and
>> > S1F1 to handle 25 queries each, and S1F2 to handle 50 queries.  Similarly
>> > we would want S2L1 and S2F2 to handle 25 queries and S2F1 to handle 50
>> > queries.
>> >
>> > As far as I understand, this is not possible via smart client provided in
>> > SolrJ.  All solr servers will handle 33% of the query load.
>> >
>> > Alternative is to use dumb client and load balancer over all servers.
>>  But
>> > even then I guess we won't get correct/desired distribution of queries.
>> > Say we put following weights for each server
>> >
>> > 1 - S1L1
>> > 1 - S1F1
>> > 2 - S1F2
>> > 1 - S1L1
>> > 2 - S1F1
>> > 1 - S1F2
>> >
>> > Now 1/4 of total number of requests go to S1F2 directly, plus now it
>> > recieves  1/6 ( 1/2 * 1/3 ) of request that went to some server on shard
>> 2.
>> > This totals up to 10/24 of request load, not half as we would expect.
>> >
>> > One way could be to chose weight y and x such that y/(2*(y + 2x)) + 1/6 =
>> > 1/2 . It seems too much of trouble to get them ( y = 4 and x = 1 ).
>> > Every time we add/remove/upgrade servers we need to recalculate new
>> weights.
>> >
>> > A simpler alternative it appears would be that each solr node register
>> its
>> > 'query_weight' with zoo-keeper on joining the cluster. This
>> 'query_weight'
>> > could be a property similar to 'solr.solr.home' or 'zkHosts' that we
>> > specify with startup commandline for solr server.
>> >
>> > And all smart clients and solr servers, to honour that weight when they
>> > distribute load.  Is there such a feature planned for Solr Cloud?
>> >
>> >
>> >
>> >
>> > [C] GC/Memory usage related - From the documentation and videos available
>> > on internet, it appears that solr perform well if index fits into the
>> > memory and stord fields fit in the memory.  Holding just index in memory
>> > has more degrading impact on solr performance and if we don't have enough
>> > memory to hold index solr is still slower, and the moment java process
>> hits
>> > swap space solr will slow to a crawl.
>> >
>> > My question is what the 'memory' being talked about is? Is it the Java
>> Heap
>> > we specify via Xmx and Xms options.  Or is it the free memory, or
>> buffered,
>> > or cached memory as available from output free command on *nix systems.
>> > And how do we know if our index and stored fields will fit the memory.
>>  For
>> > example say the data directory for the core/collection occupies 200MB on
>> > disk ( 150,000 live documents and 180,000 max documents per Solr UI) ,
>> then
>> > is a 8GB machine with solr being configured with Xmx 4G going to be
>> > sufficient?
>> >
>> > Are there any guidlines as to configuring the java heap and total RAM,
>> > given an index size and the expected query rate ( queries per
>> > minute/second).
>> > On production system I observed via gc logs that minor collections happen
>> > at rate of 20 per minute, full gc happens every seven to ten minutes, are
>> > these  too high or low given direct search query load on that solr node
>> is
>> > about 2500 request per minute.  What kind of GC behaviour I should expect
>> > from an healthy and fast/optimal solr node in solr-cloud setup.  Is the
>> > answer it depends on your specific response time and throughput
>> > requirements or is there some kind of rule of thumb that can followed
>> > irrespective of the situation.  Or should I see if any improvements can
>> be
>> > made via regular measure, tweak , measure cycles.
>> >
>> > Thanks,
>> > Himanshu
>>

Re: Solr and SolrCloud repllcation, and load balancing questions.

Posted by Himanshu Mehrotra <hi...@snapdeal.com>.
Erick, first up thanks for thoroughly answering my questions.

[A]  I had read the blot mentioned, and yet failed to 'get it'.  Now I
understand the flow.
[B]  The automatic, heuristic based approach as you said will be difficult
to get right, that is why I thought 'beefiness' index configuration similar
to load balancer might help get same effective result for the most part.  I
guess it is feature that most people won't need, only the ones in process
of upgrading their machines.
[C] I will go through the blog, and do empirical analysis.  Speaking of
caches, I see that for us filter cache hit ratio is good 97% while document
cache hit ratio is below 10%, does it mean that document cache (size=4096)
 is not big enough and I should increase the size or does it mean that we
are getting queries that result in too wide a result set and hence we would
probably better off switching off the document cache altogether if we could
do it.

Thanks,
Himanshu



On Sun, Jul 6, 2014 at 5:27 AM, Erick Erickson <er...@gmail.com>
wrote:

> Question1, both sub-cases.
>
> You're off on the wrong track here, you have to forget about replication.
>
> When documents are added to the index, they get forwarded to _all_
> replicas. So the flow is like this...
> 1> leader gets update request
> 2> leader indexes docs locally, and adds to (local) transaction log
>       _and_ forwards request to all followers
> 3> followers add docs to tlog and index locally
> 4> followers ack back to leader
> 5> leader acks back to client.
>
> There is no replication in the old sense at all in this scenario. I'll
> add parenthetically that old-style replication _is_ still used to
> "catch up" a follower that is waaaaaay behind, but the follower is
> in the "recovering" state if this ever occurs.
>
> About commit. If you commit from the client, the commit is forwarded
> to all followers (actually, all nodes in the collection). If you have
> autocommit configured, each of the replicas will fire their commit when
> the time period expires.
>
> Here's a blog that might help:
>
> http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/
>
> [B] right, SolrCloud really supposes that the machines are pretty
> similar so doesn't provide any way to do what you're asking. Really,
> you're asking for some way to assign "beefiness" to the node in terms
> of load sent to it... I don't know of a way to do that and I'm not
> sure it's on the roadmap either.
>
> What you'd really want, though, is some kind of heuristic that was
> automatically applied. That would take into account transient load
> problems, i.e. replica N happened to get a really nasty query to run
> and is just slow for a while. I can see this being very tricky to get
> right though. Would a GC pause get weighted as "slow" even though the
> pause could be over already? Anyway, I don't think this is on the
> roadmap at present but could well be wrong.
>
> In your specific example, though (this works because of the convenient
> 2x....) you could host 2x the number of shards/replicas on the beefier
> machines.
>
> [C] Right, memory allocation is difficult. The general recommendation
> is that memory for Solr allocated in the JVM should be as small as
> possible, and leave let the op system use memory for MMapDirectory.
> See the excellent blog here:
> http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html.
> If you over-allocate memory to the JVM, your GC profile worsens...
>
> Generally, when people throw "memory" around they're talking about JVM
> memory...
>
> And don't be mislead by the notion of "the index fitting into memory".
> You're absolutely right that when you get into a swapping situation,
> performance will suffer. But there are some very interesting tricks
> played to keep JVM consumption down. For instance, only every 128th
> term is stored in the JVM memory. Other terms are then read as needed.
> And stored in the OS memory via MMapDirectory implementations....
>
> Your GC stats look quite reasonable. You can get a snapshot of memory
> usage by attaching, say, jConsole to the running JVM and see what
> memory usage was after a forced GC. Sounds like you've already seen
> this, but in case not:
> http://searchhub.org/2011/03/27/garbage-collection-bootcamp-1-0/. It
> was written before there was much mileage on the new G1 garbage
> collector which has received mixed reviews.
>
> Note that the stored fields kept in memory are controlled by the
> documentCache in solrconfig.xml. I think of this as just something
> that holds documents when assembling the return list, it really
> doesn't have anything to do with searching per-se, just keeping disk
> seeks down during processing for a particular query. I.e. for a query
> returning 10 rows, only 10 docs will be kept here not the 5M rows that
> matched.
>
> Whether 4G is sufficient is.... not answerable. I've doubled the
> memory requirements by changing the query without changing the index.
> Here's a blog outlining why we can't predict and how to get an answer
> empirically:
>
> http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/
>
> Best,
> Erick
>
> On Sat, Jul 5, 2014 at 1:57 AM, Himanshu Mehrotra
> <hi...@gmail.com> wrote:
> > Hi,
> >
> > I had three quesions/doubts regarding Solr and SolrCloud functionality.
> > Can anyone help clarify these? I know these are bit long, please bear
> with
> > me.
> >
> > [A] Replication related - As I understand before SolrCloud, under a
> classic
> > master/slave replication setup, every 'X' minutes slaves will pull/poll
> the
> > updated index (index segments added and deleted/merged away ).  And when
> a
> > client explicitly issues a 'commit' only master solr closes/finalizes
> > current index segment and creates a new current index segment.  As port
> of
> > this index segment merges as well as 'fsync' ensuring data is on the disk
> > also happens.
> >
> > I read documentation regarding replication on SolrCloud but unfortunately
> > it is still not very clear to me.
> >
> > Say I have solr cloud setup of 3 solr servers with just a single shard.
> > Let's call them L (the leader) and F1 and F2, the followers.
> >
> > Case 1: We are not using autoCommits, and explictly issue 'commit' via
> > Client.  How does replication happen now?
> > Does the each update to leader L that goes into tlog get replicated to
> > followers F1, and F2 (wher they also put update in tlog ) before client
> > sees response from leader L?  What happens when client issues a 'commit'?
> > Does  the creation of new segment, merging of index segments if required,
> > and fsync happen on all three solrs or that just happens on leader L and
> > followers F1, F2 simply sync the post commit state of index.  More-over
> > does leader L wait for fsync in followers F1, F2, before responding
> > sucessfully to Client?  If yes does it sequentially updates F1 and then
> F2
> > or is the process concurrent/parallel via threads.
> >
> > Case 2: We use autoCommit every 'X' minutes and do not issue 'commit' via
> > Client.  Is this setup similar to classic master slave in terms of
> > data/index updates?
> > As in since autoCommit happens every 'X' minutes replication will happen
> > after commit, every 'X' minutes followers get updated index.  But does
> > simple updates, the ones that go int tlog get replicated immediately to
> > follower's tlog .
> >
> > Another thing I noticed in Solr Admin UI, is that replication is set to
> > afterCommit, what are other possible settings for this knob.  And what
> > behaviour do we get out of them.
> >
> >
> >
> >
> > [B] Load balancing related - In traditional master/slave setup we use
> load
> > balancer to distribute load search query load equally over slaves.  In
> case
> > one of the slave solr is running on 'beefier' machine (say more RAM or
> CPU
> > or both) than others, then load balancers allow distributing load by
> > weights so that we can distribute load proportional to percieved machine
> > capacity.
> >
> > With solr cloud setup, lets take an example, 2 shards, 3 replicas per
> > shard, totaling to 6 solr servers are running and say we have
> > Servers S1L1, S1F1, S1F2 hosting replicas of shard1 and servers S2L1,
> S2F1,
> > S2F2 hosting replicas of shard2.  S1L1 and S2L2 happen to be leaders of
> > their respective shard.  And lets say S1F2, and S2F1 happen to be twice
> as
> > big machines as others (twice the RAM and CPU).
> >
> > Ideally speaking in such case we would want S2F1 and S1F2 to handle twice
> > the search query load as their peers.  That is if 100 search queries come
> > we know each shard will receive these 100 queries.  So we want S1L1, and
> > S1F1 to handle 25 queries each, and S1F2 to handle 50 queries.  Similarly
> > we would want S2L1 and S2F2 to handle 25 queries and S2F1 to handle 50
> > queries.
> >
> > As far as I understand, this is not possible via smart client provided in
> > SolrJ.  All solr servers will handle 33% of the query load.
> >
> > Alternative is to use dumb client and load balancer over all servers.
>  But
> > even then I guess we won't get correct/desired distribution of queries.
> > Say we put following weights for each server
> >
> > 1 - S1L1
> > 1 - S1F1
> > 2 - S1F2
> > 1 - S1L1
> > 2 - S1F1
> > 1 - S1F2
> >
> > Now 1/4 of total number of requests go to S1F2 directly, plus now it
> > recieves  1/6 ( 1/2 * 1/3 ) of request that went to some server on shard
> 2.
> > This totals up to 10/24 of request load, not half as we would expect.
> >
> > One way could be to chose weight y and x such that y/(2*(y + 2x)) + 1/6 =
> > 1/2 . It seems too much of trouble to get them ( y = 4 and x = 1 ).
> > Every time we add/remove/upgrade servers we need to recalculate new
> weights.
> >
> > A simpler alternative it appears would be that each solr node register
> its
> > 'query_weight' with zoo-keeper on joining the cluster. This
> 'query_weight'
> > could be a property similar to 'solr.solr.home' or 'zkHosts' that we
> > specify with startup commandline for solr server.
> >
> > And all smart clients and solr servers, to honour that weight when they
> > distribute load.  Is there such a feature planned for Solr Cloud?
> >
> >
> >
> >
> > [C] GC/Memory usage related - From the documentation and videos available
> > on internet, it appears that solr perform well if index fits into the
> > memory and stord fields fit in the memory.  Holding just index in memory
> > has more degrading impact on solr performance and if we don't have enough
> > memory to hold index solr is still slower, and the moment java process
> hits
> > swap space solr will slow to a crawl.
> >
> > My question is what the 'memory' being talked about is? Is it the Java
> Heap
> > we specify via Xmx and Xms options.  Or is it the free memory, or
> buffered,
> > or cached memory as available from output free command on *nix systems.
> > And how do we know if our index and stored fields will fit the memory.
>  For
> > example say the data directory for the core/collection occupies 200MB on
> > disk ( 150,000 live documents and 180,000 max documents per Solr UI) ,
> then
> > is a 8GB machine with solr being configured with Xmx 4G going to be
> > sufficient?
> >
> > Are there any guidlines as to configuring the java heap and total RAM,
> > given an index size and the expected query rate ( queries per
> > minute/second).
> > On production system I observed via gc logs that minor collections happen
> > at rate of 20 per minute, full gc happens every seven to ten minutes, are
> > these  too high or low given direct search query load on that solr node
> is
> > about 2500 request per minute.  What kind of GC behaviour I should expect
> > from an healthy and fast/optimal solr node in solr-cloud setup.  Is the
> > answer it depends on your specific response time and throughput
> > requirements or is there some kind of rule of thumb that can followed
> > irrespective of the situation.  Or should I see if any improvements can
> be
> > made via regular measure, tweak , measure cycles.
> >
> > Thanks,
> > Himanshu
>

Re: Solr and SolrCloud repllcation, and load balancing questions.

Posted by Erick Erickson <er...@gmail.com>.
Question1, both sub-cases.

You're off on the wrong track here, you have to forget about replication.

When documents are added to the index, they get forwarded to _all_
replicas. So the flow is like this...
1> leader gets update request
2> leader indexes docs locally, and adds to (local) transaction log
      _and_ forwards request to all followers
3> followers add docs to tlog and index locally
4> followers ack back to leader
5> leader acks back to client.

There is no replication in the old sense at all in this scenario. I'll
add parenthetically that old-style replication _is_ still used to
"catch up" a follower that is waaaaaay behind, but the follower is
in the "recovering" state if this ever occurs.

About commit. If you commit from the client, the commit is forwarded
to all followers (actually, all nodes in the collection). If you have
autocommit configured, each of the replicas will fire their commit when
the time period expires.

Here's a blog that might help:
http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/

[B] right, SolrCloud really supposes that the machines are pretty
similar so doesn't provide any way to do what you're asking. Really,
you're asking for some way to assign "beefiness" to the node in terms
of load sent to it... I don't know of a way to do that and I'm not
sure it's on the roadmap either.

What you'd really want, though, is some kind of heuristic that was
automatically applied. That would take into account transient load
problems, i.e. replica N happened to get a really nasty query to run
and is just slow for a while. I can see this being very tricky to get
right though. Would a GC pause get weighted as "slow" even though the
pause could be over already? Anyway, I don't think this is on the
roadmap at present but could well be wrong.

In your specific example, though (this works because of the convenient
2x....) you could host 2x the number of shards/replicas on the beefier
machines.

[C] Right, memory allocation is difficult. The general recommendation
is that memory for Solr allocated in the JVM should be as small as
possible, and leave let the op system use memory for MMapDirectory.
See the excellent blog here:
http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory-on-64bit.html.
If you over-allocate memory to the JVM, your GC profile worsens...

Generally, when people throw "memory" around they're talking about JVM memory...

And don't be mislead by the notion of "the index fitting into memory".
You're absolutely right that when you get into a swapping situation,
performance will suffer. But there are some very interesting tricks
played to keep JVM consumption down. For instance, only every 128th
term is stored in the JVM memory. Other terms are then read as needed.
And stored in the OS memory via MMapDirectory implementations....

Your GC stats look quite reasonable. You can get a snapshot of memory
usage by attaching, say, jConsole to the running JVM and see what
memory usage was after a forced GC. Sounds like you've already seen
this, but in case not:
http://searchhub.org/2011/03/27/garbage-collection-bootcamp-1-0/. It
was written before there was much mileage on the new G1 garbage
collector which has received mixed reviews.

Note that the stored fields kept in memory are controlled by the
documentCache in solrconfig.xml. I think of this as just something
that holds documents when assembling the return list, it really
doesn't have anything to do with searching per-se, just keeping disk
seeks down during processing for a particular query. I.e. for a query
returning 10 rows, only 10 docs will be kept here not the 5M rows that
matched.

Whether 4G is sufficient is.... not answerable. I've doubled the
memory requirements by changing the query without changing the index.
Here's a blog outlining why we can't predict and how to get an answer
empirically:
http://searchhub.org/2012/07/23/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

Best,
Erick

On Sat, Jul 5, 2014 at 1:57 AM, Himanshu Mehrotra
<hi...@gmail.com> wrote:
> Hi,
>
> I had three quesions/doubts regarding Solr and SolrCloud functionality.
> Can anyone help clarify these? I know these are bit long, please bear with
> me.
>
> [A] Replication related - As I understand before SolrCloud, under a classic
> master/slave replication setup, every 'X' minutes slaves will pull/poll the
> updated index (index segments added and deleted/merged away ).  And when a
> client explicitly issues a 'commit' only master solr closes/finalizes
> current index segment and creates a new current index segment.  As port of
> this index segment merges as well as 'fsync' ensuring data is on the disk
> also happens.
>
> I read documentation regarding replication on SolrCloud but unfortunately
> it is still not very clear to me.
>
> Say I have solr cloud setup of 3 solr servers with just a single shard.
> Let's call them L (the leader) and F1 and F2, the followers.
>
> Case 1: We are not using autoCommits, and explictly issue 'commit' via
> Client.  How does replication happen now?
> Does the each update to leader L that goes into tlog get replicated to
> followers F1, and F2 (wher they also put update in tlog ) before client
> sees response from leader L?  What happens when client issues a 'commit'?
> Does  the creation of new segment, merging of index segments if required,
> and fsync happen on all three solrs or that just happens on leader L and
> followers F1, F2 simply sync the post commit state of index.  More-over
> does leader L wait for fsync in followers F1, F2, before responding
> sucessfully to Client?  If yes does it sequentially updates F1 and then F2
> or is the process concurrent/parallel via threads.
>
> Case 2: We use autoCommit every 'X' minutes and do not issue 'commit' via
> Client.  Is this setup similar to classic master slave in terms of
> data/index updates?
> As in since autoCommit happens every 'X' minutes replication will happen
> after commit, every 'X' minutes followers get updated index.  But does
> simple updates, the ones that go int tlog get replicated immediately to
> follower's tlog .
>
> Another thing I noticed in Solr Admin UI, is that replication is set to
> afterCommit, what are other possible settings for this knob.  And what
> behaviour do we get out of them.
>
>
>
>
> [B] Load balancing related - In traditional master/slave setup we use load
> balancer to distribute load search query load equally over slaves.  In case
> one of the slave solr is running on 'beefier' machine (say more RAM or CPU
> or both) than others, then load balancers allow distributing load by
> weights so that we can distribute load proportional to percieved machine
> capacity.
>
> With solr cloud setup, lets take an example, 2 shards, 3 replicas per
> shard, totaling to 6 solr servers are running and say we have
> Servers S1L1, S1F1, S1F2 hosting replicas of shard1 and servers S2L1, S2F1,
> S2F2 hosting replicas of shard2.  S1L1 and S2L2 happen to be leaders of
> their respective shard.  And lets say S1F2, and S2F1 happen to be twice as
> big machines as others (twice the RAM and CPU).
>
> Ideally speaking in such case we would want S2F1 and S1F2 to handle twice
> the search query load as their peers.  That is if 100 search queries come
> we know each shard will receive these 100 queries.  So we want S1L1, and
> S1F1 to handle 25 queries each, and S1F2 to handle 50 queries.  Similarly
> we would want S2L1 and S2F2 to handle 25 queries and S2F1 to handle 50
> queries.
>
> As far as I understand, this is not possible via smart client provided in
> SolrJ.  All solr servers will handle 33% of the query load.
>
> Alternative is to use dumb client and load balancer over all servers.  But
> even then I guess we won't get correct/desired distribution of queries.
> Say we put following weights for each server
>
> 1 - S1L1
> 1 - S1F1
> 2 - S1F2
> 1 - S1L1
> 2 - S1F1
> 1 - S1F2
>
> Now 1/4 of total number of requests go to S1F2 directly, plus now it
> recieves  1/6 ( 1/2 * 1/3 ) of request that went to some server on shard 2.
> This totals up to 10/24 of request load, not half as we would expect.
>
> One way could be to chose weight y and x such that y/(2*(y + 2x)) + 1/6 =
> 1/2 . It seems too much of trouble to get them ( y = 4 and x = 1 ).
> Every time we add/remove/upgrade servers we need to recalculate new weights.
>
> A simpler alternative it appears would be that each solr node register its
> 'query_weight' with zoo-keeper on joining the cluster. This 'query_weight'
> could be a property similar to 'solr.solr.home' or 'zkHosts' that we
> specify with startup commandline for solr server.
>
> And all smart clients and solr servers, to honour that weight when they
> distribute load.  Is there such a feature planned for Solr Cloud?
>
>
>
>
> [C] GC/Memory usage related - From the documentation and videos available
> on internet, it appears that solr perform well if index fits into the
> memory and stord fields fit in the memory.  Holding just index in memory
> has more degrading impact on solr performance and if we don't have enough
> memory to hold index solr is still slower, and the moment java process hits
> swap space solr will slow to a crawl.
>
> My question is what the 'memory' being talked about is? Is it the Java Heap
> we specify via Xmx and Xms options.  Or is it the free memory, or buffered,
> or cached memory as available from output free command on *nix systems.
> And how do we know if our index and stored fields will fit the memory.  For
> example say the data directory for the core/collection occupies 200MB on
> disk ( 150,000 live documents and 180,000 max documents per Solr UI) , then
> is a 8GB machine with solr being configured with Xmx 4G going to be
> sufficient?
>
> Are there any guidlines as to configuring the java heap and total RAM,
> given an index size and the expected query rate ( queries per
> minute/second).
> On production system I observed via gc logs that minor collections happen
> at rate of 20 per minute, full gc happens every seven to ten minutes, are
> these  too high or low given direct search query load on that solr node is
> about 2500 request per minute.  What kind of GC behaviour I should expect
> from an healthy and fast/optimal solr node in solr-cloud setup.  Is the
> answer it depends on your specific response time and throughput
> requirements or is there some kind of rule of thumb that can followed
> irrespective of the situation.  Or should I see if any improvements can be
> made via regular measure, tweak , measure cycles.
>
> Thanks,
> Himanshu