You are viewing a plain text version of this content. The canonical link for it is here.

Posted to solr-user@lucene.apache.org by adfel70 <ad...@gmail.com> on 2014/02/03 08:33:10 UTC

need help in understating solr cloud stats data

I'm sending all solr stats data to graphite.
I have some questions:
1. query_handler/select requestTime - 
if i'm looking at some metric, lets say 75thPcRequestTime - I see that each
core in a single collection has different values.
Is each value of each core is the time that specific core spent on a
request?
so to get an idea of total request time, I should summarize all the values
of all the cores?


2.update_handler/commits - does this include auto_commits? becuaste I'm
pretty sure I'm not doing any manual commits and yet I see a number there.

3. update_handler/docs pending - what does this mean? pending for what? for
flush to disk?

thanks.



--
View this message in context: http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: need help in understating solr cloud stats data

Posted by David Santamauro <da...@gmail.com>.

Zabbix 2.2 has a jmx client built in as well as a few JVM templates. I 
wrote my own templates for my solr instance and monitoring and graphing 
is wonderful.

David


On 02/03/2014 12:55 PM, Joel Cohen wrote:
> I had to come up with some Solr stats monitoring for my Zabbix instance. I
> found that using JMX was the easiest way for us.
>
> There is a command line jmx client that works quite well for me.
> http://crawler.archive.org/cmdline-jmxclient/
>
> I wrote a shell script to wrap around that and shove the data back to
> Zabbix for ingestion and monitoring. I've listed the stats that I am
> gathering, and the mbean that is called. My shell script is rather
> simplistic.
>
> !/bin/bash
>
> cmdLineJMXJar=/usr/local/lib/cmdline-jmxclient.jar
> jmxHost=$1
> port=$2
> query=$3
> value=$4
>
> java -jar ${cmdLineJMXJar} user:pass ${jmxHost}:${port} ${query} ${value}
> 2>&1 | awk '{print $NF}'
>
> The script is called as so: jmxstats.sh <solr server name or IP> <jmx port>
> <name of mbean> <value to query from mbean>
> My collection name is productCatalog, so swap that with yours.
>
> *select requests*:
> solr/productCatalog:id=org.apache.solr.handler.component.SearchHandler,type=/select
> requests
> *select errors:
> *solr/productCatalog:id=org.apache.solr.handler.component.SearchHandler,type=/select
> errors
> *95th percentile request time*:
> solr/productCatalog:id=org.apache.solr.handler.component.SearchHandler,type=/select
> 95thPcRequestTime
> *update requests*:
> solr/productCatalog:id=org.apache.solr.handler.UpdateRequestHandler,type=/update
> requests
> *update errors:*
> solr/productCatalog:id=org.apache.solr.handler.UpdateRequestHandler,type=/update
> errors
> *95th percentile update time:*
> solr/productCatalog:id=org.apache.solr.handler.UpdateRequestHandler,type=/update
> 95thPcRequestTime
>
> *query result cache lookups*:
> solr/productCatalog:id=org.apache.solr.search.LRUCache,type=queryResultCache
> cumulative_lookups
> *query result cache inserts*:
> solr/productCatalog:id=org.apache.solr.search.LRUCache,type=queryResultCache
> cumulative_inserts
> *query result cache evictions*:
> solr/productCatalog:id=org.apache.solr.search.LRUCache,type=queryResultCache
> cumulative_evictions
> *query result cache hit ratio:
> *solr/productCatalog:id=org.apache.solr.search.LRUCache,type=queryResultCache
> cumulative_hitratio
>
> *document cache lookups:
> *solr/productCatalog:id=org.apache.solr.search.LRUCache,type=documentCache
> cumulative_lookups
> *document cache inserts:
> *solr/productCatalog:id=org.apache.solr.search.LRUCache,type=documentCache
> cumulative_inserts
> *document cache evictions:
> *solr/productCatalog:id=org.apache.solr.search.LRUCache,type=documentCache
> cumulative_evictions
> *document cache hit ratio:
> *solr/productCatalog:id=org.apache.solr.search.LRUCache,type=documentCache
> cumulative_hitratio
>
> *filter cache lookups:
> *solr/productCatalog:type=filterCache,id=org.apache.solr.search.FastLRUCache
> cumulative_lookups
> *filter cache inserts:
> *solr/productCatalog:type=filterCache,id=org.apache.solr.search.FastLRUCache
> cumulative_inserts
> *filter cache evictions:
> *solr/productCatalog:type=filterCache,id=org.apache.solr.search.FastLRUCache
> cumulative_evictions
> *filter cache hit ratio:
> *solr/productCatalog:type=filterCache,id=org.apache.solr.search.FastLRUCache
> cumulative_hitratio
>
> *field value cache lookups:
> *solr/productCatalog:type=fieldValueCache,id=org.apache.solr.search.FastLRUCache
> cumulative_lookups
> *field value cache inserts:
> *solr/productCatalog:type=fieldValueCache,id=org.apache.solr.search.FastLRUCache
> cumulative_inserts
> *field value cache evictions:
> *solr/productCatalog:type=fieldValueCache,id=org.apache.solr.search.FastLRUCache
> cumulative_evictions
> *field value cache hit ratio:
> *solr/productCatalog:type=fieldValueCache,id=org.apache.solr.search.FastLRUCache
> cumulative_evictions
>
> This set of stats gets me a pretty good idea of what's going on with my
> SolrCloud at any time. Anyone have any thoughts or suggestions?
>
> Joel Cohen
> Senior System Engineer
> Bluefly, Inc.
>
>
> On Mon, Feb 3, 2014 at 11:25 AM, Greg Walters <gr...@answers.com>wrote:
>
>> The code I wrote is currently a bit of an ugly hack so I'm a bit reluctant
>> to share it and there's some legal concerns with open-sourcing code within
>> my company. That being said, I wouldn't mind rewriting it on my own time.
>> Where can I find a starter kit for contributors with coding guidelines and
>> the like? Spruced up some I'd be OK with submitting a patch.
>>
>> Thanks,
>> Greg
>>
>> On Feb 3, 2014, at 10:08 AM, Mark Miller <ma...@gmail.com> wrote:
>>
>>> You should contribute that and spread the dev load with others :)
>>>
>>> We need something like that at some point, it's just no one has done it.
>> We currently expect you to aggregate in the monitoring layer and it's a lot
>> to ask IMO.
>>>
>>> - Mark
>>>
>>> http://about.me/markrmiller
>>>
>>> On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com>
>> wrote:
>>>
>>>> I've had some issues monitoring Solr with the per-core mbeans and ended
>> up writing a custom "request handler" that gets loaded then registers
>> itself as an mbean. When called it polls all the per-core mbeans then adds
>> or averages them where appropriate before returning the requested value.
>> I'm not sure if there's a better way to get jvm-wide stats via jmx but it
>> is *a* way to get it done.
>>>>
>>>> Thanks,
>>>> Greg
>>>>
>>>> On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
>>>>
>>>>> I'm sending all solr stats data to graphite.
>>>>> I have some questions:
>>>>> 1. query_handler/select requestTime -
>>>>> if i'm looking at some metric, lets say 75thPcRequestTime - I see that
>> each
>>>>> core in a single collection has different values.
>>>>> Is each value of each core is the time that specific core spent on a
>>>>> request?
>>>>> so to get an idea of total request time, I should summarize all the
>> values
>>>>> of all the cores?
>>>>>
>>>>>
>>>>> 2.update_handler/commits - does this include auto_commits? becuaste I'm
>>>>> pretty sure I'm not doing any manual commits and yet I see a number
>> there.
>>>>>
>>>>> 3. update_handler/docs pending - what does this mean? pending for
>> what? for
>>>>> flush to disk?
>>>>>
>>>>> thanks.
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> View this message in context:
>> http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>
>>>
>>
>>
>
>

Re: need help in understating solr cloud stats data

Posted by Joel Cohen <jo...@bluefly.com>.

I had to come up with some Solr stats monitoring for my Zabbix instance. I
found that using JMX was the easiest way for us.

There is a command line jmx client that works quite well for me.
http://crawler.archive.org/cmdline-jmxclient/

I wrote a shell script to wrap around that and shove the data back to
Zabbix for ingestion and monitoring. I've listed the stats that I am
gathering, and the mbean that is called. My shell script is rather
simplistic.

!/bin/bash

cmdLineJMXJar=/usr/local/lib/cmdline-jmxclient.jar
jmxHost=$1
port=$2
query=$3
value=$4

java -jar ${cmdLineJMXJar} user:pass ${jmxHost}:${port} ${query} ${value}
2>&1 | awk '{print $NF}'

The script is called as so: jmxstats.sh <solr server name or IP> <jmx port>
<name of mbean> <value to query from mbean>
My collection name is productCatalog, so swap that with yours.

*select requests*:
solr/productCatalog:id=org.apache.solr.handler.component.SearchHandler,type=/select
requests
*select errors:
*solr/productCatalog:id=org.apache.solr.handler.component.SearchHandler,type=/select
errors
*95th percentile request time*:
solr/productCatalog:id=org.apache.solr.handler.component.SearchHandler,type=/select
95thPcRequestTime
*update requests*:
solr/productCatalog:id=org.apache.solr.handler.UpdateRequestHandler,type=/update
requests
*update errors:*
solr/productCatalog:id=org.apache.solr.handler.UpdateRequestHandler,type=/update
errors
*95th percentile update time:*
solr/productCatalog:id=org.apache.solr.handler.UpdateRequestHandler,type=/update
95thPcRequestTime

*query result cache lookups*:
solr/productCatalog:id=org.apache.solr.search.LRUCache,type=queryResultCache
cumulative_lookups
*query result cache inserts*:
solr/productCatalog:id=org.apache.solr.search.LRUCache,type=queryResultCache
cumulative_inserts
*query result cache evictions*:
solr/productCatalog:id=org.apache.solr.search.LRUCache,type=queryResultCache
cumulative_evictions
*query result cache hit ratio:
*solr/productCatalog:id=org.apache.solr.search.LRUCache,type=queryResultCache
cumulative_hitratio

*document cache lookups:
*solr/productCatalog:id=org.apache.solr.search.LRUCache,type=documentCache
cumulative_lookups
*document cache inserts:
*solr/productCatalog:id=org.apache.solr.search.LRUCache,type=documentCache
cumulative_inserts
*document cache evictions:
*solr/productCatalog:id=org.apache.solr.search.LRUCache,type=documentCache
cumulative_evictions
*document cache hit ratio:
*solr/productCatalog:id=org.apache.solr.search.LRUCache,type=documentCache
cumulative_hitratio

*filter cache lookups:
*solr/productCatalog:type=filterCache,id=org.apache.solr.search.FastLRUCache
cumulative_lookups
*filter cache inserts:
*solr/productCatalog:type=filterCache,id=org.apache.solr.search.FastLRUCache
cumulative_inserts
*filter cache evictions:
*solr/productCatalog:type=filterCache,id=org.apache.solr.search.FastLRUCache
cumulative_evictions
*filter cache hit ratio:
*solr/productCatalog:type=filterCache,id=org.apache.solr.search.FastLRUCache
cumulative_hitratio

*field value cache lookups:
*solr/productCatalog:type=fieldValueCache,id=org.apache.solr.search.FastLRUCache
cumulative_lookups
*field value cache inserts:
*solr/productCatalog:type=fieldValueCache,id=org.apache.solr.search.FastLRUCache
cumulative_inserts
*field value cache evictions:
*solr/productCatalog:type=fieldValueCache,id=org.apache.solr.search.FastLRUCache
cumulative_evictions
*field value cache hit ratio:
*solr/productCatalog:type=fieldValueCache,id=org.apache.solr.search.FastLRUCache
cumulative_evictions

This set of stats gets me a pretty good idea of what's going on with my
SolrCloud at any time. Anyone have any thoughts or suggestions?

Joel Cohen
Senior System Engineer
Bluefly, Inc.


On Mon, Feb 3, 2014 at 11:25 AM, Greg Walters <gr...@answers.com>wrote:

> The code I wrote is currently a bit of an ugly hack so I'm a bit reluctant
> to share it and there's some legal concerns with open-sourcing code within
> my company. That being said, I wouldn't mind rewriting it on my own time.
> Where can I find a starter kit for contributors with coding guidelines and
> the like? Spruced up some I'd be OK with submitting a patch.
>
> Thanks,
> Greg
>
> On Feb 3, 2014, at 10:08 AM, Mark Miller <ma...@gmail.com> wrote:
>
> > You should contribute that and spread the dev load with others :)
> >
> > We need something like that at some point, it's just no one has done it.
> We currently expect you to aggregate in the monitoring layer and it's a lot
> to ask IMO.
> >
> > - Mark
> >
> > http://about.me/markrmiller
> >
> > On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com>
> wrote:
> >
> >> I've had some issues monitoring Solr with the per-core mbeans and ended
> up writing a custom "request handler" that gets loaded then registers
> itself as an mbean. When called it polls all the per-core mbeans then adds
> or averages them where appropriate before returning the requested value.
> I'm not sure if there's a better way to get jvm-wide stats via jmx but it
> is *a* way to get it done.
> >>
> >> Thanks,
> >> Greg
> >>
> >> On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
> >>
> >>> I'm sending all solr stats data to graphite.
> >>> I have some questions:
> >>> 1. query_handler/select requestTime -
> >>> if i'm looking at some metric, lets say 75thPcRequestTime - I see that
> each
> >>> core in a single collection has different values.
> >>> Is each value of each core is the time that specific core spent on a
> >>> request?
> >>> so to get an idea of total request time, I should summarize all the
> values
> >>> of all the cores?
> >>>
> >>>
> >>> 2.update_handler/commits - does this include auto_commits? becuaste I'm
> >>> pretty sure I'm not doing any manual commits and yet I see a number
> there.
> >>>
> >>> 3. update_handler/docs pending - what does this mean? pending for
> what? for
> >>> flush to disk?
> >>>
> >>> thanks.
> >>>
> >>>
> >>>
> >>> --
> >>> View this message in context:
> http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
> >>> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >
>
>


-- 

joel cohen, senior system engineer

e joel.cohen@bluefly.com p 212.944.8000 x276
bluefly, inc. 42 w. 39th st. new york, ny 10018
www.bluefly.com <http://www.bluefly.com/?referer=autosig> | *fly since
2013...*

Re: need help in understating solr cloud stats data

Posted by Erick Erickson <er...@gmail.com>.

See:

http://wiki.apache.org/solr/HowToContribute

It outlines how to get the code, how to work with patches, how to set
up IntelliJ and Eclipse IDEs (links near the bottom?). There are
formatting files for both IntelliJ and Eclipse that'll do the right
thing in terms of indents and such.

Legal issues aside, you don't to be very compulsive about cleaning up
the code before posting the first patch! Just let people know you
don't consider it ready to commit. You'll want to open a JIRA to
attach it to. People often put in //nocommit in places they especially
don't like, and the "precommit" ant target takes care of keeping these
from getting into the code.

People are quite happy to see hack, first-cut patches. You'll often
get suggestions on approaches that may be easier and nobody will
complain about "bad code" when they know that _you_ don't consider it
submittable. Google for "Yonik's law of half-baked patches".

One thing that escapes people often... When attaching a patch to a
JIRA, just call it SOLR-####.patch, where #### is the JIRA number.
Successive versions of the patch should have the _same_ name, they'll
all be listed and the newest one will be "live". It's easier to know
what is the right patch that way. No big deal either way.

Best,
Erick

On Mon, Feb 3, 2014 at 8:25 AM, Greg Walters <gr...@answers.com> wrote:
> The code I wrote is currently a bit of an ugly hack so I'm a bit reluctant to share it and there's some legal concerns with open-sourcing code within my company. That being said, I wouldn't mind rewriting it on my own time. Where can I find a starter kit for contributors with coding guidelines and the like? Spruced up some I'd be OK with submitting a patch.
>
> Thanks,
> Greg
>
> On Feb 3, 2014, at 10:08 AM, Mark Miller <ma...@gmail.com> wrote:
>
>> You should contribute that and spread the dev load with others :)
>>
>> We need something like that at some point, it's just no one has done it. We currently expect you to aggregate in the monitoring layer and it's a lot to ask IMO.
>>
>> - Mark
>>
>> http://about.me/markrmiller
>>
>> On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com> wrote:
>>
>>> I've had some issues monitoring Solr with the per-core mbeans and ended up writing a custom "request handler" that gets loaded then registers itself as an mbean. When called it polls all the per-core mbeans then adds or averages them where appropriate before returning the requested value. I'm not sure if there's a better way to get jvm-wide stats via jmx but it is *a* way to get it done.
>>>
>>> Thanks,
>>> Greg
>>>
>>> On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
>>>
>>>> I'm sending all solr stats data to graphite.
>>>> I have some questions:
>>>> 1. query_handler/select requestTime -
>>>> if i'm looking at some metric, lets say 75thPcRequestTime - I see that each
>>>> core in a single collection has different values.
>>>> Is each value of each core is the time that specific core spent on a
>>>> request?
>>>> so to get an idea of total request time, I should summarize all the values
>>>> of all the cores?
>>>>
>>>>
>>>> 2.update_handler/commits - does this include auto_commits? becuaste I'm
>>>> pretty sure I'm not doing any manual commits and yet I see a number there.
>>>>
>>>> 3. update_handler/docs pending - what does this mean? pending for what? for
>>>> flush to disk?
>>>>
>>>> thanks.
>>>>
>>>>
>>>>
>>>> --
>>>> View this message in context: http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>
>

Re: need help in understating solr cloud stats data

Posted by Greg Walters <gr...@answers.com>.

The code I wrote is currently a bit of an ugly hack so I'm a bit reluctant to share it and there's some legal concerns with open-sourcing code within my company. That being said, I wouldn't mind rewriting it on my own time. Where can I find a starter kit for contributors with coding guidelines and the like? Spruced up some I'd be OK with submitting a patch.

Thanks,
Greg

On Feb 3, 2014, at 10:08 AM, Mark Miller <ma...@gmail.com> wrote:

> You should contribute that and spread the dev load with others :)
> 
> We need something like that at some point, it’s just no one has done it. We currently expect you to aggregate in the monitoring layer and it’s a lot to ask IMO.
> 
> - Mark
> 
> http://about.me/markrmiller
> 
> On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com> wrote:
> 
>> I've had some issues monitoring Solr with the per-core mbeans and ended up writing a custom "request handler" that gets loaded then registers itself as an mbean. When called it polls all the per-core mbeans then adds or averages them where appropriate before returning the requested value. I'm not sure if there's a better way to get jvm-wide stats via jmx but it is *a* way to get it done.
>> 
>> Thanks,
>> Greg
>> 
>> On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
>> 
>>> I'm sending all solr stats data to graphite.
>>> I have some questions:
>>> 1. query_handler/select requestTime - 
>>> if i'm looking at some metric, lets say 75thPcRequestTime - I see that each
>>> core in a single collection has different values.
>>> Is each value of each core is the time that specific core spent on a
>>> request?
>>> so to get an idea of total request time, I should summarize all the values
>>> of all the cores?
>>> 
>>> 
>>> 2.update_handler/commits - does this include auto_commits? becuaste I'm
>>> pretty sure I'm not doing any manual commits and yet I see a number there.
>>> 
>>> 3. update_handler/docs pending - what does this mean? pending for what? for
>>> flush to disk?
>>> 
>>> thanks.
>>> 
>>> 
>>> 
>>> --
>>> View this message in context: http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
>

Re: need help in understating solr cloud stats data

Posted by "Ramkumar R. Aiyengar" <an...@gmail.com>.

We have had success with starting up Jolokia in the same servlet container
as Solr, and then using its REST/Bulk API to JMX from the application of
choice.
On 4 Feb 2014 17:16, "Walter Underwood" <wu...@wunderwood.org> wrote:

> I agree that sorting and filtering stats in Solr is not a good idea. There
> is certainly some use in aggregation, though. One request to /admin/mbeans
> replaces about 50 JMX requests.
>
> Is anybody working on https://issues.apache.org/jira/browse/SOLR-4735?
>
> wunder
>
> On Feb 4, 2014, at 8:13 AM, Otis Gospodnetic <ot...@gmail.com>
> wrote:
>
> > +101 for more stats.  Was just saying that trying to pre-aggregate them
> > along multiple dimensions is probably best left out of Solr.
> >
> > Otis
> > --
> > Performance Monitoring * Log Analytics * Search Analytics
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> > On Tue, Feb 4, 2014 at 10:49 AM, Mark Miller <ma...@gmail.com>
> wrote:
> >
> >> I think that is silly. We can still offer per shard stats *and* let a
> user
> >> easily see stats for a collection without requiring they jump hoops or
> use
> >> a specific monitoring solution where someone else has already jumped
> hoops
> >> for them.
> >>
> >> You don't have to guess what ops people really want - *everyone* wants
> >> stats that make sense for the collections and cluster on top of the per
> >> shard stats. *Everyone* wouldn't mind seeing these without having to
> setup
> >> a monitoring solution first.
> >>
> >> If you want more than that, then you can fiddle with your monitoring
> >> solution.
> >>
> >> - Mark
> >>
> >> http://about.me/markrmiller
> >>
> >> On Feb 3, 2014, at 11:10 PM, Otis Gospodnetic <
> otis.gospodnetic@gmail.com>
> >> wrote:
> >>
> >>> Hi,
> >>>
> >>> Oh, I just saw Greg's email on dev@ about this.
> >>> IMHO aggregating in the search engine is not the way to do.  Leave that
> >> to
> >>> external tools, which are likely to be more flexible when it comes to
> >> this.
> >>> For example, our SPM for Solr can do all kinds of aggregations and
> >>> filtering by a number of Solr and SolrCloud-specific dimensions
> already,
> >>> without Solr having to do any sort of aggregation that it thinks Ops
> >> people
> >>> will really want.
> >>>
> >>> Otis
> >>> --
> >>> Performance Monitoring * Log Analytics * Search Analytics
> >>> Solr & Elasticsearch Support * http://sematext.com/
> >>>
> >>>
> >>> On Mon, Feb 3, 2014 at 11:08 AM, Mark Miller <ma...@gmail.com>
> >> wrote:
> >>>
> >>>> You should contribute that and spread the dev load with others :)
> >>>>
> >>>> We need something like that at some point, it's just no one has done
> it.
> >>>> We currently expect you to aggregate in the monitoring layer and it's
> a
> >> lot
> >>>> to ask IMO.
> >>>>
> >>>> - Mark
> >>>>
> >>>> http://about.me/markrmiller
> >>>>
> >>>> On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com>
> >>>> wrote:
> >>>>
> >>>>> I've had some issues monitoring Solr with the per-core mbeans and
> ended
> >>>> up writing a custom "request handler" that gets loaded then registers
> >>>> itself as an mbean. When called it polls all the per-core mbeans then
> >> adds
> >>>> or averages them where appropriate before returning the requested
> value.
> >>>> I'm not sure if there's a better way to get jvm-wide stats via jmx but
> >> it
> >>>> is *a* way to get it done.
> >>>>>
> >>>>> Thanks,
> >>>>> Greg
> >>>>>
> >>>>> On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
> >>>>>
> >>>>>> I'm sending all solr stats data to graphite.
> >>>>>> I have some questions:
> >>>>>> 1. query_handler/select requestTime -
> >>>>>> if i'm looking at some metric, lets say 75thPcRequestTime - I see
> that
> >>>> each
> >>>>>> core in a single collection has different values.
> >>>>>> Is each value of each core is the time that specific core spent on a
> >>>>>> request?
> >>>>>> so to get an idea of total request time, I should summarize all the
> >>>> values
> >>>>>> of all the cores?
> >>>>>>
> >>>>>>
> >>>>>> 2.update_handler/commits - does this include auto_commits? becuaste
> >> I'm
> >>>>>> pretty sure I'm not doing any manual commits and yet I see a number
> >>>> there.
> >>>>>>
> >>>>>> 3. update_handler/docs pending - what does this mean? pending for
> >> what?
> >>>> for
> >>>>>> flush to disk?
> >>>>>>
> >>>>>> thanks.
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>> --
> >>>>>> View this message in context:
> >>>>
> >>
> http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
> >>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
> >>>>>
> >>>>
> >>>>
> >>
> >>
>
> --
> Walter Underwood
> wunder@wunderwood.org
>
>
>
>

Re: need help in understating solr cloud stats data

Posted by Walter Underwood <wu...@wunderwood.org>.

I agree that sorting and filtering stats in Solr is not a good idea. There is certainly some use in aggregation, though. One request to /admin/mbeans replaces about 50 JMX requests.

Is anybody working on https://issues.apache.org/jira/browse/SOLR-4735?

wunder

On Feb 4, 2014, at 8:13 AM, Otis Gospodnetic <ot...@gmail.com> wrote:

> +101 for more stats.  Was just saying that trying to pre-aggregate them
> along multiple dimensions is probably best left out of Solr.
> 
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
> 
> 
> On Tue, Feb 4, 2014 at 10:49 AM, Mark Miller <ma...@gmail.com> wrote:
> 
>> I think that is silly. We can still offer per shard stats *and* let a user
>> easily see stats for a collection without requiring they jump hoops or use
>> a specific monitoring solution where someone else has already jumped hoops
>> for them.
>> 
>> You don't have to guess what ops people really want - *everyone* wants
>> stats that make sense for the collections and cluster on top of the per
>> shard stats. *Everyone* wouldn't mind seeing these without having to setup
>> a monitoring solution first.
>> 
>> If you want more than that, then you can fiddle with your monitoring
>> solution.
>> 
>> - Mark
>> 
>> http://about.me/markrmiller
>> 
>> On Feb 3, 2014, at 11:10 PM, Otis Gospodnetic <ot...@gmail.com>
>> wrote:
>> 
>>> Hi,
>>> 
>>> Oh, I just saw Greg's email on dev@ about this.
>>> IMHO aggregating in the search engine is not the way to do.  Leave that
>> to
>>> external tools, which are likely to be more flexible when it comes to
>> this.
>>> For example, our SPM for Solr can do all kinds of aggregations and
>>> filtering by a number of Solr and SolrCloud-specific dimensions already,
>>> without Solr having to do any sort of aggregation that it thinks Ops
>> people
>>> will really want.
>>> 
>>> Otis
>>> --
>>> Performance Monitoring * Log Analytics * Search Analytics
>>> Solr & Elasticsearch Support * http://sematext.com/
>>> 
>>> 
>>> On Mon, Feb 3, 2014 at 11:08 AM, Mark Miller <ma...@gmail.com>
>> wrote:
>>> 
>>>> You should contribute that and spread the dev load with others :)
>>>> 
>>>> We need something like that at some point, it's just no one has done it.
>>>> We currently expect you to aggregate in the monitoring layer and it's a
>> lot
>>>> to ask IMO.
>>>> 
>>>> - Mark
>>>> 
>>>> http://about.me/markrmiller
>>>> 
>>>> On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com>
>>>> wrote:
>>>> 
>>>>> I've had some issues monitoring Solr with the per-core mbeans and ended
>>>> up writing a custom "request handler" that gets loaded then registers
>>>> itself as an mbean. When called it polls all the per-core mbeans then
>> adds
>>>> or averages them where appropriate before returning the requested value.
>>>> I'm not sure if there's a better way to get jvm-wide stats via jmx but
>> it
>>>> is *a* way to get it done.
>>>>> 
>>>>> Thanks,
>>>>> Greg
>>>>> 
>>>>> On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
>>>>> 
>>>>>> I'm sending all solr stats data to graphite.
>>>>>> I have some questions:
>>>>>> 1. query_handler/select requestTime -
>>>>>> if i'm looking at some metric, lets say 75thPcRequestTime - I see that
>>>> each
>>>>>> core in a single collection has different values.
>>>>>> Is each value of each core is the time that specific core spent on a
>>>>>> request?
>>>>>> so to get an idea of total request time, I should summarize all the
>>>> values
>>>>>> of all the cores?
>>>>>> 
>>>>>> 
>>>>>> 2.update_handler/commits - does this include auto_commits? becuaste
>> I'm
>>>>>> pretty sure I'm not doing any manual commits and yet I see a number
>>>> there.
>>>>>> 
>>>>>> 3. update_handler/docs pending - what does this mean? pending for
>> what?
>>>> for
>>>>>> flush to disk?
>>>>>> 
>>>>>> thanks.
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> View this message in context:
>>>> 
>> http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>> 
>>>> 
>>>> 
>> 
>> 

--
Walter Underwood
wunder@wunderwood.org

Re: need help in understating solr cloud stats data

Posted by Otis Gospodnetic <ot...@gmail.com>.

+101 for more stats.  Was just saying that trying to pre-aggregate them
along multiple dimensions is probably best left out of Solr.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Tue, Feb 4, 2014 at 10:49 AM, Mark Miller <ma...@gmail.com> wrote:

> I think that is silly. We can still offer per shard stats *and* let a user
> easily see stats for a collection without requiring they jump hoops or use
> a specific monitoring solution where someone else has already jumped hoops
> for them.
>
> You don't have to guess what ops people really want - *everyone* wants
> stats that make sense for the collections and cluster on top of the per
> shard stats. *Everyone* wouldn't mind seeing these without having to setup
> a monitoring solution first.
>
> If you want more than that, then you can fiddle with your monitoring
> solution.
>
> - Mark
>
> http://about.me/markrmiller
>
> On Feb 3, 2014, at 11:10 PM, Otis Gospodnetic <ot...@gmail.com>
> wrote:
>
> > Hi,
> >
> > Oh, I just saw Greg's email on dev@ about this.
> > IMHO aggregating in the search engine is not the way to do.  Leave that
> to
> > external tools, which are likely to be more flexible when it comes to
> this.
> > For example, our SPM for Solr can do all kinds of aggregations and
> > filtering by a number of Solr and SolrCloud-specific dimensions already,
> > without Solr having to do any sort of aggregation that it thinks Ops
> people
> > will really want.
> >
> > Otis
> > --
> > Performance Monitoring * Log Analytics * Search Analytics
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> > On Mon, Feb 3, 2014 at 11:08 AM, Mark Miller <ma...@gmail.com>
> wrote:
> >
> >> You should contribute that and spread the dev load with others :)
> >>
> >> We need something like that at some point, it's just no one has done it.
> >> We currently expect you to aggregate in the monitoring layer and it's a
> lot
> >> to ask IMO.
> >>
> >> - Mark
> >>
> >> http://about.me/markrmiller
> >>
> >> On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com>
> >> wrote:
> >>
> >>> I've had some issues monitoring Solr with the per-core mbeans and ended
> >> up writing a custom "request handler" that gets loaded then registers
> >> itself as an mbean. When called it polls all the per-core mbeans then
> adds
> >> or averages them where appropriate before returning the requested value.
> >> I'm not sure if there's a better way to get jvm-wide stats via jmx but
> it
> >> is *a* way to get it done.
> >>>
> >>> Thanks,
> >>> Greg
> >>>
> >>> On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
> >>>
> >>>> I'm sending all solr stats data to graphite.
> >>>> I have some questions:
> >>>> 1. query_handler/select requestTime -
> >>>> if i'm looking at some metric, lets say 75thPcRequestTime - I see that
> >> each
> >>>> core in a single collection has different values.
> >>>> Is each value of each core is the time that specific core spent on a
> >>>> request?
> >>>> so to get an idea of total request time, I should summarize all the
> >> values
> >>>> of all the cores?
> >>>>
> >>>>
> >>>> 2.update_handler/commits - does this include auto_commits? becuaste
> I'm
> >>>> pretty sure I'm not doing any manual commits and yet I see a number
> >> there.
> >>>>
> >>>> 3. update_handler/docs pending - what does this mean? pending for
> what?
> >> for
> >>>> flush to disk?
> >>>>
> >>>> thanks.
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> View this message in context:
> >>
> http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
> >>>> Sent from the Solr - User mailing list archive at Nabble.com.
> >>>
> >>
> >>
>
>

Re: need help in understating solr cloud stats data

Posted by Mark Miller <ma...@gmail.com>.

I think that is silly. We can still offer per shard stats *and* let a user easily see stats for a collection without requiring they jump hoops or use a specific monitoring solution where someone else has already jumped hoops for them.

You don’t have to guess what ops people really want - *everyone* wants stats that make sense for the collections and cluster on top of the per shard stats. *Everyone* wouldn’t mind seeing these without having to setup a monitoring solution first.

If you want more than that, then you can fiddle with your monitoring solution.

- Mark

http://about.me/markrmiller

On Feb 3, 2014, at 11:10 PM, Otis Gospodnetic <ot...@gmail.com> wrote:

> Hi,
> 
> Oh, I just saw Greg's email on dev@ about this.
> IMHO aggregating in the search engine is not the way to do.  Leave that to
> external tools, which are likely to be more flexible when it comes to this.
> For example, our SPM for Solr can do all kinds of aggregations and
> filtering by a number of Solr and SolrCloud-specific dimensions already,
> without Solr having to do any sort of aggregation that it thinks Ops people
> will really want.
> 
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
> 
> 
> On Mon, Feb 3, 2014 at 11:08 AM, Mark Miller <ma...@gmail.com> wrote:
> 
>> You should contribute that and spread the dev load with others :)
>> 
>> We need something like that at some point, it's just no one has done it.
>> We currently expect you to aggregate in the monitoring layer and it's a lot
>> to ask IMO.
>> 
>> - Mark
>> 
>> http://about.me/markrmiller
>> 
>> On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com>
>> wrote:
>> 
>>> I've had some issues monitoring Solr with the per-core mbeans and ended
>> up writing a custom "request handler" that gets loaded then registers
>> itself as an mbean. When called it polls all the per-core mbeans then adds
>> or averages them where appropriate before returning the requested value.
>> I'm not sure if there's a better way to get jvm-wide stats via jmx but it
>> is *a* way to get it done.
>>> 
>>> Thanks,
>>> Greg
>>> 
>>> On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
>>> 
>>>> I'm sending all solr stats data to graphite.
>>>> I have some questions:
>>>> 1. query_handler/select requestTime -
>>>> if i'm looking at some metric, lets say 75thPcRequestTime - I see that
>> each
>>>> core in a single collection has different values.
>>>> Is each value of each core is the time that specific core spent on a
>>>> request?
>>>> so to get an idea of total request time, I should summarize all the
>> values
>>>> of all the cores?
>>>> 
>>>> 
>>>> 2.update_handler/commits - does this include auto_commits? becuaste I'm
>>>> pretty sure I'm not doing any manual commits and yet I see a number
>> there.
>>>> 
>>>> 3. update_handler/docs pending - what does this mean? pending for what?
>> for
>>>> flush to disk?
>>>> 
>>>> thanks.
>>>> 
>>>> 
>>>> 
>>>> --
>>>> View this message in context:
>> http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
>> 
>>

Re: need help in understating solr cloud stats data

Posted by Otis Gospodnetic <ot...@gmail.com>.

Hi,

Oh, I just saw Greg's email on dev@ about this.
IMHO aggregating in the search engine is not the way to do.  Leave that to
external tools, which are likely to be more flexible when it comes to this.
 For example, our SPM for Solr can do all kinds of aggregations and
filtering by a number of Solr and SolrCloud-specific dimensions already,
without Solr having to do any sort of aggregation that it thinks Ops people
will really want.

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Mon, Feb 3, 2014 at 11:08 AM, Mark Miller <ma...@gmail.com> wrote:

> You should contribute that and spread the dev load with others :)
>
> We need something like that at some point, it's just no one has done it.
> We currently expect you to aggregate in the monitoring layer and it's a lot
> to ask IMO.
>
> - Mark
>
> http://about.me/markrmiller
>
> On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com>
> wrote:
>
> > I've had some issues monitoring Solr with the per-core mbeans and ended
> up writing a custom "request handler" that gets loaded then registers
> itself as an mbean. When called it polls all the per-core mbeans then adds
> or averages them where appropriate before returning the requested value.
> I'm not sure if there's a better way to get jvm-wide stats via jmx but it
> is *a* way to get it done.
> >
> > Thanks,
> > Greg
> >
> > On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
> >
> >> I'm sending all solr stats data to graphite.
> >> I have some questions:
> >> 1. query_handler/select requestTime -
> >> if i'm looking at some metric, lets say 75thPcRequestTime - I see that
> each
> >> core in a single collection has different values.
> >> Is each value of each core is the time that specific core spent on a
> >> request?
> >> so to get an idea of total request time, I should summarize all the
> values
> >> of all the cores?
> >>
> >>
> >> 2.update_handler/commits - does this include auto_commits? becuaste I'm
> >> pretty sure I'm not doing any manual commits and yet I see a number
> there.
> >>
> >> 3. update_handler/docs pending - what does this mean? pending for what?
> for
> >> flush to disk?
> >>
> >> thanks.
> >>
> >>
> >>
> >> --
> >> View this message in context:
> http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >
>
>

Re: need help in understating solr cloud stats data

Posted by Mark Miller <ma...@gmail.com>.

You should contribute that and spread the dev load with others :)

We need something like that at some point, it’s just no one has done it. We currently expect you to aggregate in the monitoring layer and it’s a lot to ask IMO.

- Mark

http://about.me/markrmiller

On Feb 3, 2014, at 10:49 AM, Greg Walters <gr...@answers.com> wrote:

> I've had some issues monitoring Solr with the per-core mbeans and ended up writing a custom "request handler" that gets loaded then registers itself as an mbean. When called it polls all the per-core mbeans then adds or averages them where appropriate before returning the requested value. I'm not sure if there's a better way to get jvm-wide stats via jmx but it is *a* way to get it done.
> 
> Thanks,
> Greg
> 
> On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:
> 
>> I'm sending all solr stats data to graphite.
>> I have some questions:
>> 1. query_handler/select requestTime - 
>> if i'm looking at some metric, lets say 75thPcRequestTime - I see that each
>> core in a single collection has different values.
>> Is each value of each core is the time that specific core spent on a
>> request?
>> so to get an idea of total request time, I should summarize all the values
>> of all the cores?
>> 
>> 
>> 2.update_handler/commits - does this include auto_commits? becuaste I'm
>> pretty sure I'm not doing any manual commits and yet I see a number there.
>> 
>> 3. update_handler/docs pending - what does this mean? pending for what? for
>> flush to disk?
>> 
>> thanks.
>> 
>> 
>> 
>> --
>> View this message in context: http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: need help in understating solr cloud stats data

Posted by Greg Walters <gr...@answers.com>.

I've had some issues monitoring Solr with the per-core mbeans and ended up writing a custom "request handler" that gets loaded then registers itself as an mbean. When called it polls all the per-core mbeans then adds or averages them where appropriate before returning the requested value. I'm not sure if there's a better way to get jvm-wide stats via jmx but it is *a* way to get it done.

Thanks,
Greg

On Feb 3, 2014, at 1:33 AM, adfel70 <ad...@gmail.com> wrote:

> I'm sending all solr stats data to graphite.
> I have some questions:
> 1. query_handler/select requestTime - 
> if i'm looking at some metric, lets say 75thPcRequestTime - I see that each
> core in a single collection has different values.
> Is each value of each core is the time that specific core spent on a
> request?
> so to get an idea of total request time, I should summarize all the values
> of all the cores?
> 
> 
> 2.update_handler/commits - does this include auto_commits? becuaste I'm
> pretty sure I'm not doing any manual commits and yet I see a number there.
> 
> 3. update_handler/docs pending - what does this mean? pending for what? for
> flush to disk?
> 
> thanks.
> 
> 
> 
> --
> View this message in context: http://lucene.472066.n3.nabble.com/need-help-in-understating-solr-cloud-stats-data-tp4114992.html
> Sent from the Solr - User mailing list archive at Nabble.com.