You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Vladimir Tretyakov <vl...@sematext.com> on 2014/06/02 14:18:18 UTC

How to parse some of JMX Bean's names.

Hello everyone,

We are adding Kafka 0.8.x monitoring support to SPM
<http://sematext.com/spm/> here at Sematext. Unfortunately, we quickly hit
an issue caused by the new bean naming convention that embeds things like
topic and host names in the beans along with metrics, separated by dashes,
making it hard to parse these beans.

To put it simply: it is hard/impossible to automatically figure out which
part of the bean name is e.g. consumer group, which is the topic, which is
the host name, and which is the name of the metric.

Let me show you what I mean:

kafka.consumer:type="ConsumerTopicMetrics",

                       name="af_servers-spm_topic-BytesPerSec"

Here we actually CAN extract:

 * consumer group ('af_servers')

 * topic ('spm_topic')

 * metric (‘BytesPerSec’)

BUT what if the consumer group id and/or topic name contain '-'?

Then how would we extract consumer group and topic?

Here is a concrete example of this problem:

kafka.consumer:type="ConsumerTopicMetrics",

                      name="af-servers-spm-topic-BytesPerSec"

How can we know what is group id or topic name here?

This looks like a problem to me, but maybe I’m missing something?

Is it possible to have all these values (group id, topic name) as separate
attributes inside JMX bean?

Or maybe the problem could be solved if a different delimiter was used,
such as the pipe (“I”)?

It is really needed things and will be nice to have it to build good tool
for monitoring.

Thx and best regards from Sematext.

Re: How to parse some of JMX Bean's names.

Posted by Neha Narkhede <ne...@gmail.com>.
Yes, it will require a review and I see that Jun reviewed it.


On Fri, Jun 20, 2014 at 5:03 AM, Otis Gospodnetic <
otis.gospodnetic@gmail.com> wrote:

> Hi Neha,
>
> The patch is in https://issues.apache.org/jira/browse/KAFKA-1481 . It's
> super-simple.  Does it require a review or can one submit it to Jenkins or
> something else?
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Wed, Jun 4, 2014 at 9:47 PM, Neha Narkhede <ne...@gmail.com>
> wrote:
>
> > Agree that 0.9 is a few months out. 0.8.2, on the other hand, is probably
> > less than a month away. On one hand, we are trying our best to minimize
> the
> > set of changes to the existing consumer in the interest of saving time to
> > work on the new consumer. Since the old consumer is very complex and is
> now
> > stable for some time, the motivation is to make fewer changes to it to
> > maintain that stability. On the other hand, if there are critical bug
> > fixes, it makes sense to patch the old consumer and do a point release.
> >
> > We would be happy to take a patch from you. How about we look at the size
> > of the proposed changes and discuss a release timeline on the JIRA?
> >
> > Thanks,
> > Neha
> >
> >
> > On Wed, Jun 4, 2014 at 3:09 PM, Otis Gospodnetic <
> > otis.gospodnetic@gmail.com
> > > wrote:
> >
> > > Hi Guozhang,
> > >
> > > Since the new consumer is 2-3 months out ... hm, no, looks like 4
> months
> > > out - October -
> > > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
> > >
> > > Any way you could change this in 0.8.1.2 or 0.8.2?  We can submit a
> > patch.
> > >
> > > Thanks,
> > > Otis
> > > --
> > > Performance Monitoring * Log Analytics * Search Analytics
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > >
> > > On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > I think we will make this change in the new consumer, which may be
> > > released
> > > > in 0.9.
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov <
> > > > vladimir.tretyakov@sematext.com> wrote:
> > > >
> > > > > Hi, thx Guozhang Wang, looking forward.
> > > > >
> > > > > When do you think this changes will available? 0.8.2? Jul 2014
> > > > >
> > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
> > > ?
> > > > Or
> > > > > later?
> > > > >
> > > > > Best regards, Vladimir.
> > > > >
> > > > >
> > > > > On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang <wa...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hello Vladimir, comments in-lined.
> > > > > >
> > > > > >
> > > > > > On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov <
> > > > > > vladimir.tretyakov@sematext.com> wrote:
> > > > > >
> > > > > > > Hi again, few more questions from me:
> > > > > > >
> > > > > > > *1.*
> > > > > > >
> > > > > > > What I see in JMX:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > > > > > >
> > > > > > > From code:
> > > > > > >
> > > > > > > newGauge(
> > > > > > >         config.clientId + "-" + config.groupId + "-" +
> > > > > topicThreadId._1 +
> > > > > > > "-" + topicThreadId._2 + "-FetchQueueSize",
> > > > > > >         new Gauge[Int] {
> > > > > > >           def value = q.size
> > > > > > >         }
> > > > > > >       )
> > > > > > >
> > > > > > > I've tried to parse part as I've understood they.
> > > > > > >
> > > > > > > config.clientId       >> af_servers
> > > > > > > topicThreadId._1 >> af_servers-spm_new_cluster_topic
> > > > > > > topicThreadId._2 >>
> > > > af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
> > > > > > >
> > > > > > > Yes I can suppose that this topicThreadId._1 will always looks
> > like
> > > > > > > GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST,
> > but
> > > > > will
> > > > > > it
> > > > > > > always true?
> > > > > >
> > > > > >
> > > > > >
> > > > > > With the new consumer coming soon, the metrics naming schemes
> would
> > > > very
> > > > > > likely to be refined. So I cannot say this will always be true.
> > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > *2.*
> > > > > > >
> > > > > > > From code I see that sometimes Kafka uses "_" as separator, not
> > > only
> > > > > "-":
> > > > > > >
> > > > > > > val consumerIdString = {
> > > > > > >     var consumerUuid : String = null
> > > > > > >     config.consumerId match {
> > > > > > >       case Some(consumerId) // for testing only
> > > > > > >       => consumerUuid = consumerId
> > > > > > >       case None // generate unique consumerId automatically
> > > > > > >       => val uuid = UUID.randomUUID()
> > > > > > >       consumerUuid = "%s-%d-%s".format(
> > > > > > >         InetAddress.getLocalHost.getHostName,
> > > > System.currentTimeMillis,
> > > > > > >
> uuid.getMostSignificantBits().toHexString.substring(0,8))
> > > > > > >     }
> > > > > > >     config.groupId + "_" + consumerUuid
> > > > > > >   }
> > > > > > >
> > > > > > > That means if user will use "_" as part of his
> host/topic/groupId
> > > > name
> > > > > it
> > > > > > > maybe be a problem to parse string like:
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > > > > > >
> > > > > > > Look at part:
> "spm_new_cluster_topic-af_servers_wawanawna-Dell",
> > > what
> > > > > is
> > > > > > > host name here "servers_wawanawna-Dell" or "wawanawna-Dell" ?
> > > > > > >
> > > > > > > So from one side if we want to be able parse name without any
> > > > problems
> > > > > we
> > > > > > > have to avoid using "-" and "_" in host/topic/groupId/clientId,
> > but
> > > > at
> > > > > > the
> > > > > > > same time I see (from
> > > > > > >
> > > http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name
> > > > ):
> > > > > > >
> > > > > > > *"Client id is used for registering jmx beans for monitoring.
> > > Because
> > > > > of
> > > > > > > the*
> > > > > > > *restrictions in bean names, we limit the client id to be only
> > > > > > > alpha-numeric*
> > > > > > > *plus "-" and "_"."*
> > > > > > >
> > > > > > > Does that mean user can use only camelCase in his
> > > > > > > host/topic/groupId/clientId for distinguish one part of name
> from
> > > > > > another?
> > > > > > >
> > > > > > > Is this a problem? Or I didn't understand something?
> > > > > > >
> > > > > >
> > > > > > Yeah I agree this is a problem, and we should fix it in the new
> > > > consumer.
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > Best regards from Sematext.
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic <
> > > > > > > otis.gospodnetic@gmail.com
> > > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Guozhang,
> > > > > > > >
> > > > > > > > On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang <
> > > wangguoz@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > That is indeed a problem, for now, we recommend group name
> > and
> > > > > topic
> > > > > > > > names
> > > > > > > > > to use "_" when there is a need for "-", but this should be
> > > fixed
> > > > > > > > > systematically.
> > > > > > > > >
> > > > > > > >
> > > > > > > > Right!
> > > > > > > >
> > > > > > > > For you use case, could you change your topic/group name
> using
> > > "_"?
> > > > > > > >
> > > > > > > >
> > > > > > > > Our own Kafka doesn't use topics with "-" characters, so we
> > don't
> > > > > have
> > > > > > a
> > > > > > > > problem.
> > > > > > > >
> > > > > > > > The problem, in our case, is that we have a general (Kafka)
> > > > > monitoring
> > > > > > > tool
> > > > > > > > that other people use to monitor Kafka - see
> > > > > http://sematext.com/spm/
> > > > > > .
> > > > > > > >  So
> > > > > > > > we can't really tell people "hey, our tool will work but only
> > if
> > > > you
> > > > > > > don't
> > > > > > > > have a dash in your topic names and hosts and ... because if
> > you
> > > > use
> > > > > > > dashes
> > > > > > > > we won't know how to parse your Kafka's MBean names" :)
> > > > > > > >
> > > > > > > >
> > > > > > > > > Also, do you mind to file a JIRA ticket to keep track of
> this
> > > > > issue?
> > > > > > > >
> > > > > > > >
> > > > > > > > Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
> > > > > > > >
> > > > > > > > Otis
> > > > > > > > --
> > > > > > > > Performance Monitoring * Log Analytics * Search Analytics
> > > > > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
> > > > > > > > > vladimir.tretyakov@sematext.com> wrote:
> > > > > > > > >
> > > > > > > > > > Hello everyone,
> > > > > > > > > >
> > > > > > > > > > We are adding Kafka 0.8.x monitoring support to SPM
> > > > > > > > > > <http://sematext.com/spm/> here at Sematext.
> > Unfortunately,
> > > we
> > > > > > > quickly
> > > > > > > > > hit
> > > > > > > > > > an issue caused by the new bean naming convention that
> > embeds
> > > > > > things
> > > > > > > > like
> > > > > > > > > > topic and host names in the beans along with metrics,
> > > separated
> > > > > by
> > > > > > > > > dashes,
> > > > > > > > > > making it hard to parse these beans.
> > > > > > > > > >
> > > > > > > > > > To put it simply: it is hard/impossible to automatically
> > > figure
> > > > > out
> > > > > > > > which
> > > > > > > > > > part of the bean name is e.g. consumer group, which is
> the
> > > > topic,
> > > > > > > which
> > > > > > > > > is
> > > > > > > > > > the host name, and which is the name of the metric.
> > > > > > > > > >
> > > > > > > > > > Let me show you what I mean:
> > > > > > > > > >
> > > > > > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > > > > > >
> > > > > > > > > >
> > >  name="af_servers-spm_topic-BytesPerSec"
> > > > > > > > > >
> > > > > > > > > > Here we actually CAN extract:
> > > > > > > > > >
> > > > > > > > > >  * consumer group ('af_servers')
> > > > > > > > > >
> > > > > > > > > >  * topic ('spm_topic')
> > > > > > > > > >
> > > > > > > > > >  * metric (‘BytesPerSec’)
> > > > > > > > > >
> > > > > > > > > > BUT what if the consumer group id and/or topic name
> contain
> > > > '-'?
> > > > > > > > > >
> > > > > > > > > > Then how would we extract consumer group and topic?
> > > > > > > > > >
> > > > > > > > > > Here is a concrete example of this problem:
> > > > > > > > > >
> > > > > > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > > > > > >
> > > > > > > > > >
> > name="af-servers-spm-topic-BytesPerSec"
> > > > > > > > > >
> > > > > > > > > > How can we know what is group id or topic name here?
> > > > > > > > > >
> > > > > > > > > > This looks like a problem to me, but maybe I’m missing
> > > > something?
> > > > > > > > > >
> > > > > > > > > > Is it possible to have all these values (group id, topic
> > > name)
> > > > as
> > > > > > > > > separate
> > > > > > > > > > attributes inside JMX bean?
> > > > > > > > > >
> > > > > > > > > > Or maybe the problem could be solved if a different
> > delimiter
> > > > was
> > > > > > > used,
> > > > > > > > > > such as the pipe (“I”)?
> > > > > > > > > >
> > > > > > > > > > It is really needed things and will be nice to have it to
> > > build
> > > > > > good
> > > > > > > > tool
> > > > > > > > > > for monitoring.
> > > > > > > > > >
> > > > > > > > > > Thx and best regards from Sematext.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > -- Guozhang
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -- Guozhang
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>

Re: How to parse some of JMX Bean's names.

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi Neha,

The patch is in https://issues.apache.org/jira/browse/KAFKA-1481 . It's
super-simple.  Does it require a review or can one submit it to Jenkins or
something else?

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Wed, Jun 4, 2014 at 9:47 PM, Neha Narkhede <ne...@gmail.com>
wrote:

> Agree that 0.9 is a few months out. 0.8.2, on the other hand, is probably
> less than a month away. On one hand, we are trying our best to minimize the
> set of changes to the existing consumer in the interest of saving time to
> work on the new consumer. Since the old consumer is very complex and is now
> stable for some time, the motivation is to make fewer changes to it to
> maintain that stability. On the other hand, if there are critical bug
> fixes, it makes sense to patch the old consumer and do a point release.
>
> We would be happy to take a patch from you. How about we look at the size
> of the proposed changes and discuss a release timeline on the JIRA?
>
> Thanks,
> Neha
>
>
> On Wed, Jun 4, 2014 at 3:09 PM, Otis Gospodnetic <
> otis.gospodnetic@gmail.com
> > wrote:
>
> > Hi Guozhang,
> >
> > Since the new consumer is 2-3 months out ... hm, no, looks like 4 months
> > out - October -
> > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
> >
> > Any way you could change this in 0.8.1.2 or 0.8.2?  We can submit a
> patch.
> >
> > Thanks,
> > Otis
> > --
> > Performance Monitoring * Log Analytics * Search Analytics
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> > On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > I think we will make this change in the new consumer, which may be
> > released
> > > in 0.9.
> > >
> > > Guozhang
> > >
> > >
> > > On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov <
> > > vladimir.tretyakov@sematext.com> wrote:
> > >
> > > > Hi, thx Guozhang Wang, looking forward.
> > > >
> > > > When do you think this changes will available? 0.8.2? Jul 2014
> > > >
> https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
> > ?
> > > Or
> > > > later?
> > > >
> > > > Best regards, Vladimir.
> > > >
> > > >
> > > > On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > >
> > > > > Hello Vladimir, comments in-lined.
> > > > >
> > > > >
> > > > > On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov <
> > > > > vladimir.tretyakov@sematext.com> wrote:
> > > > >
> > > > > > Hi again, few more questions from me:
> > > > > >
> > > > > > *1.*
> > > > > >
> > > > > > What I see in JMX:
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > > > > >
> > > > > > From code:
> > > > > >
> > > > > > newGauge(
> > > > > >         config.clientId + "-" + config.groupId + "-" +
> > > > topicThreadId._1 +
> > > > > > "-" + topicThreadId._2 + "-FetchQueueSize",
> > > > > >         new Gauge[Int] {
> > > > > >           def value = q.size
> > > > > >         }
> > > > > >       )
> > > > > >
> > > > > > I've tried to parse part as I've understood they.
> > > > > >
> > > > > > config.clientId       >> af_servers
> > > > > > topicThreadId._1 >> af_servers-spm_new_cluster_topic
> > > > > > topicThreadId._2 >>
> > > af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
> > > > > >
> > > > > > Yes I can suppose that this topicThreadId._1 will always looks
> like
> > > > > > GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST,
> but
> > > > will
> > > > > it
> > > > > > always true?
> > > > >
> > > > >
> > > > >
> > > > > With the new consumer coming soon, the metrics naming schemes would
> > > very
> > > > > likely to be refined. So I cannot say this will always be true.
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > *2.*
> > > > > >
> > > > > > From code I see that sometimes Kafka uses "_" as separator, not
> > only
> > > > "-":
> > > > > >
> > > > > > val consumerIdString = {
> > > > > >     var consumerUuid : String = null
> > > > > >     config.consumerId match {
> > > > > >       case Some(consumerId) // for testing only
> > > > > >       => consumerUuid = consumerId
> > > > > >       case None // generate unique consumerId automatically
> > > > > >       => val uuid = UUID.randomUUID()
> > > > > >       consumerUuid = "%s-%d-%s".format(
> > > > > >         InetAddress.getLocalHost.getHostName,
> > > System.currentTimeMillis,
> > > > > >         uuid.getMostSignificantBits().toHexString.substring(0,8))
> > > > > >     }
> > > > > >     config.groupId + "_" + consumerUuid
> > > > > >   }
> > > > > >
> > > > > > That means if user will use "_" as part of his host/topic/groupId
> > > name
> > > > it
> > > > > > maybe be a problem to parse string like:
> > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > > > > >
> > > > > > Look at part: "spm_new_cluster_topic-af_servers_wawanawna-Dell",
> > what
> > > > is
> > > > > > host name here "servers_wawanawna-Dell" or "wawanawna-Dell" ?
> > > > > >
> > > > > > So from one side if we want to be able parse name without any
> > > problems
> > > > we
> > > > > > have to avoid using "-" and "_" in host/topic/groupId/clientId,
> but
> > > at
> > > > > the
> > > > > > same time I see (from
> > > > > >
> > http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name
> > > ):
> > > > > >
> > > > > > *"Client id is used for registering jmx beans for monitoring.
> > Because
> > > > of
> > > > > > the*
> > > > > > *restrictions in bean names, we limit the client id to be only
> > > > > > alpha-numeric*
> > > > > > *plus "-" and "_"."*
> > > > > >
> > > > > > Does that mean user can use only camelCase in his
> > > > > > host/topic/groupId/clientId for distinguish one part of name from
> > > > > another?
> > > > > >
> > > > > > Is this a problem? Or I didn't understand something?
> > > > > >
> > > > >
> > > > > Yeah I agree this is a problem, and we should fix it in the new
> > > consumer.
> > > > >
> > > > >
> > > > > >
> > > > > > Best regards from Sematext.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic <
> > > > > > otis.gospodnetic@gmail.com
> > > > > > > wrote:
> > > > > >
> > > > > > > Hi Guozhang,
> > > > > > >
> > > > > > > On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang <
> > wangguoz@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > That is indeed a problem, for now, we recommend group name
> and
> > > > topic
> > > > > > > names
> > > > > > > > to use "_" when there is a need for "-", but this should be
> > fixed
> > > > > > > > systematically.
> > > > > > > >
> > > > > > >
> > > > > > > Right!
> > > > > > >
> > > > > > > For you use case, could you change your topic/group name using
> > "_"?
> > > > > > >
> > > > > > >
> > > > > > > Our own Kafka doesn't use topics with "-" characters, so we
> don't
> > > > have
> > > > > a
> > > > > > > problem.
> > > > > > >
> > > > > > > The problem, in our case, is that we have a general (Kafka)
> > > > monitoring
> > > > > > tool
> > > > > > > that other people use to monitor Kafka - see
> > > > http://sematext.com/spm/
> > > > > .
> > > > > > >  So
> > > > > > > we can't really tell people "hey, our tool will work but only
> if
> > > you
> > > > > > don't
> > > > > > > have a dash in your topic names and hosts and ... because if
> you
> > > use
> > > > > > dashes
> > > > > > > we won't know how to parse your Kafka's MBean names" :)
> > > > > > >
> > > > > > >
> > > > > > > > Also, do you mind to file a JIRA ticket to keep track of this
> > > > issue?
> > > > > > >
> > > > > > >
> > > > > > > Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
> > > > > > >
> > > > > > > Otis
> > > > > > > --
> > > > > > > Performance Monitoring * Log Analytics * Search Analytics
> > > > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
> > > > > > > > vladimir.tretyakov@sematext.com> wrote:
> > > > > > > >
> > > > > > > > > Hello everyone,
> > > > > > > > >
> > > > > > > > > We are adding Kafka 0.8.x monitoring support to SPM
> > > > > > > > > <http://sematext.com/spm/> here at Sematext.
> Unfortunately,
> > we
> > > > > > quickly
> > > > > > > > hit
> > > > > > > > > an issue caused by the new bean naming convention that
> embeds
> > > > > things
> > > > > > > like
> > > > > > > > > topic and host names in the beans along with metrics,
> > separated
> > > > by
> > > > > > > > dashes,
> > > > > > > > > making it hard to parse these beans.
> > > > > > > > >
> > > > > > > > > To put it simply: it is hard/impossible to automatically
> > figure
> > > > out
> > > > > > > which
> > > > > > > > > part of the bean name is e.g. consumer group, which is the
> > > topic,
> > > > > > which
> > > > > > > > is
> > > > > > > > > the host name, and which is the name of the metric.
> > > > > > > > >
> > > > > > > > > Let me show you what I mean:
> > > > > > > > >
> > > > > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > > > > >
> > > > > > > > >
> >  name="af_servers-spm_topic-BytesPerSec"
> > > > > > > > >
> > > > > > > > > Here we actually CAN extract:
> > > > > > > > >
> > > > > > > > >  * consumer group ('af_servers')
> > > > > > > > >
> > > > > > > > >  * topic ('spm_topic')
> > > > > > > > >
> > > > > > > > >  * metric (‘BytesPerSec’)
> > > > > > > > >
> > > > > > > > > BUT what if the consumer group id and/or topic name contain
> > > '-'?
> > > > > > > > >
> > > > > > > > > Then how would we extract consumer group and topic?
> > > > > > > > >
> > > > > > > > > Here is a concrete example of this problem:
> > > > > > > > >
> > > > > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > > > > >
> > > > > > > > >
> name="af-servers-spm-topic-BytesPerSec"
> > > > > > > > >
> > > > > > > > > How can we know what is group id or topic name here?
> > > > > > > > >
> > > > > > > > > This looks like a problem to me, but maybe I’m missing
> > > something?
> > > > > > > > >
> > > > > > > > > Is it possible to have all these values (group id, topic
> > name)
> > > as
> > > > > > > > separate
> > > > > > > > > attributes inside JMX bean?
> > > > > > > > >
> > > > > > > > > Or maybe the problem could be solved if a different
> delimiter
> > > was
> > > > > > used,
> > > > > > > > > such as the pipe (“I”)?
> > > > > > > > >
> > > > > > > > > It is really needed things and will be nice to have it to
> > build
> > > > > good
> > > > > > > tool
> > > > > > > > > for monitoring.
> > > > > > > > >
> > > > > > > > > Thx and best regards from Sematext.
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > -- Guozhang
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>

Re: How to parse some of JMX Bean's names.

Posted by Neha Narkhede <ne...@gmail.com>.
Agree that 0.9 is a few months out. 0.8.2, on the other hand, is probably
less than a month away. On one hand, we are trying our best to minimize the
set of changes to the existing consumer in the interest of saving time to
work on the new consumer. Since the old consumer is very complex and is now
stable for some time, the motivation is to make fewer changes to it to
maintain that stability. On the other hand, if there are critical bug
fixes, it makes sense to patch the old consumer and do a point release.

We would be happy to take a patch from you. How about we look at the size
of the proposed changes and discuss a release timeline on the JIRA?

Thanks,
Neha


On Wed, Jun 4, 2014 at 3:09 PM, Otis Gospodnetic <otis.gospodnetic@gmail.com
> wrote:

> Hi Guozhang,
>
> Since the new consumer is 2-3 months out ... hm, no, looks like 4 months
> out - October -
> https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
>
> Any way you could change this in 0.8.1.2 or 0.8.2?  We can submit a patch.
>
> Thanks,
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
> On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > I think we will make this change in the new consumer, which may be
> released
> > in 0.9.
> >
> > Guozhang
> >
> >
> > On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov <
> > vladimir.tretyakov@sematext.com> wrote:
> >
> > > Hi, thx Guozhang Wang, looking forward.
> > >
> > > When do you think this changes will available? 0.8.2? Jul 2014
> > > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan
> ?
> > Or
> > > later?
> > >
> > > Best regards, Vladimir.
> > >
> > >
> > > On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > Hello Vladimir, comments in-lined.
> > > >
> > > >
> > > > On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov <
> > > > vladimir.tretyakov@sematext.com> wrote:
> > > >
> > > > > Hi again, few more questions from me:
> > > > >
> > > > > *1.*
> > > > >
> > > > > What I see in JMX:
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > > > >
> > > > > From code:
> > > > >
> > > > > newGauge(
> > > > >         config.clientId + "-" + config.groupId + "-" +
> > > topicThreadId._1 +
> > > > > "-" + topicThreadId._2 + "-FetchQueueSize",
> > > > >         new Gauge[Int] {
> > > > >           def value = q.size
> > > > >         }
> > > > >       )
> > > > >
> > > > > I've tried to parse part as I've understood they.
> > > > >
> > > > > config.clientId       >> af_servers
> > > > > topicThreadId._1 >> af_servers-spm_new_cluster_topic
> > > > > topicThreadId._2 >>
> > af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
> > > > >
> > > > > Yes I can suppose that this topicThreadId._1 will always looks like
> > > > > GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but
> > > will
> > > > it
> > > > > always true?
> > > >
> > > >
> > > >
> > > > With the new consumer coming soon, the metrics naming schemes would
> > very
> > > > likely to be refined. So I cannot say this will always be true.
> > > >
> > > >
> > > >
> > > > >
> > > > > *2.*
> > > > >
> > > > > From code I see that sometimes Kafka uses "_" as separator, not
> only
> > > "-":
> > > > >
> > > > > val consumerIdString = {
> > > > >     var consumerUuid : String = null
> > > > >     config.consumerId match {
> > > > >       case Some(consumerId) // for testing only
> > > > >       => consumerUuid = consumerId
> > > > >       case None // generate unique consumerId automatically
> > > > >       => val uuid = UUID.randomUUID()
> > > > >       consumerUuid = "%s-%d-%s".format(
> > > > >         InetAddress.getLocalHost.getHostName,
> > System.currentTimeMillis,
> > > > >         uuid.getMostSignificantBits().toHexString.substring(0,8))
> > > > >     }
> > > > >     config.groupId + "_" + consumerUuid
> > > > >   }
> > > > >
> > > > > That means if user will use "_" as part of his host/topic/groupId
> > name
> > > it
> > > > > maybe be a problem to parse string like:
> > > > >
> > > > >
> > > > >
> > > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > > > >
> > > > > Look at part: "spm_new_cluster_topic-af_servers_wawanawna-Dell",
> what
> > > is
> > > > > host name here "servers_wawanawna-Dell" or "wawanawna-Dell" ?
> > > > >
> > > > > So from one side if we want to be able parse name without any
> > problems
> > > we
> > > > > have to avoid using "-" and "_" in host/topic/groupId/clientId, but
> > at
> > > > the
> > > > > same time I see (from
> > > > >
> http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name
> > ):
> > > > >
> > > > > *"Client id is used for registering jmx beans for monitoring.
> Because
> > > of
> > > > > the*
> > > > > *restrictions in bean names, we limit the client id to be only
> > > > > alpha-numeric*
> > > > > *plus "-" and "_"."*
> > > > >
> > > > > Does that mean user can use only camelCase in his
> > > > > host/topic/groupId/clientId for distinguish one part of name from
> > > > another?
> > > > >
> > > > > Is this a problem? Or I didn't understand something?
> > > > >
> > > >
> > > > Yeah I agree this is a problem, and we should fix it in the new
> > consumer.
> > > >
> > > >
> > > > >
> > > > > Best regards from Sematext.
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic <
> > > > > otis.gospodnetic@gmail.com
> > > > > > wrote:
> > > > >
> > > > > > Hi Guozhang,
> > > > > >
> > > > > > On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang <
> wangguoz@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > That is indeed a problem, for now, we recommend group name and
> > > topic
> > > > > > names
> > > > > > > to use "_" when there is a need for "-", but this should be
> fixed
> > > > > > > systematically.
> > > > > > >
> > > > > >
> > > > > > Right!
> > > > > >
> > > > > > For you use case, could you change your topic/group name using
> "_"?
> > > > > >
> > > > > >
> > > > > > Our own Kafka doesn't use topics with "-" characters, so we don't
> > > have
> > > > a
> > > > > > problem.
> > > > > >
> > > > > > The problem, in our case, is that we have a general (Kafka)
> > > monitoring
> > > > > tool
> > > > > > that other people use to monitor Kafka - see
> > > http://sematext.com/spm/
> > > > .
> > > > > >  So
> > > > > > we can't really tell people "hey, our tool will work but only if
> > you
> > > > > don't
> > > > > > have a dash in your topic names and hosts and ... because if you
> > use
> > > > > dashes
> > > > > > we won't know how to parse your Kafka's MBean names" :)
> > > > > >
> > > > > >
> > > > > > > Also, do you mind to file a JIRA ticket to keep track of this
> > > issue?
> > > > > >
> > > > > >
> > > > > > Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
> > > > > >
> > > > > > Otis
> > > > > > --
> > > > > > Performance Monitoring * Log Analytics * Search Analytics
> > > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
> > > > > > > vladimir.tretyakov@sematext.com> wrote:
> > > > > > >
> > > > > > > > Hello everyone,
> > > > > > > >
> > > > > > > > We are adding Kafka 0.8.x monitoring support to SPM
> > > > > > > > <http://sematext.com/spm/> here at Sematext. Unfortunately,
> we
> > > > > quickly
> > > > > > > hit
> > > > > > > > an issue caused by the new bean naming convention that embeds
> > > > things
> > > > > > like
> > > > > > > > topic and host names in the beans along with metrics,
> separated
> > > by
> > > > > > > dashes,
> > > > > > > > making it hard to parse these beans.
> > > > > > > >
> > > > > > > > To put it simply: it is hard/impossible to automatically
> figure
> > > out
> > > > > > which
> > > > > > > > part of the bean name is e.g. consumer group, which is the
> > topic,
> > > > > which
> > > > > > > is
> > > > > > > > the host name, and which is the name of the metric.
> > > > > > > >
> > > > > > > > Let me show you what I mean:
> > > > > > > >
> > > > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > > > >
> > > > > > > >
>  name="af_servers-spm_topic-BytesPerSec"
> > > > > > > >
> > > > > > > > Here we actually CAN extract:
> > > > > > > >
> > > > > > > >  * consumer group ('af_servers')
> > > > > > > >
> > > > > > > >  * topic ('spm_topic')
> > > > > > > >
> > > > > > > >  * metric (‘BytesPerSec’)
> > > > > > > >
> > > > > > > > BUT what if the consumer group id and/or topic name contain
> > '-'?
> > > > > > > >
> > > > > > > > Then how would we extract consumer group and topic?
> > > > > > > >
> > > > > > > > Here is a concrete example of this problem:
> > > > > > > >
> > > > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > > > >
> > > > > > > >                       name="af-servers-spm-topic-BytesPerSec"
> > > > > > > >
> > > > > > > > How can we know what is group id or topic name here?
> > > > > > > >
> > > > > > > > This looks like a problem to me, but maybe I’m missing
> > something?
> > > > > > > >
> > > > > > > > Is it possible to have all these values (group id, topic
> name)
> > as
> > > > > > > separate
> > > > > > > > attributes inside JMX bean?
> > > > > > > >
> > > > > > > > Or maybe the problem could be solved if a different delimiter
> > was
> > > > > used,
> > > > > > > > such as the pipe (“I”)?
> > > > > > > >
> > > > > > > > It is really needed things and will be nice to have it to
> build
> > > > good
> > > > > > tool
> > > > > > > > for monitoring.
> > > > > > > >
> > > > > > > > Thx and best regards from Sematext.
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > -- Guozhang
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: How to parse some of JMX Bean's names.

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi Guozhang,

Since the new consumer is 2-3 months out ... hm, no, looks like 4 months
out - October -
https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan

Any way you could change this in 0.8.1.2 or 0.8.2?  We can submit a patch.

Thanks,
Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/


On Wed, Jun 4, 2014 at 4:20 PM, Guozhang Wang <wa...@gmail.com> wrote:

> I think we will make this change in the new consumer, which may be released
> in 0.9.
>
> Guozhang
>
>
> On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov <
> vladimir.tretyakov@sematext.com> wrote:
>
> > Hi, thx Guozhang Wang, looking forward.
> >
> > When do you think this changes will available? 0.8.2? Jul 2014
> > https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ?
> Or
> > later?
> >
> > Best regards, Vladimir.
> >
> >
> > On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > Hello Vladimir, comments in-lined.
> > >
> > >
> > > On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov <
> > > vladimir.tretyakov@sematext.com> wrote:
> > >
> > > > Hi again, few more questions from me:
> > > >
> > > > *1.*
> > > >
> > > > What I see in JMX:
> > > >
> > > >
> > > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > > >
> > > > From code:
> > > >
> > > > newGauge(
> > > >         config.clientId + "-" + config.groupId + "-" +
> > topicThreadId._1 +
> > > > "-" + topicThreadId._2 + "-FetchQueueSize",
> > > >         new Gauge[Int] {
> > > >           def value = q.size
> > > >         }
> > > >       )
> > > >
> > > > I've tried to parse part as I've understood they.
> > > >
> > > > config.clientId       >> af_servers
> > > > topicThreadId._1 >> af_servers-spm_new_cluster_topic
> > > > topicThreadId._2 >>
> af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
> > > >
> > > > Yes I can suppose that this topicThreadId._1 will always looks like
> > > > GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but
> > will
> > > it
> > > > always true?
> > >
> > >
> > >
> > > With the new consumer coming soon, the metrics naming schemes would
> very
> > > likely to be refined. So I cannot say this will always be true.
> > >
> > >
> > >
> > > >
> > > > *2.*
> > > >
> > > > From code I see that sometimes Kafka uses "_" as separator, not only
> > "-":
> > > >
> > > > val consumerIdString = {
> > > >     var consumerUuid : String = null
> > > >     config.consumerId match {
> > > >       case Some(consumerId) // for testing only
> > > >       => consumerUuid = consumerId
> > > >       case None // generate unique consumerId automatically
> > > >       => val uuid = UUID.randomUUID()
> > > >       consumerUuid = "%s-%d-%s".format(
> > > >         InetAddress.getLocalHost.getHostName,
> System.currentTimeMillis,
> > > >         uuid.getMostSignificantBits().toHexString.substring(0,8))
> > > >     }
> > > >     config.groupId + "_" + consumerUuid
> > > >   }
> > > >
> > > > That means if user will use "_" as part of his host/topic/groupId
> name
> > it
> > > > maybe be a problem to parse string like:
> > > >
> > > >
> > > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > > >
> > > > Look at part: "spm_new_cluster_topic-af_servers_wawanawna-Dell", what
> > is
> > > > host name here "servers_wawanawna-Dell" or "wawanawna-Dell" ?
> > > >
> > > > So from one side if we want to be able parse name without any
> problems
> > we
> > > > have to avoid using "-" and "_" in host/topic/groupId/clientId, but
> at
> > > the
> > > > same time I see (from
> > > > http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name
> ):
> > > >
> > > > *"Client id is used for registering jmx beans for monitoring. Because
> > of
> > > > the*
> > > > *restrictions in bean names, we limit the client id to be only
> > > > alpha-numeric*
> > > > *plus "-" and "_"."*
> > > >
> > > > Does that mean user can use only camelCase in his
> > > > host/topic/groupId/clientId for distinguish one part of name from
> > > another?
> > > >
> > > > Is this a problem? Or I didn't understand something?
> > > >
> > >
> > > Yeah I agree this is a problem, and we should fix it in the new
> consumer.
> > >
> > >
> > > >
> > > > Best regards from Sematext.
> > > >
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic <
> > > > otis.gospodnetic@gmail.com
> > > > > wrote:
> > > >
> > > > > Hi Guozhang,
> > > > >
> > > > > On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang <wa...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > That is indeed a problem, for now, we recommend group name and
> > topic
> > > > > names
> > > > > > to use "_" when there is a need for "-", but this should be fixed
> > > > > > systematically.
> > > > > >
> > > > >
> > > > > Right!
> > > > >
> > > > > For you use case, could you change your topic/group name using "_"?
> > > > >
> > > > >
> > > > > Our own Kafka doesn't use topics with "-" characters, so we don't
> > have
> > > a
> > > > > problem.
> > > > >
> > > > > The problem, in our case, is that we have a general (Kafka)
> > monitoring
> > > > tool
> > > > > that other people use to monitor Kafka - see
> > http://sematext.com/spm/
> > > .
> > > > >  So
> > > > > we can't really tell people "hey, our tool will work but only if
> you
> > > > don't
> > > > > have a dash in your topic names and hosts and ... because if you
> use
> > > > dashes
> > > > > we won't know how to parse your Kafka's MBean names" :)
> > > > >
> > > > >
> > > > > > Also, do you mind to file a JIRA ticket to keep track of this
> > issue?
> > > > >
> > > > >
> > > > > Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
> > > > >
> > > > > Otis
> > > > > --
> > > > > Performance Monitoring * Log Analytics * Search Analytics
> > > > > Solr & Elasticsearch Support * http://sematext.com/
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > >
> > > > > > On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
> > > > > > vladimir.tretyakov@sematext.com> wrote:
> > > > > >
> > > > > > > Hello everyone,
> > > > > > >
> > > > > > > We are adding Kafka 0.8.x monitoring support to SPM
> > > > > > > <http://sematext.com/spm/> here at Sematext. Unfortunately, we
> > > > quickly
> > > > > > hit
> > > > > > > an issue caused by the new bean naming convention that embeds
> > > things
> > > > > like
> > > > > > > topic and host names in the beans along with metrics, separated
> > by
> > > > > > dashes,
> > > > > > > making it hard to parse these beans.
> > > > > > >
> > > > > > > To put it simply: it is hard/impossible to automatically figure
> > out
> > > > > which
> > > > > > > part of the bean name is e.g. consumer group, which is the
> topic,
> > > > which
> > > > > > is
> > > > > > > the host name, and which is the name of the metric.
> > > > > > >
> > > > > > > Let me show you what I mean:
> > > > > > >
> > > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > > >
> > > > > > >                        name="af_servers-spm_topic-BytesPerSec"
> > > > > > >
> > > > > > > Here we actually CAN extract:
> > > > > > >
> > > > > > >  * consumer group ('af_servers')
> > > > > > >
> > > > > > >  * topic ('spm_topic')
> > > > > > >
> > > > > > >  * metric (‘BytesPerSec’)
> > > > > > >
> > > > > > > BUT what if the consumer group id and/or topic name contain
> '-'?
> > > > > > >
> > > > > > > Then how would we extract consumer group and topic?
> > > > > > >
> > > > > > > Here is a concrete example of this problem:
> > > > > > >
> > > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > > >
> > > > > > >                       name="af-servers-spm-topic-BytesPerSec"
> > > > > > >
> > > > > > > How can we know what is group id or topic name here?
> > > > > > >
> > > > > > > This looks like a problem to me, but maybe I’m missing
> something?
> > > > > > >
> > > > > > > Is it possible to have all these values (group id, topic name)
> as
> > > > > > separate
> > > > > > > attributes inside JMX bean?
> > > > > > >
> > > > > > > Or maybe the problem could be solved if a different delimiter
> was
> > > > used,
> > > > > > > such as the pipe (“I”)?
> > > > > > >
> > > > > > > It is really needed things and will be nice to have it to build
> > > good
> > > > > tool
> > > > > > > for monitoring.
> > > > > > >
> > > > > > > Thx and best regards from Sematext.
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -- Guozhang
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: How to parse some of JMX Bean's names.

Posted by Guozhang Wang <wa...@gmail.com>.
I think we will make this change in the new consumer, which may be released
in 0.9.

Guozhang


On Wed, Jun 4, 2014 at 12:13 PM, Vladimir Tretyakov <
vladimir.tretyakov@sematext.com> wrote:

> Hi, thx Guozhang Wang, looking forward.
>
> When do you think this changes will available? 0.8.2? Jul 2014
> https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ? Or
> later?
>
> Best regards, Vladimir.
>
>
> On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > Hello Vladimir, comments in-lined.
> >
> >
> > On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov <
> > vladimir.tretyakov@sematext.com> wrote:
> >
> > > Hi again, few more questions from me:
> > >
> > > *1.*
> > >
> > > What I see in JMX:
> > >
> > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > >
> > > From code:
> > >
> > > newGauge(
> > >         config.clientId + "-" + config.groupId + "-" +
> topicThreadId._1 +
> > > "-" + topicThreadId._2 + "-FetchQueueSize",
> > >         new Gauge[Int] {
> > >           def value = q.size
> > >         }
> > >       )
> > >
> > > I've tried to parse part as I've understood they.
> > >
> > > config.clientId       >> af_servers
> > > topicThreadId._1 >> af_servers-spm_new_cluster_topic
> > > topicThreadId._2 >> af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
> > >
> > > Yes I can suppose that this topicThreadId._1 will always looks like
> > > GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but
> will
> > it
> > > always true?
> >
> >
> >
> > With the new consumer coming soon, the metrics naming schemes would very
> > likely to be refined. So I cannot say this will always be true.
> >
> >
> >
> > >
> > > *2.*
> > >
> > > From code I see that sometimes Kafka uses "_" as separator, not only
> "-":
> > >
> > > val consumerIdString = {
> > >     var consumerUuid : String = null
> > >     config.consumerId match {
> > >       case Some(consumerId) // for testing only
> > >       => consumerUuid = consumerId
> > >       case None // generate unique consumerId automatically
> > >       => val uuid = UUID.randomUUID()
> > >       consumerUuid = "%s-%d-%s".format(
> > >         InetAddress.getLocalHost.getHostName, System.currentTimeMillis,
> > >         uuid.getMostSignificantBits().toHexString.substring(0,8))
> > >     }
> > >     config.groupId + "_" + consumerUuid
> > >   }
> > >
> > > That means if user will use "_" as part of his host/topic/groupId name
> it
> > > maybe be a problem to parse string like:
> > >
> > >
> > >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> > >
> > > Look at part: "spm_new_cluster_topic-af_servers_wawanawna-Dell", what
> is
> > > host name here "servers_wawanawna-Dell" or "wawanawna-Dell" ?
> > >
> > > So from one side if we want to be able parse name without any problems
> we
> > > have to avoid using "-" and "_" in host/topic/groupId/clientId, but at
> > the
> > > same time I see (from
> > > http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name):
> > >
> > > *"Client id is used for registering jmx beans for monitoring. Because
> of
> > > the*
> > > *restrictions in bean names, we limit the client id to be only
> > > alpha-numeric*
> > > *plus "-" and "_"."*
> > >
> > > Does that mean user can use only camelCase in his
> > > host/topic/groupId/clientId for distinguish one part of name from
> > another?
> > >
> > > Is this a problem? Or I didn't understand something?
> > >
> >
> > Yeah I agree this is a problem, and we should fix it in the new consumer.
> >
> >
> > >
> > > Best regards from Sematext.
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic <
> > > otis.gospodnetic@gmail.com
> > > > wrote:
> > >
> > > > Hi Guozhang,
> > > >
> > > > On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > >
> > > > > That is indeed a problem, for now, we recommend group name and
> topic
> > > > names
> > > > > to use "_" when there is a need for "-", but this should be fixed
> > > > > systematically.
> > > > >
> > > >
> > > > Right!
> > > >
> > > > For you use case, could you change your topic/group name using "_"?
> > > >
> > > >
> > > > Our own Kafka doesn't use topics with "-" characters, so we don't
> have
> > a
> > > > problem.
> > > >
> > > > The problem, in our case, is that we have a general (Kafka)
> monitoring
> > > tool
> > > > that other people use to monitor Kafka - see
> http://sematext.com/spm/
> > .
> > > >  So
> > > > we can't really tell people "hey, our tool will work but only if you
> > > don't
> > > > have a dash in your topic names and hosts and ... because if you use
> > > dashes
> > > > we won't know how to parse your Kafka's MBean names" :)
> > > >
> > > >
> > > > > Also, do you mind to file a JIRA ticket to keep track of this
> issue?
> > > >
> > > >
> > > > Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
> > > >
> > > > Otis
> > > > --
> > > > Performance Monitoring * Log Analytics * Search Analytics
> > > > Solr & Elasticsearch Support * http://sematext.com/
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > >
> > > > > On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
> > > > > vladimir.tretyakov@sematext.com> wrote:
> > > > >
> > > > > > Hello everyone,
> > > > > >
> > > > > > We are adding Kafka 0.8.x monitoring support to SPM
> > > > > > <http://sematext.com/spm/> here at Sematext. Unfortunately, we
> > > quickly
> > > > > hit
> > > > > > an issue caused by the new bean naming convention that embeds
> > things
> > > > like
> > > > > > topic and host names in the beans along with metrics, separated
> by
> > > > > dashes,
> > > > > > making it hard to parse these beans.
> > > > > >
> > > > > > To put it simply: it is hard/impossible to automatically figure
> out
> > > > which
> > > > > > part of the bean name is e.g. consumer group, which is the topic,
> > > which
> > > > > is
> > > > > > the host name, and which is the name of the metric.
> > > > > >
> > > > > > Let me show you what I mean:
> > > > > >
> > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > >
> > > > > >                        name="af_servers-spm_topic-BytesPerSec"
> > > > > >
> > > > > > Here we actually CAN extract:
> > > > > >
> > > > > >  * consumer group ('af_servers')
> > > > > >
> > > > > >  * topic ('spm_topic')
> > > > > >
> > > > > >  * metric (‘BytesPerSec’)
> > > > > >
> > > > > > BUT what if the consumer group id and/or topic name contain '-'?
> > > > > >
> > > > > > Then how would we extract consumer group and topic?
> > > > > >
> > > > > > Here is a concrete example of this problem:
> > > > > >
> > > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > > >
> > > > > >                       name="af-servers-spm-topic-BytesPerSec"
> > > > > >
> > > > > > How can we know what is group id or topic name here?
> > > > > >
> > > > > > This looks like a problem to me, but maybe I’m missing something?
> > > > > >
> > > > > > Is it possible to have all these values (group id, topic name) as
> > > > > separate
> > > > > > attributes inside JMX bean?
> > > > > >
> > > > > > Or maybe the problem could be solved if a different delimiter was
> > > used,
> > > > > > such as the pipe (“I”)?
> > > > > >
> > > > > > It is really needed things and will be nice to have it to build
> > good
> > > > tool
> > > > > > for monitoring.
> > > > > >
> > > > > > Thx and best regards from Sematext.
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>



-- 
-- Guozhang

Re: How to parse some of JMX Bean's names.

Posted by Vladimir Tretyakov <vl...@sematext.com>.
Hi, thx Guozhang Wang, looking forward.

When do you think this changes will available? 0.8.2? Jul 2014
https://cwiki.apache.org/confluence/display/KAFKA/Future+release+plan ? Or
later?

Best regards, Vladimir.


On Wed, Jun 4, 2014 at 8:01 PM, Guozhang Wang <wa...@gmail.com> wrote:

> Hello Vladimir, comments in-lined.
>
>
> On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov <
> vladimir.tretyakov@sematext.com> wrote:
>
> > Hi again, few more questions from me:
> >
> > *1.*
> >
> > What I see in JMX:
> >
> >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> >
> > From code:
> >
> > newGauge(
> >         config.clientId + "-" + config.groupId + "-" + topicThreadId._1 +
> > "-" + topicThreadId._2 + "-FetchQueueSize",
> >         new Gauge[Int] {
> >           def value = q.size
> >         }
> >       )
> >
> > I've tried to parse part as I've understood they.
> >
> > config.clientId       >> af_servers
> > topicThreadId._1 >> af_servers-spm_new_cluster_topic
> > topicThreadId._2 >> af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
> >
> > Yes I can suppose that this topicThreadId._1 will always looks like
> > GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will
> it
> > always true?
>
>
>
> With the new consumer coming soon, the metrics naming schemes would very
> likely to be refined. So I cannot say this will always be true.
>
>
>
> >
> > *2.*
> >
> > From code I see that sometimes Kafka uses "_" as separator, not only "-":
> >
> > val consumerIdString = {
> >     var consumerUuid : String = null
> >     config.consumerId match {
> >       case Some(consumerId) // for testing only
> >       => consumerUuid = consumerId
> >       case None // generate unique consumerId automatically
> >       => val uuid = UUID.randomUUID()
> >       consumerUuid = "%s-%d-%s".format(
> >         InetAddress.getLocalHost.getHostName, System.currentTimeMillis,
> >         uuid.getMostSignificantBits().toHexString.substring(0,8))
> >     }
> >     config.groupId + "_" + consumerUuid
> >   }
> >
> > That means if user will use "_" as part of his host/topic/groupId name it
> > maybe be a problem to parse string like:
> >
> >
> >
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
> >
> > Look at part: "spm_new_cluster_topic-af_servers_wawanawna-Dell", what is
> > host name here "servers_wawanawna-Dell" or "wawanawna-Dell" ?
> >
> > So from one side if we want to be able parse name without any problems we
> > have to avoid using "-" and "_" in host/topic/groupId/clientId, but at
> the
> > same time I see (from
> > http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name):
> >
> > *"Client id is used for registering jmx beans for monitoring. Because of
> > the*
> > *restrictions in bean names, we limit the client id to be only
> > alpha-numeric*
> > *plus "-" and "_"."*
> >
> > Does that mean user can use only camelCase in his
> > host/topic/groupId/clientId for distinguish one part of name from
> another?
> >
> > Is this a problem? Or I didn't understand something?
> >
>
> Yeah I agree this is a problem, and we should fix it in the new consumer.
>
>
> >
> > Best regards from Sematext.
> >
> >
> >
> >
> >
> >
> > On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic <
> > otis.gospodnetic@gmail.com
> > > wrote:
> >
> > > Hi Guozhang,
> > >
> > > On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > That is indeed a problem, for now, we recommend group name and topic
> > > names
> > > > to use "_" when there is a need for "-", but this should be fixed
> > > > systematically.
> > > >
> > >
> > > Right!
> > >
> > > For you use case, could you change your topic/group name using "_"?
> > >
> > >
> > > Our own Kafka doesn't use topics with "-" characters, so we don't have
> a
> > > problem.
> > >
> > > The problem, in our case, is that we have a general (Kafka) monitoring
> > tool
> > > that other people use to monitor Kafka - see http://sematext.com/spm/
> .
> > >  So
> > > we can't really tell people "hey, our tool will work but only if you
> > don't
> > > have a dash in your topic names and hosts and ... because if you use
> > dashes
> > > we won't know how to parse your Kafka's MBean names" :)
> > >
> > >
> > > > Also, do you mind to file a JIRA ticket to keep track of this issue?
> > >
> > >
> > > Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
> > >
> > > Otis
> > > --
> > > Performance Monitoring * Log Analytics * Search Analytics
> > > Solr & Elasticsearch Support * http://sematext.com/
> > >
> > >
> > >
> > >
> > >
> > > >
> > > > On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
> > > > vladimir.tretyakov@sematext.com> wrote:
> > > >
> > > > > Hello everyone,
> > > > >
> > > > > We are adding Kafka 0.8.x monitoring support to SPM
> > > > > <http://sematext.com/spm/> here at Sematext. Unfortunately, we
> > quickly
> > > > hit
> > > > > an issue caused by the new bean naming convention that embeds
> things
> > > like
> > > > > topic and host names in the beans along with metrics, separated by
> > > > dashes,
> > > > > making it hard to parse these beans.
> > > > >
> > > > > To put it simply: it is hard/impossible to automatically figure out
> > > which
> > > > > part of the bean name is e.g. consumer group, which is the topic,
> > which
> > > > is
> > > > > the host name, and which is the name of the metric.
> > > > >
> > > > > Let me show you what I mean:
> > > > >
> > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > >
> > > > >                        name="af_servers-spm_topic-BytesPerSec"
> > > > >
> > > > > Here we actually CAN extract:
> > > > >
> > > > >  * consumer group ('af_servers')
> > > > >
> > > > >  * topic ('spm_topic')
> > > > >
> > > > >  * metric (‘BytesPerSec’)
> > > > >
> > > > > BUT what if the consumer group id and/or topic name contain '-'?
> > > > >
> > > > > Then how would we extract consumer group and topic?
> > > > >
> > > > > Here is a concrete example of this problem:
> > > > >
> > > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > > >
> > > > >                       name="af-servers-spm-topic-BytesPerSec"
> > > > >
> > > > > How can we know what is group id or topic name here?
> > > > >
> > > > > This looks like a problem to me, but maybe I’m missing something?
> > > > >
> > > > > Is it possible to have all these values (group id, topic name) as
> > > > separate
> > > > > attributes inside JMX bean?
> > > > >
> > > > > Or maybe the problem could be solved if a different delimiter was
> > used,
> > > > > such as the pipe (“I”)?
> > > > >
> > > > > It is really needed things and will be nice to have it to build
> good
> > > tool
> > > > > for monitoring.
> > > > >
> > > > > Thx and best regards from Sematext.
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: How to parse some of JMX Bean's names.

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Vladimir, comments in-lined.


On Wed, Jun 4, 2014 at 6:08 AM, Vladimir Tretyakov <
vladimir.tretyakov@sematext.com> wrote:

> Hi again, few more questions from me:
>
> *1.*
>
> What I see in JMX:
>
>
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
>
> From code:
>
> newGauge(
>         config.clientId + "-" + config.groupId + "-" + topicThreadId._1 +
> "-" + topicThreadId._2 + "-FetchQueueSize",
>         new Gauge[Int] {
>           def value = q.size
>         }
>       )
>
> I've tried to parse part as I've understood they.
>
> config.clientId       >> af_servers
> topicThreadId._1 >> af_servers-spm_new_cluster_topic
> topicThreadId._2 >> af_servers_wawanawna-Dell-1401353748289-fcaaea29-0
>
> Yes I can suppose that this topicThreadId._1 will always looks like
> GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will it
> always true?



With the new consumer coming soon, the metrics naming schemes would very
likely to be refined. So I cannot say this will always be true.



>
> *2.*
>
> From code I see that sometimes Kafka uses "_" as separator, not only "-":
>
> val consumerIdString = {
>     var consumerUuid : String = null
>     config.consumerId match {
>       case Some(consumerId) // for testing only
>       => consumerUuid = consumerId
>       case None // generate unique consumerId automatically
>       => val uuid = UUID.randomUUID()
>       consumerUuid = "%s-%d-%s".format(
>         InetAddress.getLocalHost.getHostName, System.currentTimeMillis,
>         uuid.getMostSignificantBits().toHexString.substring(0,8))
>     }
>     config.groupId + "_" + consumerUuid
>   }
>
> That means if user will use "_" as part of his host/topic/groupId name it
> maybe be a problem to parse string like:
>
>
> kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"
>
> Look at part: "spm_new_cluster_topic-af_servers_wawanawna-Dell", what is
> host name here "servers_wawanawna-Dell" or "wawanawna-Dell" ?
>
> So from one side if we want to be able parse name without any problems we
> have to avoid using "-" and "_" in host/topic/groupId/clientId, but at the
> same time I see (from
> http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name):
>
> *"Client id is used for registering jmx beans for monitoring. Because of
> the*
> *restrictions in bean names, we limit the client id to be only
> alpha-numeric*
> *plus "-" and "_"."*
>
> Does that mean user can use only camelCase in his
> host/topic/groupId/clientId for distinguish one part of name from another?
>
> Is this a problem? Or I didn't understand something?
>

Yeah I agree this is a problem, and we should fix it in the new consumer.


>
> Best regards from Sematext.
>
>
>
>
>
>
> On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic <
> otis.gospodnetic@gmail.com
> > wrote:
>
> > Hi Guozhang,
> >
> > On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > That is indeed a problem, for now, we recommend group name and topic
> > names
> > > to use "_" when there is a need for "-", but this should be fixed
> > > systematically.
> > >
> >
> > Right!
> >
> > For you use case, could you change your topic/group name using "_"?
> >
> >
> > Our own Kafka doesn't use topics with "-" characters, so we don't have a
> > problem.
> >
> > The problem, in our case, is that we have a general (Kafka) monitoring
> tool
> > that other people use to monitor Kafka - see http://sematext.com/spm/ .
> >  So
> > we can't really tell people "hey, our tool will work but only if you
> don't
> > have a dash in your topic names and hosts and ... because if you use
> dashes
> > we won't know how to parse your Kafka's MBean names" :)
> >
> >
> > > Also, do you mind to file a JIRA ticket to keep track of this issue?
> >
> >
> > Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
> >
> > Otis
> > --
> > Performance Monitoring * Log Analytics * Search Analytics
> > Solr & Elasticsearch Support * http://sematext.com/
> >
> >
> >
> >
> >
> > >
> > > On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
> > > vladimir.tretyakov@sematext.com> wrote:
> > >
> > > > Hello everyone,
> > > >
> > > > We are adding Kafka 0.8.x monitoring support to SPM
> > > > <http://sematext.com/spm/> here at Sematext. Unfortunately, we
> quickly
> > > hit
> > > > an issue caused by the new bean naming convention that embeds things
> > like
> > > > topic and host names in the beans along with metrics, separated by
> > > dashes,
> > > > making it hard to parse these beans.
> > > >
> > > > To put it simply: it is hard/impossible to automatically figure out
> > which
> > > > part of the bean name is e.g. consumer group, which is the topic,
> which
> > > is
> > > > the host name, and which is the name of the metric.
> > > >
> > > > Let me show you what I mean:
> > > >
> > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > >
> > > >                        name="af_servers-spm_topic-BytesPerSec"
> > > >
> > > > Here we actually CAN extract:
> > > >
> > > >  * consumer group ('af_servers')
> > > >
> > > >  * topic ('spm_topic')
> > > >
> > > >  * metric (‘BytesPerSec’)
> > > >
> > > > BUT what if the consumer group id and/or topic name contain '-'?
> > > >
> > > > Then how would we extract consumer group and topic?
> > > >
> > > > Here is a concrete example of this problem:
> > > >
> > > > kafka.consumer:type="ConsumerTopicMetrics",
> > > >
> > > >                       name="af-servers-spm-topic-BytesPerSec"
> > > >
> > > > How can we know what is group id or topic name here?
> > > >
> > > > This looks like a problem to me, but maybe I’m missing something?
> > > >
> > > > Is it possible to have all these values (group id, topic name) as
> > > separate
> > > > attributes inside JMX bean?
> > > >
> > > > Or maybe the problem could be solved if a different delimiter was
> used,
> > > > such as the pipe (“I”)?
> > > >
> > > > It is really needed things and will be nice to have it to build good
> > tool
> > > > for monitoring.
> > > >
> > > > Thx and best regards from Sematext.
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>



-- 
-- Guozhang

Re: How to parse some of JMX Bean's names.

Posted by Vladimir Tretyakov <vl...@sematext.com>.
Hi again, few more questions from me:

*1.*

What I see in JMX:

kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"

>From code:

newGauge(
        config.clientId + "-" + config.groupId + "-" + topicThreadId._1 +
"-" + topicThreadId._2 + "-FetchQueueSize",
        new Gauge[Int] {
          def value = q.size
        }
      )

I've tried to parse part as I've understood they.

config.clientId       >> af_servers
topicThreadId._1 >> af_servers-spm_new_cluster_topic
topicThreadId._2 >> af_servers_wawanawna-Dell-1401353748289-fcaaea29-0

Yes I can suppose that this topicThreadId._1 will always looks like
GROUP_ID+TOPIC and topicThreadId._2 will contain CONSUMER HOST, but will it
always true?


*2.*

>From code I see that sometimes Kafka uses "_" as separator, not only "-":

val consumerIdString = {
    var consumerUuid : String = null
    config.consumerId match {
      case Some(consumerId) // for testing only
      => consumerUuid = consumerId
      case None // generate unique consumerId automatically
      => val uuid = UUID.randomUUID()
      consumerUuid = "%s-%d-%s".format(
        InetAddress.getLocalHost.getHostName, System.currentTimeMillis,
        uuid.getMostSignificantBits().toHexString.substring(0,8))
    }
    config.groupId + "_" + consumerUuid
  }

That means if user will use "_" as part of his host/topic/groupId name it
maybe be a problem to parse string like:

kafka.consumer:type="ZookeeperConsumerConnector",name="af_servers-af_servers-spm_new_cluster_topic-af_servers_wawanawna-Dell-1401353748289-fcaaea29-0-FetchQueueSize"

Look at part: "spm_new_cluster_topic-af_servers_wawanawna-Dell", what is
host name here "servers_wawanawna-Dell" or "wawanawna-Dell" ?

So from one side if we want to be able parse name without any problems we
have to avoid using "-" and "_" in host/topic/groupId/clientId, but at the
same time I see (from
http://grokbase.com/t/kafka/users/133xfsnpdh/cant-use-in-client-name):

*"Client id is used for registering jmx beans for monitoring. Because of
the*
*restrictions in bean names, we limit the client id to be only
alpha-numeric*
*plus "-" and "_"."*

Does that mean user can use only camelCase in his
host/topic/groupId/clientId for distinguish one part of name from another?

Is this a problem? Or I didn't understand something?

Best regards from Sematext.






On Tue, Jun 3, 2014 at 3:24 AM, Otis Gospodnetic <otis.gospodnetic@gmail.com
> wrote:

> Hi Guozhang,
>
> On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > That is indeed a problem, for now, we recommend group name and topic
> names
> > to use "_" when there is a need for "-", but this should be fixed
> > systematically.
> >
>
> Right!
>
> For you use case, could you change your topic/group name using "_"?
>
>
> Our own Kafka doesn't use topics with "-" characters, so we don't have a
> problem.
>
> The problem, in our case, is that we have a general (Kafka) monitoring tool
> that other people use to monitor Kafka - see http://sematext.com/spm/ .
>  So
> we can't really tell people "hey, our tool will work but only if you don't
> have a dash in your topic names and hosts and ... because if you use dashes
> we won't know how to parse your Kafka's MBean names" :)
>
>
> > Also, do you mind to file a JIRA ticket to keep track of this issue?
>
>
> Here it is: https://issues.apache.org/jira/browse/KAFKA-1481
>
> Otis
> --
> Performance Monitoring * Log Analytics * Search Analytics
> Solr & Elasticsearch Support * http://sematext.com/
>
>
>
>
>
> >
> > On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
> > vladimir.tretyakov@sematext.com> wrote:
> >
> > > Hello everyone,
> > >
> > > We are adding Kafka 0.8.x monitoring support to SPM
> > > <http://sematext.com/spm/> here at Sematext. Unfortunately, we quickly
> > hit
> > > an issue caused by the new bean naming convention that embeds things
> like
> > > topic and host names in the beans along with metrics, separated by
> > dashes,
> > > making it hard to parse these beans.
> > >
> > > To put it simply: it is hard/impossible to automatically figure out
> which
> > > part of the bean name is e.g. consumer group, which is the topic, which
> > is
> > > the host name, and which is the name of the metric.
> > >
> > > Let me show you what I mean:
> > >
> > > kafka.consumer:type="ConsumerTopicMetrics",
> > >
> > >                        name="af_servers-spm_topic-BytesPerSec"
> > >
> > > Here we actually CAN extract:
> > >
> > >  * consumer group ('af_servers')
> > >
> > >  * topic ('spm_topic')
> > >
> > >  * metric (‘BytesPerSec’)
> > >
> > > BUT what if the consumer group id and/or topic name contain '-'?
> > >
> > > Then how would we extract consumer group and topic?
> > >
> > > Here is a concrete example of this problem:
> > >
> > > kafka.consumer:type="ConsumerTopicMetrics",
> > >
> > >                       name="af-servers-spm-topic-BytesPerSec"
> > >
> > > How can we know what is group id or topic name here?
> > >
> > > This looks like a problem to me, but maybe I’m missing something?
> > >
> > > Is it possible to have all these values (group id, topic name) as
> > separate
> > > attributes inside JMX bean?
> > >
> > > Or maybe the problem could be solved if a different delimiter was used,
> > > such as the pipe (“I”)?
> > >
> > > It is really needed things and will be nice to have it to build good
> tool
> > > for monitoring.
> > >
> > > Thx and best regards from Sematext.
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: How to parse some of JMX Bean's names.

Posted by Otis Gospodnetic <ot...@gmail.com>.
Hi Guozhang,

On Mon, Jun 2, 2014 at 7:18 PM, Guozhang Wang <wa...@gmail.com> wrote:

> That is indeed a problem, for now, we recommend group name and topic names
> to use "_" when there is a need for "-", but this should be fixed
> systematically.
>

Right!

For you use case, could you change your topic/group name using "_"?


Our own Kafka doesn't use topics with "-" characters, so we don't have a
problem.

The problem, in our case, is that we have a general (Kafka) monitoring tool
that other people use to monitor Kafka - see http://sematext.com/spm/ .  So
we can't really tell people "hey, our tool will work but only if you don't
have a dash in your topic names and hosts and ... because if you use dashes
we won't know how to parse your Kafka's MBean names" :)


> Also, do you mind to file a JIRA ticket to keep track of this issue?


Here it is: https://issues.apache.org/jira/browse/KAFKA-1481

Otis
--
Performance Monitoring * Log Analytics * Search Analytics
Solr & Elasticsearch Support * http://sematext.com/





>
> On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
> vladimir.tretyakov@sematext.com> wrote:
>
> > Hello everyone,
> >
> > We are adding Kafka 0.8.x monitoring support to SPM
> > <http://sematext.com/spm/> here at Sematext. Unfortunately, we quickly
> hit
> > an issue caused by the new bean naming convention that embeds things like
> > topic and host names in the beans along with metrics, separated by
> dashes,
> > making it hard to parse these beans.
> >
> > To put it simply: it is hard/impossible to automatically figure out which
> > part of the bean name is e.g. consumer group, which is the topic, which
> is
> > the host name, and which is the name of the metric.
> >
> > Let me show you what I mean:
> >
> > kafka.consumer:type="ConsumerTopicMetrics",
> >
> >                        name="af_servers-spm_topic-BytesPerSec"
> >
> > Here we actually CAN extract:
> >
> >  * consumer group ('af_servers')
> >
> >  * topic ('spm_topic')
> >
> >  * metric (‘BytesPerSec’)
> >
> > BUT what if the consumer group id and/or topic name contain '-'?
> >
> > Then how would we extract consumer group and topic?
> >
> > Here is a concrete example of this problem:
> >
> > kafka.consumer:type="ConsumerTopicMetrics",
> >
> >                       name="af-servers-spm-topic-BytesPerSec"
> >
> > How can we know what is group id or topic name here?
> >
> > This looks like a problem to me, but maybe I’m missing something?
> >
> > Is it possible to have all these values (group id, topic name) as
> separate
> > attributes inside JMX bean?
> >
> > Or maybe the problem could be solved if a different delimiter was used,
> > such as the pipe (“I”)?
> >
> > It is really needed things and will be nice to have it to build good tool
> > for monitoring.
> >
> > Thx and best regards from Sematext.
> >
>
>
>
> --
> -- Guozhang
>

Re: How to parse some of JMX Bean's names.

Posted by Guozhang Wang <wa...@gmail.com>.
That is indeed a problem, for now, we recommend group name and topic names
to use "_" when there is a need for "-", but this should be fixed
systematically.

For you use case, could you change your topic/group name using "_"? Also,
do you mind to file a JIRA ticket to keep track of this issue?

Guozhang


On Mon, Jun 2, 2014 at 5:18 AM, Vladimir Tretyakov <
vladimir.tretyakov@sematext.com> wrote:

> Hello everyone,
>
> We are adding Kafka 0.8.x monitoring support to SPM
> <http://sematext.com/spm/> here at Sematext. Unfortunately, we quickly hit
> an issue caused by the new bean naming convention that embeds things like
> topic and host names in the beans along with metrics, separated by dashes,
> making it hard to parse these beans.
>
> To put it simply: it is hard/impossible to automatically figure out which
> part of the bean name is e.g. consumer group, which is the topic, which is
> the host name, and which is the name of the metric.
>
> Let me show you what I mean:
>
> kafka.consumer:type="ConsumerTopicMetrics",
>
>                        name="af_servers-spm_topic-BytesPerSec"
>
> Here we actually CAN extract:
>
>  * consumer group ('af_servers')
>
>  * topic ('spm_topic')
>
>  * metric (‘BytesPerSec’)
>
> BUT what if the consumer group id and/or topic name contain '-'?
>
> Then how would we extract consumer group and topic?
>
> Here is a concrete example of this problem:
>
> kafka.consumer:type="ConsumerTopicMetrics",
>
>                       name="af-servers-spm-topic-BytesPerSec"
>
> How can we know what is group id or topic name here?
>
> This looks like a problem to me, but maybe I’m missing something?
>
> Is it possible to have all these values (group id, topic name) as separate
> attributes inside JMX bean?
>
> Or maybe the problem could be solved if a different delimiter was used,
> such as the pipe (“I”)?
>
> It is really needed things and will be nice to have it to build good tool
> for monitoring.
>
> Thx and best regards from Sematext.
>



-- 
-- Guozhang

Fwd: How to parse some of JMX Bean's names.

Posted by Vladimir Tretyakov <vl...@sematext.com>.
Hello everyone,

We are adding Kafka 0.8.x monitoring support to SPM
<http://sematext.com/spm/> here at Sematext. Unfortunately, we quickly hit
an issue caused by the new bean naming convention that embeds things like
topic and host names in the beans along with metrics, separated by dashes,
making it hard to parse these beans.

To put it simply: it is hard/impossible to automatically figure out which
part of the bean name is e.g. consumer group, which is the topic, which is
the host name, and which is the name of the metric.

Let me show you what I mean:

kafka.consumer:type="ConsumerTopicMetrics",

                        name="af_servers-spm_topic-BytesPerSec"

Here we actually CAN extract:

  * consumer group ('af_servers')

 * topic ('spm_topic')

 * metric (‘BytesPerSec’)

BUT what if the consumer group id and/or topic name contain '-'?

Then how would we extract consumer group and topic?

Here is a concrete example of this problem:

kafka.consumer:type="ConsumerTopicMetrics",

                       name="af-servers-spm-topic-BytesPerSec"

How can we know what is group id or topic name here?

This looks like a problem to me, but maybe I’m missing something?

Is it possible to have all these values (group id, topic name) as separate
attributes inside JMX bean?

Or maybe the problem could be solved if a different delimiter was used,
such as the pipe (“I”)?

It is really needed things and will be nice to have it to build good tool
for monitoring.

Thx and best regards from Sematext.