You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@samza.apache.org by David Yu <da...@optimizely.com> on 2016/02/24 01:51:35 UTC

Understand Samza default metrics

Hi,

Where can I find the detailed descriptions of the out of the box metrics
provided by MetricsSnapshotReporterFactory and JmxReporterFactory?

I'm interested in seeing the basic metrics of the my samza job (e.g.
messages_processed_per_sec). But it's hard to ping point to the specific
metric that shows me that.

Thanks,
David

Re: Understand Samza default metrics

Posted by David Yu <da...@optimizely.com>.
Thanks guys. This is really helpful! I assume we have a plan to publish
these docs on the samza wiki.

-David

On Wed, Feb 24, 2016 at 10:34 AM, Xinyu Liu <xi...@linkedin.com.invalid>
wrote:

> Thanks, Shadi. The doc is really useful!
>
> @Milinda: thanks for pointing it out. Process-calls includes both
> process-envelopes and process-null-envelopes, so it should be
> process-envelopes in David's example.
>
> Thanks,
> Xinyu
>
> On Wed, Feb 24, 2016 at 9:52 AM, Abdollahian Noghabi, Shadi <
> abdolla2@illinois.edu> wrote:
>
> > I have attached the document to SAMZA-702.<
> > https://issues.apache.org/jira/browse/SAMZA-702>
> >
> >
> > On Feb 24, 2016, at 9:33 AM, Milinda Pathirage <mpathira@umail.iu.edu
> > <ma...@umail.iu.edu>> wrote:
> >
> > Hi Shadi,
> >
> > Attachment is not there in your mail. I think mailing list dropped the
> > attachment. IMHO, we should create a JIRA issue and attach the doc to the
> > issue so that we can move it to Samza docs.
> >
> > On Wed, Feb 24, 2016 at 12:27 PM, Abdollahian Noghabi, Shadi <
> > abdolla2@illinois.edu<ma...@illinois.edu>> wrote:
> >
> > I have a document with some of the metrics. I had gathered these around
> > last summer, so they may be out-of-date. I have attached the document to
> > this email. Hope it can help.
> >
> >
> >
> >
> >
> >
> > On Feb 24, 2016, at 7:10 AM, Milinda Pathirage <mpathira@umail.iu.edu
> > <ma...@umail.iu.edu>>
> > wrote:
> >
> > Hi David and Xinyu,
> >
> > If you want to get the number of messages processed, "process-envelopes"
> > is
> > the correct metrics. "process-calls" gives measure the number of times
> > RunLoop#process method is called. So "process-calls" get updated even
> > without processing any messages (This happens when no new messages in
> > input
> > stream). "process-ns" can be used as the average time taken to process a
> > message. But this average also includes time taken to process null
> > messages. So I don't trust the accuracy of that metric.
> >
> > Each metric emitted by Samza contains a header which includes job name,
> > job
> > id, container name and metric timestamp. You can use it to calculate
> > messages per second values.
> >
> > If you are using KV store, KeyValueStoreMetrics contains metrics such as
> > bytes read, bytes write, puts and gets for each store.
> >
> > Thanks
> > Milinda
> >
> > On Tue, Feb 23, 2016 at 8:26 PM, xinyu liu <xinyuliu.us@gmail.com
> <mailto:
> > xinyuliu.us@gmail.com>>
> > wrote:
> >
> > Hi, David,
> >
> > I didn't find a wiki page that contains the descriptions of all Samza
> > metrics. You can find the basic metrics by googling the following
> > classes:
> > SamzaContainerMetrics, TaskInstanceMetrics, SystemConsumersMetrics and
> > SystemProducersMetrics. For your example, you can use the
> > "process-calls"
> > in SamzaContainerMetrics to get the processed message count, and divide
> > the
> > delta by time to get the messages processed per sec. In practice, you
> > can
> > either use JConsole to connect to the running Samza container or consume
> > the MetricsSnapshot topic to get the detailed metrics.
> >
> > Thanks,
> > Xinyu
> >
> > On Tue, Feb 23, 2016 at 4:51 PM, David Yu <david.yu@optimizely.com
> <mailto:
> > david.yu@optimizely.com>>
> > wrote:
> >
> > Hi,
> >
> > Where can I find the detailed descriptions of the out of the box
> > metrics
> > provided by MetricsSnapshotReporterFactory and JmxReporterFactory?
> >
> > I'm interested in seeing the basic metrics of the my samza job (e.g.
> > messages_processed_per_sec). But it's hard to ping point to the
> > specific
> > metric that shows me that.
> >
> > Thanks,
> > David
> >
> >
> >
> >
> >
> > --
> > Milinda Pathirage
> >
> > PhD Student | Research Assistant
> > School of Informatics and Computing | Data to Insight Center
> > Indiana University
> >
> > twitter: milindalakmal
> > skype: milinda.pathirage
> > blog: http://milinda.pathirage.org
> >
> >
> >
> >
> > --
> > Milinda Pathirage
> >
> > PhD Student | Research Assistant
> > School of Informatics and Computing | Data to Insight Center
> > Indiana University
> >
> > twitter: milindalakmal
> > skype: milinda.pathirage
> > blog: http://milinda.pathirage.org
> >
> >
>

Re: Understand Samza default metrics

Posted by Xinyu Liu <xi...@linkedin.com.INVALID>.
Thanks, Shadi. The doc is really useful!

@Milinda: thanks for pointing it out. Process-calls includes both
process-envelopes and process-null-envelopes, so it should be
process-envelopes in David's example.

Thanks,
Xinyu

On Wed, Feb 24, 2016 at 9:52 AM, Abdollahian Noghabi, Shadi <
abdolla2@illinois.edu> wrote:

> I have attached the document to SAMZA-702.<
> https://issues.apache.org/jira/browse/SAMZA-702>
>
>
> On Feb 24, 2016, at 9:33 AM, Milinda Pathirage <mpathira@umail.iu.edu
> <ma...@umail.iu.edu>> wrote:
>
> Hi Shadi,
>
> Attachment is not there in your mail. I think mailing list dropped the
> attachment. IMHO, we should create a JIRA issue and attach the doc to the
> issue so that we can move it to Samza docs.
>
> On Wed, Feb 24, 2016 at 12:27 PM, Abdollahian Noghabi, Shadi <
> abdolla2@illinois.edu<ma...@illinois.edu>> wrote:
>
> I have a document with some of the metrics. I had gathered these around
> last summer, so they may be out-of-date. I have attached the document to
> this email. Hope it can help.
>
>
>
>
>
>
> On Feb 24, 2016, at 7:10 AM, Milinda Pathirage <mpathira@umail.iu.edu
> <ma...@umail.iu.edu>>
> wrote:
>
> Hi David and Xinyu,
>
> If you want to get the number of messages processed, "process-envelopes"
> is
> the correct metrics. "process-calls" gives measure the number of times
> RunLoop#process method is called. So "process-calls" get updated even
> without processing any messages (This happens when no new messages in
> input
> stream). "process-ns" can be used as the average time taken to process a
> message. But this average also includes time taken to process null
> messages. So I don't trust the accuracy of that metric.
>
> Each metric emitted by Samza contains a header which includes job name,
> job
> id, container name and metric timestamp. You can use it to calculate
> messages per second values.
>
> If you are using KV store, KeyValueStoreMetrics contains metrics such as
> bytes read, bytes write, puts and gets for each store.
>
> Thanks
> Milinda
>
> On Tue, Feb 23, 2016 at 8:26 PM, xinyu liu <xinyuliu.us@gmail.com<mailto:
> xinyuliu.us@gmail.com>>
> wrote:
>
> Hi, David,
>
> I didn't find a wiki page that contains the descriptions of all Samza
> metrics. You can find the basic metrics by googling the following
> classes:
> SamzaContainerMetrics, TaskInstanceMetrics, SystemConsumersMetrics and
> SystemProducersMetrics. For your example, you can use the
> "process-calls"
> in SamzaContainerMetrics to get the processed message count, and divide
> the
> delta by time to get the messages processed per sec. In practice, you
> can
> either use JConsole to connect to the running Samza container or consume
> the MetricsSnapshot topic to get the detailed metrics.
>
> Thanks,
> Xinyu
>
> On Tue, Feb 23, 2016 at 4:51 PM, David Yu <david.yu@optimizely.com<mailto:
> david.yu@optimizely.com>>
> wrote:
>
> Hi,
>
> Where can I find the detailed descriptions of the out of the box
> metrics
> provided by MetricsSnapshotReporterFactory and JmxReporterFactory?
>
> I'm interested in seeing the basic metrics of the my samza job (e.g.
> messages_processed_per_sec). But it's hard to ping point to the
> specific
> metric that shows me that.
>
> Thanks,
> David
>
>
>
>
>
> --
> Milinda Pathirage
>
> PhD Student | Research Assistant
> School of Informatics and Computing | Data to Insight Center
> Indiana University
>
> twitter: milindalakmal
> skype: milinda.pathirage
> blog: http://milinda.pathirage.org
>
>
>
>
> --
> Milinda Pathirage
>
> PhD Student | Research Assistant
> School of Informatics and Computing | Data to Insight Center
> Indiana University
>
> twitter: milindalakmal
> skype: milinda.pathirage
> blog: http://milinda.pathirage.org
>
>

Re: Understand Samza default metrics

Posted by "Abdollahian Noghabi, Shadi" <ab...@illinois.edu>.
I have attached the document to SAMZA-702.<https://issues.apache.org/jira/browse/SAMZA-702>


On Feb 24, 2016, at 9:33 AM, Milinda Pathirage <mp...@umail.iu.edu>> wrote:

Hi Shadi,

Attachment is not there in your mail. I think mailing list dropped the
attachment. IMHO, we should create a JIRA issue and attach the doc to the
issue so that we can move it to Samza docs.

On Wed, Feb 24, 2016 at 12:27 PM, Abdollahian Noghabi, Shadi <
abdolla2@illinois.edu<ma...@illinois.edu>> wrote:

I have a document with some of the metrics. I had gathered these around
last summer, so they may be out-of-date. I have attached the document to
this email. Hope it can help.






On Feb 24, 2016, at 7:10 AM, Milinda Pathirage <mp...@umail.iu.edu>>
wrote:

Hi David and Xinyu,

If you want to get the number of messages processed, "process-envelopes"
is
the correct metrics. "process-calls" gives measure the number of times
RunLoop#process method is called. So "process-calls" get updated even
without processing any messages (This happens when no new messages in
input
stream). "process-ns" can be used as the average time taken to process a
message. But this average also includes time taken to process null
messages. So I don't trust the accuracy of that metric.

Each metric emitted by Samza contains a header which includes job name,
job
id, container name and metric timestamp. You can use it to calculate
messages per second values.

If you are using KV store, KeyValueStoreMetrics contains metrics such as
bytes read, bytes write, puts and gets for each store.

Thanks
Milinda

On Tue, Feb 23, 2016 at 8:26 PM, xinyu liu <xi...@gmail.com>>
wrote:

Hi, David,

I didn't find a wiki page that contains the descriptions of all Samza
metrics. You can find the basic metrics by googling the following
classes:
SamzaContainerMetrics, TaskInstanceMetrics, SystemConsumersMetrics and
SystemProducersMetrics. For your example, you can use the
"process-calls"
in SamzaContainerMetrics to get the processed message count, and divide
the
delta by time to get the messages processed per sec. In practice, you
can
either use JConsole to connect to the running Samza container or consume
the MetricsSnapshot topic to get the detailed metrics.

Thanks,
Xinyu

On Tue, Feb 23, 2016 at 4:51 PM, David Yu <da...@optimizely.com>>
wrote:

Hi,

Where can I find the detailed descriptions of the out of the box
metrics
provided by MetricsSnapshotReporterFactory and JmxReporterFactory?

I'm interested in seeing the basic metrics of the my samza job (e.g.
messages_processed_per_sec). But it's hard to ping point to the
specific
metric that shows me that.

Thanks,
David





--
Milinda Pathirage

PhD Student | Research Assistant
School of Informatics and Computing | Data to Insight Center
Indiana University

twitter: milindalakmal
skype: milinda.pathirage
blog: http://milinda.pathirage.org




--
Milinda Pathirage

PhD Student | Research Assistant
School of Informatics and Computing | Data to Insight Center
Indiana University

twitter: milindalakmal
skype: milinda.pathirage
blog: http://milinda.pathirage.org


Re: Understand Samza default metrics

Posted by Milinda Pathirage <mp...@umail.iu.edu>.
Hi Shadi,

Attachment is not there in your mail. I think mailing list dropped the
attachment. IMHO, we should create a JIRA issue and attach the doc to the
issue so that we can move it to Samza docs.

On Wed, Feb 24, 2016 at 12:27 PM, Abdollahian Noghabi, Shadi <
abdolla2@illinois.edu> wrote:

> I have a document with some of the metrics. I had gathered these around
> last summer, so they may be out-of-date. I have attached the document to
> this email. Hope it can help.
>
>
>
>
>
>
> > On Feb 24, 2016, at 7:10 AM, Milinda Pathirage <mp...@umail.iu.edu>
> wrote:
> >
> > Hi David and Xinyu,
> >
> > If you want to get the number of messages processed, "process-envelopes"
> is
> > the correct metrics. "process-calls" gives measure the number of times
> > RunLoop#process method is called. So "process-calls" get updated even
> > without processing any messages (This happens when no new messages in
> input
> > stream). "process-ns" can be used as the average time taken to process a
> > message. But this average also includes time taken to process null
> > messages. So I don't trust the accuracy of that metric.
> >
> > Each metric emitted by Samza contains a header which includes job name,
> job
> > id, container name and metric timestamp. You can use it to calculate
> > messages per second values.
> >
> > If you are using KV store, KeyValueStoreMetrics contains metrics such as
> > bytes read, bytes write, puts and gets for each store.
> >
> > Thanks
> > Milinda
> >
> > On Tue, Feb 23, 2016 at 8:26 PM, xinyu liu <xi...@gmail.com>
> wrote:
> >
> >> Hi, David,
> >>
> >> I didn't find a wiki page that contains the descriptions of all Samza
> >> metrics. You can find the basic metrics by googling the following
> classes:
> >> SamzaContainerMetrics, TaskInstanceMetrics, SystemConsumersMetrics and
> >> SystemProducersMetrics. For your example, you can use the
> "process-calls"
> >> in SamzaContainerMetrics to get the processed message count, and divide
> the
> >> delta by time to get the messages processed per sec. In practice, you
> can
> >> either use JConsole to connect to the running Samza container or consume
> >> the MetricsSnapshot topic to get the detailed metrics.
> >>
> >> Thanks,
> >> Xinyu
> >>
> >> On Tue, Feb 23, 2016 at 4:51 PM, David Yu <da...@optimizely.com>
> wrote:
> >>
> >>> Hi,
> >>>
> >>> Where can I find the detailed descriptions of the out of the box
> metrics
> >>> provided by MetricsSnapshotReporterFactory and JmxReporterFactory?
> >>>
> >>> I'm interested in seeing the basic metrics of the my samza job (e.g.
> >>> messages_processed_per_sec). But it's hard to ping point to the
> specific
> >>> metric that shows me that.
> >>>
> >>> Thanks,
> >>> David
> >>>
> >>
> >
> >
> >
> > --
> > Milinda Pathirage
> >
> > PhD Student | Research Assistant
> > School of Informatics and Computing | Data to Insight Center
> > Indiana University
> >
> > twitter: milindalakmal
> > skype: milinda.pathirage
> > blog: http://milinda.pathirage.org
>
>


-- 
Milinda Pathirage

PhD Student | Research Assistant
School of Informatics and Computing | Data to Insight Center
Indiana University

twitter: milindalakmal
skype: milinda.pathirage
blog: http://milinda.pathirage.org

Re: Understand Samza default metrics

Posted by Jagadish Venkatraman <ja...@gmail.com>.
Not sure if this mail server strips attachments. Could you upload the doc
to jira and link it here?

On Wednesday, February 24, 2016, Abdollahian Noghabi, Shadi <
abdolla2@illinois.edu> wrote:

> I have a document with some of the metrics. I had gathered these around
> last summer, so they may be out-of-date. I have attached the document to
> this email. Hope it can help.
>
>
>
>
>
>
> > On Feb 24, 2016, at 7:10 AM, Milinda Pathirage <mpathira@umail.iu.edu
> <javascript:_e(%7B%7D,'cvml','mpathira@umail.iu.edu');>> wrote:
> >
> > Hi David and Xinyu,
> >
> > If you want to get the number of messages processed, "process-envelopes"
> is
> > the correct metrics. "process-calls" gives measure the number of times
> > RunLoop#process method is called. So "process-calls" get updated even
> > without processing any messages (This happens when no new messages in
> input
> > stream). "process-ns" can be used as the average time taken to process a
> > message. But this average also includes time taken to process null
> > messages. So I don't trust the accuracy of that metric.
> >
> > Each metric emitted by Samza contains a header which includes job name,
> job
> > id, container name and metric timestamp. You can use it to calculate
> > messages per second values.
> >
> > If you are using KV store, KeyValueStoreMetrics contains metrics such as
> > bytes read, bytes write, puts and gets for each store.
> >
> > Thanks
> > Milinda
> >
> > On Tue, Feb 23, 2016 at 8:26 PM, xinyu liu <xinyuliu.us@gmail.com
> <javascript:_e(%7B%7D,'cvml','xinyuliu.us@gmail.com');>> wrote:
> >
> >> Hi, David,
> >>
> >> I didn't find a wiki page that contains the descriptions of all Samza
> >> metrics. You can find the basic metrics by googling the following
> classes:
> >> SamzaContainerMetrics, TaskInstanceMetrics, SystemConsumersMetrics and
> >> SystemProducersMetrics. For your example, you can use the
> "process-calls"
> >> in SamzaContainerMetrics to get the processed message count, and divide
> the
> >> delta by time to get the messages processed per sec. In practice, you
> can
> >> either use JConsole to connect to the running Samza container or consume
> >> the MetricsSnapshot topic to get the detailed metrics.
> >>
> >> Thanks,
> >> Xinyu
> >>
> >> On Tue, Feb 23, 2016 at 4:51 PM, David Yu <david.yu@optimizely.com
> <javascript:_e(%7B%7D,'cvml','david.yu@optimizely.com');>> wrote:
> >>
> >>> Hi,
> >>>
> >>> Where can I find the detailed descriptions of the out of the box
> metrics
> >>> provided by MetricsSnapshotReporterFactory and JmxReporterFactory?
> >>>
> >>> I'm interested in seeing the basic metrics of the my samza job (e.g.
> >>> messages_processed_per_sec). But it's hard to ping point to the
> specific
> >>> metric that shows me that.
> >>>
> >>> Thanks,
> >>> David
> >>>
> >>
> >
> >
> >
> > --
> > Milinda Pathirage
> >
> > PhD Student | Research Assistant
> > School of Informatics and Computing | Data to Insight Center
> > Indiana University
> >
> > twitter: milindalakmal
> > skype: milinda.pathirage
> > blog: http://milinda.pathirage.org
>
>

-- 
Sent from my iphone.

Re: Understand Samza default metrics

Posted by "Abdollahian Noghabi, Shadi" <ab...@illinois.edu>.
I have a document with some of the metrics. I had gathered these around last summer, so they may be out-of-date. I have attached the document to this email. Hope it can help.






> On Feb 24, 2016, at 7:10 AM, Milinda Pathirage <mp...@umail.iu.edu> wrote:
>
> Hi David and Xinyu,
>
> If you want to get the number of messages processed, "process-envelopes" is
> the correct metrics. "process-calls" gives measure the number of times
> RunLoop#process method is called. So "process-calls" get updated even
> without processing any messages (This happens when no new messages in input
> stream). "process-ns" can be used as the average time taken to process a
> message. But this average also includes time taken to process null
> messages. So I don't trust the accuracy of that metric.
>
> Each metric emitted by Samza contains a header which includes job name, job
> id, container name and metric timestamp. You can use it to calculate
> messages per second values.
>
> If you are using KV store, KeyValueStoreMetrics contains metrics such as
> bytes read, bytes write, puts and gets for each store.
>
> Thanks
> Milinda
>
> On Tue, Feb 23, 2016 at 8:26 PM, xinyu liu <xi...@gmail.com> wrote:
>
>> Hi, David,
>>
>> I didn't find a wiki page that contains the descriptions of all Samza
>> metrics. You can find the basic metrics by googling the following classes:
>> SamzaContainerMetrics, TaskInstanceMetrics, SystemConsumersMetrics and
>> SystemProducersMetrics. For your example, you can use the "process-calls"
>> in SamzaContainerMetrics to get the processed message count, and divide the
>> delta by time to get the messages processed per sec. In practice, you can
>> either use JConsole to connect to the running Samza container or consume
>> the MetricsSnapshot topic to get the detailed metrics.
>>
>> Thanks,
>> Xinyu
>>
>> On Tue, Feb 23, 2016 at 4:51 PM, David Yu <da...@optimizely.com> wrote:
>>
>>> Hi,
>>>
>>> Where can I find the detailed descriptions of the out of the box metrics
>>> provided by MetricsSnapshotReporterFactory and JmxReporterFactory?
>>>
>>> I'm interested in seeing the basic metrics of the my samza job (e.g.
>>> messages_processed_per_sec). But it's hard to ping point to the specific
>>> metric that shows me that.
>>>
>>> Thanks,
>>> David
>>>
>>
>
>
>
> --
> Milinda Pathirage
>
> PhD Student | Research Assistant
> School of Informatics and Computing | Data to Insight Center
> Indiana University
>
> twitter: milindalakmal
> skype: milinda.pathirage
> blog: http://milinda.pathirage.org


Re: Understand Samza default metrics

Posted by Milinda Pathirage <mp...@umail.iu.edu>.
Hi David and Xinyu,

If you want to get the number of messages processed, "process-envelopes" is
the correct metrics. "process-calls" gives measure the number of times
RunLoop#process method is called. So "process-calls" get updated even
without processing any messages (This happens when no new messages in input
stream). "process-ns" can be used as the average time taken to process a
message. But this average also includes time taken to process null
messages. So I don't trust the accuracy of that metric.

Each metric emitted by Samza contains a header which includes job name, job
id, container name and metric timestamp. You can use it to calculate
messages per second values.

If you are using KV store, KeyValueStoreMetrics contains metrics such as
bytes read, bytes write, puts and gets for each store.

Thanks
Milinda

On Tue, Feb 23, 2016 at 8:26 PM, xinyu liu <xi...@gmail.com> wrote:

> Hi, David,
>
> I didn't find a wiki page that contains the descriptions of all Samza
> metrics. You can find the basic metrics by googling the following classes:
> SamzaContainerMetrics, TaskInstanceMetrics, SystemConsumersMetrics and
> SystemProducersMetrics. For your example, you can use the "process-calls"
> in SamzaContainerMetrics to get the processed message count, and divide the
> delta by time to get the messages processed per sec. In practice, you can
> either use JConsole to connect to the running Samza container or consume
> the MetricsSnapshot topic to get the detailed metrics.
>
> Thanks,
> Xinyu
>
> On Tue, Feb 23, 2016 at 4:51 PM, David Yu <da...@optimizely.com> wrote:
>
> > Hi,
> >
> > Where can I find the detailed descriptions of the out of the box metrics
> > provided by MetricsSnapshotReporterFactory and JmxReporterFactory?
> >
> > I'm interested in seeing the basic metrics of the my samza job (e.g.
> > messages_processed_per_sec). But it's hard to ping point to the specific
> > metric that shows me that.
> >
> > Thanks,
> > David
> >
>



-- 
Milinda Pathirage

PhD Student | Research Assistant
School of Informatics and Computing | Data to Insight Center
Indiana University

twitter: milindalakmal
skype: milinda.pathirage
blog: http://milinda.pathirage.org

Re: Understand Samza default metrics

Posted by xinyu liu <xi...@gmail.com>.
Hi, David,

I didn't find a wiki page that contains the descriptions of all Samza
metrics. You can find the basic metrics by googling the following classes:
SamzaContainerMetrics, TaskInstanceMetrics, SystemConsumersMetrics and
SystemProducersMetrics. For your example, you can use the "process-calls"
in SamzaContainerMetrics to get the processed message count, and divide the
delta by time to get the messages processed per sec. In practice, you can
either use JConsole to connect to the running Samza container or consume
the MetricsSnapshot topic to get the detailed metrics.

Thanks,
Xinyu

On Tue, Feb 23, 2016 at 4:51 PM, David Yu <da...@optimizely.com> wrote:

> Hi,
>
> Where can I find the detailed descriptions of the out of the box metrics
> provided by MetricsSnapshotReporterFactory and JmxReporterFactory?
>
> I'm interested in seeing the basic metrics of the my samza job (e.g.
> messages_processed_per_sec). But it's hard to ping point to the specific
> metric that shows me that.
>
> Thanks,
> David
>