You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@beam.apache.org by Alex Amato <aj...@google.com> on 2018/04/04 20:53:34 UTC

Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Hello beam community,

Thank you everyone for your initial feedback on this proposal so far. I
have made some revisions based on the feedback. There were some larger
questions asking about alternatives. For each of these I have added a
section tagged with [Alternatives] and discussed my recommendation as well
as as few other choices we considered.

I would appreciate more feedback on the revised proposal. Please take
another look and let me know
https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit

Etienne, I would appreciate it if you could please take another look after
the revisions I have made as well.

Thanks again,
Alex

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Alex Amato <aj...@google.com>.

Hello,

I have rewritten most of the proposal. Though I think that there is some
more research that needs to be done to get the Metric specification
perfect. I plan to do more research, and would like to ask you all for more
help to make this proposal better. In particular, now that the metrics
format by default is designed to allow metrics to pass through to
monitoring collection systems such as Dropwizard and Stackdriver, they need
to be complete enough to be compatible with these systems.

I think some changes will be needed to fulfill this, but I wanted to send
out this document, which contains the general idea, and continue refining
it.

Please take a look and let me know what you think.
https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit

Major Revision: April 17, 2018

The design has been reworked, to use a metric format which resembles
Dropwizard and Stackdriver formats, allowing metrics to be passed through.

The generic bytes payload style of metrics is still available but is
reserved for complex use cases which do not fit into these typical metrics
collection systems.

Note: This document isn’t 100% complete, there are a few areas which need
to be improved, though our discussion and more research I want to complete
these details. Please share any thoughts that you have.

   1.

   The metric specification and Metric proto schemas may need revisions:
   1.

      The distribution format needs to be refined so that its compatible
      with Stackdriver and Dropwizard. The current example format is. A second
      distribution format need.
      2.

      Annotations needs to be examine in detail, if there are first class
      annotations which should be supported to pass through properly to
      Dropwizard and Stackdriver.
      3.

      Aggregation functions may need parameters. For example Top(n) may
      need to be parameterized. How should this best be supported.






On Tue, Apr 17, 2018 at 11:10 AM Ben Chambers <bc...@apache.org> wrote:

> That sounds like a very reasonable choice -- given the discussion seemed
> to be focusing on the differences between these two categories, separating
> them will allow the proposal (and implementation) to address each category
> in the best way possible without needing to make compromises.
>
> Looking forward to the updated proposal.
>
> On Tue, Apr 17, 2018 at 10:53 AM Alex Amato <aj...@google.com> wrote:
>
>> Hello,
>>
>> I just wanted to give an update .
>>
>> After some discussion, I've realized that its best to break up the two
>> concepts, with two separate way of reporting monitoring data. These two
>> categories are:
>>
>>    1. Metrics - Counters, Gauges, Distributions. These are well defined
>>    concepts for monitoring information and ned to integrate with existing
>>    metrics collection systems such as Dropwizard and Stackdriver. Most metrics
>>    will go through this model, which will allow runners to process new metrics
>>    without adding extra code to support them, forwarding them to metric
>>    collection systems.
>>    2. Monitoring State - This supports general monitoring data which may
>>    not fit into the standard model for Metrics. For example an I/O source may
>>    provide a table of filenames+metadata, for files which are old and blocking
>>    the system. I will propose a general approach, similar to the URN+payload
>>    approach used in the doc right now.
>>
>> One thing to keep in mind -- even though it makes sense to allow each I/O
> source to define their own monitoring state, this then shifts
> responsibility for collecting that information to each runner and
> displaying that information to every consumer. It would be reasonable to
> see if there could be a set of 10 or so that covered most of the cases that
> could become the "standard" set (eg., watermark information, performance
> information, etc.).
>
>
>> I will rewrite most of the doc and propose separating these two very
>> different use cases, one which optimizes for integration with existing
>> monitoring systems. The other which optimizes for flexibility, allowing
>> more complex and custom metrics formats for other debugging scenarios.
>>
>> I just wanted to give a brief update on the direction of this change,
>> before writing it up in full detail.
>>
>>
>> On Mon, Apr 16, 2018 at 10:36 AM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> I agree that the user/system dichotomy is false, the real question of
>>> how counters can be scoped to avoid accidental (or even intentional)
>>> interference. A system that entirely controls the interaction between the
>>> "user" (from its perspective) and the underlying system can do this by
>>> prefixing all requested "user" counters with a prefix it will not use
>>> itself. Of course this breaks down whenever the wrapping isn't complete
>>> (either on the production or consumption side), but may be worth doing for
>>> some components (like the SDKs that value being able to provide this
>>> isolation for better behavior). Actual (human) end users are likely to be
>>> much less careful about avoiding conflicts than library authors who in turn
>>> are generally less careful than authors of the system itself.
>>>
>>> We could alternatively allow for specifying fully qualified URNs for
>>> counter names in the SDK APIs, and letting "normal" user counters be in the
>>> empty namespace rather than something like beam:metrics:{user,other,...},
>>> perhaps with SDKs prohibiting certain conflicting prefixes (which is less
>>> than ideal). A layer above the SDK that has similar absolute control over
>>> its "users" would have a similar decision to make.
>>>
>>>
>>> On Sat, Apr 14, 2018 at 4:00 PM Kenneth Knowles <kl...@google.com> wrote:
>>>
>>>> One reason I resist the user/system distinction is that Beam is a
>>>> multi-party system with at least SDK, runner, and pipeline. Often there may
>>>> be a DSL like SQL or Scio, or similarly someone may be building a platform
>>>> for their company where there is no user authoring the pipeline. Should
>>>> Scio, SQL, or MyCompanyFramework metrics end up in "user"? Who decides to
>>>> tack on the prefix? It looks like it is the SDK harness? Are there just
>>>> three namespaces "runner", "sdk", and "user"?  Most of what you'd
>>>> think of as "user" version "system" should simply be the different between
>>>> dynamically defined & typed metrics and fields in control plane protos. If
>>>> that layer of the namespaces is not finite and limited, who can extend make
>>>> a valid extension? Just some questions that I think would flesh out the
>>>> meaning of the "user" prefix.
>>>>
>>>> Kenn
>>>>
>>>> On Fri, Apr 13, 2018 at 5:26 PM Andrea Foegler <fo...@google.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler <fo...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks, Robert!
>>>>>>>
>>>>>>> I think my lack of clarity is around the MetricSpec.  Maybe what's
>>>>>>> in my head and what's being proposed are the same thing.  When I read that
>>>>>>> the MetricSpec describes the proto structure, that sound kind of
>>>>>>> complicated to me.  But I may be misinterpreting it.  What I picture is
>>>>>>> something like a MetricSpec that looks like (note: my picture looks a lot
>>>>>>> like Stackdriver :):
>>>>>>>
>>>>>>> {
>>>>>>> name: "my_timer"
>>>>>>>
>>>>>>
>>>>>> name: "beam:metric:user:my_namespace:my_timer" (assuming we want to
>>>>>> keep requiring namespaces). Or "beam:metric:[some non-user designation]"
>>>>>>
>>>>>
>>>>> Sure. Looks good.
>>>>>
>>>>>
>>>>>>
>>>>>> labels: { "ptransform" }
>>>>>>>
>>>>>>
>>>>>> How does an SDK act on this information?
>>>>>>
>>>>>
>>>>> The SDK is obligated to submit any metric values for that spec with a
>>>>> "ptransform" -> "transformName" in the labels field.  Autogenerating code
>>>>> from the spec to avoid typos should be easy.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> type: GAUGE
>>>>>>> value_type: int64
>>>>>>>
>>>>>>
>>>>>> I was lumping type and value_type into the same field, as a urn for
>>>>>> possibly extensibility, as they're tightly coupled (e.g. quantiles,
>>>>>> distributions).
>>>>>>
>>>>>
>>>>> My inclination is that keeping this set relatively small and fixed to
>>>>> a set that can be readily exported to external monitoring systems is more
>>>>> useful than the added indirection to support extensibility.  Lumping
>>>>> together seems reasonable.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> units: SECONDS
>>>>>>> description: "Times my stuff"
>>>>>>>
>>>>>>
>>>>>> Are both of these optional metadata, in the form of key-value field,
>>>>>> for flattened into the field itself (along with every other kind of
>>>>>> metadata you may want to attach)?
>>>>>>
>>>>>
>>>>> Optional metadata in the form of fixed fields.  Is there a use case
>>>>> for arbitrary metadata?  What would you do with it when exporting?
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> }
>>>>>>>
>>>>>>> Then metrics submitted would look like:
>>>>>>> {
>>>>>>> name: "my_timer"
>>>>>>> labels: {"ptransform": "MyTransform"}
>>>>>>> int_value: 100
>>>>>>> }
>>>>>>>
>>>>>>
>>>>>> Yes, or value could be a bytes field that is encoded according to
>>>>>> [value_]type above, if we want that extensibility (e.g. if we want to
>>>>>> bundle the pardo sub-timings together, we'd need a proto for the value, but
>>>>>> that seems to specific to hard code into the basic structure).
>>>>>>
>>>>>>
>>>>> The simplicity coming from the fact that there's only one proto format
>>>>>>> for the spec and for the value.  The only thing that varies are the entries
>>>>>>> in the map and the value field set.  It's pretty easy to establish
>>>>>>> contracts around this type of spec and even generate protos for use the in
>>>>>>> SDK that make the expectations explicit.
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> Or just "beam:counter:<namespace>:<name>" or even
>>>>>>>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>>>>>>>> their name.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I proposed keeping the "user" in there to avoid possible clashes
>>>>>>>> with the system namespaces. (No preference on counter vs. metric, I wasn't
>>>>>>>> trying to imply counter = SumInts)
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I like the generalization from entity -> labels.  I view the
>>>>>>>>> purpose of those fields to provide context.  And labels feel like they
>>>>>>>>> supports a richer set of contexts.
>>>>>>>>>
>>>>>>>>
>>>>>>>> If we think such a generalization provides value, I'm fine with
>>>>>>>> doing that now, as sets or key-value maps, if we have good enough examples
>>>>>>>> to justify this.
>>>>>>>>
>>>>>>>>
>>>>>>>>> The URN concept gets a little tricky.  I totally agree that the
>>>>>>>>> context fields should not be embedded in the name.
>>>>>>>>> There's a "name" which is the identifier that can be used to
>>>>>>>>> communicate what context values are supported / allowed for metrics with
>>>>>>>>> that name (for example, element_count expects a ptransform ID).  But then
>>>>>>>>> there's the context.  In Stackdriver, this context is a map of key-value
>>>>>>>>> pairs; the type is considered metadata associated with the name, but not
>>>>>>>>> communicated with the value.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I'm not quite following you here. If context contains a ptransform
>>>>>>>> id, then it cannot be associated with a single name.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Could the URN be "beam:namespace:name" and every metric have a map
>>>>>>>>> of key-value pairs for context?
>>>>>>>>>
>>>>>>>>
>>>>>>>> The URN is the name. Something like
>>>>>>>> "beam:metric:ptransform_execution_times:v1."
>>>>>>>>
>>>>>>>>
>>>>>>>>> Not sure where this fits in the discussion or if this is handled
>>>>>>>>> somewhere, but allowing for a metric configuration that's provided
>>>>>>>>> independently of the value allows for configuring "type", "units", etc in a
>>>>>>>>> uniform way without having to encode them in the metric name / value.
>>>>>>>>> Stackdriver expects each metric type has been configured ahead of time with
>>>>>>>>> these annotations / metadata.  Then values are reported separately.  For
>>>>>>>>> system metrics, the definitions can be packaged with the SDK.  For user
>>>>>>>>> metrics, they'd be defined at runtime.
>>>>>>>>>
>>>>>>>>
>>>>>>>> This feels like the metrics spec, that specifies that the metric
>>>>>>>> with name/URN X has this type plus a bunch of other metadata (e.g. units,
>>>>>>>> if they're not implicit in the type? This gets into whether the type should
>>>>>>>> be Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...}
>>>>>>>> + units metadata).
>>>>>>>>
>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Ben Chambers <bc...@apache.org>.

That sounds like a very reasonable choice -- given the discussion seemed to
be focusing on the differences between these two categories, separating
them will allow the proposal (and implementation) to address each category
in the best way possible without needing to make compromises.

Looking forward to the updated proposal.

On Tue, Apr 17, 2018 at 10:53 AM Alex Amato <aj...@google.com> wrote:

> Hello,
>
> I just wanted to give an update .
>
> After some discussion, I've realized that its best to break up the two
> concepts, with two separate way of reporting monitoring data. These two
> categories are:
>
>    1. Metrics - Counters, Gauges, Distributions. These are well defined
>    concepts for monitoring information and ned to integrate with existing
>    metrics collection systems such as Dropwizard and Stackdriver. Most metrics
>    will go through this model, which will allow runners to process new metrics
>    without adding extra code to support them, forwarding them to metric
>    collection systems.
>    2. Monitoring State - This supports general monitoring data which may
>    not fit into the standard model for Metrics. For example an I/O source may
>    provide a table of filenames+metadata, for files which are old and blocking
>    the system. I will propose a general approach, similar to the URN+payload
>    approach used in the doc right now.
>
> One thing to keep in mind -- even though it makes sense to allow each I/O
source to define their own monitoring state, this then shifts
responsibility for collecting that information to each runner and
displaying that information to every consumer. It would be reasonable to
see if there could be a set of 10 or so that covered most of the cases that
could become the "standard" set (eg., watermark information, performance
information, etc.).


> I will rewrite most of the doc and propose separating these two very
> different use cases, one which optimizes for integration with existing
> monitoring systems. The other which optimizes for flexibility, allowing
> more complex and custom metrics formats for other debugging scenarios.
>
> I just wanted to give a brief update on the direction of this change,
> before writing it up in full detail.
>
>
> On Mon, Apr 16, 2018 at 10:36 AM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> I agree that the user/system dichotomy is false, the real question of how
>> counters can be scoped to avoid accidental (or even intentional)
>> interference. A system that entirely controls the interaction between the
>> "user" (from its perspective) and the underlying system can do this by
>> prefixing all requested "user" counters with a prefix it will not use
>> itself. Of course this breaks down whenever the wrapping isn't complete
>> (either on the production or consumption side), but may be worth doing for
>> some components (like the SDKs that value being able to provide this
>> isolation for better behavior). Actual (human) end users are likely to be
>> much less careful about avoiding conflicts than library authors who in turn
>> are generally less careful than authors of the system itself.
>>
>> We could alternatively allow for specifying fully qualified URNs for
>> counter names in the SDK APIs, and letting "normal" user counters be in the
>> empty namespace rather than something like beam:metrics:{user,other,...},
>> perhaps with SDKs prohibiting certain conflicting prefixes (which is less
>> than ideal). A layer above the SDK that has similar absolute control over
>> its "users" would have a similar decision to make.
>>
>>
>> On Sat, Apr 14, 2018 at 4:00 PM Kenneth Knowles <kl...@google.com> wrote:
>>
>>> One reason I resist the user/system distinction is that Beam is a
>>> multi-party system with at least SDK, runner, and pipeline. Often there may
>>> be a DSL like SQL or Scio, or similarly someone may be building a platform
>>> for their company where there is no user authoring the pipeline. Should
>>> Scio, SQL, or MyCompanyFramework metrics end up in "user"? Who decides to
>>> tack on the prefix? It looks like it is the SDK harness? Are there just
>>> three namespaces "runner", "sdk", and "user"?  Most of what you'd think
>>> of as "user" version "system" should simply be the different between
>>> dynamically defined & typed metrics and fields in control plane protos. If
>>> that layer of the namespaces is not finite and limited, who can extend make
>>> a valid extension? Just some questions that I think would flesh out the
>>> meaning of the "user" prefix.
>>>
>>> Kenn
>>>
>>> On Fri, Apr 13, 2018 at 5:26 PM Andrea Foegler <fo...@google.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler <fo...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks, Robert!
>>>>>>
>>>>>> I think my lack of clarity is around the MetricSpec.  Maybe what's in
>>>>>> my head and what's being proposed are the same thing.  When I read that the
>>>>>> MetricSpec describes the proto structure, that sound kind of complicated to
>>>>>> me.  But I may be misinterpreting it.  What I picture is something like a
>>>>>> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
>>>>>> :):
>>>>>>
>>>>>> {
>>>>>> name: "my_timer"
>>>>>>
>>>>>
>>>>> name: "beam:metric:user:my_namespace:my_timer" (assuming we want to
>>>>> keep requiring namespaces). Or "beam:metric:[some non-user designation]"
>>>>>
>>>>
>>>> Sure. Looks good.
>>>>
>>>>
>>>>>
>>>>> labels: { "ptransform" }
>>>>>>
>>>>>
>>>>> How does an SDK act on this information?
>>>>>
>>>>
>>>> The SDK is obligated to submit any metric values for that spec with a
>>>> "ptransform" -> "transformName" in the labels field.  Autogenerating code
>>>> from the spec to avoid typos should be easy.
>>>>
>>>>
>>>>>
>>>>>
>>>>>> type: GAUGE
>>>>>> value_type: int64
>>>>>>
>>>>>
>>>>> I was lumping type and value_type into the same field, as a urn for
>>>>> possibly extensibility, as they're tightly coupled (e.g. quantiles,
>>>>> distributions).
>>>>>
>>>>
>>>> My inclination is that keeping this set relatively small and fixed to a
>>>> set that can be readily exported to external monitoring systems is more
>>>> useful than the added indirection to support extensibility.  Lumping
>>>> together seems reasonable.
>>>>
>>>>
>>>>>
>>>>>
>>>>>> units: SECONDS
>>>>>> description: "Times my stuff"
>>>>>>
>>>>>
>>>>> Are both of these optional metadata, in the form of key-value field,
>>>>> for flattened into the field itself (along with every other kind of
>>>>> metadata you may want to attach)?
>>>>>
>>>>
>>>> Optional metadata in the form of fixed fields.  Is there a use case for
>>>> arbitrary metadata?  What would you do with it when exporting?
>>>>
>>>>
>>>>>
>>>>>
>>>>>> }
>>>>>>
>>>>>> Then metrics submitted would look like:
>>>>>> {
>>>>>> name: "my_timer"
>>>>>> labels: {"ptransform": "MyTransform"}
>>>>>> int_value: 100
>>>>>> }
>>>>>>
>>>>>
>>>>> Yes, or value could be a bytes field that is encoded according to
>>>>> [value_]type above, if we want that extensibility (e.g. if we want to
>>>>> bundle the pardo sub-timings together, we'd need a proto for the value, but
>>>>> that seems to specific to hard code into the basic structure).
>>>>>
>>>>>
>>>> The simplicity coming from the fact that there's only one proto format
>>>>>> for the spec and for the value.  The only thing that varies are the entries
>>>>>> in the map and the value field set.  It's pretty easy to establish
>>>>>> contracts around this type of spec and even generate protos for use the in
>>>>>> SDK that make the expectations explicit.
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Or just "beam:counter:<namespace>:<name>" or even
>>>>>>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>>>>>>> their name.
>>>>>>>>
>>>>>>>
>>>>>>> I proposed keeping the "user" in there to avoid possible clashes
>>>>>>> with the system namespaces. (No preference on counter vs. metric, I wasn't
>>>>>>> trying to imply counter = SumInts)
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I like the generalization from entity -> labels.  I view the
>>>>>>>> purpose of those fields to provide context.  And labels feel like they
>>>>>>>> supports a richer set of contexts.
>>>>>>>>
>>>>>>>
>>>>>>> If we think such a generalization provides value, I'm fine with
>>>>>>> doing that now, as sets or key-value maps, if we have good enough examples
>>>>>>> to justify this.
>>>>>>>
>>>>>>>
>>>>>>>> The URN concept gets a little tricky.  I totally agree that the
>>>>>>>> context fields should not be embedded in the name.
>>>>>>>> There's a "name" which is the identifier that can be used to
>>>>>>>> communicate what context values are supported / allowed for metrics with
>>>>>>>> that name (for example, element_count expects a ptransform ID).  But then
>>>>>>>> there's the context.  In Stackdriver, this context is a map of key-value
>>>>>>>> pairs; the type is considered metadata associated with the name, but not
>>>>>>>> communicated with the value.
>>>>>>>>
>>>>>>>
>>>>>>> I'm not quite following you here. If context contains a ptransform
>>>>>>> id, then it cannot be associated with a single name.
>>>>>>>
>>>>>>>
>>>>>>>> Could the URN be "beam:namespace:name" and every metric have a map
>>>>>>>> of key-value pairs for context?
>>>>>>>>
>>>>>>>
>>>>>>> The URN is the name. Something like
>>>>>>> "beam:metric:ptransform_execution_times:v1."
>>>>>>>
>>>>>>>
>>>>>>>> Not sure where this fits in the discussion or if this is handled
>>>>>>>> somewhere, but allowing for a metric configuration that's provided
>>>>>>>> independently of the value allows for configuring "type", "units", etc in a
>>>>>>>> uniform way without having to encode them in the metric name / value.
>>>>>>>> Stackdriver expects each metric type has been configured ahead of time with
>>>>>>>> these annotations / metadata.  Then values are reported separately.  For
>>>>>>>> system metrics, the definitions can be packaged with the SDK.  For user
>>>>>>>> metrics, they'd be defined at runtime.
>>>>>>>>
>>>>>>>
>>>>>>> This feels like the metrics spec, that specifies that the metric
>>>>>>> with name/URN X has this type plus a bunch of other metadata (e.g. units,
>>>>>>> if they're not implicit in the type? This gets into whether the type should
>>>>>>> be Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...}
>>>>>>> + units metadata).
>>>>>>>
>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Alex Amato <aj...@google.com>.

Hello,

I just wanted to give an update .

After some discussion, I've realized that its best to break up the two
concepts, with two separate way of reporting monitoring data. These two
categories are:

   1. Metrics - Counters, Gauges, Distributions. These are well defined
   concepts for monitoring information and ned to integrate with existing
   metrics collection systems such as Dropwizard and Stackdriver. Most metrics
   will go through this model, which will allow runners to process new metrics
   without adding extra code to support them, forwarding them to metric
   collection systems.
   2. Monitoring State - This supports general monitoring data which may
   not fit into the standard model for Metrics. For example an I/O source may
   provide a table of filenames+metadata, for files which are old and blocking
   the system. I will propose a general approach, similar to the URN+payload
   approach used in the doc right now.

I will rewrite most of the doc and propose separating these two very
different use cases, one which optimizes for integration with existing
monitoring systems. The other which optimizes for flexibility, allowing
more complex and custom metrics formats for other debugging scenarios.

I just wanted to give a brief update on the direction of this change,
before writing it up in full detail.


On Mon, Apr 16, 2018 at 10:36 AM Robert Bradshaw <ro...@google.com>
wrote:

> I agree that the user/system dichotomy is false, the real question of how
> counters can be scoped to avoid accidental (or even intentional)
> interference. A system that entirely controls the interaction between the
> "user" (from its perspective) and the underlying system can do this by
> prefixing all requested "user" counters with a prefix it will not use
> itself. Of course this breaks down whenever the wrapping isn't complete
> (either on the production or consumption side), but may be worth doing for
> some components (like the SDKs that value being able to provide this
> isolation for better behavior). Actual (human) end users are likely to be
> much less careful about avoiding conflicts than library authors who in turn
> are generally less careful than authors of the system itself.
>
> We could alternatively allow for specifying fully qualified URNs for
> counter names in the SDK APIs, and letting "normal" user counters be in the
> empty namespace rather than something like beam:metrics:{user,other,...},
> perhaps with SDKs prohibiting certain conflicting prefixes (which is less
> than ideal). A layer above the SDK that has similar absolute control over
> its "users" would have a similar decision to make.
>
>
> On Sat, Apr 14, 2018 at 4:00 PM Kenneth Knowles <kl...@google.com> wrote:
>
>> One reason I resist the user/system distinction is that Beam is a
>> multi-party system with at least SDK, runner, and pipeline. Often there may
>> be a DSL like SQL or Scio, or similarly someone may be building a platform
>> for their company where there is no user authoring the pipeline. Should
>> Scio, SQL, or MyCompanyFramework metrics end up in "user"? Who decides to
>> tack on the prefix? It looks like it is the SDK harness? Are there just
>> three namespaces "runner", "sdk", and "user"?  Most of what you'd think
>> of as "user" version "system" should simply be the different between
>> dynamically defined & typed metrics and fields in control plane protos. If
>> that layer of the namespaces is not finite and limited, who can extend make
>> a valid extension? Just some questions that I think would flesh out the
>> meaning of the "user" prefix.
>>
>> Kenn
>>
>> On Fri, Apr 13, 2018 at 5:26 PM Andrea Foegler <fo...@google.com>
>> wrote:
>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler <fo...@google.com>
>>>> wrote:
>>>>
>>>>> Thanks, Robert!
>>>>>
>>>>> I think my lack of clarity is around the MetricSpec.  Maybe what's in
>>>>> my head and what's being proposed are the same thing.  When I read that the
>>>>> MetricSpec describes the proto structure, that sound kind of complicated to
>>>>> me.  But I may be misinterpreting it.  What I picture is something like a
>>>>> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
>>>>> :):
>>>>>
>>>>> {
>>>>> name: "my_timer"
>>>>>
>>>>
>>>> name: "beam:metric:user:my_namespace:my_timer" (assuming we want to
>>>> keep requiring namespaces). Or "beam:metric:[some non-user designation]"
>>>>
>>>
>>> Sure. Looks good.
>>>
>>>
>>>>
>>>> labels: { "ptransform" }
>>>>>
>>>>
>>>> How does an SDK act on this information?
>>>>
>>>
>>> The SDK is obligated to submit any metric values for that spec with a
>>> "ptransform" -> "transformName" in the labels field.  Autogenerating code
>>> from the spec to avoid typos should be easy.
>>>
>>>
>>>>
>>>>
>>>>> type: GAUGE
>>>>> value_type: int64
>>>>>
>>>>
>>>> I was lumping type and value_type into the same field, as a urn for
>>>> possibly extensibility, as they're tightly coupled (e.g. quantiles,
>>>> distributions).
>>>>
>>>
>>> My inclination is that keeping this set relatively small and fixed to a
>>> set that can be readily exported to external monitoring systems is more
>>> useful than the added indirection to support extensibility.  Lumping
>>> together seems reasonable.
>>>
>>>
>>>>
>>>>
>>>>> units: SECONDS
>>>>> description: "Times my stuff"
>>>>>
>>>>
>>>> Are both of these optional metadata, in the form of key-value field,
>>>> for flattened into the field itself (along with every other kind of
>>>> metadata you may want to attach)?
>>>>
>>>
>>> Optional metadata in the form of fixed fields.  Is there a use case for
>>> arbitrary metadata?  What would you do with it when exporting?
>>>
>>>
>>>>
>>>>
>>>>> }
>>>>>
>>>>> Then metrics submitted would look like:
>>>>> {
>>>>> name: "my_timer"
>>>>> labels: {"ptransform": "MyTransform"}
>>>>> int_value: 100
>>>>> }
>>>>>
>>>>
>>>> Yes, or value could be a bytes field that is encoded according to
>>>> [value_]type above, if we want that extensibility (e.g. if we want to
>>>> bundle the pardo sub-timings together, we'd need a proto for the value, but
>>>> that seems to specific to hard code into the basic structure).
>>>>
>>>>
>>> The simplicity coming from the fact that there's only one proto format
>>>>> for the spec and for the value.  The only thing that varies are the entries
>>>>> in the map and the value field set.  It's pretty easy to establish
>>>>> contracts around this type of spec and even generate protos for use the in
>>>>> SDK that make the expectations explicit.
>>>>>
>>>>>
>>>>> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> Or just "beam:counter:<namespace>:<name>" or even
>>>>>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>>>>>> their name.
>>>>>>>
>>>>>>
>>>>>> I proposed keeping the "user" in there to avoid possible clashes with
>>>>>> the system namespaces. (No preference on counter vs. metric, I wasn't
>>>>>> trying to imply counter = SumInts)
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I like the generalization from entity -> labels.  I view the purpose
>>>>>>> of those fields to provide context.  And labels feel like they supports a
>>>>>>> richer set of contexts.
>>>>>>>
>>>>>>
>>>>>> If we think such a generalization provides value, I'm fine with doing
>>>>>> that now, as sets or key-value maps, if we have good enough examples to
>>>>>> justify this.
>>>>>>
>>>>>>
>>>>>>> The URN concept gets a little tricky.  I totally agree that the
>>>>>>> context fields should not be embedded in the name.
>>>>>>> There's a "name" which is the identifier that can be used to
>>>>>>> communicate what context values are supported / allowed for metrics with
>>>>>>> that name (for example, element_count expects a ptransform ID).  But then
>>>>>>> there's the context.  In Stackdriver, this context is a map of key-value
>>>>>>> pairs; the type is considered metadata associated with the name, but not
>>>>>>> communicated with the value.
>>>>>>>
>>>>>>
>>>>>> I'm not quite following you here. If context contains a ptransform
>>>>>> id, then it cannot be associated with a single name.
>>>>>>
>>>>>>
>>>>>>> Could the URN be "beam:namespace:name" and every metric have a map
>>>>>>> of key-value pairs for context?
>>>>>>>
>>>>>>
>>>>>> The URN is the name. Something like
>>>>>> "beam:metric:ptransform_execution_times:v1."
>>>>>>
>>>>>>
>>>>>>> Not sure where this fits in the discussion or if this is handled
>>>>>>> somewhere, but allowing for a metric configuration that's provided
>>>>>>> independently of the value allows for configuring "type", "units", etc in a
>>>>>>> uniform way without having to encode them in the metric name / value.
>>>>>>> Stackdriver expects each metric type has been configured ahead of time with
>>>>>>> these annotations / metadata.  Then values are reported separately.  For
>>>>>>> system metrics, the definitions can be packaged with the SDK.  For user
>>>>>>> metrics, they'd be defined at runtime.
>>>>>>>
>>>>>>
>>>>>> This feels like the metrics spec, that specifies that the metric with
>>>>>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>>>>>> they're not implicit in the type? This gets into whether the type should be
>>>>>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...}
>>>>>> + units metadata).
>>>>>>
>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

I agree that the user/system dichotomy is false, the real question of how
counters can be scoped to avoid accidental (or even intentional)
interference. A system that entirely controls the interaction between the
"user" (from its perspective) and the underlying system can do this by
prefixing all requested "user" counters with a prefix it will not use
itself. Of course this breaks down whenever the wrapping isn't complete
(either on the production or consumption side), but may be worth doing for
some components (like the SDKs that value being able to provide this
isolation for better behavior). Actual (human) end users are likely to be
much less careful about avoiding conflicts than library authors who in turn
are generally less careful than authors of the system itself.

We could alternatively allow for specifying fully qualified URNs for
counter names in the SDK APIs, and letting "normal" user counters be in the
empty namespace rather than something like beam:metrics:{user,other,...},
perhaps with SDKs prohibiting certain conflicting prefixes (which is less
than ideal). A layer above the SDK that has similar absolute control over
its "users" would have a similar decision to make.


On Sat, Apr 14, 2018 at 4:00 PM Kenneth Knowles <kl...@google.com> wrote:

> One reason I resist the user/system distinction is that Beam is a
> multi-party system with at least SDK, runner, and pipeline. Often there may
> be a DSL like SQL or Scio, or similarly someone may be building a platform
> for their company where there is no user authoring the pipeline. Should
> Scio, SQL, or MyCompanyFramework metrics end up in "user"? Who decides to
> tack on the prefix? It looks like it is the SDK harness? Are there just
> three namespaces "runner", "sdk", and "user"?  Most of what you'd think
> of as "user" version "system" should simply be the different between
> dynamically defined & typed metrics and fields in control plane protos. If
> that layer of the namespaces is not finite and limited, who can extend make
> a valid extension? Just some questions that I think would flesh out the
> meaning of the "user" prefix.
>
> Kenn
>
> On Fri, Apr 13, 2018 at 5:26 PM Andrea Foegler <fo...@google.com> wrote:
>
>>
>>
>> On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler <fo...@google.com>
>>> wrote:
>>>
>>>> Thanks, Robert!
>>>>
>>>> I think my lack of clarity is around the MetricSpec.  Maybe what's in
>>>> my head and what's being proposed are the same thing.  When I read that the
>>>> MetricSpec describes the proto structure, that sound kind of complicated to
>>>> me.  But I may be misinterpreting it.  What I picture is something like a
>>>> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
>>>> :):
>>>>
>>>> {
>>>> name: "my_timer"
>>>>
>>>
>>> name: "beam:metric:user:my_namespace:my_timer" (assuming we want to keep
>>> requiring namespaces). Or "beam:metric:[some non-user designation]"
>>>
>>
>> Sure. Looks good.
>>
>>
>>>
>>> labels: { "ptransform" }
>>>>
>>>
>>> How does an SDK act on this information?
>>>
>>
>> The SDK is obligated to submit any metric values for that spec with a
>> "ptransform" -> "transformName" in the labels field.  Autogenerating code
>> from the spec to avoid typos should be easy.
>>
>>
>>>
>>>
>>>> type: GAUGE
>>>> value_type: int64
>>>>
>>>
>>> I was lumping type and value_type into the same field, as a urn for
>>> possibly extensibility, as they're tightly coupled (e.g. quantiles,
>>> distributions).
>>>
>>
>> My inclination is that keeping this set relatively small and fixed to a
>> set that can be readily exported to external monitoring systems is more
>> useful than the added indirection to support extensibility.  Lumping
>> together seems reasonable.
>>
>>
>>>
>>>
>>>> units: SECONDS
>>>> description: "Times my stuff"
>>>>
>>>
>>> Are both of these optional metadata, in the form of key-value field, for
>>> flattened into the field itself (along with every other kind of metadata
>>> you may want to attach)?
>>>
>>
>> Optional metadata in the form of fixed fields.  Is there a use case for
>> arbitrary metadata?  What would you do with it when exporting?
>>
>>
>>>
>>>
>>>> }
>>>>
>>>> Then metrics submitted would look like:
>>>> {
>>>> name: "my_timer"
>>>> labels: {"ptransform": "MyTransform"}
>>>> int_value: 100
>>>> }
>>>>
>>>
>>> Yes, or value could be a bytes field that is encoded according to
>>> [value_]type above, if we want that extensibility (e.g. if we want to
>>> bundle the pardo sub-timings together, we'd need a proto for the value, but
>>> that seems to specific to hard code into the basic structure).
>>>
>>>
>> The simplicity coming from the fact that there's only one proto format
>>>> for the spec and for the value.  The only thing that varies are the entries
>>>> in the map and the value field set.  It's pretty easy to establish
>>>> contracts around this type of spec and even generate protos for use the in
>>>> SDK that make the expectations explicit.
>>>>
>>>>
>>>> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> Or just "beam:counter:<namespace>:<name>" or even
>>>>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>>>>> their name.
>>>>>>
>>>>>
>>>>> I proposed keeping the "user" in there to avoid possible clashes with
>>>>> the system namespaces. (No preference on counter vs. metric, I wasn't
>>>>> trying to imply counter = SumInts)
>>>>>
>>>>>
>>>>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com>
>>>>> wrote:
>>>>>
>>>>>> I like the generalization from entity -> labels.  I view the purpose
>>>>>> of those fields to provide context.  And labels feel like they supports a
>>>>>> richer set of contexts.
>>>>>>
>>>>>
>>>>> If we think such a generalization provides value, I'm fine with doing
>>>>> that now, as sets or key-value maps, if we have good enough examples to
>>>>> justify this.
>>>>>
>>>>>
>>>>>> The URN concept gets a little tricky.  I totally agree that the
>>>>>> context fields should not be embedded in the name.
>>>>>> There's a "name" which is the identifier that can be used to
>>>>>> communicate what context values are supported / allowed for metrics with
>>>>>> that name (for example, element_count expects a ptransform ID).  But then
>>>>>> there's the context.  In Stackdriver, this context is a map of key-value
>>>>>> pairs; the type is considered metadata associated with the name, but not
>>>>>> communicated with the value.
>>>>>>
>>>>>
>>>>> I'm not quite following you here. If context contains a ptransform id,
>>>>> then it cannot be associated with a single name.
>>>>>
>>>>>
>>>>>> Could the URN be "beam:namespace:name" and every metric have a map of
>>>>>> key-value pairs for context?
>>>>>>
>>>>>
>>>>> The URN is the name. Something like
>>>>> "beam:metric:ptransform_execution_times:v1."
>>>>>
>>>>>
>>>>>> Not sure where this fits in the discussion or if this is handled
>>>>>> somewhere, but allowing for a metric configuration that's provided
>>>>>> independently of the value allows for configuring "type", "units", etc in a
>>>>>> uniform way without having to encode them in the metric name / value.
>>>>>> Stackdriver expects each metric type has been configured ahead of time with
>>>>>> these annotations / metadata.  Then values are reported separately.  For
>>>>>> system metrics, the definitions can be packaged with the SDK.  For user
>>>>>> metrics, they'd be defined at runtime.
>>>>>>
>>>>>
>>>>> This feels like the metrics spec, that specifies that the metric with
>>>>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>>>>> they're not implicit in the type? This gets into whether the type should be
>>>>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...}
>>>>> + units metadata).
>>>>>
>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Kenneth Knowles <kl...@google.com>.

One reason I resist the user/system distinction is that Beam is a
multi-party system with at least SDK, runner, and pipeline. Often there may
be a DSL like SQL or Scio, or similarly someone may be building a platform
for their company where there is no user authoring the pipeline. Should
Scio, SQL, or MyCompanyFramework metrics end up in "user"? Who decides to
tack on the prefix? It looks like it is the SDK harness? Are there just
three namespaces "runner", "sdk", and "user"?  Most of what you'd think of
as "user" version "system" should simply be the different between
dynamically defined & typed metrics and fields in control plane protos. If
that layer of the namespaces is not finite and limited, who can extend make
a valid extension? Just some questions that I think would flesh out the
meaning of the "user" prefix.

Kenn

On Fri, Apr 13, 2018 at 5:26 PM Andrea Foegler <fo...@google.com> wrote:

>
>
> On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler <fo...@google.com>
>> wrote:
>>
>>> Thanks, Robert!
>>>
>>> I think my lack of clarity is around the MetricSpec.  Maybe what's in my
>>> head and what's being proposed are the same thing.  When I read that the
>>> MetricSpec describes the proto structure, that sound kind of complicated to
>>> me.  But I may be misinterpreting it.  What I picture is something like a
>>> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
>>> :):
>>>
>>> {
>>> name: "my_timer"
>>>
>>
>> name: "beam:metric:user:my_namespace:my_timer" (assuming we want to keep
>> requiring namespaces). Or "beam:metric:[some non-user designation]"
>>
>
> Sure. Looks good.
>
>
>>
>> labels: { "ptransform" }
>>>
>>
>> How does an SDK act on this information?
>>
>
> The SDK is obligated to submit any metric values for that spec with a
> "ptransform" -> "transformName" in the labels field.  Autogenerating code
> from the spec to avoid typos should be easy.
>
>
>>
>>
>>> type: GAUGE
>>> value_type: int64
>>>
>>
>> I was lumping type and value_type into the same field, as a urn for
>> possibly extensibility, as they're tightly coupled (e.g. quantiles,
>> distributions).
>>
>
> My inclination is that keeping this set relatively small and fixed to a
> set that can be readily exported to external monitoring systems is more
> useful than the added indirection to support extensibility.  Lumping
> together seems reasonable.
>
>
>>
>>
>>> units: SECONDS
>>> description: "Times my stuff"
>>>
>>
>> Are both of these optional metadata, in the form of key-value field, for
>> flattened into the field itself (along with every other kind of metadata
>> you may want to attach)?
>>
>
> Optional metadata in the form of fixed fields.  Is there a use case for
> arbitrary metadata?  What would you do with it when exporting?
>
>
>>
>>
>>> }
>>>
>>> Then metrics submitted would look like:
>>> {
>>> name: "my_timer"
>>> labels: {"ptransform": "MyTransform"}
>>> int_value: 100
>>> }
>>>
>>
>> Yes, or value could be a bytes field that is encoded according to
>> [value_]type above, if we want that extensibility (e.g. if we want to
>> bundle the pardo sub-timings together, we'd need a proto for the value, but
>> that seems to specific to hard code into the basic structure).
>>
>>
> The simplicity coming from the fact that there's only one proto format for
>>> the spec and for the value.  The only thing that varies are the entries in
>>> the map and the value field set.  It's pretty easy to establish contracts
>>> around this type of spec and even generate protos for use the in SDK that
>>> make the expectations explicit.
>>>
>>>
>>> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>>>>
>>>>>
>>>>> Or just "beam:counter:<namespace>:<name>" or even
>>>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>>>> their name.
>>>>>
>>>>
>>>> I proposed keeping the "user" in there to avoid possible clashes with
>>>> the system namespaces. (No preference on counter vs. metric, I wasn't
>>>> trying to imply counter = SumInts)
>>>>
>>>>
>>>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com>
>>>> wrote:
>>>>
>>>>> I like the generalization from entity -> labels.  I view the purpose
>>>>> of those fields to provide context.  And labels feel like they supports a
>>>>> richer set of contexts.
>>>>>
>>>>
>>>> If we think such a generalization provides value, I'm fine with doing
>>>> that now, as sets or key-value maps, if we have good enough examples to
>>>> justify this.
>>>>
>>>>
>>>>> The URN concept gets a little tricky.  I totally agree that the
>>>>> context fields should not be embedded in the name.
>>>>> There's a "name" which is the identifier that can be used to
>>>>> communicate what context values are supported / allowed for metrics with
>>>>> that name (for example, element_count expects a ptransform ID).  But then
>>>>> there's the context.  In Stackdriver, this context is a map of key-value
>>>>> pairs; the type is considered metadata associated with the name, but not
>>>>> communicated with the value.
>>>>>
>>>>
>>>> I'm not quite following you here. If context contains a ptransform id,
>>>> then it cannot be associated with a single name.
>>>>
>>>>
>>>>> Could the URN be "beam:namespace:name" and every metric have a map of
>>>>> key-value pairs for context?
>>>>>
>>>>
>>>> The URN is the name. Something like
>>>> "beam:metric:ptransform_execution_times:v1."
>>>>
>>>>
>>>>> Not sure where this fits in the discussion or if this is handled
>>>>> somewhere, but allowing for a metric configuration that's provided
>>>>> independently of the value allows for configuring "type", "units", etc in a
>>>>> uniform way without having to encode them in the metric name / value.
>>>>> Stackdriver expects each metric type has been configured ahead of time with
>>>>> these annotations / metadata.  Then values are reported separately.  For
>>>>> system metrics, the definitions can be packaged with the SDK.  For user
>>>>> metrics, they'd be defined at runtime.
>>>>>
>>>>
>>>> This feels like the metrics spec, that specifies that the metric with
>>>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>>>> they're not implicit in the type? This gets into whether the type should be
>>>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
>>>> units metadata).
>>>>
>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Andrea Foegler <fo...@google.com>.

On Fri, Apr 13, 2018 at 5:00 PM Robert Bradshaw <ro...@google.com> wrote:

> On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler <fo...@google.com> wrote:
>
>> Thanks, Robert!
>>
>> I think my lack of clarity is around the MetricSpec.  Maybe what's in my
>> head and what's being proposed are the same thing.  When I read that the
>> MetricSpec describes the proto structure, that sound kind of complicated to
>> me.  But I may be misinterpreting it.  What I picture is something like a
>> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
>> :):
>>
>> {
>> name: "my_timer"
>>
>
> name: "beam:metric:user:my_namespace:my_timer" (assuming we want to keep
> requiring namespaces). Or "beam:metric:[some non-user designation]"
>

Sure. Looks good.


>
> labels: { "ptransform" }
>>
>
> How does an SDK act on this information?
>

The SDK is obligated to submit any metric values for that spec with a
"ptransform" -> "transformName" in the labels field.  Autogenerating code
from the spec to avoid typos should be easy.


>
>
>> type: GAUGE
>> value_type: int64
>>
>
> I was lumping type and value_type into the same field, as a urn for
> possibly extensibility, as they're tightly coupled (e.g. quantiles,
> distributions).
>

My inclination is that keeping this set relatively small and fixed to a set
that can be readily exported to external monitoring systems is more useful
than the added indirection to support extensibility.  Lumping together
seems reasonable.


>
>
>> units: SECONDS
>> description: "Times my stuff"
>>
>
> Are both of these optional metadata, in the form of key-value field, for
> flattened into the field itself (along with every other kind of metadata
> you may want to attach)?
>

Optional metadata in the form of fixed fields.  Is there a use case for
arbitrary metadata?  What would you do with it when exporting?


>
>
>> }
>>
>> Then metrics submitted would look like:
>> {
>> name: "my_timer"
>> labels: {"ptransform": "MyTransform"}
>> int_value: 100
>> }
>>
>
> Yes, or value could be a bytes field that is encoded according to
> [value_]type above, if we want that extensibility (e.g. if we want to
> bundle the pardo sub-timings together, we'd need a proto for the value, but
> that seems to specific to hard code into the basic structure).
>
>
The simplicity coming from the fact that there's only one proto format for
>> the spec and for the value.  The only thing that varies are the entries in
>> the map and the value field set.  It's pretty easy to establish contracts
>> around this type of spec and even generate protos for use the in SDK that
>> make the expectations explicit.
>>
>>
>> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>>>
>>>>
>>>> Or just "beam:counter:<namespace>:<name>" or even
>>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>>> their name.
>>>>
>>>
>>> I proposed keeping the "user" in there to avoid possible clashes with
>>> the system namespaces. (No preference on counter vs. metric, I wasn't
>>> trying to imply counter = SumInts)
>>>
>>>
>>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com>
>>> wrote:
>>>
>>>> I like the generalization from entity -> labels.  I view the purpose of
>>>> those fields to provide context.  And labels feel like they supports a
>>>> richer set of contexts.
>>>>
>>>
>>> If we think such a generalization provides value, I'm fine with doing
>>> that now, as sets or key-value maps, if we have good enough examples to
>>> justify this.
>>>
>>>
>>>> The URN concept gets a little tricky.  I totally agree that the context
>>>> fields should not be embedded in the name.
>>>> There's a "name" which is the identifier that can be used to
>>>> communicate what context values are supported / allowed for metrics with
>>>> that name (for example, element_count expects a ptransform ID).  But then
>>>> there's the context.  In Stackdriver, this context is a map of key-value
>>>> pairs; the type is considered metadata associated with the name, but not
>>>> communicated with the value.
>>>>
>>>
>>> I'm not quite following you here. If context contains a ptransform id,
>>> then it cannot be associated with a single name.
>>>
>>>
>>>> Could the URN be "beam:namespace:name" and every metric have a map of
>>>> key-value pairs for context?
>>>>
>>>
>>> The URN is the name. Something like
>>> "beam:metric:ptransform_execution_times:v1."
>>>
>>>
>>>> Not sure where this fits in the discussion or if this is handled
>>>> somewhere, but allowing for a metric configuration that's provided
>>>> independently of the value allows for configuring "type", "units", etc in a
>>>> uniform way without having to encode them in the metric name / value.
>>>> Stackdriver expects each metric type has been configured ahead of time with
>>>> these annotations / metadata.  Then values are reported separately.  For
>>>> system metrics, the definitions can be packaged with the SDK.  For user
>>>> metrics, they'd be defined at runtime.
>>>>
>>>
>>> This feels like the metrics spec, that specifies that the metric with
>>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>>> they're not implicit in the type? This gets into whether the type should be
>>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
>>> units metadata).
>>>
>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

On Fri, Apr 13, 2018 at 3:28 PM Andrea Foegler <fo...@google.com> wrote:

> Thanks, Robert!
>
> I think my lack of clarity is around the MetricSpec.  Maybe what's in my
> head and what's being proposed are the same thing.  When I read that the
> MetricSpec describes the proto structure, that sound kind of complicated to
> me.  But I may be misinterpreting it.  What I picture is something like a
> MetricSpec that looks like (note: my picture looks a lot like Stackdriver
> :):
>
> {
> name: "my_timer"
>

name: "beam:metric:user:my_namespace:my_timer" (assuming we want to keep
requiring namespaces). Or "beam:metric:[some non-user designation]"

labels: { "ptransform" }
>

How does an SDK act on this information?


> type: GAUGE
> value_type: int64
>

I was lumping type and value_type into the same field, as a urn for
possibly extensibility, as they're tightly coupled (e.g. quantiles,
distributions).


> units: SECONDS
> description: "Times my stuff"
>

Are both of these optional metadata, in the form of key-value field, for
flattened into the field itself (along with every other kind of metadata
you may want to attach)?


> }
>
> Then metrics submitted would look like:
> {
> name: "my_timer"
> labels: {"ptransform": "MyTransform"}
> int_value: 100
> }
>

Yes, or value could be a bytes field that is encoded according to
[value_]type above, if we want that extensibility (e.g. if we want to
bundle the pardo sub-timings together, we'd need a proto for the value, but
that seems to specific to hard code into the basic structure).


> The simplicity coming from the fact that there's only one proto format for
> the spec and for the value.  The only thing that varies are the entries in
> the map and the value field set.  It's pretty easy to establish contracts
> around this type of spec and even generate protos for use the in SDK that
> make the expectations explicit.
>
>
> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>>
>>>
>>> Or just "beam:counter:<namespace>:<name>" or even
>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>> their name.
>>>
>>
>> I proposed keeping the "user" in there to avoid possible clashes with the
>> system namespaces. (No preference on counter vs. metric, I wasn't trying to
>> imply counter = SumInts)
>>
>>
>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com>
>> wrote:
>>
>>> I like the generalization from entity -> labels.  I view the purpose of
>>> those fields to provide context.  And labels feel like they supports a
>>> richer set of contexts.
>>>
>>
>> If we think such a generalization provides value, I'm fine with doing
>> that now, as sets or key-value maps, if we have good enough examples to
>> justify this.
>>
>>
>>> The URN concept gets a little tricky.  I totally agree that the context
>>> fields should not be embedded in the name.
>>> There's a "name" which is the identifier that can be used to communicate
>>> what context values are supported / allowed for metrics with that name (for
>>> example, element_count expects a ptransform ID).  But then there's the
>>> context.  In Stackdriver, this context is a map of key-value pairs; the
>>> type is considered metadata associated with the name, but not communicated
>>> with the value.
>>>
>>
>> I'm not quite following you here. If context contains a ptransform id,
>> then it cannot be associated with a single name.
>>
>>
>>> Could the URN be "beam:namespace:name" and every metric have a map of
>>> key-value pairs for context?
>>>
>>
>> The URN is the name. Something like
>> "beam:metric:ptransform_execution_times:v1."
>>
>>
>>> Not sure where this fits in the discussion or if this is handled
>>> somewhere, but allowing for a metric configuration that's provided
>>> independently of the value allows for configuring "type", "units", etc in a
>>> uniform way without having to encode them in the metric name / value.
>>> Stackdriver expects each metric type has been configured ahead of time with
>>> these annotations / metadata.  Then values are reported separately.  For
>>> system metrics, the definitions can be packaged with the SDK.  For user
>>> metrics, they'd be defined at runtime.
>>>
>>
>> This feels like the metrics spec, that specifies that the metric with
>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>> they're not implicit in the type? This gets into whether the type should be
>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
>> units metadata).
>>
>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Andrea Foegler <fo...@google.com>.

Thanks, Robert!

I think my lack of clarity is around the MetricSpec.  Maybe what's in my
head and what's being proposed are the same thing.  When I read that the
MetricSpec describes the proto structure, that sound kind of complicated to
me.  But I may be misinterpreting it.  What I picture is something like a
MetricSpec that looks like (note: my picture looks a lot like Stackdriver
:):

{
name: "my_timer"
labels: { "ptransform" }
type: GAUGE
value_type: int64
units: SECONDS
description: "Times my stuff"
}

Then metrics submitted would look like:
{
name: "my_timer"
labels: {"ptransform": "MyTransform"}
int_value: 100
}

The simplicity coming from the fact that there's only one proto format for
the spec and for the value.  The only thing that varies are the entries in
the map and the value field set.  It's pretty easy to establish contracts
around this type of spec and even generate protos for use the in SDK that
make the expectations explicit.


On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com> wrote:

> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>
>>
>> Or just "beam:counter:<namespace>:<name>" or even
>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>> their name.
>>
>
> I proposed keeping the "user" in there to avoid possible clashes with the
> system namespaces. (No preference on counter vs. metric, I wasn't trying to
> imply counter = SumInts)
>
>
> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com> wrote:
>
>> I like the generalization from entity -> labels.  I view the purpose of
>> those fields to provide context.  And labels feel like they supports a
>> richer set of contexts.
>>
>
> If we think such a generalization provides value, I'm fine with doing that
> now, as sets or key-value maps, if we have good enough examples to justify
> this.
>
>
>> The URN concept gets a little tricky.  I totally agree that the context
>> fields should not be embedded in the name.
>> There's a "name" which is the identifier that can be used to communicate
>> what context values are supported / allowed for metrics with that name (for
>> example, element_count expects a ptransform ID).  But then there's the
>> context.  In Stackdriver, this context is a map of key-value pairs; the
>> type is considered metadata associated with the name, but not communicated
>> with the value.
>>
>
> I'm not quite following you here. If context contains a ptransform id,
> then it cannot be associated with a single name.
>
>
>> Could the URN be "beam:namespace:name" and every metric have a map of
>> key-value pairs for context?
>>
>
> The URN is the name. Something like
> "beam:metric:ptransform_execution_times:v1."
>
>
>> Not sure where this fits in the discussion or if this is handled
>> somewhere, but allowing for a metric configuration that's provided
>> independently of the value allows for configuring "type", "units", etc in a
>> uniform way without having to encode them in the metric name / value.
>> Stackdriver expects each metric type has been configured ahead of time with
>> these annotations / metadata.  Then values are reported separately.  For
>> system metrics, the definitions can be packaged with the SDK.  For user
>> metrics, they'd be defined at runtime.
>>
>
> This feels like the metrics spec, that specifies that the metric with
> name/URN X has this type plus a bunch of other metadata (e.g. units, if
> they're not implicit in the type? This gets into whether the type should be
> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
> units metadata).
>
>
>>
>>
>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles <kl...@google.com> wrote:
>>>>
>>>>> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Also, the only use for payloads is because "User Counter" is
>>>>>> currently a single URN, rather than using the namespacing characteristics
>>>>>> of URNs to map user names onto distinct metric names.
>>>>>>
>>>>>
>>>>> Can they be URNs? I don't see value in having a "user metric" URN
>>>>> where you then have to look elsewhere for what the real name is.
>>>>>
>>>>
>>>> Yes, that was my point with the parenthetical statement. I would rather
>>>> have "beam:counter:user:use_provide_namespace:user_provide_name" than use
>>>> the payload field for this. So if we're going to keep the payload field, we
>>>> need more compelling usecases.
>>>>
>>>
>>> Or just "beam:counter:<namespace>:<name>" or even
>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>> their name.
>>>
>>> Kenn
>>>
>>>
>>>> A payload avoids the messiness of having to pack (and parse) arbitrary
>>>>> parameters into a name though.) If we're going to choose names that the
>>>>> system and sdks agree to have specific meanings, and to avoid accidental
>>>>> collisions, making them full-fledged documented URNs has value.
>>>>>
>>>>
>>>>> Value is the "payload". Likely worth changing the name to avoid
>>>>> confusion with the payload above. It's bytes because it depends on the
>>>>> type. I would try to avoid nesting it too deeply (e.g. a payload within a
>>>>> payload). If we thing the types are generally limited, another option would
>>>>> be a oneof field (with a bytes option just in case) for transparency. There
>>>>> are pros and cons going this route.
>>>>>
>>>>> Type is what I proposed we add, instead of it being implicit in the
>>>>> name (and unknowable if one does not recognize the name). This makes things
>>>>> more open-ended and easier to evolve and work with.
>>>>>
>>>>> Entity could be generalized to Label, or LabelSet if desired. But as
>>>>> mentioned I think it makes sense to pull this out as a separate field,
>>>>> especially when it makes sense to aggregate a single named counter across
>>>>> labels as well as for a single label (e.g. execution time of composite
>>>>> transforms).
>>>>>
>>>>> - Robert
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <fo...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Hi folks -
>>>>>>
>>>>>> Before we totally go down the path of highly structured metric
>>>>>> protos, I'd like to propose considering a simple metrics interface between
>>>>>> the SDK and the runner.  Something more generic and closer to what most
>>>>>> monitoring systems would use.
>>>>>>
>>>>>> To use Spark as an example, the Metric system uses a simple metric
>>>>>> format of name, value and type to report all metrics in a single structure,
>>>>>> regardless of the source or context of the metric.
>>>>>>
>>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>>>>>
>>>>>> The subsystems have contracts for what metrics they will expose and
>>>>>> how they are calculated:
>>>>>>
>>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>>>>>
>>>>>> Codifying the system metrics in the SDK seems perfectly reasonable -
>>>>>> no reason to make the notion of metric generic at that level.  But at the
>>>>>> point the metric is leaving the SDK and going to the runner, a simpler,
>>>>>> generic encoding of the metrics might make it easier to adapt and maintain
>>>>>> system.  The generic format can include information about downstream
>>>>>> consumers, if that's useful.
>>>>>>
>>>>>> Spark supports a number of Metric Sinks - external monitoring
>>>>>> systems.  If runners receive a simple list of metrics, implementing any
>>>>>> number of Sinks for Beam would be straightforward and would generally be a
>>>>>> one time implementation.  If instead all system metrics are sent embedded
>>>>>> in a highly structured, semantically meaningful structure, runner code
>>>>>> would need to be updated to support exporting the new metric. We seem to be
>>>>>> heading in the direction of "if you don't understand this metric, you can't
>>>>>> use it / export it".  But most systems seem to assume metrics are really
>>>>>> simple named values that can be handled a priori.
>>>>>>
>>>>>> So I guess my primary question is:  Is it necessary for Beam to treat
>>>>>> metrics as highly semantic, arbitrarily complex data?  Or could they
>>>>>> possibly be the sort of simple named values as they are in most monitoring
>>>>>> systems and in Spark?  With the SDK potentially providing scaffolding to
>>>>>> add meaning and structure, but simplifying that out before leaving SDK
>>>>>> code.  Is the coupling to a semantically meaningful structure between the
>>>>>> SDK and runner and necessary complexity?
>>>>>>
>>>>>> Andrea
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> *Thank you for this clarification. I think the table of files fits
>>>>>>>> into the model as one of type string-set (with union as aggregation). *
>>>>>>>> Its not a list of files, its a list of metadata for each file,
>>>>>>>> several pieces of data per file.
>>>>>>>>
>>>>>>>> Are you proposing that there would be separate URNs as well for
>>>>>>>> each entity being measured then, so the the URN defines the type of entity
>>>>>>>> being measured.
>>>>>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>>>>>>> PCollection entities
>>>>>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>>>>>>> PTransform entities
>>>>>>>>
>>>>>>>
>>>>>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>>>>>> execution times are never for PCollections, and even if they were it'd be
>>>>>>> semantically a very different beast (which should not re-use the same URN).
>>>>>>>
>>>>>>> *message MetricSpec {*
>>>>>>>> *  // (Required) A URN that describes the accompanying payload.*
>>>>>>>> *  // For any URN that is not recognized (by whomever is inspecting*
>>>>>>>> *  // it) the parameter payload should be treated as opaque and*
>>>>>>>> *  // passed as-is.*
>>>>>>>> *  string urn = 1;*
>>>>>>>>
>>>>>>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>>>>>>> *  // the URN does not require any arguments, this may be omitted.*
>>>>>>>> *  bytes parameters_payload = 2;*
>>>>>>>>
>>>>>>>> *  // (Required) A URN that describes the type of values this
>>>>>>>> metric*
>>>>>>>> *  // records (e.g. durations that should be summed).*
>>>>>>>> *}*
>>>>>>>>
>>>>>>>> *message Metric[Values] {*
>>>>>>>> * // (Required) The original requesting MetricSpec.*
>>>>>>>> * MetricSpec metric_spec = 1;*
>>>>>>>>
>>>>>>>> * // A mapping of entities to (encoded) values.*
>>>>>>>> * map<string, bytes> values;*
>>>>>>>> This ignores the non-unqiueness of entity identifiers. This is why
>>>>>>>> in my doc, I have specified the entity type and its string identifier
>>>>>>>> @Ken, I believe you have pointed this out in the past, that
>>>>>>>> uniqueness is only guaranteed within a type of entity (all PCollections),
>>>>>>>> but not between entities (A Pcollection and PTransform may have the same
>>>>>>>> identifier).
>>>>>>>>
>>>>>>>
>>>>>>> See above for why this is not an issue. The extra complexity (in
>>>>>>> protos and code), the inability to use them as map keys, and the fact that
>>>>>>> they'll be 100% redundant for all entities for a given metric convinces me
>>>>>>> that it's not worth creating and tracking an enum for the type alongside
>>>>>>> the id.
>>>>>>>
>>>>>>>
>>>>>>>> *}*
>>>>>>>>
>>>>>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <
>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>
>>>>>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To Robert's proto:
>>>>>>>>>>
>>>>>>>>>>  // A mapping of entities to (encoded) values.
>>>>>>>>>>>  map<string, bytes> values;
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Are the keys here the names of the metrics, aka what is used for
>>>>>>>>>> URNs in the doc?
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>> They're the entities to which a metric is attached, e.g. a
>>>>>>>>> PTransform, a PCollection, or perhaps a process/worker.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I
>>>>>>>>>>>>> was going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <
>>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics
>>>>>>>>>>>>>> of supported type" didn't include new ways of aggregating ints. As long as
>>>>>>>>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>>>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>>>>>>>>> metrics right.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Ben
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters
>>>>>>>>>>>>>>> are used a lot but others are less mainstream so being too fine from the
>>>>>>>>>>>>>>> start can just add complexity and bugs in impls IMHO.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <
>>>>>>>>>>>>>>> robertwb@google.com> a écrit :
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> By "type" of metric, I mean both the data types (including
>>>>>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be a type, as
>>>>>>>>>>>>>>>> would double-distribution.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> When you say type do you mean accumulator type, result
>>>>>>>>>>>>>>>>> type, or accumulator strategy? Specifically, what is the "type" of sumint,
>>>>>>>>>>>>>>>>> sumlong, meanlong, etc?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The metric api is designed to prevent user defined
>>>>>>>>>>>>>>>>>>> metric types based on the fact they just weren't used enough to justify
>>>>>>>>>>>>>>>>>>> support.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're
>>>>>>>>>>>>>>>>>>>> conflating the idea of custom metrics and custom metric types. I would
>>>>>>>>>>>>>>>>>>>> propose the MetricSpec field be augmented with an additional field "type"
>>>>>>>>>>>>>>>>>>>> which is a urn specifying the type of metric it is (i.e. the contents of
>>>>>>>>>>>>>>>>>>>> its payload, as well as the form of aggregation). Summing or maxing over
>>>>>>>>>>>>>>>>>>>> ints would be a typical example. Though we could pursue making this opaque
>>>>>>>>>>>>>>>>>>>> to the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for
>>>>>>>>>>>>>>>>>>>> every type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>>>>>>> I have made a revision today which is to make all
>>>>>>>>>>>>>>>>>>>>> metrics refer to a primary entity, so I have restructured some of the
>>>>>>>>>>>>>>>>>>>>> protos a little bit.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The point of this change was to futureproof the
>>>>>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom aggregation
>>>>>>>>>>>>>>>>>>>>> functions for its metric updates.
>>>>>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity
>>>>>>>>>>>>>>>>>>>>> associated with it (e.g. PCollection, PTransform), we can design an
>>>>>>>>>>>>>>>>>>>>> approach which forwards the opaque bytes metric updates, without
>>>>>>>>>>>>>>>>>>>>> deserializing them. These are forwarded to user provided code which then
>>>>>>>>>>>>>>>>>>>>> would deserialize the metric update payloads and perform the custom
>>>>>>>>>>>>>>>>>>>>> aggregations.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to
>>>>>>>>>>>>>>>>>>>>> draw attention to this particular revision.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to
>>>>>>>>>>>>>>>>>>>>>> make a decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback
>>>>>>>>>>>>>>>>>>>>>> before then and I will post the final decisions made to this thread Friday
>>>>>>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to
>>>>>>>>>>>>>>>>>>>>>>> it easily in
>>>>>>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on
>>>>>>>>>>>>>>>>>>>>>>> this proposal so far. I
>>>>>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback.
>>>>>>>>>>>>>>>>>>>>>>> There were some larger
>>>>>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of
>>>>>>>>>>>>>>>>>>>>>>> these I have added a
>>>>>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed
>>>>>>>>>>>>>>>>>>>>>>> my recommendation as well
>>>>>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could
>>>>>>>>>>>>>>>>>>>>>>> please take another look after
>>>>>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

On Fri, Apr 13, 2018 at 4:30 PM Alex Amato <aj...@google.com> wrote:

> There are a few more confusing concepts in this thread
> *Name*
>
>    - Name can mean a *"string name"* used to refer to a metric in a
>    metrics system such as stackdriver, i.e. "ElementCount", "ExecutionTime"
>    - Name can mean a set of *context* fields added to a counter, either
>    embedded in a complex string, or in a structured name. Typically referring
>    to *aggregation entities, *which define how the metric updates get
>    aggregated into final metric values, i.e. all Metric updates with the same
>    field are aggregated together.
>       - e.g.my_ptransform_id-ElementCount
>       - e.g.{ name : 'ElementCount', 'ptransform_name' :
>       'my_ptransform_id' }
>    - The *URN* of a Metric, which identifies a proto to use in a payload
>    field for the Metric and MetricSpec. Note: The string name, can literally
>    be the URN value in most cases, except for metrics which can specify a
>    separate name (i.e. user counters).
>
> @Robert,
> You have proposed that metrics should contain the following parts, I still
> don't fully understand what you mean by each one.
>
>    - Name - Why is a name a URN + bytes payload? What type of name are
>    you referring to, *string name*? *context*? *URN*? Or something else.
>
> As you say above, the URN can literally be the string name. I see no
reason why this can't be the case for user counters as well (the user
counter name becoming part of the urn). The payload, should we decide to
keep it, is "part" of the name because it helps identify what exactly we're
counting. I.e. {urnX, payload1} would be distinct from {urnX, payload2}.
The only reason to have a payload is to avoid sticking stuff that would be
ugly to parse into the URN.

>
>    - Entity - This is how the metric is aggregated together. If I
>    understand you correctly. And you correctly point out that a singular
>    entity is not sufficient, a set of labels may be more appropriate.
>
> Alternatively, the entity/labels specifies possible sub-partitions of the
metric identified by its name (as above).

>
>    - Value - *Are you saying this is just the metric value, not including
>    any fields related to entity or name.*
>
> Exactly. Like "5077." For some types it would be composite. The type also
indicates how it's encoded (e.g. as bytes, or which field of a oneof should
be populated).

>
>    - Type - I am not clear at all on what this is or what it would look
>    like. Are you referring to units, like milliseconds/seconds? Why it
>    wouldn't be part of the value payload. Is this some sort of reason to
>    separate it out from the value? What if the value has multiple fields for
>    example.
>
> Type would be "beam:metric_type:sum:ints" or
"beam:metric_type:distribution:doubles." We could separate "data type" from
"aggregation type" if desired, though of course the full cross-product
doesn't makes sense. We could put the unit in the type (e.g. sum_durations
!= sum_ints), but, preferably, I'd put this as metadata on the counter
spec. It is often fully determined by the URN, but provided so one can
reason about the metric without having to interpret the URN. It also means
we don't have to have a separate URN for each user metric type. (In fact,
any metric the runner doesn't understand would be treated as a user metric,
and aggregated as such if it understand the type.)

Some pros and cons as I see them
> Pros:
>
>    - More separation and flexibility for an SDK to specify labels
>    separately from the value/type. Though, maybe I don't understand enough,
>    and I am not so sure this is a con over just having the URN payload contain
>    everything in itself.
>
> We can't interpret a URN payload unless we know the URN. Separating things
out allows us to act on metrics without interpreting the URN (both for
unknown URNs, and simplifying the logic by not having to do lookups on the
URN everywhere).


> Cons:
>
>    - I think this means that the SDK must properly pick two separate
>    payloads and populate them correctly. We can run into issues where.
>       - Having one URN which specifies all the fields you would need to
>       populate for a specific metric avoids this, this was a concern brought up
>       by Luke. The runner would then be responsible for packaging metrics up to
>       send to external monitoring systems.
>
> I'm not following you here. We'd return exactly what Andrea suggested.


>
> @Andrea, please correct me if I misunderstand
> Thank you for the metric spec example in your last response, I think that
> makes the idea much more clear.
>
> Using your approach I see the following pros and cons
> Pros:
>
>    - Runners have a cleaner more reusable codepath to forwarding metrics
>    to external monitoring systems. This will mean less work on the runner side
>    to support each metric (perhaps none in many cases).
>    - SDKs may need less code as well to package up new metrics.
>    - As long\ as we expect SDKs to only send cataloged/requested metrics,
>    we can avoid the issues of SDKs creating too many metrics, metrics the
>    runner/engine don't understand, etc.
>
> Cons:
>
>    - Luke's concern with this approach was that this spec ends up boiling
>    down to just the name, in this case "my_timer". His concern is that with
>    many SDK implementations, we can have bugs using the wrong string name for
>    counters, or populating them with the wrong values.
>       - Note how the ParDoExecution time example in the doc lets you
>       build the group of metrics together, rather than reporting three different
>       ones from SDK->Runner. This sort of thing can make it more clear how to
>       fill in metrics in the SDK side. Then the RunnerHarness is responsible for
>       packaging the metrics up for monitoring systems, not the SDK side.
>    - Ruling out URNs+payloads altogether (Though, I don't think you are
>    suggesting this) is less extensible for custom runners+sdks+engines. I.e.
>    the table of I/O files example. It also rules out sending parameters for a
>    metric from the runner->SDK.
>    - Populating each metric spec in code in each Runner could be
>    similarly error prone. Instead of just stating "urn:namespace:my_timer",
>    you must specify this and each runner must get it correct:
>    - {
>       name: "my_timer"
>       labels: { "ptransform" }
>       type: GAUGE
>       value_type: int64
>       units: SECONDS
>       description: "Times my stuff"
>       }
>       - Would the MetricSpec be passed like that from the RunnerHarness
>       to the SDK? This part I am not so clear on.
>       - Do we want runners to accept and forward metrics they don't know
>    about? That was another concern, was to not accept them until both the SDK
>    and Runner have been updated to accept them. Consider the performance
>    implications of an SDK sending a noisy metric.
>
> This being said, I think some of this can be mitigated.
>
>    1. Could a URN describe a metric spec in the format you describe. I.e.
>    urn:namespace:my_timer is passed to the SDK, and it knows to somehow find a
>    catalog which tells it about the MetricSpec. This way with just a URN it
>    knows how to populate a common representation, like the MetricSpec you have
>    specified.
>    2. This common metric representation could be a specific URN+payload
>    used for most metrics, while extensible ones use a different URN+payload.
>
>
> Thanks for the discussion on this thread today. I'd like us to keep
> engaging like this to come to an agreement.
> Alex
>
> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>>
>>>
>>> Or just "beam:counter:<namespace>:<name>" or even
>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>> their name.
>>>
>>
>> I proposed keeping the "user" in there to avoid possible clashes with the
>> system namespaces. (No preference on counter vs. metric, I wasn't trying to
>> imply counter = SumInts)
>>
>>
>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com>
>> wrote:
>>
>>> I like the generalization from entity -> labels.  I view the purpose of
>>> those fields to provide context.  And labels feel like they supports a
>>> richer set of contexts.
>>>
>>
>> If we think such a generalization provides value, I'm fine with doing
>> that now, as sets or key-value maps, if we have good enough examples to
>> justify this.
>>
>>
>>> The URN concept gets a little tricky.  I totally agree that the context
>>> fields should not be embedded in the name.
>>> There's a "name" which is the identifier that can be used to communicate
>>> what context values are supported / allowed for metrics with that name (for
>>> example, element_count expects a ptransform ID).  But then there's the
>>> context.  In Stackdriver, this context is a map of key-value pairs; the
>>> type is considered metadata associated with the name, but not communicated
>>> with the value.
>>>
>>
>> I'm not quite following you here. If context contains a ptransform id,
>> then it cannot be associated with a single name.
>>
>>
>>> Could the URN be "beam:namespace:name" and every metric have a map of
>>> key-value pairs for context?
>>>
>>
>> The URN is the name. Something like
>> "beam:metric:ptransform_execution_times:v1."
>>
>>
>>> Not sure where this fits in the discussion or if this is handled
>>> somewhere, but allowing for a metric configuration that's provided
>>> independently of the value allows for configuring "type", "units", etc in a
>>> uniform way without having to encode them in the metric name / value.
>>> Stackdriver expects each metric type has been configured ahead of time with
>>> these annotations / metadata.  Then values are reported separately.  For
>>> system metrics, the definitions can be packaged with the SDK.  For user
>>> metrics, they'd be defined at runtime.
>>>
>>
>> This feels like the metrics spec, that specifies that the metric with
>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>> they're not implicit in the type? This gets into whether the type should be
>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
>> units metadata).
>>
>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Andrea Foegler <fo...@google.com>.

That's a great summary Alex, thanks!

This doesn't address all your questions, but in terms of how I see the
MetricSpec being specified / shared is something like this:
SDKs just share the same MetricSpec file which defines all the system
metrics guaranteed by Beam.  SDK-specific additions can be handled with an
addendum.
That spec can be read by the SDK and by the Runner.  The SDK is responsible
for populating the metric values according to the spec for all system
metrics.  The runner doesn't really need the spec for user-defined metrics,
since there's really nothing to do with them but forward them along.

I think this should eliminate any concerns around misspellings and such.
It would even be pretty simple to automatically generate protos for each
system MetricSpec and the code to convert from the proto to the MetricSpec.

I do think runners should treat any metrics they don't know about just like
user metrics - metrics to be forwarded to downstream monitoring tools.

I think I'm unconvinced this Metrics API should handle cases like the I/O
files case.



On Fri, Apr 13, 2018 at 4:30 PM Alex Amato <aj...@google.com> wrote:

> There are a few more confusing concepts in this thread
> *Name*
>
>    - Name can mean a *"string name"* used to refer to a metric in a
>    metrics system such as stackdriver, i.e. "ElementCount", "ExecutionTime"
>    - Name can mean a set of *context* fields added to a counter, either
>    embedded in a complex string, or in a structured name. Typically referring
>    to *aggregation entities, *which define how the metric updates get
>    aggregated into final metric values, i.e. all Metric updates with the same
>    field are aggregated together.
>       - e.g.my_ptransform_id-ElementCount
>       - e.g.{ name : 'ElementCount', 'ptransform_name' :
>       'my_ptransform_id' }
>    - The *URN* of a Metric, which identifies a proto to use in a payload
>    field for the Metric and MetricSpec. Note: The string name, can literally
>    be the URN value in most cases, except for metrics which can specify a
>    separate name (i.e. user counters).
>
>
>
> @Robert,
> You have proposed that metrics should contain the following parts, I still
> don't fully understand what you mean by each one.
>
>    - Name - Why is a name a URN + bytes payload? What type of name are
>    you referring to, *string name*? *context*? *URN*? Or something else.
>    - Entity - This is how the metric is aggregated together. If I
>    understand you correctly. And you correctly point out that a singular
>    entity is not sufficient, a set of labels may be more appropriate.
>    - Value - *Are you saying this is just the metric value, not including
>    any fields related to entity or name.*
>    - Type - I am not clear at all on what this is or what it would look
>    like. Are you referring to units, like milliseconds/seconds? Why it
>    wouldn't be part of the value payload. Is this some sort of reason to
>    separate it out from the value? What if the value has multiple fields for
>    example.
>
> Some pros and cons as I see them
> Pros:
>
>    - More separation and flexibility for an SDK to specify labels
>    separately from the value/type. Though, maybe I don't understand enough,
>    and I am not so sure this is a con over just having the URN payload contain
>    everything in itself.
>
> Cons:
>
>    - I think this means that the SDK must properly pick two separate
>    payloads and populate them correctly. We can run into issues where.
>       - Having one URN which specifies all the fields you would need to
>       populate for a specific metric avoids this, this was a concern brought up
>       by Luke. The runner would then be responsible for packaging metrics up to
>       send to external monitoring systems.
>
>
> @Andrea, please correct me if I misunderstand
> Thank you for the metric spec example in your last response, I think that
> makes the idea much more clear.
>
> Using your approach I see the following pros and cons
> Pros:
>
>    - Runners have a cleaner more reusable codepath to forwarding metrics
>    to external monitoring systems. This will mean less work on the runner side
>    to support each metric (perhaps none in many cases).
>    - SDKs may need less code as well to package up new metrics.
>    - As long\ as we expect SDKs to only send cataloged/requested metrics,
>    we can avoid the issues of SDKs creating too many metrics, metrics the
>    runner/engine don't understand, etc.
>
> Cons:
>
>    - Luke's concern with this approach was that this spec ends up boiling
>    down to just the name, in this case "my_timer". His concern is that with
>    many SDK implementations, we can have bugs using the wrong string name for
>    counters, or populating them with the wrong values.
>       - Note how the ParDoExecution time example in the doc lets you
>       build the group of metrics together, rather than reporting three different
>       ones from SDK->Runner. This sort of thing can make it more clear how to
>       fill in metrics in the SDK side. Then the RunnerHarness is responsible for
>       packaging the metrics up for monitoring systems, not the SDK side.
>    - Ruling out URNs+payloads altogether (Though, I don't think you are
>    suggesting this) is less extensible for custom runners+sdks+engines. I.e.
>    the table of I/O files example. It also rules out sending parameters for a
>    metric from the runner->SDK.
>    - Populating each metric spec in code in each Runner could be
>    similarly error prone. Instead of just stating "urn:namespace:my_timer",
>    you must specify this and each runner must get it correct:
>    - {
>       name: "my_timer"
>       labels: { "ptransform" }
>       type: GAUGE
>       value_type: int64
>       units: SECONDS
>       description: "Times my stuff"
>       }
>       - Would the MetricSpec be passed like that from the RunnerHarness
>       to the SDK? This part I am not so clear on.
>       - Do we want runners to accept and forward metrics they don't know
>    about? That was another concern, was to not accept them until both the SDK
>    and Runner have been updated to accept them. Consider the performance
>    implications of an SDK sending a noisy metric.
>
> This being said, I think some of this can be mitigated.
>
>    1. Could a URN describe a metric spec in the format you describe. I.e.
>    urn:namespace:my_timer is passed to the SDK, and it knows to somehow find a
>    catalog which tells it about the MetricSpec. This way with just a URN it
>    knows how to populate a common representation, like the MetricSpec you have
>    specified.
>    2. This common metric representation could be a specific URN+payload
>    used for most metrics, while extensible ones use a different URN+payload.
>
>
> Thanks for the discussion on this thread today. I'd like us to keep
> engaging like this to come to an agreement.
> Alex
>
> On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>>
>>>
>>> Or just "beam:counter:<namespace>:<name>" or even
>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>> their name.
>>>
>>
>> I proposed keeping the "user" in there to avoid possible clashes with the
>> system namespaces. (No preference on counter vs. metric, I wasn't trying to
>> imply counter = SumInts)
>>
>>
>> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com>
>> wrote:
>>
>>> I like the generalization from entity -> labels.  I view the purpose of
>>> those fields to provide context.  And labels feel like they supports a
>>> richer set of contexts.
>>>
>>
>> If we think such a generalization provides value, I'm fine with doing
>> that now, as sets or key-value maps, if we have good enough examples to
>> justify this.
>>
>>
>>> The URN concept gets a little tricky.  I totally agree that the context
>>> fields should not be embedded in the name.
>>> There's a "name" which is the identifier that can be used to communicate
>>> what context values are supported / allowed for metrics with that name (for
>>> example, element_count expects a ptransform ID).  But then there's the
>>> context.  In Stackdriver, this context is a map of key-value pairs; the
>>> type is considered metadata associated with the name, but not communicated
>>> with the value.
>>>
>>
>> I'm not quite following you here. If context contains a ptransform id,
>> then it cannot be associated with a single name.
>>
>>
>>> Could the URN be "beam:namespace:name" and every metric have a map of
>>> key-value pairs for context?
>>>
>>
>> The URN is the name. Something like
>> "beam:metric:ptransform_execution_times:v1."
>>
>>
>>> Not sure where this fits in the discussion or if this is handled
>>> somewhere, but allowing for a metric configuration that's provided
>>> independently of the value allows for configuring "type", "units", etc in a
>>> uniform way without having to encode them in the metric name / value.
>>> Stackdriver expects each metric type has been configured ahead of time with
>>> these annotations / metadata.  Then values are reported separately.  For
>>> system metrics, the definitions can be packaged with the SDK.  For user
>>> metrics, they'd be defined at runtime.
>>>
>>
>> This feels like the metrics spec, that specifies that the metric with
>> name/URN X has this type plus a bunch of other metadata (e.g. units, if
>> they're not implicit in the type? This gets into whether the type should be
>> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
>> units metadata).
>>
>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>>>
>>>>
>>>>
>>>> On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles <kl...@google.com>
>>>>> wrote:
>>>>>
>>>>>> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Also, the only use for payloads is because "User Counter" is
>>>>>>> currently a single URN, rather than using the namespacing characteristics
>>>>>>> of URNs to map user names onto distinct metric names.
>>>>>>>
>>>>>>
>>>>>> Can they be URNs? I don't see value in having a "user metric" URN
>>>>>> where you then have to look elsewhere for what the real name is.
>>>>>>
>>>>>
>>>>> Yes, that was my point with the parenthetical statement. I would
>>>>> rather have "beam:counter:user:use_provide_namespace:user_provide_name"
>>>>> than use the payload field for this. So if we're going to keep the payload
>>>>> field, we need more compelling usecases.
>>>>>
>>>>
>>>> Or just "beam:counter:<namespace>:<name>" or even
>>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>>> their name.
>>>>
>>>> Kenn
>>>>
>>>>
>>>>> A payload avoids the messiness of having to pack (and parse) arbitrary
>>>>>> parameters into a name though.) If we're going to choose names that the
>>>>>> system and sdks agree to have specific meanings, and to avoid accidental
>>>>>> collisions, making them full-fledged documented URNs has value.
>>>>>>
>>>>>
>>>>>> Value is the "payload". Likely worth changing the name to avoid
>>>>>> confusion with the payload above. It's bytes because it depends on the
>>>>>> type. I would try to avoid nesting it too deeply (e.g. a payload within a
>>>>>> payload). If we thing the types are generally limited, another option would
>>>>>> be a oneof field (with a bytes option just in case) for transparency. There
>>>>>> are pros and cons going this route.
>>>>>>
>>>>>> Type is what I proposed we add, instead of it being implicit in the
>>>>>> name (and unknowable if one does not recognize the name). This makes things
>>>>>> more open-ended and easier to evolve and work with.
>>>>>>
>>>>>> Entity could be generalized to Label, or LabelSet if desired. But as
>>>>>> mentioned I think it makes sense to pull this out as a separate field,
>>>>>> especially when it makes sense to aggregate a single named counter across
>>>>>> labels as well as for a single label (e.g. execution time of composite
>>>>>> transforms).
>>>>>>
>>>>>> - Robert
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <fo...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi folks -
>>>>>>>
>>>>>>> Before we totally go down the path of highly structured metric
>>>>>>> protos, I'd like to propose considering a simple metrics interface between
>>>>>>> the SDK and the runner.  Something more generic and closer to what most
>>>>>>> monitoring systems would use.
>>>>>>>
>>>>>>> To use Spark as an example, the Metric system uses a simple metric
>>>>>>> format of name, value and type to report all metrics in a single structure,
>>>>>>> regardless of the source or context of the metric.
>>>>>>>
>>>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>>>>>>
>>>>>>> The subsystems have contracts for what metrics they will expose and
>>>>>>> how they are calculated:
>>>>>>>
>>>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>>>>>>
>>>>>>> Codifying the system metrics in the SDK seems perfectly reasonable -
>>>>>>> no reason to make the notion of metric generic at that level.  But at the
>>>>>>> point the metric is leaving the SDK and going to the runner, a simpler,
>>>>>>> generic encoding of the metrics might make it easier to adapt and maintain
>>>>>>> system.  The generic format can include information about downstream
>>>>>>> consumers, if that's useful.
>>>>>>>
>>>>>>> Spark supports a number of Metric Sinks - external monitoring
>>>>>>> systems.  If runners receive a simple list of metrics, implementing any
>>>>>>> number of Sinks for Beam would be straightforward and would generally be a
>>>>>>> one time implementation.  If instead all system metrics are sent embedded
>>>>>>> in a highly structured, semantically meaningful structure, runner code
>>>>>>> would need to be updated to support exporting the new metric. We seem to be
>>>>>>> heading in the direction of "if you don't understand this metric, you can't
>>>>>>> use it / export it".  But most systems seem to assume metrics are really
>>>>>>> simple named values that can be handled a priori.
>>>>>>>
>>>>>>> So I guess my primary question is:  Is it necessary for Beam to
>>>>>>> treat metrics as highly semantic, arbitrarily complex data?  Or could they
>>>>>>> possibly be the sort of simple named values as they are in most monitoring
>>>>>>> systems and in Spark?  With the SDK potentially providing scaffolding to
>>>>>>> add meaning and structure, but simplifying that out before leaving SDK
>>>>>>> code.  Is the coupling to a semantically meaningful structure between the
>>>>>>> SDK and runner and necessary complexity?
>>>>>>>
>>>>>>> Andrea
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <
>>>>>>> robertwb@google.com> wrote:
>>>>>>>
>>>>>>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> *Thank you for this clarification. I think the table of files fits
>>>>>>>>> into the model as one of type string-set (with union as aggregation). *
>>>>>>>>> Its not a list of files, its a list of metadata for each file,
>>>>>>>>> several pieces of data per file.
>>>>>>>>>
>>>>>>>>> Are you proposing that there would be separate URNs as well for
>>>>>>>>> each entity being measured then, so the the URN defines the type of entity
>>>>>>>>> being measured.
>>>>>>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>>>>>>>> PCollection entities
>>>>>>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>>>>>>>> PTransform entities
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>>>>>>> execution times are never for PCollections, and even if they were it'd be
>>>>>>>> semantically a very different beast (which should not re-use the same URN).
>>>>>>>>
>>>>>>>> *message MetricSpec {*
>>>>>>>>> *  // (Required) A URN that describes the accompanying payload.*
>>>>>>>>> *  // For any URN that is not recognized (by whomever is
>>>>>>>>> inspecting*
>>>>>>>>> *  // it) the parameter payload should be treated as opaque and*
>>>>>>>>> *  // passed as-is.*
>>>>>>>>> *  string urn = 1;*
>>>>>>>>>
>>>>>>>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>>>>>>>> *  // the URN does not require any arguments, this may be omitted.*
>>>>>>>>> *  bytes parameters_payload = 2;*
>>>>>>>>>
>>>>>>>>> *  // (Required) A URN that describes the type of values this
>>>>>>>>> metric*
>>>>>>>>> *  // records (e.g. durations that should be summed).*
>>>>>>>>> *}*
>>>>>>>>>
>>>>>>>>> *message Metric[Values] {*
>>>>>>>>> * // (Required) The original requesting MetricSpec.*
>>>>>>>>> * MetricSpec metric_spec = 1;*
>>>>>>>>>
>>>>>>>>> * // A mapping of entities to (encoded) values.*
>>>>>>>>> * map<string, bytes> values;*
>>>>>>>>> This ignores the non-unqiueness of entity identifiers. This is why
>>>>>>>>> in my doc, I have specified the entity type and its string identifier
>>>>>>>>> @Ken, I believe you have pointed this out in the past, that
>>>>>>>>> uniqueness is only guaranteed within a type of entity (all PCollections),
>>>>>>>>> but not between entities (A Pcollection and PTransform may have the same
>>>>>>>>> identifier).
>>>>>>>>>
>>>>>>>>
>>>>>>>> See above for why this is not an issue. The extra complexity (in
>>>>>>>> protos and code), the inability to use them as map keys, and the fact that
>>>>>>>> they'll be 100% redundant for all entities for a given metric convinces me
>>>>>>>> that it's not worth creating and tracking an enum for the type alongside
>>>>>>>> the id.
>>>>>>>>
>>>>>>>>
>>>>>>>>> *}*
>>>>>>>>>
>>>>>>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <
>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>
>>>>>>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> To Robert's proto:
>>>>>>>>>>>
>>>>>>>>>>>  // A mapping of entities to (encoded) values.
>>>>>>>>>>>>  map<string, bytes> values;
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Are the keys here the names of the metrics, aka what is used for
>>>>>>>>>>> URNs in the doc?
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>> They're the entities to which a metric is attached, e.g. a
>>>>>>>>>> PTransform, a PCollection, or perhaps a process/worker.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I
>>>>>>>>>>>>>> was going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <
>>>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom
>>>>>>>>>>>>>>> metrics of supported type" didn't include new ways of aggregating ints. As
>>>>>>>>>>>>>>> long as that means we have a fixed set of aggregations (that align with
>>>>>>>>>>>>>>> what what users want and metrics back end support) it seems like we are
>>>>>>>>>>>>>>> doing user metrics right.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Ben
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters
>>>>>>>>>>>>>>>> are used a lot but others are less mainstream so being too fine from the
>>>>>>>>>>>>>>>> start can just add complexity and bugs in impls IMHO.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <
>>>>>>>>>>>>>>>> robertwb@google.com> a écrit :
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> By "type" of metric, I mean both the data types (including
>>>>>>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be a type, as
>>>>>>>>>>>>>>>>> would double-distribution.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> When you say type do you mean accumulator type, result
>>>>>>>>>>>>>>>>>> type, or accumulator strategy? Specifically, what is the "type" of sumint,
>>>>>>>>>>>>>>>>>> sumlong, meanlong, etc?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The metric api is designed to prevent user defined
>>>>>>>>>>>>>>>>>>>> metric types based on the fact they just weren't used enough to justify
>>>>>>>>>>>>>>>>>>>> support.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're
>>>>>>>>>>>>>>>>>>>>> conflating the idea of custom metrics and custom metric types. I would
>>>>>>>>>>>>>>>>>>>>> propose the MetricSpec field be augmented with an additional field "type"
>>>>>>>>>>>>>>>>>>>>> which is a urn specifying the type of metric it is (i.e. the contents of
>>>>>>>>>>>>>>>>>>>>> its payload, as well as the form of aggregation). Summing or maxing over
>>>>>>>>>>>>>>>>>>>>> ints would be a typical example. Though we could pursue making this opaque
>>>>>>>>>>>>>>>>>>>>> to the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for
>>>>>>>>>>>>>>>>>>>>> every type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>>>>>>>> I have made a revision today which is to make all
>>>>>>>>>>>>>>>>>>>>>> metrics refer to a primary entity, so I have restructured some of the
>>>>>>>>>>>>>>>>>>>>>> protos a little bit.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> The point of this change was to futureproof the
>>>>>>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom aggregation
>>>>>>>>>>>>>>>>>>>>>> functions for its metric updates.
>>>>>>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity
>>>>>>>>>>>>>>>>>>>>>> associated with it (e.g. PCollection, PTransform), we can design an
>>>>>>>>>>>>>>>>>>>>>> approach which forwards the opaque bytes metric updates, without
>>>>>>>>>>>>>>>>>>>>>> deserializing them. These are forwarded to user provided code which then
>>>>>>>>>>>>>>>>>>>>>> would deserialize the metric update payloads and perform the custom
>>>>>>>>>>>>>>>>>>>>>> aggregations.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to
>>>>>>>>>>>>>>>>>>>>>> draw attention to this particular revision.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to
>>>>>>>>>>>>>>>>>>>>>>> make a decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback
>>>>>>>>>>>>>>>>>>>>>>> before then and I will post the final decisions made to this thread Friday
>>>>>>>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to
>>>>>>>>>>>>>>>>>>>>>>>> it easily in
>>>>>>>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some
>>>>>>>>>>>>>>>>>>>>>>>> comments.
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on
>>>>>>>>>>>>>>>>>>>>>>>> this proposal so far. I
>>>>>>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback.
>>>>>>>>>>>>>>>>>>>>>>>> There were some larger
>>>>>>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of
>>>>>>>>>>>>>>>>>>>>>>>> these I have added a
>>>>>>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed
>>>>>>>>>>>>>>>>>>>>>>>> my recommendation as well
>>>>>>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could
>>>>>>>>>>>>>>>>>>>>>>>> please take another look after
>>>>>>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Alex Amato <aj...@google.com>.

There are a few more confusing concepts in this thread
*Name*

   - Name can mean a *"string name"* used to refer to a metric in a metrics
   system such as stackdriver, i.e. "ElementCount", "ExecutionTime"
   - Name can mean a set of *context* fields added to a counter, either
   embedded in a complex string, or in a structured name. Typically referring
   to *aggregation entities, *which define how the metric updates get
   aggregated into final metric values, i.e. all Metric updates with the same
   field are aggregated together.
      - e.g.my_ptransform_id-ElementCount
      - e.g.{ name : 'ElementCount', 'ptransform_name' : 'my_ptransform_id'
      }
   - The *URN* of a Metric, which identifies a proto to use in a payload
   field for the Metric and MetricSpec. Note: The string name, can literally
   be the URN value in most cases, except for metrics which can specify a
   separate name (i.e. user counters).



@Robert,
You have proposed that metrics should contain the following parts, I still
don't fully understand what you mean by each one.

   - Name - Why is a name a URN + bytes payload? What type of name are you
   referring to, *string name*? *context*? *URN*? Or something else.
   - Entity - This is how the metric is aggregated together. If I
   understand you correctly. And you correctly point out that a singular
   entity is not sufficient, a set of labels may be more appropriate.
   - Value - *Are you saying this is just the metric value, not including
   any fields related to entity or name.*
   - Type - I am not clear at all on what this is or what it would look
   like. Are you referring to units, like milliseconds/seconds? Why it
   wouldn't be part of the value payload. Is this some sort of reason to
   separate it out from the value? What if the value has multiple fields for
   example.

Some pros and cons as I see them
Pros:

   - More separation and flexibility for an SDK to specify labels
   separately from the value/type. Though, maybe I don't understand enough,
   and I am not so sure this is a con over just having the URN payload contain
   everything in itself.

Cons:

   - I think this means that the SDK must properly pick two separate
   payloads and populate them correctly. We can run into issues where.
      - Having one URN which specifies all the fields you would need to
      populate for a specific metric avoids this, this was a concern brought up
      by Luke. The runner would then be responsible for packaging metrics up to
      send to external monitoring systems.


@Andrea, please correct me if I misunderstand
Thank you for the metric spec example in your last response, I think that
makes the idea much more clear.

Using your approach I see the following pros and cons
Pros:

   - Runners have a cleaner more reusable codepath to forwarding metrics to
   external monitoring systems. This will mean less work on the runner side to
   support each metric (perhaps none in many cases).
   - SDKs may need less code as well to package up new metrics.
   - As long\ as we expect SDKs to only send cataloged/requested metrics,
   we can avoid the issues of SDKs creating too many metrics, metrics the
   runner/engine don't understand, etc.

Cons:

   - Luke's concern with this approach was that this spec ends up boiling
   down to just the name, in this case "my_timer". His concern is that with
   many SDK implementations, we can have bugs using the wrong string name for
   counters, or populating them with the wrong values.
      - Note how the ParDoExecution time example in the doc lets you build
      the group of metrics together, rather than reporting three different ones
      from SDK->Runner. This sort of thing can make it more clear how
to fill in
      metrics in the SDK side. Then the RunnerHarness is responsible for
      packaging the metrics up for monitoring systems, not the SDK side.
   - Ruling out URNs+payloads altogether (Though, I don't think you are
   suggesting this) is less extensible for custom runners+sdks+engines. I.e.
   the table of I/O files example. It also rules out sending parameters for a
   metric from the runner->SDK.
   - Populating each metric spec in code in each Runner could be similarly
   error prone. Instead of just stating "urn:namespace:my_timer", you must
   specify this and each runner must get it correct:
   - {
      name: "my_timer"
      labels: { "ptransform" }
      type: GAUGE
      value_type: int64
      units: SECONDS
      description: "Times my stuff"
      }
      - Would the MetricSpec be passed like that from the RunnerHarness to
      the SDK? This part I am not so clear on.
      - Do we want runners to accept and forward metrics they don't know
   about? That was another concern, was to not accept them until both the SDK
   and Runner have been updated to accept them. Consider the performance
   implications of an SDK sending a noisy metric.

This being said, I think some of this can be mitigated.

   1. Could a URN describe a metric spec in the format you describe. I.e.
   urn:namespace:my_timer is passed to the SDK, and it knows to somehow find a
   catalog which tells it about the MetricSpec. This way with just a URN it
   knows how to populate a common representation, like the MetricSpec you have
   specified.
   2. This common metric representation could be a specific URN+payload
   used for most metrics, while extensible ones use a different URN+payload.


Thanks for the discussion on this thread today. I'd like us to keep
engaging like this to come to an agreement.
Alex

On Fri, Apr 13, 2018 at 2:23 PM Robert Bradshaw <ro...@google.com> wrote:

> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>
>>
>> Or just "beam:counter:<namespace>:<name>" or even
>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>> their name.
>>
>
> I proposed keeping the "user" in there to avoid possible clashes with the
> system namespaces. (No preference on counter vs. metric, I wasn't trying to
> imply counter = SumInts)
>
>
> On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com> wrote:
>
>> I like the generalization from entity -> labels.  I view the purpose of
>> those fields to provide context.  And labels feel like they supports a
>> richer set of contexts.
>>
>
> If we think such a generalization provides value, I'm fine with doing that
> now, as sets or key-value maps, if we have good enough examples to justify
> this.
>
>
>> The URN concept gets a little tricky.  I totally agree that the context
>> fields should not be embedded in the name.
>> There's a "name" which is the identifier that can be used to communicate
>> what context values are supported / allowed for metrics with that name (for
>> example, element_count expects a ptransform ID).  But then there's the
>> context.  In Stackdriver, this context is a map of key-value pairs; the
>> type is considered metadata associated with the name, but not communicated
>> with the value.
>>
>
> I'm not quite following you here. If context contains a ptransform id,
> then it cannot be associated with a single name.
>
>
>> Could the URN be "beam:namespace:name" and every metric have a map of
>> key-value pairs for context?
>>
>
> The URN is the name. Something like
> "beam:metric:ptransform_execution_times:v1."
>
>
>> Not sure where this fits in the discussion or if this is handled
>> somewhere, but allowing for a metric configuration that's provided
>> independently of the value allows for configuring "type", "units", etc in a
>> uniform way without having to encode them in the metric name / value.
>> Stackdriver expects each metric type has been configured ahead of time with
>> these annotations / metadata.  Then values are reported separately.  For
>> system metrics, the definitions can be packaged with the SDK.  For user
>> metrics, they'd be defined at runtime.
>>
>
> This feels like the metrics spec, that specifies that the metric with
> name/URN X has this type plus a bunch of other metadata (e.g. units, if
> they're not implicit in the type? This gets into whether the type should be
> Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
> units metadata).
>
>
>>
>>
>> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles <kl...@google.com> wrote:
>>>>
>>>>> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Also, the only use for payloads is because "User Counter" is
>>>>>> currently a single URN, rather than using the namespacing characteristics
>>>>>> of URNs to map user names onto distinct metric names.
>>>>>>
>>>>>
>>>>> Can they be URNs? I don't see value in having a "user metric" URN
>>>>> where you then have to look elsewhere for what the real name is.
>>>>>
>>>>
>>>> Yes, that was my point with the parenthetical statement. I would rather
>>>> have "beam:counter:user:use_provide_namespace:user_provide_name" than use
>>>> the payload field for this. So if we're going to keep the payload field, we
>>>> need more compelling usecases.
>>>>
>>>
>>> Or just "beam:counter:<namespace>:<name>" or even
>>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>>> their name.
>>>
>>> Kenn
>>>
>>>
>>>> A payload avoids the messiness of having to pack (and parse) arbitrary
>>>>> parameters into a name though.) If we're going to choose names that the
>>>>> system and sdks agree to have specific meanings, and to avoid accidental
>>>>> collisions, making them full-fledged documented URNs has value.
>>>>>
>>>>
>>>>> Value is the "payload". Likely worth changing the name to avoid
>>>>> confusion with the payload above. It's bytes because it depends on the
>>>>> type. I would try to avoid nesting it too deeply (e.g. a payload within a
>>>>> payload). If we thing the types are generally limited, another option would
>>>>> be a oneof field (with a bytes option just in case) for transparency. There
>>>>> are pros and cons going this route.
>>>>>
>>>>> Type is what I proposed we add, instead of it being implicit in the
>>>>> name (and unknowable if one does not recognize the name). This makes things
>>>>> more open-ended and easier to evolve and work with.
>>>>>
>>>>> Entity could be generalized to Label, or LabelSet if desired. But as
>>>>> mentioned I think it makes sense to pull this out as a separate field,
>>>>> especially when it makes sense to aggregate a single named counter across
>>>>> labels as well as for a single label (e.g. execution time of composite
>>>>> transforms).
>>>>>
>>>>> - Robert
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <fo...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Hi folks -
>>>>>>
>>>>>> Before we totally go down the path of highly structured metric
>>>>>> protos, I'd like to propose considering a simple metrics interface between
>>>>>> the SDK and the runner.  Something more generic and closer to what most
>>>>>> monitoring systems would use.
>>>>>>
>>>>>> To use Spark as an example, the Metric system uses a simple metric
>>>>>> format of name, value and type to report all metrics in a single structure,
>>>>>> regardless of the source or context of the metric.
>>>>>>
>>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>>>>>
>>>>>> The subsystems have contracts for what metrics they will expose and
>>>>>> how they are calculated:
>>>>>>
>>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>>>>>
>>>>>> Codifying the system metrics in the SDK seems perfectly reasonable -
>>>>>> no reason to make the notion of metric generic at that level.  But at the
>>>>>> point the metric is leaving the SDK and going to the runner, a simpler,
>>>>>> generic encoding of the metrics might make it easier to adapt and maintain
>>>>>> system.  The generic format can include information about downstream
>>>>>> consumers, if that's useful.
>>>>>>
>>>>>> Spark supports a number of Metric Sinks - external monitoring
>>>>>> systems.  If runners receive a simple list of metrics, implementing any
>>>>>> number of Sinks for Beam would be straightforward and would generally be a
>>>>>> one time implementation.  If instead all system metrics are sent embedded
>>>>>> in a highly structured, semantically meaningful structure, runner code
>>>>>> would need to be updated to support exporting the new metric. We seem to be
>>>>>> heading in the direction of "if you don't understand this metric, you can't
>>>>>> use it / export it".  But most systems seem to assume metrics are really
>>>>>> simple named values that can be handled a priori.
>>>>>>
>>>>>> So I guess my primary question is:  Is it necessary for Beam to treat
>>>>>> metrics as highly semantic, arbitrarily complex data?  Or could they
>>>>>> possibly be the sort of simple named values as they are in most monitoring
>>>>>> systems and in Spark?  With the SDK potentially providing scaffolding to
>>>>>> add meaning and structure, but simplifying that out before leaving SDK
>>>>>> code.  Is the coupling to a semantically meaningful structure between the
>>>>>> SDK and runner and necessary complexity?
>>>>>>
>>>>>> Andrea
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> *Thank you for this clarification. I think the table of files fits
>>>>>>>> into the model as one of type string-set (with union as aggregation). *
>>>>>>>> Its not a list of files, its a list of metadata for each file,
>>>>>>>> several pieces of data per file.
>>>>>>>>
>>>>>>>> Are you proposing that there would be separate URNs as well for
>>>>>>>> each entity being measured then, so the the URN defines the type of entity
>>>>>>>> being measured.
>>>>>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>>>>>>> PCollection entities
>>>>>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>>>>>>> PTransform entities
>>>>>>>>
>>>>>>>
>>>>>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>>>>>> execution times are never for PCollections, and even if they were it'd be
>>>>>>> semantically a very different beast (which should not re-use the same URN).
>>>>>>>
>>>>>>> *message MetricSpec {*
>>>>>>>> *  // (Required) A URN that describes the accompanying payload.*
>>>>>>>> *  // For any URN that is not recognized (by whomever is inspecting*
>>>>>>>> *  // it) the parameter payload should be treated as opaque and*
>>>>>>>> *  // passed as-is.*
>>>>>>>> *  string urn = 1;*
>>>>>>>>
>>>>>>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>>>>>>> *  // the URN does not require any arguments, this may be omitted.*
>>>>>>>> *  bytes parameters_payload = 2;*
>>>>>>>>
>>>>>>>> *  // (Required) A URN that describes the type of values this
>>>>>>>> metric*
>>>>>>>> *  // records (e.g. durations that should be summed).*
>>>>>>>> *}*
>>>>>>>>
>>>>>>>> *message Metric[Values] {*
>>>>>>>> * // (Required) The original requesting MetricSpec.*
>>>>>>>> * MetricSpec metric_spec = 1;*
>>>>>>>>
>>>>>>>> * // A mapping of entities to (encoded) values.*
>>>>>>>> * map<string, bytes> values;*
>>>>>>>> This ignores the non-unqiueness of entity identifiers. This is why
>>>>>>>> in my doc, I have specified the entity type and its string identifier
>>>>>>>> @Ken, I believe you have pointed this out in the past, that
>>>>>>>> uniqueness is only guaranteed within a type of entity (all PCollections),
>>>>>>>> but not between entities (A Pcollection and PTransform may have the same
>>>>>>>> identifier).
>>>>>>>>
>>>>>>>
>>>>>>> See above for why this is not an issue. The extra complexity (in
>>>>>>> protos and code), the inability to use them as map keys, and the fact that
>>>>>>> they'll be 100% redundant for all entities for a given metric convinces me
>>>>>>> that it's not worth creating and tracking an enum for the type alongside
>>>>>>> the id.
>>>>>>>
>>>>>>>
>>>>>>>> *}*
>>>>>>>>
>>>>>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <
>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>
>>>>>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> To Robert's proto:
>>>>>>>>>>
>>>>>>>>>>  // A mapping of entities to (encoded) values.
>>>>>>>>>>>  map<string, bytes> values;
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Are the keys here the names of the metrics, aka what is used for
>>>>>>>>>> URNs in the doc?
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>> They're the entities to which a metric is attached, e.g. a
>>>>>>>>> PTransform, a PCollection, or perhaps a process/worker.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I
>>>>>>>>>>>>> was going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <
>>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics
>>>>>>>>>>>>>> of supported type" didn't include new ways of aggregating ints. As long as
>>>>>>>>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>>>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>>>>>>>>> metrics right.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Ben
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters
>>>>>>>>>>>>>>> are used a lot but others are less mainstream so being too fine from the
>>>>>>>>>>>>>>> start can just add complexity and bugs in impls IMHO.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <
>>>>>>>>>>>>>>> robertwb@google.com> a écrit :
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> By "type" of metric, I mean both the data types (including
>>>>>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be a type, as
>>>>>>>>>>>>>>>> would double-distribution.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> When you say type do you mean accumulator type, result
>>>>>>>>>>>>>>>>> type, or accumulator strategy? Specifically, what is the "type" of sumint,
>>>>>>>>>>>>>>>>> sumlong, meanlong, etc?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The metric api is designed to prevent user defined
>>>>>>>>>>>>>>>>>>> metric types based on the fact they just weren't used enough to justify
>>>>>>>>>>>>>>>>>>> support.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're
>>>>>>>>>>>>>>>>>>>> conflating the idea of custom metrics and custom metric types. I would
>>>>>>>>>>>>>>>>>>>> propose the MetricSpec field be augmented with an additional field "type"
>>>>>>>>>>>>>>>>>>>> which is a urn specifying the type of metric it is (i.e. the contents of
>>>>>>>>>>>>>>>>>>>> its payload, as well as the form of aggregation). Summing or maxing over
>>>>>>>>>>>>>>>>>>>> ints would be a typical example. Though we could pursue making this opaque
>>>>>>>>>>>>>>>>>>>> to the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for
>>>>>>>>>>>>>>>>>>>> every type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>>>>>>> I have made a revision today which is to make all
>>>>>>>>>>>>>>>>>>>>> metrics refer to a primary entity, so I have restructured some of the
>>>>>>>>>>>>>>>>>>>>> protos a little bit.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> The point of this change was to futureproof the
>>>>>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom aggregation
>>>>>>>>>>>>>>>>>>>>> functions for its metric updates.
>>>>>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity
>>>>>>>>>>>>>>>>>>>>> associated with it (e.g. PCollection, PTransform), we can design an
>>>>>>>>>>>>>>>>>>>>> approach which forwards the opaque bytes metric updates, without
>>>>>>>>>>>>>>>>>>>>> deserializing them. These are forwarded to user provided code which then
>>>>>>>>>>>>>>>>>>>>> would deserialize the metric update payloads and perform the custom
>>>>>>>>>>>>>>>>>>>>> aggregations.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to
>>>>>>>>>>>>>>>>>>>>> draw attention to this particular revision.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to
>>>>>>>>>>>>>>>>>>>>>> make a decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback
>>>>>>>>>>>>>>>>>>>>>> before then and I will post the final decisions made to this thread Friday
>>>>>>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to
>>>>>>>>>>>>>>>>>>>>>>> it easily in
>>>>>>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on
>>>>>>>>>>>>>>>>>>>>>>> this proposal so far. I
>>>>>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback.
>>>>>>>>>>>>>>>>>>>>>>> There were some larger
>>>>>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of
>>>>>>>>>>>>>>>>>>>>>>> these I have added a
>>>>>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed
>>>>>>>>>>>>>>>>>>>>>>> my recommendation as well
>>>>>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could
>>>>>>>>>>>>>>>>>>>>>>> please take another look after
>>>>>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:

>
> Or just "beam:counter:<namespace>:<name>" or even
> "beam:metric:<namespace>:<name>" since metrics have a type separate from
> their name.
>

I proposed keeping the "user" in there to avoid possible clashes with the
system namespaces. (No preference on counter vs. metric, I wasn't trying to
imply counter = SumInts)


On Fri, Apr 13, 2018 at 2:02 PM Andrea Foegler <fo...@google.com> wrote:

> I like the generalization from entity -> labels.  I view the purpose of
> those fields to provide context.  And labels feel like they supports a
> richer set of contexts.
>

If we think such a generalization provides value, I'm fine with doing that
now, as sets or key-value maps, if we have good enough examples to justify
this.


> The URN concept gets a little tricky.  I totally agree that the context
> fields should not be embedded in the name.
> There's a "name" which is the identifier that can be used to communicate
> what context values are supported / allowed for metrics with that name (for
> example, element_count expects a ptransform ID).  But then there's the
> context.  In Stackdriver, this context is a map of key-value pairs; the
> type is considered metadata associated with the name, but not communicated
> with the value.
>

I'm not quite following you here. If context contains a ptransform id, then
it cannot be associated with a single name.


> Could the URN be "beam:namespace:name" and every metric have a map of
> key-value pairs for context?
>

The URN is the name. Something like
"beam:metric:ptransform_execution_times:v1."


> Not sure where this fits in the discussion or if this is handled
> somewhere, but allowing for a metric configuration that's provided
> independently of the value allows for configuring "type", "units", etc in a
> uniform way without having to encode them in the metric name / value.
> Stackdriver expects each metric type has been configured ahead of time with
> these annotations / metadata.  Then values are reported separately.  For
> system metrics, the definitions can be packaged with the SDK.  For user
> metrics, they'd be defined at runtime.
>

This feels like the metrics spec, that specifies that the metric with
name/URN X has this type plus a bunch of other metadata (e.g. units, if
they're not implicit in the type? This gets into whether the type should be
Duration{Sum,Max,Distribution,...} vs. Int{Sum,Max,Distribution,...} +
units metadata).


>
>
> On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:
>
>>
>>
>> On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles <kl...@google.com> wrote:
>>>
>>>> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> Also, the only use for payloads is because "User Counter" is currently
>>>>> a single URN, rather than using the namespacing characteristics of URNs to
>>>>> map user names onto distinct metric names.
>>>>>
>>>>
>>>> Can they be URNs? I don't see value in having a "user metric" URN where
>>>> you then have to look elsewhere for what the real name is.
>>>>
>>>
>>> Yes, that was my point with the parenthetical statement. I would rather
>>> have "beam:counter:user:use_provide_namespace:user_provide_name" than use
>>> the payload field for this. So if we're going to keep the payload field, we
>>> need more compelling usecases.
>>>
>>
>> Or just "beam:counter:<namespace>:<name>" or even
>> "beam:metric:<namespace>:<name>" since metrics have a type separate from
>> their name.
>>
>> Kenn
>>
>>
>>> A payload avoids the messiness of having to pack (and parse) arbitrary
>>>> parameters into a name though.) If we're going to choose names that the
>>>> system and sdks agree to have specific meanings, and to avoid accidental
>>>> collisions, making them full-fledged documented URNs has value.
>>>>
>>>
>>>> Value is the "payload". Likely worth changing the name to avoid
>>>> confusion with the payload above. It's bytes because it depends on the
>>>> type. I would try to avoid nesting it too deeply (e.g. a payload within a
>>>> payload). If we thing the types are generally limited, another option would
>>>> be a oneof field (with a bytes option just in case) for transparency. There
>>>> are pros and cons going this route.
>>>>
>>>> Type is what I proposed we add, instead of it being implicit in the
>>>> name (and unknowable if one does not recognize the name). This makes things
>>>> more open-ended and easier to evolve and work with.
>>>>
>>>> Entity could be generalized to Label, or LabelSet if desired. But as
>>>> mentioned I think it makes sense to pull this out as a separate field,
>>>> especially when it makes sense to aggregate a single named counter across
>>>> labels as well as for a single label (e.g. execution time of composite
>>>> transforms).
>>>>
>>>> - Robert
>>>>
>>>>
>>>>
>>>> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <fo...@google.com>
>>>> wrote:
>>>>
>>>>> Hi folks -
>>>>>
>>>>> Before we totally go down the path of highly structured metric protos,
>>>>> I'd like to propose considering a simple metrics interface between the SDK
>>>>> and the runner.  Something more generic and closer to what most monitoring
>>>>> systems would use.
>>>>>
>>>>> To use Spark as an example, the Metric system uses a simple metric
>>>>> format of name, value and type to report all metrics in a single structure,
>>>>> regardless of the source or context of the metric.
>>>>>
>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>>>>
>>>>> The subsystems have contracts for what metrics they will expose and
>>>>> how they are calculated:
>>>>>
>>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>>>>
>>>>> Codifying the system metrics in the SDK seems perfectly reasonable -
>>>>> no reason to make the notion of metric generic at that level.  But at the
>>>>> point the metric is leaving the SDK and going to the runner, a simpler,
>>>>> generic encoding of the metrics might make it easier to adapt and maintain
>>>>> system.  The generic format can include information about downstream
>>>>> consumers, if that's useful.
>>>>>
>>>>> Spark supports a number of Metric Sinks - external monitoring
>>>>> systems.  If runners receive a simple list of metrics, implementing any
>>>>> number of Sinks for Beam would be straightforward and would generally be a
>>>>> one time implementation.  If instead all system metrics are sent embedded
>>>>> in a highly structured, semantically meaningful structure, runner code
>>>>> would need to be updated to support exporting the new metric. We seem to be
>>>>> heading in the direction of "if you don't understand this metric, you can't
>>>>> use it / export it".  But most systems seem to assume metrics are really
>>>>> simple named values that can be handled a priori.
>>>>>
>>>>> So I guess my primary question is:  Is it necessary for Beam to treat
>>>>> metrics as highly semantic, arbitrarily complex data?  Or could they
>>>>> possibly be the sort of simple named values as they are in most monitoring
>>>>> systems and in Spark?  With the SDK potentially providing scaffolding to
>>>>> add meaning and structure, but simplifying that out before leaving SDK
>>>>> code.  Is the coupling to a semantically meaningful structure between the
>>>>> SDK and runner and necessary complexity?
>>>>>
>>>>> Andrea
>>>>>
>>>>>
>>>>>
>>>>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> *Thank you for this clarification. I think the table of files fits
>>>>>>> into the model as one of type string-set (with union as aggregation). *
>>>>>>> Its not a list of files, its a list of metadata for each file,
>>>>>>> several pieces of data per file.
>>>>>>>
>>>>>>> Are you proposing that there would be separate URNs as well for each
>>>>>>> entity being measured then, so the the URN defines the type of entity being
>>>>>>> measured.
>>>>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>>>>>> PCollection entities
>>>>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>>>>>> PTransform entities
>>>>>>>
>>>>>>
>>>>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>>>>> execution times are never for PCollections, and even if they were it'd be
>>>>>> semantically a very different beast (which should not re-use the same URN).
>>>>>>
>>>>>> *message MetricSpec {*
>>>>>>> *  // (Required) A URN that describes the accompanying payload.*
>>>>>>> *  // For any URN that is not recognized (by whomever is inspecting*
>>>>>>> *  // it) the parameter payload should be treated as opaque and*
>>>>>>> *  // passed as-is.*
>>>>>>> *  string urn = 1;*
>>>>>>>
>>>>>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>>>>>> *  // the URN does not require any arguments, this may be omitted.*
>>>>>>> *  bytes parameters_payload = 2;*
>>>>>>>
>>>>>>> *  // (Required) A URN that describes the type of values this metric*
>>>>>>> *  // records (e.g. durations that should be summed).*
>>>>>>> *}*
>>>>>>>
>>>>>>> *message Metric[Values] {*
>>>>>>> * // (Required) The original requesting MetricSpec.*
>>>>>>> * MetricSpec metric_spec = 1;*
>>>>>>>
>>>>>>> * // A mapping of entities to (encoded) values.*
>>>>>>> * map<string, bytes> values;*
>>>>>>> This ignores the non-unqiueness of entity identifiers. This is why
>>>>>>> in my doc, I have specified the entity type and its string identifier
>>>>>>> @Ken, I believe you have pointed this out in the past, that
>>>>>>> uniqueness is only guaranteed within a type of entity (all PCollections),
>>>>>>> but not between entities (A Pcollection and PTransform may have the same
>>>>>>> identifier).
>>>>>>>
>>>>>>
>>>>>> See above for why this is not an issue. The extra complexity (in
>>>>>> protos and code), the inability to use them as map keys, and the fact that
>>>>>> they'll be 100% redundant for all entities for a given metric convinces me
>>>>>> that it's not worth creating and tracking an enum for the type alongside
>>>>>> the id.
>>>>>>
>>>>>>
>>>>>>> *}*
>>>>>>>
>>>>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <ro...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>> To Robert's proto:
>>>>>>>>>
>>>>>>>>>  // A mapping of entities to (encoded) values.
>>>>>>>>>>  map<string, bytes> values;
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Are the keys here the names of the metrics, aka what is used for
>>>>>>>>> URNs in the doc?
>>>>>>>>>
>>>>>>>>>>
>>>>>>>> They're the entities to which a metric is attached, e.g. a
>>>>>>>> PTransform, a PCollection, or perhaps a process/worker.
>>>>>>>>
>>>>>>>>
>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I
>>>>>>>>>>>> was going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <
>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics
>>>>>>>>>>>>> of supported type" didn't include new ways of aggregating ints. As long as
>>>>>>>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>>>>>>>> metrics right.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Ben
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters
>>>>>>>>>>>>>> are used a lot but others are less mainstream so being too fine from the
>>>>>>>>>>>>>> start can just add complexity and bugs in impls IMHO.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com>
>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> By "type" of metric, I mean both the data types (including
>>>>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be a type, as
>>>>>>>>>>>>>>> would double-distribution.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When you say type do you mean accumulator type, result
>>>>>>>>>>>>>>>> type, or accumulator strategy? Specifically, what is the "type" of sumint,
>>>>>>>>>>>>>>>> sumlong, meanlong, etc?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The metric api is designed to prevent user defined metric
>>>>>>>>>>>>>>>>>> types based on the fact they just weren't used enough to justify support.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're
>>>>>>>>>>>>>>>>>>> conflating the idea of custom metrics and custom metric types. I would
>>>>>>>>>>>>>>>>>>> propose the MetricSpec field be augmented with an additional field "type"
>>>>>>>>>>>>>>>>>>> which is a urn specifying the type of metric it is (i.e. the contents of
>>>>>>>>>>>>>>>>>>> its payload, as well as the form of aggregation). Summing or maxing over
>>>>>>>>>>>>>>>>>>> ints would be a typical example. Though we could pursue making this opaque
>>>>>>>>>>>>>>>>>>> to the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for
>>>>>>>>>>>>>>>>>>> every type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>>>>>> I have made a revision today which is to make all
>>>>>>>>>>>>>>>>>>>> metrics refer to a primary entity, so I have restructured some of the
>>>>>>>>>>>>>>>>>>>> protos a little bit.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The point of this change was to futureproof the
>>>>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom aggregation
>>>>>>>>>>>>>>>>>>>> functions for its metric updates.
>>>>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity
>>>>>>>>>>>>>>>>>>>> associated with it (e.g. PCollection, PTransform), we can design an
>>>>>>>>>>>>>>>>>>>> approach which forwards the opaque bytes metric updates, without
>>>>>>>>>>>>>>>>>>>> deserializing them. These are forwarded to user provided code which then
>>>>>>>>>>>>>>>>>>>> would deserialize the metric update payloads and perform the custom
>>>>>>>>>>>>>>>>>>>> aggregations.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to
>>>>>>>>>>>>>>>>>>>> draw attention to this particular revision.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to
>>>>>>>>>>>>>>>>>>>>> make a decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback before
>>>>>>>>>>>>>>>>>>>>> then and I will post the final decisions made to this thread Friday
>>>>>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to
>>>>>>>>>>>>>>>>>>>>>> it easily in
>>>>>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on
>>>>>>>>>>>>>>>>>>>>>> this proposal so far. I
>>>>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback.
>>>>>>>>>>>>>>>>>>>>>> There were some larger
>>>>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of
>>>>>>>>>>>>>>>>>>>>>> these I have added a
>>>>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed
>>>>>>>>>>>>>>>>>>>>>> my recommendation as well
>>>>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please
>>>>>>>>>>>>>>>>>>>>>> take another look after
>>>>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Andrea Foegler <fo...@google.com>.

I like the generalization from entity -> labels.  I view the purpose of
those fields to provide context.  And labels feel like they supports a
richer set of contexts.

The URN concept gets a little tricky.  I totally agree that the context
fields should not be embedded in the name.
There's a "name" which is the identifier that can be used to communicate
what context values are supported / allowed for metrics with that name (for
example, element_count expects a ptransform ID).  But then there's the
context.  In Stackdriver, this context is a map of key-value pairs; the
type is considered metadata associated with the name, but not communicated
with the value.  Could the URN be "beam:namespace:name" and every metric
have a map of key-value pairs for context?

Not sure where this fits in the discussion or if this is handled somewhere,
but allowing for a metric configuration that's provided independently of
the value allows for configuring "type", "units", etc in a uniform way
without having to encode them in the metric name / value.  Stackdriver
expects each metric type has been configured ahead of time with these
annotations / metadata.  Then values are reported separately.  For system
metrics, the definitions can be packaged with the SDK.  For user metrics,
they'd be defined at runtime.



On Fri, Apr 13, 2018 at 1:32 PM Kenneth Knowles <kl...@google.com> wrote:

>
>
> On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles <kl...@google.com> wrote:
>>
>>> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> Also, the only use for payloads is because "User Counter" is currently
>>>> a single URN, rather than using the namespacing characteristics of URNs to
>>>> map user names onto distinct metric names.
>>>>
>>>
>>> Can they be URNs? I don't see value in having a "user metric" URN where
>>> you then have to look elsewhere for what the real name is.
>>>
>>
>> Yes, that was my point with the parenthetical statement. I would rather
>> have "beam:counter:user:use_provide_namespace:user_provide_name" than use
>> the payload field for this. So if we're going to keep the payload field, we
>> need more compelling usecases.
>>
>
> Or just "beam:counter:<namespace>:<name>" or even
> "beam:metric:<namespace>:<name>" since metrics have a type separate from
> their name.
>
> Kenn
>
>
>> A payload avoids the messiness of having to pack (and parse) arbitrary
>>> parameters into a name though.) If we're going to choose names that the
>>> system and sdks agree to have specific meanings, and to avoid accidental
>>> collisions, making them full-fledged documented URNs has value.
>>>
>>
>>> Value is the "payload". Likely worth changing the name to avoid
>>> confusion with the payload above. It's bytes because it depends on the
>>> type. I would try to avoid nesting it too deeply (e.g. a payload within a
>>> payload). If we thing the types are generally limited, another option would
>>> be a oneof field (with a bytes option just in case) for transparency. There
>>> are pros and cons going this route.
>>>
>>> Type is what I proposed we add, instead of it being implicit in the name
>>> (and unknowable if one does not recognize the name). This makes things more
>>> open-ended and easier to evolve and work with.
>>>
>>> Entity could be generalized to Label, or LabelSet if desired. But as
>>> mentioned I think it makes sense to pull this out as a separate field,
>>> especially when it makes sense to aggregate a single named counter across
>>> labels as well as for a single label (e.g. execution time of composite
>>> transforms).
>>>
>>> - Robert
>>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <fo...@google.com>
>>> wrote:
>>>
>>>> Hi folks -
>>>>
>>>> Before we totally go down the path of highly structured metric protos,
>>>> I'd like to propose considering a simple metrics interface between the SDK
>>>> and the runner.  Something more generic and closer to what most monitoring
>>>> systems would use.
>>>>
>>>> To use Spark as an example, the Metric system uses a simple metric
>>>> format of name, value and type to report all metrics in a single structure,
>>>> regardless of the source or context of the metric.
>>>>
>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>>>
>>>> The subsystems have contracts for what metrics they will expose and how
>>>> they are calculated:
>>>>
>>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>>>
>>>> Codifying the system metrics in the SDK seems perfectly reasonable - no
>>>> reason to make the notion of metric generic at that level.  But at the
>>>> point the metric is leaving the SDK and going to the runner, a simpler,
>>>> generic encoding of the metrics might make it easier to adapt and maintain
>>>> system.  The generic format can include information about downstream
>>>> consumers, if that's useful.
>>>>
>>>> Spark supports a number of Metric Sinks - external monitoring systems.
>>>> If runners receive a simple list of metrics, implementing any number of
>>>> Sinks for Beam would be straightforward and would generally be a one time
>>>> implementation.  If instead all system metrics are sent embedded in a
>>>> highly structured, semantically meaningful structure, runner code would
>>>> need to be updated to support exporting the new metric. We seem to be
>>>> heading in the direction of "if you don't understand this metric, you can't
>>>> use it / export it".  But most systems seem to assume metrics are really
>>>> simple named values that can be handled a priori.
>>>>
>>>> So I guess my primary question is:  Is it necessary for Beam to treat
>>>> metrics as highly semantic, arbitrarily complex data?  Or could they
>>>> possibly be the sort of simple named values as they are in most monitoring
>>>> systems and in Spark?  With the SDK potentially providing scaffolding to
>>>> add meaning and structure, but simplifying that out before leaving SDK
>>>> code.  Is the coupling to a semantically meaningful structure between the
>>>> SDK and runner and necessary complexity?
>>>>
>>>> Andrea
>>>>
>>>>
>>>>
>>>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> *Thank you for this clarification. I think the table of files fits
>>>>>> into the model as one of type string-set (with union as aggregation). *
>>>>>> Its not a list of files, its a list of metadata for each file,
>>>>>> several pieces of data per file.
>>>>>>
>>>>>> Are you proposing that there would be separate URNs as well for each
>>>>>> entity being measured then, so the the URN defines the type of entity being
>>>>>> measured.
>>>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>>>>> PCollection entities
>>>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>>>>> PTransform entities
>>>>>>
>>>>>
>>>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>>>> execution times are never for PCollections, and even if they were it'd be
>>>>> semantically a very different beast (which should not re-use the same URN).
>>>>>
>>>>> *message MetricSpec {*
>>>>>> *  // (Required) A URN that describes the accompanying payload.*
>>>>>> *  // For any URN that is not recognized (by whomever is inspecting*
>>>>>> *  // it) the parameter payload should be treated as opaque and*
>>>>>> *  // passed as-is.*
>>>>>> *  string urn = 1;*
>>>>>>
>>>>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>>>>> *  // the URN does not require any arguments, this may be omitted.*
>>>>>> *  bytes parameters_payload = 2;*
>>>>>>
>>>>>> *  // (Required) A URN that describes the type of values this metric*
>>>>>> *  // records (e.g. durations that should be summed).*
>>>>>> *}*
>>>>>>
>>>>>> *message Metric[Values] {*
>>>>>> * // (Required) The original requesting MetricSpec.*
>>>>>> * MetricSpec metric_spec = 1;*
>>>>>>
>>>>>> * // A mapping of entities to (encoded) values.*
>>>>>> * map<string, bytes> values;*
>>>>>> This ignores the non-unqiueness of entity identifiers. This is why in
>>>>>> my doc, I have specified the entity type and its string identifier
>>>>>> @Ken, I believe you have pointed this out in the past, that
>>>>>> uniqueness is only guaranteed within a type of entity (all PCollections),
>>>>>> but not between entities (A Pcollection and PTransform may have the same
>>>>>> identifier).
>>>>>>
>>>>>
>>>>> See above for why this is not an issue. The extra complexity (in
>>>>> protos and code), the inability to use them as map keys, and the fact that
>>>>> they'll be 100% redundant for all entities for a given metric convinces me
>>>>> that it's not worth creating and tracking an enum for the type alongside
>>>>> the id.
>>>>>
>>>>>
>>>>>> *}*
>>>>>>
>>>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> To Robert's proto:
>>>>>>>>
>>>>>>>>  // A mapping of entities to (encoded) values.
>>>>>>>>>  map<string, bytes> values;
>>>>>>>>>
>>>>>>>>
>>>>>>>> Are the keys here the names of the metrics, aka what is used for
>>>>>>>> URNs in the doc?
>>>>>>>>
>>>>>>>>>
>>>>>>> They're the entities to which a metric is attached, e.g. a
>>>>>>> PTransform, a PCollection, or perhaps a process/worker.
>>>>>>>
>>>>>>>
>>>>>>>> }
>>>>>>>>>
>>>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I was
>>>>>>>>>>> going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <
>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics
>>>>>>>>>>>> of supported type" didn't include new ways of aggregating ints. As long as
>>>>>>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>>>>>>> metrics right.
>>>>>>>>>>>>
>>>>>>>>>>>> - Ben
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters are
>>>>>>>>>>>>> used a lot but others are less mainstream so being too fine from the start
>>>>>>>>>>>>> can just add complexity and bugs in impls IMHO.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com>
>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>>> By "type" of metric, I mean both the data types (including
>>>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be a type, as
>>>>>>>>>>>>>> would double-distribution.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When you say type do you mean accumulator type, result type,
>>>>>>>>>>>>>>> or accumulator strategy? Specifically, what is the "type" of sumint,
>>>>>>>>>>>>>>> sumlong, meanlong, etc?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The metric api is designed to prevent user defined metric
>>>>>>>>>>>>>>>>> types based on the fact they just weren't used enough to justify support.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're
>>>>>>>>>>>>>>>>>> conflating the idea of custom metrics and custom metric types. I would
>>>>>>>>>>>>>>>>>> propose the MetricSpec field be augmented with an additional field "type"
>>>>>>>>>>>>>>>>>> which is a urn specifying the type of metric it is (i.e. the contents of
>>>>>>>>>>>>>>>>>> its payload, as well as the form of aggregation). Summing or maxing over
>>>>>>>>>>>>>>>>>> ints would be a typical example. Though we could pursue making this opaque
>>>>>>>>>>>>>>>>>> to the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for
>>>>>>>>>>>>>>>>>> every type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>>>>> I have made a revision today which is to make all
>>>>>>>>>>>>>>>>>>> metrics refer to a primary entity, so I have restructured some of the
>>>>>>>>>>>>>>>>>>> protos a little bit.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The point of this change was to futureproof the
>>>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom aggregation
>>>>>>>>>>>>>>>>>>> functions for its metric updates.
>>>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity
>>>>>>>>>>>>>>>>>>> associated with it (e.g. PCollection, PTransform), we can design an
>>>>>>>>>>>>>>>>>>> approach which forwards the opaque bytes metric updates, without
>>>>>>>>>>>>>>>>>>> deserializing them. These are forwarded to user provided code which then
>>>>>>>>>>>>>>>>>>> would deserialize the metric update payloads and perform the custom
>>>>>>>>>>>>>>>>>>> aggregations.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make
>>>>>>>>>>>>>>>>>>>> a decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback before
>>>>>>>>>>>>>>>>>>>> then and I will post the final decisions made to this thread Friday
>>>>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it
>>>>>>>>>>>>>>>>>>>>> easily in
>>>>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on
>>>>>>>>>>>>>>>>>>>>> this proposal so far. I
>>>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback.
>>>>>>>>>>>>>>>>>>>>> There were some larger
>>>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of
>>>>>>>>>>>>>>>>>>>>> these I have added a
>>>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please
>>>>>>>>>>>>>>>>>>>>> take another look after
>>>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Kenneth Knowles <kl...@google.com>.

On Fri, Apr 13, 2018 at 1:27 PM Robert Bradshaw <ro...@google.com> wrote:

> On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles <kl...@google.com> wrote:
>
>> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> Also, the only use for payloads is because "User Counter" is currently a
>>> single URN, rather than using the namespacing characteristics of URNs to
>>> map user names onto distinct metric names.
>>>
>>
>> Can they be URNs? I don't see value in having a "user metric" URN where
>> you then have to look elsewhere for what the real name is.
>>
>
> Yes, that was my point with the parenthetical statement. I would rather
> have "beam:counter:user:use_provide_namespace:user_provide_name" than use
> the payload field for this. So if we're going to keep the payload field, we
> need more compelling usecases.
>

Or just "beam:counter:<namespace>:<name>" or even
"beam:metric:<namespace>:<name>" since metrics have a type separate from
their name.

Kenn


> A payload avoids the messiness of having to pack (and parse) arbitrary
>> parameters into a name though.) If we're going to choose names that the
>> system and sdks agree to have specific meanings, and to avoid accidental
>> collisions, making them full-fledged documented URNs has value.
>>
>
>> Value is the "payload". Likely worth changing the name to avoid confusion
>> with the payload above. It's bytes because it depends on the type. I would
>> try to avoid nesting it too deeply (e.g. a payload within a payload). If we
>> thing the types are generally limited, another option would be a oneof
>> field (with a bytes option just in case) for transparency. There are pros
>> and cons going this route.
>>
>> Type is what I proposed we add, instead of it being implicit in the name
>> (and unknowable if one does not recognize the name). This makes things more
>> open-ended and easier to evolve and work with.
>>
>> Entity could be generalized to Label, or LabelSet if desired. But as
>> mentioned I think it makes sense to pull this out as a separate field,
>> especially when it makes sense to aggregate a single named counter across
>> labels as well as for a single label (e.g. execution time of composite
>> transforms).
>>
>> - Robert
>>
>>
>>
>> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <fo...@google.com>
>> wrote:
>>
>>> Hi folks -
>>>
>>> Before we totally go down the path of highly structured metric protos,
>>> I'd like to propose considering a simple metrics interface between the SDK
>>> and the runner.  Something more generic and closer to what most monitoring
>>> systems would use.
>>>
>>> To use Spark as an example, the Metric system uses a simple metric
>>> format of name, value and type to report all metrics in a single structure,
>>> regardless of the source or context of the metric.
>>>
>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>>
>>> The subsystems have contracts for what metrics they will expose and how
>>> they are calculated:
>>>
>>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>>
>>> Codifying the system metrics in the SDK seems perfectly reasonable - no
>>> reason to make the notion of metric generic at that level.  But at the
>>> point the metric is leaving the SDK and going to the runner, a simpler,
>>> generic encoding of the metrics might make it easier to adapt and maintain
>>> system.  The generic format can include information about downstream
>>> consumers, if that's useful.
>>>
>>> Spark supports a number of Metric Sinks - external monitoring systems.
>>> If runners receive a simple list of metrics, implementing any number of
>>> Sinks for Beam would be straightforward and would generally be a one time
>>> implementation.  If instead all system metrics are sent embedded in a
>>> highly structured, semantically meaningful structure, runner code would
>>> need to be updated to support exporting the new metric. We seem to be
>>> heading in the direction of "if you don't understand this metric, you can't
>>> use it / export it".  But most systems seem to assume metrics are really
>>> simple named values that can be handled a priori.
>>>
>>> So I guess my primary question is:  Is it necessary for Beam to treat
>>> metrics as highly semantic, arbitrarily complex data?  Or could they
>>> possibly be the sort of simple named values as they are in most monitoring
>>> systems and in Spark?  With the SDK potentially providing scaffolding to
>>> add meaning and structure, but simplifying that out before leaving SDK
>>> code.  Is the coupling to a semantically meaningful structure between the
>>> SDK and runner and necessary complexity?
>>>
>>> Andrea
>>>
>>>
>>>
>>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com> wrote:
>>>>
>>>>>
>>>>> *Thank you for this clarification. I think the table of files fits
>>>>> into the model as one of type string-set (with union as aggregation). *
>>>>> Its not a list of files, its a list of metadata for each file, several
>>>>> pieces of data per file.
>>>>>
>>>>> Are you proposing that there would be separate URNs as well for each
>>>>> entity being measured then, so the the URN defines the type of entity being
>>>>> measured.
>>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>>>> PCollection entities
>>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>>>> PTransform entities
>>>>>
>>>>
>>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>>> execution times are never for PCollections, and even if they were it'd be
>>>> semantically a very different beast (which should not re-use the same URN).
>>>>
>>>> *message MetricSpec {*
>>>>> *  // (Required) A URN that describes the accompanying payload.*
>>>>> *  // For any URN that is not recognized (by whomever is inspecting*
>>>>> *  // it) the parameter payload should be treated as opaque and*
>>>>> *  // passed as-is.*
>>>>> *  string urn = 1;*
>>>>>
>>>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>>>> *  // the URN does not require any arguments, this may be omitted.*
>>>>> *  bytes parameters_payload = 2;*
>>>>>
>>>>> *  // (Required) A URN that describes the type of values this metric*
>>>>> *  // records (e.g. durations that should be summed).*
>>>>> *}*
>>>>>
>>>>> *message Metric[Values] {*
>>>>> * // (Required) The original requesting MetricSpec.*
>>>>> * MetricSpec metric_spec = 1;*
>>>>>
>>>>> * // A mapping of entities to (encoded) values.*
>>>>> * map<string, bytes> values;*
>>>>> This ignores the non-unqiueness of entity identifiers. This is why in
>>>>> my doc, I have specified the entity type and its string identifier
>>>>> @Ken, I believe you have pointed this out in the past, that uniqueness
>>>>> is only guaranteed within a type of entity (all PCollections), but not
>>>>> between entities (A Pcollection and PTransform may have the same
>>>>> identifier).
>>>>>
>>>>
>>>> See above for why this is not an issue. The extra complexity (in protos
>>>> and code), the inability to use them as map keys, and the fact that they'll
>>>> be 100% redundant for all entities for a given metric convinces me that
>>>> it's not worth creating and tracking an enum for the type alongside the id.
>>>>
>>>>
>>>>> *}*
>>>>>
>>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>>
>>>>>>> To Robert's proto:
>>>>>>>
>>>>>>>  // A mapping of entities to (encoded) values.
>>>>>>>>  map<string, bytes> values;
>>>>>>>>
>>>>>>>
>>>>>>> Are the keys here the names of the metrics, aka what is used for
>>>>>>> URNs in the doc?
>>>>>>>
>>>>>>>>
>>>>>> They're the entities to which a metric is attached, e.g. a
>>>>>> PTransform, a PCollection, or perhaps a process/worker.
>>>>>>
>>>>>>
>>>>>>> }
>>>>>>>>
>>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I was
>>>>>>>>>> going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <
>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>>>>>>>>>> supported type" didn't include new ways of aggregating ints. As long as
>>>>>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>>>>>> metrics right.
>>>>>>>>>>>
>>>>>>>>>>> - Ben
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters are
>>>>>>>>>>>> used a lot but others are less mainstream so being too fine from the start
>>>>>>>>>>>> can just add complexity and bugs in impls IMHO.
>>>>>>>>>>>>
>>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com>
>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>> By "type" of metric, I mean both the data types (including
>>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be a type, as
>>>>>>>>>>>>> would double-distribution.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> When you say type do you mean accumulator type, result type,
>>>>>>>>>>>>>> or accumulator strategy? Specifically, what is the "type" of sumint,
>>>>>>>>>>>>>> sumlong, meanlong, etc?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The metric api is designed to prevent user defined metric
>>>>>>>>>>>>>>>> types based on the fact they just weren't used enough to justify support.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're conflating
>>>>>>>>>>>>>>>>> the idea of custom metrics and custom metric types. I would propose
>>>>>>>>>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for
>>>>>>>>>>>>>>>>> every type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>>>> I have made a revision today which is to make all metrics
>>>>>>>>>>>>>>>>>> refer to a primary entity, so I have restructured some of the protos a
>>>>>>>>>>>>>>>>>> little bit.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The point of this change was to futureproof the
>>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom aggregation
>>>>>>>>>>>>>>>>>> functions for its metric updates.
>>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity associated
>>>>>>>>>>>>>>>>>> with it (e.g. PCollection, PTransform), we can design an approach which
>>>>>>>>>>>>>>>>>> forwards the opaque bytes metric updates, without deserializing them. These
>>>>>>>>>>>>>>>>>> are forwarded to user provided code which then would deserialize the metric
>>>>>>>>>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make
>>>>>>>>>>>>>>>>>>> a decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback before
>>>>>>>>>>>>>>>>>>> then and I will post the final decisions made to this thread Friday
>>>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it
>>>>>>>>>>>>>>>>>>>> easily in
>>>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback.
>>>>>>>>>>>>>>>>>>>> There were some larger
>>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of
>>>>>>>>>>>>>>>>>>>> these I have added a
>>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please
>>>>>>>>>>>>>>>>>>>> take another look after
>>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

On Fri, Apr 13, 2018 at 1:19 PM Kenneth Knowles <kl...@google.com> wrote:

> On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> Also, the only use for payloads is because "User Counter" is currently a
>> single URN, rather than using the namespacing characteristics of URNs to
>> map user names onto distinct metric names.
>>
>
> Can they be URNs? I don't see value in having a "user metric" URN where
> you then have to look elsewhere for what the real name is.
>

Yes, that was my point with the parenthetical statement. I would rather
have "beam:counter:user:use_provide_namespace:user_provide_name" than use
the payload field for this. So if we're going to keep the payload field, we
need more compelling usecases.


> A payload avoids the messiness of having to pack (and parse) arbitrary
> parameters into a name though.) If we're going to choose names that the
> system and sdks agree to have specific meanings, and to avoid accidental
> collisions, making them full-fledged documented URNs has value.
>

> Value is the "payload". Likely worth changing the name to avoid confusion
> with the payload above. It's bytes because it depends on the type. I would
> try to avoid nesting it too deeply (e.g. a payload within a payload). If we
> thing the types are generally limited, another option would be a oneof
> field (with a bytes option just in case) for transparency. There are pros
> and cons going this route.
>
> Type is what I proposed we add, instead of it being implicit in the name
> (and unknowable if one does not recognize the name). This makes things more
> open-ended and easier to evolve and work with.
>
> Entity could be generalized to Label, or LabelSet if desired. But as
> mentioned I think it makes sense to pull this out as a separate field,
> especially when it makes sense to aggregate a single named counter across
> labels as well as for a single label (e.g. execution time of composite
> transforms).
>
> - Robert
>
>
>
> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <fo...@google.com>
> wrote:
>
>> Hi folks -
>>
>> Before we totally go down the path of highly structured metric protos,
>> I'd like to propose considering a simple metrics interface between the SDK
>> and the runner.  Something more generic and closer to what most monitoring
>> systems would use.
>>
>> To use Spark as an example, the Metric system uses a simple metric format
>> of name, value and type to report all metrics in a single structure,
>> regardless of the source or context of the metric.
>>
>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>
>> The subsystems have contracts for what metrics they will expose and how
>> they are calculated:
>>
>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>
>> Codifying the system metrics in the SDK seems perfectly reasonable - no
>> reason to make the notion of metric generic at that level.  But at the
>> point the metric is leaving the SDK and going to the runner, a simpler,
>> generic encoding of the metrics might make it easier to adapt and maintain
>> system.  The generic format can include information about downstream
>> consumers, if that's useful.
>>
>> Spark supports a number of Metric Sinks - external monitoring systems.
>> If runners receive a simple list of metrics, implementing any number of
>> Sinks for Beam would be straightforward and would generally be a one time
>> implementation.  If instead all system metrics are sent embedded in a
>> highly structured, semantically meaningful structure, runner code would
>> need to be updated to support exporting the new metric. We seem to be
>> heading in the direction of "if you don't understand this metric, you can't
>> use it / export it".  But most systems seem to assume metrics are really
>> simple named values that can be handled a priori.
>>
>> So I guess my primary question is:  Is it necessary for Beam to treat
>> metrics as highly semantic, arbitrarily complex data?  Or could they
>> possibly be the sort of simple named values as they are in most monitoring
>> systems and in Spark?  With the SDK potentially providing scaffolding to
>> add meaning and structure, but simplifying that out before leaving SDK
>> code.  Is the coupling to a semantically meaningful structure between the
>> SDK and runner and necessary complexity?
>>
>> Andrea
>>
>>
>>
>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com> wrote:
>>>
>>>>
>>>> *Thank you for this clarification. I think the table of files fits into
>>>> the model as one of type string-set (with union as aggregation). *
>>>> Its not a list of files, its a list of metadata for each file, several
>>>> pieces of data per file.
>>>>
>>>> Are you proposing that there would be separate URNs as well for each
>>>> entity being measured then, so the the URN defines the type of entity being
>>>> measured.
>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>>> PCollection entities
>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>>> PTransform entities
>>>>
>>>
>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>> execution times are never for PCollections, and even if they were it'd be
>>> semantically a very different beast (which should not re-use the same URN).
>>>
>>> *message MetricSpec {*
>>>> *  // (Required) A URN that describes the accompanying payload.*
>>>> *  // For any URN that is not recognized (by whomever is inspecting*
>>>> *  // it) the parameter payload should be treated as opaque and*
>>>> *  // passed as-is.*
>>>> *  string urn = 1;*
>>>>
>>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>>> *  // the URN does not require any arguments, this may be omitted.*
>>>> *  bytes parameters_payload = 2;*
>>>>
>>>> *  // (Required) A URN that describes the type of values this metric*
>>>> *  // records (e.g. durations that should be summed).*
>>>> *}*
>>>>
>>>> *message Metric[Values] {*
>>>> * // (Required) The original requesting MetricSpec.*
>>>> * MetricSpec metric_spec = 1;*
>>>>
>>>> * // A mapping of entities to (encoded) values.*
>>>> * map<string, bytes> values;*
>>>> This ignores the non-unqiueness of entity identifiers. This is why in
>>>> my doc, I have specified the entity type and its string identifier
>>>> @Ken, I believe you have pointed this out in the past, that uniqueness
>>>> is only guaranteed within a type of entity (all PCollections), but not
>>>> between entities (A Pcollection and PTransform may have the same
>>>> identifier).
>>>>
>>>
>>> See above for why this is not an issue. The extra complexity (in protos
>>> and code), the inability to use them as map keys, and the fact that they'll
>>> be 100% redundant for all entities for a given metric convinces me that
>>> it's not worth creating and tracking an enum for the type alongside the id.
>>>
>>>
>>>> *}*
>>>>
>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> To Robert's proto:
>>>>>>
>>>>>>  // A mapping of entities to (encoded) values.
>>>>>>>  map<string, bytes> values;
>>>>>>>
>>>>>>
>>>>>> Are the keys here the names of the metrics, aka what is used for URNs
>>>>>> in the doc?
>>>>>>
>>>>>>>
>>>>> They're the entities to which a metric is attached, e.g. a PTransform,
>>>>> a PCollection, or perhaps a process/worker.
>>>>>
>>>>>
>>>>>> }
>>>>>>>
>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I was
>>>>>>>>> going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>>>>>>>>> supported type" didn't include new ways of aggregating ints. As long as
>>>>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>>>>> metrics right.
>>>>>>>>>>
>>>>>>>>>> - Ben
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters are
>>>>>>>>>>> used a lot but others are less mainstream so being too fine from the start
>>>>>>>>>>> can just add complexity and bugs in impls IMHO.
>>>>>>>>>>>
>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com>
>>>>>>>>>>> a écrit :
>>>>>>>>>>>
>>>>>>>>>>>> By "type" of metric, I mean both the data types (including
>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be a type, as
>>>>>>>>>>>> would double-distribution.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> When you say type do you mean accumulator type, result type,
>>>>>>>>>>>>> or accumulator strategy? Specifically, what is the "type" of sumint,
>>>>>>>>>>>>> sumlong, meanlong, etc?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The metric api is designed to prevent user defined metric
>>>>>>>>>>>>>>> types based on the fact they just weren't used enough to justify support.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're conflating
>>>>>>>>>>>>>>>> the idea of custom metrics and custom metric types. I would propose
>>>>>>>>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for every
>>>>>>>>>>>>>>>> type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>>> I have made a revision today which is to make all metrics
>>>>>>>>>>>>>>>>> refer to a primary entity, so I have restructured some of the protos a
>>>>>>>>>>>>>>>>> little bit.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The point of this change was to futureproof the
>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom aggregation
>>>>>>>>>>>>>>>>> functions for its metric updates.
>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity associated
>>>>>>>>>>>>>>>>> with it (e.g. PCollection, PTransform), we can design an approach which
>>>>>>>>>>>>>>>>> forwards the opaque bytes metric updates, without deserializing them. These
>>>>>>>>>>>>>>>>> are forwarded to user provided code which then would deserialize the metric
>>>>>>>>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback before
>>>>>>>>>>>>>>>>>> then and I will post the final decisions made to this thread Friday
>>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it
>>>>>>>>>>>>>>>>>>> easily in
>>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback. There
>>>>>>>>>>>>>>>>>>> were some larger
>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of
>>>>>>>>>>>>>>>>>>> these I have added a
>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please
>>>>>>>>>>>>>>>>>>> take another look after
>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Kenneth Knowles <kl...@google.com>.

On Fri, Apr 13, 2018 at 1:07 PM Robert Bradshaw <ro...@google.com> wrote:

> Also, the only use for payloads is because "User Counter" is currently a
> single URN, rather than using the namespacing characteristics of URNs to
> map user names onto distinct metric names.
>

Can they be URNs? I don't see value in having a "user metric" URN where you
then have to look elsewhere for what the real name is.

Kenn


A payload avoids the messiness of having to pack (and parse) arbitrary
> parameters into a name though.) If we're going to choose names that the
> system and sdks agree to have specific meanings, and to avoid accidental
> collisions, making them full-fledged documented URNs has value.
>
> Value is the "payload". Likely worth changing the name to avoid confusion
> with the payload above. It's bytes because it depends on the type. I would
> try to avoid nesting it too deeply (e.g. a payload within a payload). If we
> thing the types are generally limited, another option would be a oneof
> field (with a bytes option just in case) for transparency. There are pros
> and cons going this route.
>
> Type is what I proposed we add, instead of it being implicit in the name
> (and unknowable if one does not recognize the name). This makes things more
> open-ended and easier to evolve and work with.
>
> Entity could be generalized to Label, or LabelSet if desired. But as
> mentioned I think it makes sense to pull this out as a separate field,
> especially when it makes sense to aggregate a single named counter across
> labels as well as for a single label (e.g. execution time of composite
> transforms).
>
> - Robert
>
>
>
> On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <fo...@google.com>
> wrote:
>
>> Hi folks -
>>
>> Before we totally go down the path of highly structured metric protos,
>> I'd like to propose considering a simple metrics interface between the SDK
>> and the runner.  Something more generic and closer to what most monitoring
>> systems would use.
>>
>> To use Spark as an example, the Metric system uses a simple metric format
>> of name, value and type to report all metrics in a single structure,
>> regardless of the source or context of the metric.
>>
>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>>
>> The subsystems have contracts for what metrics they will expose and how
>> they are calculated:
>>
>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>>
>> Codifying the system metrics in the SDK seems perfectly reasonable - no
>> reason to make the notion of metric generic at that level.  But at the
>> point the metric is leaving the SDK and going to the runner, a simpler,
>> generic encoding of the metrics might make it easier to adapt and maintain
>> system.  The generic format can include information about downstream
>> consumers, if that's useful.
>>
>> Spark supports a number of Metric Sinks - external monitoring systems.
>> If runners receive a simple list of metrics, implementing any number of
>> Sinks for Beam would be straightforward and would generally be a one time
>> implementation.  If instead all system metrics are sent embedded in a
>> highly structured, semantically meaningful structure, runner code would
>> need to be updated to support exporting the new metric. We seem to be
>> heading in the direction of "if you don't understand this metric, you can't
>> use it / export it".  But most systems seem to assume metrics are really
>> simple named values that can be handled a priori.
>>
>> So I guess my primary question is:  Is it necessary for Beam to treat
>> metrics as highly semantic, arbitrarily complex data?  Or could they
>> possibly be the sort of simple named values as they are in most monitoring
>> systems and in Spark?  With the SDK potentially providing scaffolding to
>> add meaning and structure, but simplifying that out before leaving SDK
>> code.  Is the coupling to a semantically meaningful structure between the
>> SDK and runner and necessary complexity?
>>
>> Andrea
>>
>>
>>
>> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com> wrote:
>>>
>>>>
>>>> *Thank you for this clarification. I think the table of files fits into
>>>> the model as one of type string-set (with union as aggregation). *
>>>> Its not a list of files, its a list of metadata for each file, several
>>>> pieces of data per file.
>>>>
>>>> Are you proposing that there would be separate URNs as well for each
>>>> entity being measured then, so the the URN defines the type of entity being
>>>> measured.
>>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>>> PCollection entities
>>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>>> PTransform entities
>>>>
>>>
>>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>>> execution times are never for PCollections, and even if they were it'd be
>>> semantically a very different beast (which should not re-use the same URN).
>>>
>>> *message MetricSpec {*
>>>> *  // (Required) A URN that describes the accompanying payload.*
>>>> *  // For any URN that is not recognized (by whomever is inspecting*
>>>> *  // it) the parameter payload should be treated as opaque and*
>>>> *  // passed as-is.*
>>>> *  string urn = 1;*
>>>>
>>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>>> *  // the URN does not require any arguments, this may be omitted.*
>>>> *  bytes parameters_payload = 2;*
>>>>
>>>> *  // (Required) A URN that describes the type of values this metric*
>>>> *  // records (e.g. durations that should be summed).*
>>>> *}*
>>>>
>>>> *message Metric[Values] {*
>>>> * // (Required) The original requesting MetricSpec.*
>>>> * MetricSpec metric_spec = 1;*
>>>>
>>>> * // A mapping of entities to (encoded) values.*
>>>> * map<string, bytes> values;*
>>>> This ignores the non-unqiueness of entity identifiers. This is why in
>>>> my doc, I have specified the entity type and its string identifier
>>>> @Ken, I believe you have pointed this out in the past, that uniqueness
>>>> is only guaranteed within a type of entity (all PCollections), but not
>>>> between entities (A Pcollection and PTransform may have the same
>>>> identifier).
>>>>
>>>
>>> See above for why this is not an issue. The extra complexity (in protos
>>> and code), the inability to use them as map keys, and the fact that they'll
>>> be 100% redundant for all entities for a given metric convinces me that
>>> it's not worth creating and tracking an enum for the type alongside the id.
>>>
>>>
>>>> *}*
>>>>
>>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com>
>>>>> wrote:
>>>>>
>>>>>>
>>>>>> To Robert's proto:
>>>>>>
>>>>>>  // A mapping of entities to (encoded) values.
>>>>>>>  map<string, bytes> values;
>>>>>>>
>>>>>>
>>>>>> Are the keys here the names of the metrics, aka what is used for URNs
>>>>>> in the doc?
>>>>>>
>>>>>>>
>>>>> They're the entities to which a metric is attached, e.g. a PTransform,
>>>>> a PCollection, or perhaps a process/worker.
>>>>>
>>>>>
>>>>>> }
>>>>>>>
>>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Agree with all of this. It echoes a thread on the doc that I was
>>>>>>>>> going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>>>>>>>>> supported type" didn't include new ways of aggregating ints. As long as
>>>>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>>>>> metrics right.
>>>>>>>>>>
>>>>>>>>>> - Ben
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters are
>>>>>>>>>>> used a lot but others are less mainstream so being too fine from the start
>>>>>>>>>>> can just add complexity and bugs in impls IMHO.
>>>>>>>>>>>
>>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com>
>>>>>>>>>>> a écrit :
>>>>>>>>>>>
>>>>>>>>>>>> By "type" of metric, I mean both the data types (including
>>>>>>>>>>>> their encoding) and accumulator strategy. So sumint would be a type, as
>>>>>>>>>>>> would double-distribution.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> When you say type do you mean accumulator type, result type,
>>>>>>>>>>>>> or accumulator strategy? Specifically, what is the "type" of sumint,
>>>>>>>>>>>>> sumlong, meanlong, etc?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The metric api is designed to prevent user defined metric
>>>>>>>>>>>>>>> types based on the fact they just weren't used enough to justify support.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> One thing that has occurred to me is that we're conflating
>>>>>>>>>>>>>>>> the idea of custom metrics and custom metric types. I would propose
>>>>>>>>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for every
>>>>>>>>>>>>>>>> type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>>> I have made a revision today which is to make all metrics
>>>>>>>>>>>>>>>>> refer to a primary entity, so I have restructured some of the protos a
>>>>>>>>>>>>>>>>> little bit.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The point of this change was to futureproof the
>>>>>>>>>>>>>>>>> possibility of allowing custom user metrics, with custom aggregation
>>>>>>>>>>>>>>>>> functions for its metric updates.
>>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity associated
>>>>>>>>>>>>>>>>> with it (e.g. PCollection, PTransform), we can design an approach which
>>>>>>>>>>>>>>>>> forwards the opaque bytes metric updates, without deserializing them. These
>>>>>>>>>>>>>>>>> are forwarded to user provided code which then would deserialize the metric
>>>>>>>>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback before
>>>>>>>>>>>>>>>>>> then and I will post the final decisions made to this thread Friday
>>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it
>>>>>>>>>>>>>>>>>>> easily in
>>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback. There
>>>>>>>>>>>>>>>>>>> were some larger
>>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of
>>>>>>>>>>>>>>>>>>> these I have added a
>>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please
>>>>>>>>>>>>>>>>>>> take another look after
>>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

+1 to keeping things simple, both in code and the model to understand.

I like thinking of things as (name, value, type) triples. Historically,
we've packed the entity name (e.g. PTransform name) into the string name
field and parsed it out in various places; I think it's worth pulling this
out and making it explicit instead, so metrics would be (name, entity,
value, type) tuples. In the current proposal:

Name is the URN + a possible bytes payload. (Actually, it's a bit unclear
if there's any relationship between counters with the same name and
different payloads. Also, the only use for payloads is because "User
Counter" is currently a single URN, rather than using the namespacing
characteristics of URNs to map user names onto distinct metric names. A
payload avoids the messiness of having to pack (and parse) arbitrary
parameters into a name though.) If we're going to choose names that the
system and sdks agree to have specific meanings, and to avoid accidental
collisions, making them full-fledged documented URNs has value.

Value is the "payload". Likely worth changing the name to avoid confusion
with the payload above. It's bytes because it depends on the type. I would
try to avoid nesting it too deeply (e.g. a payload within a payload). If we
thing the types are generally limited, another option would be a oneof
field (with a bytes option just in case) for transparency. There are pros
and cons going this route.

Type is what I proposed we add, instead of it being implicit in the name
(and unknowable if one does not recognize the name). This makes things more
open-ended and easier to evolve and work with.

Entity could be generalized to Label, or LabelSet if desired. But as
mentioned I think it makes sense to pull this out as a separate field,
especially when it makes sense to aggregate a single named counter across
labels as well as for a single label (e.g. execution time of composite
transforms).

- Robert



On Fri, Apr 13, 2018 at 12:36 PM Andrea Foegler <fo...@google.com> wrote:

> Hi folks -
>
> Before we totally go down the path of highly structured metric protos, I'd
> like to propose considering a simple metrics interface between the SDK and
> the runner.  Something more generic and closer to what most monitoring
> systems would use.
>
> To use Spark as an example, the Metric system uses a simple metric format
> of name, value and type to report all metrics in a single structure,
> regardless of the source or context of the metric.
>
> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html
>
> The subsystems have contracts for what metrics they will expose and how
> they are calculated:
>
> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html
>
> Codifying the system metrics in the SDK seems perfectly reasonable - no
> reason to make the notion of metric generic at that level.  But at the
> point the metric is leaving the SDK and going to the runner, a simpler,
> generic encoding of the metrics might make it easier to adapt and maintain
> system.  The generic format can include information about downstream
> consumers, if that's useful.
>
> Spark supports a number of Metric Sinks - external monitoring systems.  If
> runners receive a simple list of metrics, implementing any number of Sinks
> for Beam would be straightforward and would generally be a one time
> implementation.  If instead all system metrics are sent embedded in a
> highly structured, semantically meaningful structure, runner code would
> need to be updated to support exporting the new metric. We seem to be
> heading in the direction of "if you don't understand this metric, you can't
> use it / export it".  But most systems seem to assume metrics are really
> simple named values that can be handled a priori.
>
> So I guess my primary question is:  Is it necessary for Beam to treat
> metrics as highly semantic, arbitrarily complex data?  Or could they
> possibly be the sort of simple named values as they are in most monitoring
> systems and in Spark?  With the SDK potentially providing scaffolding to
> add meaning and structure, but simplifying that out before leaving SDK
> code.  Is the coupling to a semantically meaningful structure between the
> SDK and runner and necessary complexity?
>
> Andrea
>
>
>
> On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com> wrote:
>>
>>>
>>> *Thank you for this clarification. I think the table of files fits into
>>> the model as one of type string-set (with union as aggregation). *
>>> Its not a list of files, its a list of metadata for each file, several
>>> pieces of data per file.
>>>
>>> Are you proposing that there would be separate URNs as well for each
>>> entity being measured then, so the the URN defines the type of entity being
>>> measured.
>>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>>> PCollection entities
>>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>>> PTransform entities
>>>
>>
>> Yes. FWIW, it may not even be needed to put this in the name, e.g.
>> execution times are never for PCollections, and even if they were it'd be
>> semantically a very different beast (which should not re-use the same URN).
>>
>> *message MetricSpec {*
>>> *  // (Required) A URN that describes the accompanying payload.*
>>> *  // For any URN that is not recognized (by whomever is inspecting*
>>> *  // it) the parameter payload should be treated as opaque and*
>>> *  // passed as-is.*
>>> *  string urn = 1;*
>>>
>>> *  // (Optional) The data specifying any parameters to the URN. If*
>>> *  // the URN does not require any arguments, this may be omitted.*
>>> *  bytes parameters_payload = 2;*
>>>
>>> *  // (Required) A URN that describes the type of values this metric*
>>> *  // records (e.g. durations that should be summed).*
>>> *}*
>>>
>>> *message Metric[Values] {*
>>> * // (Required) The original requesting MetricSpec.*
>>> * MetricSpec metric_spec = 1;*
>>>
>>> * // A mapping of entities to (encoded) values.*
>>> * map<string, bytes> values;*
>>> This ignores the non-unqiueness of entity identifiers. This is why in my
>>> doc, I have specified the entity type and its string identifier
>>> @Ken, I believe you have pointed this out in the past, that uniqueness
>>> is only guaranteed within a type of entity (all PCollections), but not
>>> between entities (A Pcollection and PTransform may have the same
>>> identifier).
>>>
>>
>> See above for why this is not an issue. The extra complexity (in protos
>> and code), the inability to use them as map keys, and the fact that they'll
>> be 100% redundant for all entities for a given metric convinces me that
>> it's not worth creating and tracking an enum for the type alongside the id.
>>
>>
>>> *}*
>>>
>>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com> wrote:
>>>>
>>>>>
>>>>> To Robert's proto:
>>>>>
>>>>>  // A mapping of entities to (encoded) values.
>>>>>>  map<string, bytes> values;
>>>>>>
>>>>>
>>>>> Are the keys here the names of the metrics, aka what is used for URNs
>>>>> in the doc?
>>>>>
>>>>>>
>>>> They're the entities to which a metric is attached, e.g. a PTransform,
>>>> a PCollection, or perhaps a process/worker.
>>>>
>>>>
>>>>> }
>>>>>>
>>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Agree with all of this. It echoes a thread on the doc that I was
>>>>>>>> going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>>>>>>>> supported type" didn't include new ways of aggregating ints. As long as
>>>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>>>> metrics right.
>>>>>>>>>
>>>>>>>>> - Ben
>>>>>>>>>
>>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters are
>>>>>>>>>> used a lot but others are less mainstream so being too fine from the start
>>>>>>>>>> can just add complexity and bugs in impls IMHO.
>>>>>>>>>>
>>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a
>>>>>>>>>> écrit :
>>>>>>>>>>
>>>>>>>>>>> By "type" of metric, I mean both the data types (including their
>>>>>>>>>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>>>>>>>>>> double-distribution.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> When you say type do you mean accumulator type, result type, or
>>>>>>>>>>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>>>>>>>>>>> meanlong, etc?
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>>> standard type.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> The metric api is designed to prevent user defined metric
>>>>>>>>>>>>>> types based on the fact they just weren't used enough to justify support.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> One thing that has occurred to me is that we're conflating
>>>>>>>>>>>>>>> the idea of custom metrics and custom metric types. I would propose
>>>>>>>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for every
>>>>>>>>>>>>>>> type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>>> I have made a revision today which is to make all metrics
>>>>>>>>>>>>>>>> refer to a primary entity, so I have restructured some of the protos a
>>>>>>>>>>>>>>>> little bit.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The point of this change was to futureproof the possibility
>>>>>>>>>>>>>>>> of allowing custom user metrics, with custom aggregation functions for its
>>>>>>>>>>>>>>>> metric updates.
>>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity associated
>>>>>>>>>>>>>>>> with it (e.g. PCollection, PTransform), we can design an approach which
>>>>>>>>>>>>>>>> forwards the opaque bytes metric updates, without deserializing them. These
>>>>>>>>>>>>>>>> are forwarded to user provided code which then would deserialize the metric
>>>>>>>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Please make sure that you provide your feedback before
>>>>>>>>>>>>>>>>> then and I will post the final decisions made to this thread Friday
>>>>>>>>>>>>>>>>> afternoon.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it
>>>>>>>>>>>>>>>>>> easily in
>>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback. There
>>>>>>>>>>>>>>>>>> were some larger
>>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of these
>>>>>>>>>>>>>>>>>> I have added a
>>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please
>>>>>>>>>>>>>>>>>> take another look after
>>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Andrea Foegler <fo...@google.com>.

Hi folks -

Before we totally go down the path of highly structured metric protos, I'd
like to propose considering a simple metrics interface between the SDK and
the runner.  Something more generic and closer to what most monitoring
systems would use.

To use Spark as an example, the Metric system uses a simple metric format
of name, value and type to report all metrics in a single structure,
regardless of the source or context of the metric.
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-MetricsSystem.html

The subsystems have contracts for what metrics they will expose and how
they are calculated:
https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-taskmetrics-ShuffleWriteMetrics.html

Codifying the system metrics in the SDK seems perfectly reasonable - no
reason to make the notion of metric generic at that level.  But at the
point the metric is leaving the SDK and going to the runner, a simpler,
generic encoding of the metrics might make it easier to adapt and maintain
system.  The generic format can include information about downstream
consumers, if that's useful.

Spark supports a number of Metric Sinks - external monitoring systems.  If
runners receive a simple list of metrics, implementing any number of Sinks
for Beam would be straightforward and would generally be a one time
implementation.  If instead all system metrics are sent embedded in a
highly structured, semantically meaningful structure, runner code would
need to be updated to support exporting the new metric. We seem to be
heading in the direction of "if you don't understand this metric, you can't
use it / export it".  But most systems seem to assume metrics are really
simple named values that can be handled a priori.

So I guess my primary question is:  Is it necessary for Beam to treat
metrics as highly semantic, arbitrarily complex data?  Or could they
possibly be the sort of simple named values as they are in most monitoring
systems and in Spark?  With the SDK potentially providing scaffolding to
add meaning and structure, but simplifying that out before leaving SDK
code.  Is the coupling to a semantically meaningful structure between the
SDK and runner and necessary complexity?

Andrea



On Fri, Apr 13, 2018 at 10:20 AM Robert Bradshaw <ro...@google.com>
wrote:

> On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com> wrote:
>
>>
>> *Thank you for this clarification. I think the table of files fits into
>> the model as one of type string-set (with union as aggregation). *
>> Its not a list of files, its a list of metadata for each file, several
>> pieces of data per file.
>>
>> Are you proposing that there would be separate URNs as well for each
>> entity being measured then, so the the URN defines the type of entity being
>> measured.
>> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
>> PCollection entities
>> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
>> PTransform entities
>>
>
> Yes. FWIW, it may not even be needed to put this in the name, e.g.
> execution times are never for PCollections, and even if they were it'd be
> semantically a very different beast (which should not re-use the same URN).
>
> *message MetricSpec {*
>> *  // (Required) A URN that describes the accompanying payload.*
>> *  // For any URN that is not recognized (by whomever is inspecting*
>> *  // it) the parameter payload should be treated as opaque and*
>> *  // passed as-is.*
>> *  string urn = 1;*
>>
>> *  // (Optional) The data specifying any parameters to the URN. If*
>> *  // the URN does not require any arguments, this may be omitted.*
>> *  bytes parameters_payload = 2;*
>>
>> *  // (Required) A URN that describes the type of values this metric*
>> *  // records (e.g. durations that should be summed).*
>> *}*
>>
>> *message Metric[Values] {*
>> * // (Required) The original requesting MetricSpec.*
>> * MetricSpec metric_spec = 1;*
>>
>> * // A mapping of entities to (encoded) values.*
>> * map<string, bytes> values;*
>> This ignores the non-unqiueness of entity identifiers. This is why in my
>> doc, I have specified the entity type and its string identifier
>> @Ken, I believe you have pointed this out in the past, that uniqueness is
>> only guaranteed within a type of entity (all PCollections), but not between
>> entities (A Pcollection and PTransform may have the same identifier).
>>
>
> See above for why this is not an issue. The extra complexity (in protos
> and code), the inability to use them as map keys, and the fact that they'll
> be 100% redundant for all entities for a given metric convinces me that
> it's not worth creating and tracking an enum for the type alongside the id.
>
>
>> *}*
>>
>> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com> wrote:
>>>
>>>>
>>>> To Robert's proto:
>>>>
>>>>  // A mapping of entities to (encoded) values.
>>>>>  map<string, bytes> values;
>>>>>
>>>>
>>>> Are the keys here the names of the metrics, aka what is used for URNs
>>>> in the doc?
>>>>
>>>>>
>>> They're the entities to which a metric is attached, e.g. a PTransform, a
>>> PCollection, or perhaps a process/worker.
>>>
>>>
>>>> }
>>>>>
>>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Agree with all of this. It echoes a thread on the doc that I was
>>>>>>> going to bring here. Let's keep it simple and use concrete use cases to
>>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>>>>>>> supported type" didn't include new ways of aggregating ints. As long as
>>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>>> metrics right.
>>>>>>>>
>>>>>>>> - Ben
>>>>>>>>
>>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Maybe leave it out until proven it is needed. ATM counters are
>>>>>>>>> used a lot but others are less mainstream so being too fine from the start
>>>>>>>>> can just add complexity and bugs in impls IMHO.
>>>>>>>>>
>>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a
>>>>>>>>> écrit :
>>>>>>>>>
>>>>>>>>>> By "type" of metric, I mean both the data types (including their
>>>>>>>>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>>>>>>>>> double-distribution.
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> When you say type do you mean accumulator type, result type, or
>>>>>>>>>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>>>>>>>>>> meanlong, etc?
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>>> standard type.
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> The metric api is designed to prevent user defined metric
>>>>>>>>>>>>> types based on the fact they just weren't used enough to justify support.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> One thing that has occurred to me is that we're conflating
>>>>>>>>>>>>>> the idea of custom metrics and custom metric types. I would propose
>>>>>>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for every
>>>>>>>>>>>>>> type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <
>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>>> I have made a revision today which is to make all metrics
>>>>>>>>>>>>>>> refer to a primary entity, so I have restructured some of the protos a
>>>>>>>>>>>>>>> little bit.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The point of this change was to futureproof the possibility
>>>>>>>>>>>>>>> of allowing custom user metrics, with custom aggregation functions for its
>>>>>>>>>>>>>>> metric updates.
>>>>>>>>>>>>>>> Now that each metric has an aggregation_entity associated
>>>>>>>>>>>>>>> with it (e.g. PCollection, PTransform), we can design an approach which
>>>>>>>>>>>>>>> forwards the opaque bytes metric updates, without deserializing them. These
>>>>>>>>>>>>>>> are forwarded to user provided code which then would deserialize the metric
>>>>>>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I think it has also simplified some of the URN metric
>>>>>>>>>>>>>>> protos, as they do not need to keep track of ptransform names inside
>>>>>>>>>>>>>>> themselves now. The result is simpler structures, for the metrics as the
>>>>>>>>>>>>>>> entities are pulled outside of the metric.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Please make sure that you provide your feedback before then
>>>>>>>>>>>>>>>> and I will post the final decisions made to this thread Friday afternoon.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it
>>>>>>>>>>>>>>>>> easily in
>>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback. There
>>>>>>>>>>>>>>>>> were some larger
>>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of these
>>>>>>>>>>>>>>>>> I have added a
>>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please take
>>>>>>>>>>>>>>>>> another look after
>>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

On Fri, Apr 13, 2018 at 10:10 AM Alex Amato <aj...@google.com> wrote:

>
> *Thank you for this clarification. I think the table of files fits into
> the model as one of type string-set (with union as aggregation). *
> Its not a list of files, its a list of metadata for each file, several
> pieces of data per file.
>
> Are you proposing that there would be separate URNs as well for each
> entity being measured then, so the the URN defines the type of entity being
> measured.
> "urn.beam.metrics.PCollectionByteCount" is a URN for always for
> PCollection entities
> "urn.beam.metrics.PTransformExecutionTime" is a URN is always for
> PTransform entities
>

Yes. FWIW, it may not even be needed to put this in the name, e.g.
execution times are never for PCollections, and even if they were it'd be
semantically a very different beast (which should not re-use the same URN).

*message MetricSpec {*
> *  // (Required) A URN that describes the accompanying payload.*
> *  // For any URN that is not recognized (by whomever is inspecting*
> *  // it) the parameter payload should be treated as opaque and*
> *  // passed as-is.*
> *  string urn = 1;*
>
> *  // (Optional) The data specifying any parameters to the URN. If*
> *  // the URN does not require any arguments, this may be omitted.*
> *  bytes parameters_payload = 2;*
>
> *  // (Required) A URN that describes the type of values this metric*
> *  // records (e.g. durations that should be summed).*
> *}*
>
> *message Metric[Values] {*
> * // (Required) The original requesting MetricSpec.*
> * MetricSpec metric_spec = 1;*
>
> * // A mapping of entities to (encoded) values.*
> * map<string, bytes> values;*
> This ignores the non-unqiueness of entity identifiers. This is why in my
> doc, I have specified the entity type and its string identifier
> @Ken, I believe you have pointed this out in the past, that uniqueness is
> only guaranteed within a type of entity (all PCollections), but not between
> entities (A Pcollection and PTransform may have the same identifier).
>

See above for why this is not an issue. The extra complexity (in protos and
code), the inability to use them as map keys, and the fact that they'll be
100% redundant for all entities for a given metric convinces me that it's
not worth creating and tracking an enum for the type alongside the id.


> *}*
>
> On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <ro...@google.com>
> wrote:
>
>> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com> wrote:
>>
>>>
>>> To Robert's proto:
>>>
>>>  // A mapping of entities to (encoded) values.
>>>>  map<string, bytes> values;
>>>>
>>>
>>> Are the keys here the names of the metrics, aka what is used for URNs in
>>> the doc?
>>>
>>>>
>> They're the entities to which a metric is attached, e.g. a PTransform, a
>> PCollection, or perhaps a process/worker.
>>
>>
>>> }
>>>>
>>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com> wrote:
>>>>>
>>>>>> Agree with all of this. It echoes a thread on the doc that I was
>>>>>> going to bring here. Let's keep it simple and use concrete use cases to
>>>>>> drive additional abstraction if/when it becomes compelling.
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>>>>>> supported type" didn't include new ways of aggregating ints. As long as
>>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>>> metrics right.
>>>>>>>
>>>>>>> - Ben
>>>>>>>
>>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> Maybe leave it out until proven it is needed. ATM counters are used
>>>>>>>> a lot but others are less mainstream so being too fine from the start can
>>>>>>>> just add complexity and bugs in impls IMHO.
>>>>>>>>
>>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a
>>>>>>>> écrit :
>>>>>>>>
>>>>>>>>> By "type" of metric, I mean both the data types (including their
>>>>>>>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>>>>>>>> double-distribution.
>>>>>>>>>
>>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <
>>>>>>>>> bjchambers@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> When you say type do you mean accumulator type, result type, or
>>>>>>>>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>>>>>>>>> meanlong, etc?
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <
>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Fully custom metric types is the "more speculative and
>>>>>>>>>>> difficult" feature that I was proposing we kick down the road (and may
>>>>>>>>>>> never get to). What I'm suggesting is that we support custom metrics of
>>>>>>>>>>> standard type.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> The metric api is designed to prevent user defined metric types
>>>>>>>>>>>> based on the fact they just weren't used enough to justify support.
>>>>>>>>>>>>
>>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>>> system metrivs?
>>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>>
>>>>>>>>>>>>> One thing that has occurred to me is that we're conflating the
>>>>>>>>>>>>> idea of custom metrics and custom metric types. I would propose
>>>>>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>>
>>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for every
>>>>>>>>>>>>> type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>>
>>>>>>>>>>>>> - Robert
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>>> I have made a revision today which is to make all metrics
>>>>>>>>>>>>>> refer to a primary entity, so I have restructured some of the protos a
>>>>>>>>>>>>>> little bit.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The point of this change was to futureproof the possibility
>>>>>>>>>>>>>> of allowing custom user metrics, with custom aggregation functions for its
>>>>>>>>>>>>>> metric updates.
>>>>>>>>>>>>>> Now that each metric has an aggregation_entity associated
>>>>>>>>>>>>>> with it (e.g. PCollection, PTransform), we can design an approach which
>>>>>>>>>>>>>> forwards the opaque bytes metric updates, without deserializing them. These
>>>>>>>>>>>>>> are forwarded to user provided code which then would deserialize the metric
>>>>>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I think it has also simplified some of the URN metric protos,
>>>>>>>>>>>>>> as they do not need to keep track of ptransform names inside themselves
>>>>>>>>>>>>>> now. The result is simpler structures, for the metrics as the entities are
>>>>>>>>>>>>>> pulled outside of the metric.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <
>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Please make sure that you provide your feedback before then
>>>>>>>>>>>>>>> and I will post the final decisions made to this thread Friday afternoon.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it
>>>>>>>>>>>>>>>> easily in
>>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>>>>>> >> have made some revisions based on the feedback. There
>>>>>>>>>>>>>>>> were some larger
>>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of these I
>>>>>>>>>>>>>>>> have added a
>>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised
>>>>>>>>>>>>>>>> proposal. Please take
>>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please take
>>>>>>>>>>>>>>>> another look after
>>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Alex Amato <aj...@google.com>.

*Thank you for this clarification. I think the table of files fits into the
model as one of type string-set (with union as aggregation). *
Its not a list of files, its a list of metadata for each file, several
pieces of data per file.

Are you proposing that there would be separate URNs as well for each entity
being measured then, so the the URN defines the type of entity being
measured.
"urn.beam.metrics.PCollectionByteCount" is a URN for always for PCollection
entities
"urn.beam.metrics.PTransformExecutionTime" is a URN is always for
PTransform entities

*message MetricSpec {*
*  // (Required) A URN that describes the accompanying payload.*
*  // For any URN that is not recognized (by whomever is inspecting*
*  // it) the parameter payload should be treated as opaque and*
*  // passed as-is.*
*  string urn = 1;*

*  // (Optional) The data specifying any parameters to the URN. If*
*  // the URN does not require any arguments, this may be omitted.*
*  bytes parameters_payload = 2;*

*  // (Required) A URN that describes the type of values this metric*
*  // records (e.g. durations that should be summed).*
*}*

*message Metric[Values] {*
* // (Required) The original requesting MetricSpec.*
* MetricSpec metric_spec = 1;*

* // A mapping of entities to (encoded) values.*
* map<string, bytes> values;*
This ignores the non-unqiueness of entity identifiers. This is why in my
doc, I have specified the entity type and its string identifier
@Ken, I believe you have pointed this out in the past, that uniqueness is
only guaranteed within a type of entity (all PCollections), but not between
entities (A Pcollection and PTransform may have the same identifier).
*}*

On Fri, Apr 13, 2018 at 9:14 AM Robert Bradshaw <ro...@google.com> wrote:

> On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com> wrote:
>
>>
>> To Robert's proto:
>>
>>  // A mapping of entities to (encoded) values.
>>>  map<string, bytes> values;
>>>
>>
>> Are the keys here the names of the metrics, aka what is used for URNs in
>> the doc?
>>
>>>
> They're the entities to which a metric is attached, e.g. a PTransform, a
> PCollection, or perhaps a process/worker.
>
>
>> }
>>>
>>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com> wrote:
>>>>
>>>>> Agree with all of this. It echoes a thread on the doc that I was going
>>>>> to bring here. Let's keep it simple and use concrete use cases to drive
>>>>> additional abstraction if/when it becomes compelling.
>>>>>
>>>>> Kenn
>>>>>
>>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>>>>> supported type" didn't include new ways of aggregating ints. As long as
>>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>>> users want and metrics back end support) it seems like we are doing user
>>>>>> metrics right.
>>>>>>
>>>>>> - Ben
>>>>>>
>>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> Maybe leave it out until proven it is needed. ATM counters are used
>>>>>>> a lot but others are less mainstream so being too fine from the start can
>>>>>>> just add complexity and bugs in impls IMHO.
>>>>>>>
>>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a
>>>>>>> écrit :
>>>>>>>
>>>>>>>> By "type" of metric, I mean both the data types (including their
>>>>>>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>>>>>>> double-distribution.
>>>>>>>>
>>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <bj...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> When you say type do you mean accumulator type, result type, or
>>>>>>>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>>>>>>>> meanlong, etc?
>>>>>>>>>
>>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Fully custom metric types is the "more speculative and difficult"
>>>>>>>>>> feature that I was proposing we kick down the road (and may never get to).
>>>>>>>>>> What I'm suggesting is that we support custom metrics of standard type.
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <
>>>>>>>>>> bchambers@apache.org> wrote:
>>>>>>>>>>
>>>>>>>>>>> The metric api is designed to prevent user defined metric types
>>>>>>>>>>> based on the fact they just weren't used enough to justify support.
>>>>>>>>>>>
>>>>>>>>>>> Is there a reason we are bringing that complexity back?
>>>>>>>>>>> Shouldn't we just need the ability for the standard set plus any special
>>>>>>>>>>> system metrivs?
>>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>>
>>>>>>>>>>>> One thing that has occurred to me is that we're conflating the
>>>>>>>>>>>> idea of custom metrics and custom metric types. I would propose
>>>>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>>
>>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for every
>>>>>>>>>>>> type X one would have a single URN for UserMetric and it spec would
>>>>>>>>>>>> designate the type and payload designate the (qualified) name.
>>>>>>>>>>>>
>>>>>>>>>>>> - Robert
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>>> I have made a revision today which is to make all metrics
>>>>>>>>>>>>> refer to a primary entity, so I have restructured some of the protos a
>>>>>>>>>>>>> little bit.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The point of this change was to futureproof the possibility of
>>>>>>>>>>>>> allowing custom user metrics, with custom aggregation functions for its
>>>>>>>>>>>>> metric updates.
>>>>>>>>>>>>> Now that each metric has an aggregation_entity associated with
>>>>>>>>>>>>> it (e.g. PCollection, PTransform), we can design an approach which forwards
>>>>>>>>>>>>> the opaque bytes metric updates, without deserializing them. These are
>>>>>>>>>>>>> forwarded to user provided code which then would deserialize the metric
>>>>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I think it has also simplified some of the URN metric protos,
>>>>>>>>>>>>> as they do not need to keep track of ptransform names inside themselves
>>>>>>>>>>>>> now. The result is simpler structures, for the metrics as the entities are
>>>>>>>>>>>>> pulled outside of the metric.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Please make sure that you provide your feedback before then
>>>>>>>>>>>>>> and I will post the final decisions made to this thread Friday afternoon.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <
>>>>>>>>>>>>>> iemejia@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it
>>>>>>>>>>>>>>> easily in
>>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>>>>> >> have made some revisions based on the feedback. There
>>>>>>>>>>>>>>> were some larger
>>>>>>>>>>>>>>> >> questions asking about alternatives. For each of these I
>>>>>>>>>>>>>>> have added a
>>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised proposal.
>>>>>>>>>>>>>>> Please take
>>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please take
>>>>>>>>>>>>>>> another look after
>>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

On Fri, Apr 13, 2018 at 8:31 AM Kenneth Knowles <kl...@google.com> wrote:

>
> To Robert's proto:
>
>  // A mapping of entities to (encoded) values.
>>  map<string, bytes> values;
>>
>
> Are the keys here the names of the metrics, aka what is used for URNs in
> the doc?
>
>>
They're the entities to which a metric is attached, e.g. a PTransform, a
PCollection, or perhaps a process/worker.


> }
>>
>> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com> wrote:
>>>
>>>> Agree with all of this. It echoes a thread on the doc that I was going
>>>> to bring here. Let's keep it simple and use concrete use cases to drive
>>>> additional abstraction if/when it becomes compelling.
>>>>
>>>> Kenn
>>>>
>>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com>
>>>> wrote:
>>>>
>>>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>>>> supported type" didn't include new ways of aggregating ints. As long as
>>>>> that means we have a fixed set of aggregations (that align with what what
>>>>> users want and metrics back end support) it seems like we are doing user
>>>>> metrics right.
>>>>>
>>>>> - Ben
>>>>>
>>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> Maybe leave it out until proven it is needed. ATM counters are used a
>>>>>> lot but others are less mainstream so being too fine from the start can
>>>>>> just add complexity and bugs in impls IMHO.
>>>>>>
>>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a
>>>>>> écrit :
>>>>>>
>>>>>>> By "type" of metric, I mean both the data types (including their
>>>>>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>>>>>> double-distribution.
>>>>>>>
>>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <bj...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> When you say type do you mean accumulator type, result type, or
>>>>>>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>>>>>>> meanlong, etc?
>>>>>>>>
>>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Fully custom metric types is the "more speculative and difficult"
>>>>>>>>> feature that I was proposing we kick down the road (and may never get to).
>>>>>>>>> What I'm suggesting is that we support custom metrics of standard type.
>>>>>>>>>
>>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> The metric api is designed to prevent user defined metric types
>>>>>>>>>> based on the fact they just weren't used enough to justify support.
>>>>>>>>>>
>>>>>>>>>> Is there a reason we are bringing that complexity back? Shouldn't
>>>>>>>>>> we just need the ability for the standard set plus any special system
>>>>>>>>>> metrivs?
>>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <
>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>>
>>>>>>>>>>> One thing that has occurred to me is that we're conflating the
>>>>>>>>>>> idea of custom metrics and custom metric types. I would propose
>>>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>>
>>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for every type
>>>>>>>>>>> X one would have a single URN for UserMetric and it spec would designate
>>>>>>>>>>> the type and payload designate the (qualified) name.
>>>>>>>>>>>
>>>>>>>>>>> - Robert
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>>> I have made a revision today which is to make all metrics refer
>>>>>>>>>>>> to a primary entity, so I have restructured some of the protos a little bit.
>>>>>>>>>>>>
>>>>>>>>>>>> The point of this change was to futureproof the possibility of
>>>>>>>>>>>> allowing custom user metrics, with custom aggregation functions for its
>>>>>>>>>>>> metric updates.
>>>>>>>>>>>> Now that each metric has an aggregation_entity associated with
>>>>>>>>>>>> it (e.g. PCollection, PTransform), we can design an approach which forwards
>>>>>>>>>>>> the opaque bytes metric updates, without deserializing them. These are
>>>>>>>>>>>> forwarded to user provided code which then would deserialize the metric
>>>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>>>
>>>>>>>>>>>> I think it has also simplified some of the URN metric protos,
>>>>>>>>>>>> as they do not need to keep track of ptransform names inside themselves
>>>>>>>>>>>> now. The result is simpler structures, for the metrics as the entities are
>>>>>>>>>>>> pulled outside of the metric.
>>>>>>>>>>>>
>>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Please make sure that you provide your feedback before then
>>>>>>>>>>>>> and I will post the final decisions made to this thread Friday afternoon.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Nice, I created a short link so people can refer to it easily
>>>>>>>>>>>>>> in
>>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>>>> >> have made some revisions based on the feedback. There were
>>>>>>>>>>>>>> some larger
>>>>>>>>>>>>>> >> questions asking about alternatives. For each of these I
>>>>>>>>>>>>>> have added a
>>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> I would appreciate more feedback on the revised proposal.
>>>>>>>>>>>>>> Please take
>>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please take
>>>>>>>>>>>>>> another look after
>>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>>> >>
>>>>>>>>>>>>>> >
>>>>>>>>>>>>>>
>>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Kenneth Knowles <kl...@google.com>.

Thanks for the extensive clarification and +1 to Robert's distinction. It
seems fundamental, perfectly analogous to the distinction between a
variable and its type. The universe of "things you measure" is unbounded,
while the universe of "ways to measure" is vastly smaller and separable.

The remaining debate was around "ways to measure", are maybe half a dozen
in common use and widespread agreement as to what they are and how they
work - basically what Dropwizard metrics provides.

So I still have doubts about building an abstracted extensible model for
shoehorning in "any message you want to send and collect" to an API and
concept with an established meaning and scope, especially since we only
have one or two motivating examples. OTOH these examples are esoteric
enough that it really wouldn't make sense to bake them into anything, so
they really do require a generic framework. The question is: does it make
sense to be _this_ framework?

So when you say "Perhaps there should be another word for this concept" I
have to agree. Often naming discussions can be a useless time sink, but in
this case I think are more in the realm of "should this be called
'addition' or 'exponentiation'?" where the name actually matters quite a
lot. But again, OTOH, for users as long as counters, distributions, gauges,
and the like are usable without having to know anything else or think in
terms of this abstract framework, then I can reluctantly accept calling it
"Metric" at the proto layer in order to use the plumbing we've already half
built.

To Robert's proto:

 // A mapping of entities to (encoded) values.
>  map<string, bytes> values;
>

Are the keys here the names of the metrics, aka what is used for URNs in
the doc?

Kenn



> }
>
> On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com> wrote:
>>
>>> Agree with all of this. It echoes a thread on the doc that I was going
>>> to bring here. Let's keep it simple and use concrete use cases to drive
>>> additional abstraction if/when it becomes compelling.
>>>
>>> Kenn
>>>
>>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com>
>>> wrote:
>>>
>>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>>> supported type" didn't include new ways of aggregating ints. As long as
>>>> that means we have a fixed set of aggregations (that align with what what
>>>> users want and metrics back end support) it seems like we are doing user
>>>> metrics right.
>>>>
>>>> - Ben
>>>>
>>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> Maybe leave it out until proven it is needed. ATM counters are used a
>>>>> lot but others are less mainstream so being too fine from the start can
>>>>> just add complexity and bugs in impls IMHO.
>>>>>
>>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a
>>>>> écrit :
>>>>>
>>>>>> By "type" of metric, I mean both the data types (including their
>>>>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>>>>> double-distribution.
>>>>>>
>>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <bj...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> When you say type do you mean accumulator type, result type, or
>>>>>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>>>>>> meanlong, etc?
>>>>>>>
>>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Fully custom metric types is the "more speculative and difficult"
>>>>>>>> feature that I was proposing we kick down the road (and may never get to).
>>>>>>>> What I'm suggesting is that we support custom metrics of standard type.
>>>>>>>>
>>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> The metric api is designed to prevent user defined metric types
>>>>>>>>> based on the fact they just weren't used enough to justify support.
>>>>>>>>>
>>>>>>>>> Is there a reason we are bringing that complexity back? Shouldn't
>>>>>>>>> we just need the ability for the standard set plus any special system
>>>>>>>>> metrivs?
>>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>>
>>>>>>>>>> One thing that has occurred to me is that we're conflating the
>>>>>>>>>> idea of custom metrics and custom metric types. I would propose
>>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>>
>>>>>>>>>> In addition, rather than having UserMetricOfTypeX for every type
>>>>>>>>>> X one would have a single URN for UserMetric and it spec would designate
>>>>>>>>>> the type and payload designate the (qualified) name.
>>>>>>>>>>
>>>>>>>>>> - Robert
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>>> I have made a revision today which is to make all metrics refer
>>>>>>>>>>> to a primary entity, so I have restructured some of the protos a little bit.
>>>>>>>>>>>
>>>>>>>>>>> The point of this change was to futureproof the possibility of
>>>>>>>>>>> allowing custom user metrics, with custom aggregation functions for its
>>>>>>>>>>> metric updates.
>>>>>>>>>>> Now that each metric has an aggregation_entity associated with
>>>>>>>>>>> it (e.g. PCollection, PTransform), we can design an approach which forwards
>>>>>>>>>>> the opaque bytes metric updates, without deserializing them. These are
>>>>>>>>>>> forwarded to user provided code which then would deserialize the metric
>>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>>
>>>>>>>>>>> I think it has also simplified some of the URN metric protos, as
>>>>>>>>>>> they do not need to keep track of ptransform names inside themselves now.
>>>>>>>>>>> The result is simpler structures, for the metrics as the entities are
>>>>>>>>>>> pulled outside of the metric.
>>>>>>>>>>>
>>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>>
>>>>>>>>>>>> Please make sure that you provide your feedback before then and
>>>>>>>>>>>> I will post the final decisions made to this thread Friday afternoon.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Nice, I created a short link so people can refer to it easily
>>>>>>>>>>>>> in
>>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>>
>>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <
>>>>>>>>>>>>> ajamato@google.com> wrote:
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>>> >> have made some revisions based on the feedback. There were
>>>>>>>>>>>>> some larger
>>>>>>>>>>>>> >> questions asking about alternatives. For each of these I
>>>>>>>>>>>>> have added a
>>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> I would appreciate more feedback on the revised proposal.
>>>>>>>>>>>>> Please take
>>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please take
>>>>>>>>>>>>> another look after
>>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>>> >> Alex
>>>>>>>>>>>>> >>
>>>>>>>>>>>>> >
>>>>>>>>>>>>>
>>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

On Thu, Apr 12, 2018 at 8:17 PM Alex Amato <aj...@google.com> wrote:

> I agree that there is some confusion about concepts. Here are several
> concepts which have come up in discussions, as I see them (not official
> names).
>
> *Metric*
>
>    - For the purposes of my document, I have been referring to a Metric
>    as any sort of information the SDK can send to the Runner
>       - This does not mean only quantitative, aggregated values.
>       - This can include other useful '*monitoring information*', for
>       supporting debugging/monitoring scenarios such as
>          - A table of files which are not yet finished reading, causing a
>          streaming pipeline to be blocked
>       - It has been pointed out to me, that when many people hear metric,
>    a very specific thing comes to mind, in particular quantitative,
>    aggregated values. *That is NOT what my document is limited to. I
>    consider both that type of metric, and more arbitrary 'monitoring
>    information', like a table of files with statuses in the proposal.*
>    - Perhaps there should be another word for this concept, yet I have
>    not yet come up with a good one, "monitoring information", "monitoring
>    item" perhaps.
>
>
> *Metric types/Metric classes*
>
>    - A collection of information reported on
>    ProcessBundleProgressResponse and ProcessBundleResponse from the SDK to the
>    RunnerHarness.
>       - e.g. execution time of par do functions.
>    - In my proposal they are defined by a URN and two structs which are
>    serialized into a MetricSpec and Metric bytes payload field, for requesting
>    and responding to the metrics.
>       - e.g. beam:metric:ptransform_execution_times:v1 defines the
>       information needed to describe how a ptransform
>    - All metrics which are passed across the FN API have a *metric type*
>
>
> *User metrics*
>
>    - A metric added by a pipeline writer, using an SDK API to create
>    these.
>    - In my proposal the various *UserMetric types are a Metric Type. *
>       - e.g. “urn:beam:metric:user_distribution_data:v1” and “urn:beam:metric:user_counter_data:v1”
>       define two metric types for packaging these user metrics and
>       communicating them across the FN API.
>       - SDK writers would need to write code to package the user metrics
>       from SDK API calls into their associated metric types to send them across
>       the FN API.
>
> *Custom metric types*
>
>    - A metric type which is not included in a catalog of first class beam
>    metrics. This can be thought of as metrics a custom engine+runner+sdk
>    (system as a whole) collects which is not part of the beam model.
>       - e.g. a closed source runner can define its own URNs and metrics,
>       extending the beam model
>          - for example an I/O source specific to a closed source
>          engine+runner+sdk may export a table of files it is reading with statuses
>          as a custom metric type
>
>
> *Custom User Metrics with Custom Metric Types *
>
>    - Not proposed to support by the doc
>    - A user specified metric, written by a pipeline writer with a custom
>    metric type, likely would be implemented using a general mechanism to
>    attach the custom metric.
>    - May have a custom user specified aggregation function as well.
>
>
> *Reporting metrics to external systems such as drop wizard*
>
>    - My doc does not specifically cover this, it assumes that a runner
>    harness would be responsible for reporting metrics in formats specific to
>    those external systems, such as Drop Wizard. It assumes that the
>    URNs+Metric types provided will be specified enough so that it would be
>    possible to make such a translation.
>    - Each metric type would need to be handled in the RunnerHarness, to
>    collect and report the metric to an external system
>    - Some concern has come up about this, and if this should dictate the
>    format of the metrics which the SDK sends to the RunnerHarness of the FN
>    API, rather than using the more custom URN+payload approach.
>    - Though there could be URNs specifically designed to do this, the
>       intention of the design in the doc is to not require SDKs to give string
>       "names" to metrics, just to fill in URN payloads, and the Runner Harness
>       will pick names for metrics if needed to send to external systems.
>
> Just wanted to clarify this a bit. I hope the example of the table of
> files being a more complex metric type describes the usage of custom metric
> types. I'll update the doc with this
>

Thank you for this clarification. I think the table of files fits into the
model as one of type string-set (with union as aggregation).


> @Robert, I am not sure if you are proposing anything that is not in the
> current form of the doc.
>

Yes, I am.

Currently, the URN of the metric spec specifies both (1) the semantic
meaning of this metric (i.e. what exactly is being instrumented, whether
that be processing time seconds or output bytes) and (2) the formatting and
aggregation function of the (otherwise opaque) payload bytes.  I am
proposing we make (2) explicit via a URN in the MetricSpec as well, such
that a runner that does not need to interpret (1) to aggregate and even
report/display this data if it so chooses.

Concretely,

message MetricSpec {
  // (Required) A URN that describes the accompanying payload.
  // For any URN that is not recognized (by whomever is inspecting
  // it) the parameter payload should be treated as opaque and
  // passed as-is.
  string urn = 1;

  // (Optional) The data specifying any parameters to the URN. If
  // the URN does not require any arguments, this may be omitted.
  bytes parameters_payload = 2;

  // (Required) A URN that describes the type of values this metric
  // records (e.g. durations that should be summed).
}

message Metric[Values] {
 // (Required) The original requesting MetricSpec.
 MetricSpec metric_spec = 1;

 // A mapping of entities to (encoded) values.
 map<string, bytes> values;
}

On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com> wrote:
>
>> Agree with all of this. It echoes a thread on the doc that I was going to
>> bring here. Let's keep it simple and use concrete use cases to drive
>> additional abstraction if/when it becomes compelling.
>>
>> Kenn
>>
>> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com>
>> wrote:
>>
>>> Sounds perfect. Just wanted to make sure that "custom metrics of
>>> supported type" didn't include new ways of aggregating ints. As long as
>>> that means we have a fixed set of aggregations (that align with what what
>>> users want and metrics back end support) it seems like we are doing user
>>> metrics right.
>>>
>>> - Ben
>>>
>>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>
>>>> Maybe leave it out until proven it is needed. ATM counters are used a
>>>> lot but others are less mainstream so being too fine from the start can
>>>> just add complexity and bugs in impls IMHO.
>>>>
>>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a
>>>> écrit :
>>>>
>>>>> By "type" of metric, I mean both the data types (including their
>>>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>>>> double-distribution.
>>>>>
>>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <bj...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> When you say type do you mean accumulator type, result type, or
>>>>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>>>>> meanlong, etc?
>>>>>>
>>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Fully custom metric types is the "more speculative and difficult"
>>>>>>> feature that I was proposing we kick down the road (and may never get to).
>>>>>>> What I'm suggesting is that we support custom metrics of standard type.
>>>>>>>
>>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> The metric api is designed to prevent user defined metric types
>>>>>>>> based on the fact they just weren't used enough to justify support.
>>>>>>>>
>>>>>>>> Is there a reason we are bringing that complexity back? Shouldn't
>>>>>>>> we just need the ability for the standard set plus any special system
>>>>>>>> metrivs?
>>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>>
>>>>>>>>> One thing that has occurred to me is that we're conflating the
>>>>>>>>> idea of custom metrics and custom metric types. I would propose
>>>>>>>>> the MetricSpec field be augmented with an additional field "type" which is
>>>>>>>>> a urn specifying the type of metric it is (i.e. the contents of its
>>>>>>>>> payload, as well as the form of aggregation). Summing or maxing over ints
>>>>>>>>> would be a typical example. Though we could pursue making this opaque to
>>>>>>>>> the runner in the long run, that's a more speculative (and difficult)
>>>>>>>>> feature to tackle. This would allow the runner to at least aggregate and
>>>>>>>>> report/return to the SDK metrics that it did not itself understand the
>>>>>>>>> semantic meaning of. (It would probably simplify much of the specialization
>>>>>>>>> in the runner itself for metrics that it *did* understand as well.)
>>>>>>>>>
>>>>>>>>> In addition, rather than having UserMetricOfTypeX for every type X
>>>>>>>>> one would have a single URN for UserMetric and it spec would designate the
>>>>>>>>> type and payload designate the (qualified) name.
>>>>>>>>>
>>>>>>>>> - Robert
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>>> I have made a revision today which is to make all metrics refer
>>>>>>>>>> to a primary entity, so I have restructured some of the protos a little bit.
>>>>>>>>>>
>>>>>>>>>> The point of this change was to futureproof the possibility of
>>>>>>>>>> allowing custom user metrics, with custom aggregation functions for its
>>>>>>>>>> metric updates.
>>>>>>>>>> Now that each metric has an aggregation_entity associated with it
>>>>>>>>>> (e.g. PCollection, PTransform), we can design an approach which forwards
>>>>>>>>>> the opaque bytes metric updates, without deserializing them. These are
>>>>>>>>>> forwarded to user provided code which then would deserialize the metric
>>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>>
>>>>>>>>>> I think it has also simplified some of the URN metric protos, as
>>>>>>>>>> they do not need to keep track of ptransform names inside themselves now.
>>>>>>>>>> The result is simpler structures, for the metrics as the entities are
>>>>>>>>>> pulled outside of the metric.
>>>>>>>>>>
>>>>>>>>>> I have mentioned this in the doc now, and wanted to draw
>>>>>>>>>> attention to this particular revision.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>>
>>>>>>>>>>> Please make sure that you provide your feedback before then and
>>>>>>>>>>> I will post the final decisions made to this thread Friday afternoon.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Nice, I created a short link so people can refer to it easily in
>>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>>
>>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>>> >
>>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> Thank you everyone for your initial feedback on this
>>>>>>>>>>>> proposal so far. I
>>>>>>>>>>>> >> have made some revisions based on the feedback. There were
>>>>>>>>>>>> some larger
>>>>>>>>>>>> >> questions asking about alternatives. For each of these I
>>>>>>>>>>>> have added a
>>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>>> recommendation as well
>>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> I would appreciate more feedback on the revised proposal.
>>>>>>>>>>>> Please take
>>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>>> >>
>>>>>>>>>>>> >>
>>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> Etienne, I would appreciate it if you could please take
>>>>>>>>>>>> another look after
>>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>>> >>
>>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>>> >> Alex
>>>>>>>>>>>> >>
>>>>>>>>>>>> >
>>>>>>>>>>>>
>>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Alex Amato <aj...@google.com>.

I agree that there is some confusion about concepts. Here are several
concepts which have come up in discussions, as I see them (not official
names).

*Metric*

   - For the purposes of my document, I have been referring to a Metric as
   any sort of information the SDK can send to the Runner
      - This does not mean only quantitative, aggregated values.
      - This can include other useful '*monitoring information*', for
      supporting debugging/monitoring scenarios such as
         - A table of files which are not yet finished reading, causing a
         streaming pipeline to be blocked
      - It has been pointed out to me, that when many people hear metric, a
   very specific thing comes to mind, in particular quantitative,
   aggregated values. *That is NOT what my document is limited to. I
   consider both that type of metric, and more arbitrary 'monitoring
   information', like a table of files with statuses in the proposal.*
   - Perhaps there should be another word for this concept, yet I have not
   yet come up with a good one, "monitoring information", "monitoring item"
   perhaps.


*Metric types/Metric classes*

   - A collection of information reported on ProcessBundleProgressResponse
   and ProcessBundleResponse from the SDK to the RunnerHarness.
      - e.g. execution time of par do functions.
   - In my proposal they are defined by a URN and two structs which are
   serialized into a MetricSpec and Metric bytes payload field, for requesting
   and responding to the metrics.
      - e.g. beam:metric:ptransform_execution_times:v1 defines the
      information needed to describe how a ptransform
   - All metrics which are passed across the FN API have a *metric type*


*User metrics*

   - A metric added by a pipeline writer, using an SDK API to create these.
   - In my proposal the various *UserMetric types are a Metric Type. *
      - e.g. “urn:beam:metric:user_distribution_data:v1” and
“urn:beam:metric:user_counter_data:v1”
      define two metric types for packaging these user metrics and
      communicating them across the FN API.
      - SDK writers would need to write code to package the user metrics
      from SDK API calls into their associated metric types to send them across
      the FN API.

*Custom metric types*

   - A metric type which is not included in a catalog of first class beam
   metrics. This can be thought of as metrics a custom engine+runner+sdk
   (system as a whole) collects which is not part of the beam model.
      - e.g. a closed source runner can define its own URNs and metrics,
      extending the beam model
         - for example an I/O source specific to a closed source
         engine+runner+sdk may export a table of files it is reading
with statuses
         as a custom metric type


*Custom User Metrics with Custom Metric Types *

   - Not proposed to support by the doc
   - A user specified metric, written by a pipeline writer with a custom
   metric type, likely would be implemented using a general mechanism to
   attach the custom metric.
   - May have a custom user specified aggregation function as well.


*Reporting metrics to external systems such as drop wizard*

   - My doc does not specifically cover this, it assumes that a runner
   harness would be responsible for reporting metrics in formats specific to
   those external systems, such as Drop Wizard. It assumes that the
   URNs+Metric types provided will be specified enough so that it would be
   possible to make such a translation.
   - Each metric type would need to be handled in the RunnerHarness, to
   collect and report the metric to an external system
   - Some concern has come up about this, and if this should dictate the
   format of the metrics which the SDK sends to the RunnerHarness of the FN
   API, rather than using the more custom URN+payload approach.
   - Though there could be URNs specifically designed to do this, the
      intention of the design in the doc is to not require SDKs to give string
      "names" to metrics, just to fill in URN payloads, and the Runner Harness
      will pick names for metrics if needed to send to external systems.

Just wanted to clarify this a bit. I hope the example of the table of files
being a more complex metric type describes the usage of custom metric
types. I'll update the doc with this

@Robert, I am not sure if you are proposing anything that is not in the
current form of the doc.

On Thu, Apr 12, 2018 at 9:25 AM Kenneth Knowles <kl...@google.com> wrote:

> Agree with all of this. It echoes a thread on the doc that I was going to
> bring here. Let's keep it simple and use concrete use cases to drive
> additional abstraction if/when it becomes compelling.
>
> Kenn
>
> On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com> wrote:
>
>> Sounds perfect. Just wanted to make sure that "custom metrics of
>> supported type" didn't include new ways of aggregating ints. As long as
>> that means we have a fixed set of aggregations (that align with what what
>> users want and metrics back end support) it seems like we are doing user
>> metrics right.
>>
>> - Ben
>>
>> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> Maybe leave it out until proven it is needed. ATM counters are used a
>>> lot but others are less mainstream so being too fine from the start can
>>> just add complexity and bugs in impls IMHO.
>>>
>>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a écrit :
>>>
>>>> By "type" of metric, I mean both the data types (including their
>>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>>> double-distribution.
>>>>
>>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <bj...@gmail.com>
>>>> wrote:
>>>>
>>>>> When you say type do you mean accumulator type, result type, or
>>>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>>>> meanlong, etc?
>>>>>
>>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Fully custom metric types is the "more speculative and difficult"
>>>>>> feature that I was proposing we kick down the road (and may never get to).
>>>>>> What I'm suggesting is that we support custom metrics of standard type.
>>>>>>
>>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> The metric api is designed to prevent user defined metric types
>>>>>>> based on the fact they just weren't used enough to justify support.
>>>>>>>
>>>>>>> Is there a reason we are bringing that complexity back? Shouldn't we
>>>>>>> just need the ability for the standard set plus any special system metrivs?
>>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks. I think this has simplified things.
>>>>>>>>
>>>>>>>> One thing that has occurred to me is that we're conflating the idea
>>>>>>>> of custom metrics and custom metric types. I would propose the MetricSpec
>>>>>>>> field be augmented with an additional field "type" which is a urn
>>>>>>>> specifying the type of metric it is (i.e. the contents of its payload, as
>>>>>>>> well as the form of aggregation). Summing or maxing over ints would be a
>>>>>>>> typical example. Though we could pursue making this opaque to the runner in
>>>>>>>> the long run, that's a more speculative (and difficult) feature to tackle.
>>>>>>>> This would allow the runner to at least aggregate and report/return to the
>>>>>>>> SDK metrics that it did not itself understand the semantic meaning of. (It
>>>>>>>> would probably simplify much of the specialization in the runner itself for
>>>>>>>> metrics that it *did* understand as well.)
>>>>>>>>
>>>>>>>> In addition, rather than having UserMetricOfTypeX for every type X
>>>>>>>> one would have a single URN for UserMetric and it spec would designate the
>>>>>>>> type and payload designate the (qualified) name.
>>>>>>>>
>>>>>>>> - Robert
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>>> I have made a revision today which is to make all metrics refer to
>>>>>>>>> a primary entity, so I have restructured some of the protos a little bit.
>>>>>>>>>
>>>>>>>>> The point of this change was to futureproof the possibility of
>>>>>>>>> allowing custom user metrics, with custom aggregation functions for its
>>>>>>>>> metric updates.
>>>>>>>>> Now that each metric has an aggregation_entity associated with it
>>>>>>>>> (e.g. PCollection, PTransform), we can design an approach which forwards
>>>>>>>>> the opaque bytes metric updates, without deserializing them. These are
>>>>>>>>> forwarded to user provided code which then would deserialize the metric
>>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>>
>>>>>>>>> I think it has also simplified some of the URN metric protos, as
>>>>>>>>> they do not need to keep track of ptransform names inside themselves now.
>>>>>>>>> The result is simpler structures, for the metrics as the entities are
>>>>>>>>> pulled outside of the metric.
>>>>>>>>>
>>>>>>>>> I have mentioned this in the doc now, and wanted to draw attention
>>>>>>>>> to this particular revision.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> I've gathered a lot of feedback so far and want to make a
>>>>>>>>>> decision by Friday, and begin working on related PRs next week.
>>>>>>>>>>
>>>>>>>>>> Please make sure that you provide your feedback before then and I
>>>>>>>>>> will post the final decisions made to this thread Friday afternoon.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Nice, I created a short link so people can refer to it easily in
>>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>>
>>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>>
>>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>>> >
>>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>> >>
>>>>>>>>>>> >> Hello beam community,
>>>>>>>>>>> >>
>>>>>>>>>>> >> Thank you everyone for your initial feedback on this proposal
>>>>>>>>>>> so far. I
>>>>>>>>>>> >> have made some revisions based on the feedback. There were
>>>>>>>>>>> some larger
>>>>>>>>>>> >> questions asking about alternatives. For each of these I have
>>>>>>>>>>> added a
>>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>>> recommendation as well
>>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>>> >>
>>>>>>>>>>> >> I would appreciate more feedback on the revised proposal.
>>>>>>>>>>> Please take
>>>>>>>>>>> >> another look and let me know
>>>>>>>>>>> >>
>>>>>>>>>>> >>
>>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>>> >>
>>>>>>>>>>> >> Etienne, I would appreciate it if you could please take
>>>>>>>>>>> another look after
>>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>>> >>
>>>>>>>>>>> >> Thanks again,
>>>>>>>>>>> >> Alex
>>>>>>>>>>> >>
>>>>>>>>>>> >
>>>>>>>>>>>
>>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Kenneth Knowles <kl...@google.com>.

Agree with all of this. It echoes a thread on the doc that I was going to
bring here. Let's keep it simple and use concrete use cases to drive
additional abstraction if/when it becomes compelling.

Kenn

On Thu, Apr 12, 2018 at 9:21 AM Ben Chambers <bj...@gmail.com> wrote:

> Sounds perfect. Just wanted to make sure that "custom metrics of supported
> type" didn't include new ways of aggregating ints. As long as that means we
> have a fixed set of aggregations (that align with what what users want and
> metrics back end support) it seems like we are doing user metrics right.
>
> - Ben
>
> On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> Maybe leave it out until proven it is needed. ATM counters are used a lot
>> but others are less mainstream so being too fine from the start can just
>> add complexity and bugs in impls IMHO.
>>
>> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a écrit :
>>
>>> By "type" of metric, I mean both the data types (including their
>>> encoding) and accumulator strategy. So sumint would be a type, as would
>>> double-distribution.
>>>
>>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <bj...@gmail.com>
>>> wrote:
>>>
>>>> When you say type do you mean accumulator type, result type, or
>>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>>> meanlong, etc?
>>>>
>>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> Fully custom metric types is the "more speculative and difficult"
>>>>> feature that I was proposing we kick down the road (and may never get to).
>>>>> What I'm suggesting is that we support custom metrics of standard type.
>>>>>
>>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org>
>>>>> wrote:
>>>>>
>>>>>> The metric api is designed to prevent user defined metric types based
>>>>>> on the fact they just weren't used enough to justify support.
>>>>>>
>>>>>> Is there a reason we are bringing that complexity back? Shouldn't we
>>>>>> just need the ability for the standard set plus any special system metrivs?
>>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thanks. I think this has simplified things.
>>>>>>>
>>>>>>> One thing that has occurred to me is that we're conflating the idea
>>>>>>> of custom metrics and custom metric types. I would propose the MetricSpec
>>>>>>> field be augmented with an additional field "type" which is a urn
>>>>>>> specifying the type of metric it is (i.e. the contents of its payload, as
>>>>>>> well as the form of aggregation). Summing or maxing over ints would be a
>>>>>>> typical example. Though we could pursue making this opaque to the runner in
>>>>>>> the long run, that's a more speculative (and difficult) feature to tackle.
>>>>>>> This would allow the runner to at least aggregate and report/return to the
>>>>>>> SDK metrics that it did not itself understand the semantic meaning of. (It
>>>>>>> would probably simplify much of the specialization in the runner itself for
>>>>>>> metrics that it *did* understand as well.)
>>>>>>>
>>>>>>> In addition, rather than having UserMetricOfTypeX for every type X
>>>>>>> one would have a single URN for UserMetric and it spec would designate the
>>>>>>> type and payload designate the (qualified) name.
>>>>>>>
>>>>>>> - Robert
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank you everyone for your feedback so far.
>>>>>>>> I have made a revision today which is to make all metrics refer to
>>>>>>>> a primary entity, so I have restructured some of the protos a little bit.
>>>>>>>>
>>>>>>>> The point of this change was to futureproof the possibility of
>>>>>>>> allowing custom user metrics, with custom aggregation functions for its
>>>>>>>> metric updates.
>>>>>>>> Now that each metric has an aggregation_entity associated with it
>>>>>>>> (e.g. PCollection, PTransform), we can design an approach which forwards
>>>>>>>> the opaque bytes metric updates, without deserializing them. These are
>>>>>>>> forwarded to user provided code which then would deserialize the metric
>>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>>
>>>>>>>> I think it has also simplified some of the URN metric protos, as
>>>>>>>> they do not need to keep track of ptransform names inside themselves now.
>>>>>>>> The result is simpler structures, for the metrics as the entities are
>>>>>>>> pulled outside of the metric.
>>>>>>>>
>>>>>>>> I have mentioned this in the doc now, and wanted to draw attention
>>>>>>>> to this particular revision.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> I've gathered a lot of feedback so far and want to make a decision
>>>>>>>>> by Friday, and begin working on related PRs next week.
>>>>>>>>>
>>>>>>>>> Please make sure that you provide your feedback before then and I
>>>>>>>>> will post the final decisions made to this thread Friday afternoon.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Nice, I created a short link so people can refer to it easily in
>>>>>>>>>> future discussions, website, etc.
>>>>>>>>>>
>>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>>
>>>>>>>>>> Thanks for sharing.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>>> >
>>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com>
>>>>>>>>>> wrote:
>>>>>>>>>> >>
>>>>>>>>>> >> Hello beam community,
>>>>>>>>>> >>
>>>>>>>>>> >> Thank you everyone for your initial feedback on this proposal
>>>>>>>>>> so far. I
>>>>>>>>>> >> have made some revisions based on the feedback. There were
>>>>>>>>>> some larger
>>>>>>>>>> >> questions asking about alternatives. For each of these I have
>>>>>>>>>> added a
>>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>>> recommendation as well
>>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>>> >>
>>>>>>>>>> >> I would appreciate more feedback on the revised proposal.
>>>>>>>>>> Please take
>>>>>>>>>> >> another look and let me know
>>>>>>>>>> >>
>>>>>>>>>> >>
>>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>>> >>
>>>>>>>>>> >> Etienne, I would appreciate it if you could please take
>>>>>>>>>> another look after
>>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>>> >>
>>>>>>>>>> >> Thanks again,
>>>>>>>>>> >> Alex
>>>>>>>>>> >>
>>>>>>>>>> >
>>>>>>>>>>
>>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Ben Chambers <bj...@gmail.com>.

Sounds perfect. Just wanted to make sure that "custom metrics of supported
type" didn't include new ways of aggregating ints. As long as that means we
have a fixed set of aggregations (that align with what what users want and
metrics back end support) it seems like we are doing user metrics right.

- Ben

On Wed, Apr 11, 2018, 11:30 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Maybe leave it out until proven it is needed. ATM counters are used a lot
> but others are less mainstream so being too fine from the start can just
> add complexity and bugs in impls IMHO.
>
> Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a écrit :
>
>> By "type" of metric, I mean both the data types (including their
>> encoding) and accumulator strategy. So sumint would be a type, as would
>> double-distribution.
>>
>> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <bj...@gmail.com>
>> wrote:
>>
>>> When you say type do you mean accumulator type, result type, or
>>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>>> meanlong, etc?
>>>
>>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> Fully custom metric types is the "more speculative and difficult"
>>>> feature that I was proposing we kick down the road (and may never get to).
>>>> What I'm suggesting is that we support custom metrics of standard type.
>>>>
>>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org>
>>>> wrote:
>>>>
>>>>> The metric api is designed to prevent user defined metric types based
>>>>> on the fact they just weren't used enough to justify support.
>>>>>
>>>>> Is there a reason we are bringing that complexity back? Shouldn't we
>>>>> just need the ability for the standard set plus any special system metrivs?
>>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>>
>>>>>> Thanks. I think this has simplified things.
>>>>>>
>>>>>> One thing that has occurred to me is that we're conflating the idea
>>>>>> of custom metrics and custom metric types. I would propose the MetricSpec
>>>>>> field be augmented with an additional field "type" which is a urn
>>>>>> specifying the type of metric it is (i.e. the contents of its payload, as
>>>>>> well as the form of aggregation). Summing or maxing over ints would be a
>>>>>> typical example. Though we could pursue making this opaque to the runner in
>>>>>> the long run, that's a more speculative (and difficult) feature to tackle.
>>>>>> This would allow the runner to at least aggregate and report/return to the
>>>>>> SDK metrics that it did not itself understand the semantic meaning of. (It
>>>>>> would probably simplify much of the specialization in the runner itself for
>>>>>> metrics that it *did* understand as well.)
>>>>>>
>>>>>> In addition, rather than having UserMetricOfTypeX for every type X
>>>>>> one would have a single URN for UserMetric and it spec would designate the
>>>>>> type and payload designate the (qualified) name.
>>>>>>
>>>>>> - Robert
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank you everyone for your feedback so far.
>>>>>>> I have made a revision today which is to make all metrics refer to a
>>>>>>> primary entity, so I have restructured some of the protos a little bit.
>>>>>>>
>>>>>>> The point of this change was to futureproof the possibility of
>>>>>>> allowing custom user metrics, with custom aggregation functions for its
>>>>>>> metric updates.
>>>>>>> Now that each metric has an aggregation_entity associated with it
>>>>>>> (e.g. PCollection, PTransform), we can design an approach which forwards
>>>>>>> the opaque bytes metric updates, without deserializing them. These are
>>>>>>> forwarded to user provided code which then would deserialize the metric
>>>>>>> update payloads and perform the custom aggregations.
>>>>>>>
>>>>>>> I think it has also simplified some of the URN metric protos, as
>>>>>>> they do not need to keep track of ptransform names inside themselves now.
>>>>>>> The result is simpler structures, for the metrics as the entities are
>>>>>>> pulled outside of the metric.
>>>>>>>
>>>>>>> I have mentioned this in the doc now, and wanted to draw attention
>>>>>>> to this particular revision.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> I've gathered a lot of feedback so far and want to make a decision
>>>>>>>> by Friday, and begin working on related PRs next week.
>>>>>>>>
>>>>>>>> Please make sure that you provide your feedback before then and I
>>>>>>>> will post the final decisions made to this thread Friday afternoon.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Nice, I created a short link so people can refer to it easily in
>>>>>>>>> future discussions, website, etc.
>>>>>>>>>
>>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>>
>>>>>>>>> Thanks for sharing.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>>> robertwb@google.com> wrote:
>>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>>> >
>>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com>
>>>>>>>>> wrote:
>>>>>>>>> >>
>>>>>>>>> >> Hello beam community,
>>>>>>>>> >>
>>>>>>>>> >> Thank you everyone for your initial feedback on this proposal
>>>>>>>>> so far. I
>>>>>>>>> >> have made some revisions based on the feedback. There were some
>>>>>>>>> larger
>>>>>>>>> >> questions asking about alternatives. For each of these I have
>>>>>>>>> added a
>>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>>> recommendation as well
>>>>>>>>> >> as as few other choices we considered.
>>>>>>>>> >>
>>>>>>>>> >> I would appreciate more feedback on the revised proposal.
>>>>>>>>> Please take
>>>>>>>>> >> another look and let me know
>>>>>>>>> >>
>>>>>>>>> >>
>>>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>>> >>
>>>>>>>>> >> Etienne, I would appreciate it if you could please take another
>>>>>>>>> look after
>>>>>>>>> >> the revisions I have made as well.
>>>>>>>>> >>
>>>>>>>>> >> Thanks again,
>>>>>>>>> >> Alex
>>>>>>>>> >>
>>>>>>>>> >
>>>>>>>>>
>>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Maybe leave it out until proven it is needed. ATM counters are used a lot
but others are less mainstream so being too fine from the start can just
add complexity and bugs in impls IMHO.

Le 12 avr. 2018 08:06, "Robert Bradshaw" <ro...@google.com> a écrit :

> By "type" of metric, I mean both the data types (including their encoding)
> and accumulator strategy. So sumint would be a type, as would
> double-distribution.
>
> On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <bj...@gmail.com>
> wrote:
>
>> When you say type do you mean accumulator type, result type, or
>> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
>> meanlong, etc?
>>
>> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> Fully custom metric types is the "more speculative and difficult"
>>> feature that I was proposing we kick down the road (and may never get to).
>>> What I'm suggesting is that we support custom metrics of standard type.
>>>
>>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org>
>>> wrote:
>>>
>>>> The metric api is designed to prevent user defined metric types based
>>>> on the fact they just weren't used enough to justify support.
>>>>
>>>> Is there a reason we are bringing that complexity back? Shouldn't we
>>>> just need the ability for the standard set plus any special system metrivs?
>>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>>
>>>>> Thanks. I think this has simplified things.
>>>>>
>>>>> One thing that has occurred to me is that we're conflating the idea of
>>>>> custom metrics and custom metric types. I would propose the MetricSpec
>>>>> field be augmented with an additional field "type" which is a urn
>>>>> specifying the type of metric it is (i.e. the contents of its payload, as
>>>>> well as the form of aggregation). Summing or maxing over ints would be a
>>>>> typical example. Though we could pursue making this opaque to the runner in
>>>>> the long run, that's a more speculative (and difficult) feature to tackle.
>>>>> This would allow the runner to at least aggregate and report/return to the
>>>>> SDK metrics that it did not itself understand the semantic meaning of. (It
>>>>> would probably simplify much of the specialization in the runner itself for
>>>>> metrics that it *did* understand as well.)
>>>>>
>>>>> In addition, rather than having UserMetricOfTypeX for every type X one
>>>>> would have a single URN for UserMetric and it spec would designate the type
>>>>> and payload designate the (qualified) name.
>>>>>
>>>>> - Robert
>>>>>
>>>>>
>>>>>
>>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com> wrote:
>>>>>
>>>>>> Thank you everyone for your feedback so far.
>>>>>> I have made a revision today which is to make all metrics refer to a
>>>>>> primary entity, so I have restructured some of the protos a little bit.
>>>>>>
>>>>>> The point of this change was to futureproof the possibility of
>>>>>> allowing custom user metrics, with custom aggregation functions for its
>>>>>> metric updates.
>>>>>> Now that each metric has an aggregation_entity associated with it
>>>>>> (e.g. PCollection, PTransform), we can design an approach which forwards
>>>>>> the opaque bytes metric updates, without deserializing them. These are
>>>>>> forwarded to user provided code which then would deserialize the metric
>>>>>> update payloads and perform the custom aggregations.
>>>>>>
>>>>>> I think it has also simplified some of the URN metric protos, as they
>>>>>> do not need to keep track of ptransform names inside themselves now. The
>>>>>> result is simpler structures, for the metrics as the entities are pulled
>>>>>> outside of the metric.
>>>>>>
>>>>>> I have mentioned this in the doc now, and wanted to draw attention to
>>>>>> this particular revision.
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com>
>>>>>> wrote:
>>>>>>
>>>>>>> I've gathered a lot of feedback so far and want to make a decision
>>>>>>> by Friday, and begin working on related PRs next week.
>>>>>>>
>>>>>>> Please make sure that you provide your feedback before then and I
>>>>>>> will post the final decisions made to this thread Friday afternoon.
>>>>>>>
>>>>>>>
>>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Nice, I created a short link so people can refer to it easily in
>>>>>>>> future discussions, website, etc.
>>>>>>>>
>>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>>
>>>>>>>> Thanks for sharing.
>>>>>>>>
>>>>>>>>
>>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>>> robertwb@google.com> wrote:
>>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>>> >
>>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com>
>>>>>>>> wrote:
>>>>>>>> >>
>>>>>>>> >> Hello beam community,
>>>>>>>> >>
>>>>>>>> >> Thank you everyone for your initial feedback on this proposal so
>>>>>>>> far. I
>>>>>>>> >> have made some revisions based on the feedback. There were some
>>>>>>>> larger
>>>>>>>> >> questions asking about alternatives. For each of these I have
>>>>>>>> added a
>>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>>> recommendation as well
>>>>>>>> >> as as few other choices we considered.
>>>>>>>> >>
>>>>>>>> >> I would appreciate more feedback on the revised proposal. Please
>>>>>>>> take
>>>>>>>> >> another look and let me know
>>>>>>>> >>
>>>>>>>> >> https://docs.google.com/document/d/
>>>>>>>> 1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>>> >>
>>>>>>>> >> Etienne, I would appreciate it if you could please take another
>>>>>>>> look after
>>>>>>>> >> the revisions I have made as well.
>>>>>>>> >>
>>>>>>>> >> Thanks again,
>>>>>>>> >> Alex
>>>>>>>> >>
>>>>>>>> >
>>>>>>>>
>>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

By "type" of metric, I mean both the data types (including their encoding)
and accumulator strategy. So sumint would be a type, as would
double-distribution.

On Wed, Apr 11, 2018 at 10:39 PM Ben Chambers <bj...@gmail.com> wrote:

> When you say type do you mean accumulator type, result type, or
> accumulator strategy? Specifically, what is the "type" of sumint, sumlong,
> meanlong, etc?
>
> On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com> wrote:
>
>> Fully custom metric types is the "more speculative and difficult" feature
>> that I was proposing we kick down the road (and may never get to). What I'm
>> suggesting is that we support custom metrics of standard type.
>>
>> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org>
>> wrote:
>>
>>> The metric api is designed to prevent user defined metric types based on
>>> the fact they just weren't used enough to justify support.
>>>
>>> Is there a reason we are bringing that complexity back? Shouldn't we
>>> just need the ability for the standard set plus any special system metrivs?
>>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com>
>>> wrote:
>>>
>>>> Thanks. I think this has simplified things.
>>>>
>>>> One thing that has occurred to me is that we're conflating the idea of
>>>> custom metrics and custom metric types. I would propose the MetricSpec
>>>> field be augmented with an additional field "type" which is a urn
>>>> specifying the type of metric it is (i.e. the contents of its payload, as
>>>> well as the form of aggregation). Summing or maxing over ints would be a
>>>> typical example. Though we could pursue making this opaque to the runner in
>>>> the long run, that's a more speculative (and difficult) feature to tackle.
>>>> This would allow the runner to at least aggregate and report/return to the
>>>> SDK metrics that it did not itself understand the semantic meaning of. (It
>>>> would probably simplify much of the specialization in the runner itself for
>>>> metrics that it *did* understand as well.)
>>>>
>>>> In addition, rather than having UserMetricOfTypeX for every type X one
>>>> would have a single URN for UserMetric and it spec would designate the type
>>>> and payload designate the (qualified) name.
>>>>
>>>> - Robert
>>>>
>>>>
>>>>
>>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com> wrote:
>>>>
>>>>> Thank you everyone for your feedback so far.
>>>>> I have made a revision today which is to make all metrics refer to a
>>>>> primary entity, so I have restructured some of the protos a little bit.
>>>>>
>>>>> The point of this change was to futureproof the possibility of
>>>>> allowing custom user metrics, with custom aggregation functions for its
>>>>> metric updates.
>>>>> Now that each metric has an aggregation_entity associated with it
>>>>> (e.g. PCollection, PTransform), we can design an approach which forwards
>>>>> the opaque bytes metric updates, without deserializing them. These are
>>>>> forwarded to user provided code which then would deserialize the metric
>>>>> update payloads and perform the custom aggregations.
>>>>>
>>>>> I think it has also simplified some of the URN metric protos, as they
>>>>> do not need to keep track of ptransform names inside themselves now. The
>>>>> result is simpler structures, for the metrics as the entities are pulled
>>>>> outside of the metric.
>>>>>
>>>>> I have mentioned this in the doc now, and wanted to draw attention to
>>>>> this particular revision.
>>>>>
>>>>>
>>>>>
>>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com> wrote:
>>>>>
>>>>>> I've gathered a lot of feedback so far and want to make a decision by
>>>>>> Friday, and begin working on related PRs next week.
>>>>>>
>>>>>> Please make sure that you provide your feedback before then and I
>>>>>> will post the final decisions made to this thread Friday afternoon.
>>>>>>
>>>>>>
>>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Nice, I created a short link so people can refer to it easily in
>>>>>>> future discussions, website, etc.
>>>>>>>
>>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>>
>>>>>>> Thanks for sharing.
>>>>>>>
>>>>>>>
>>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <
>>>>>>> robertwb@google.com> wrote:
>>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>>> >
>>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com>
>>>>>>> wrote:
>>>>>>> >>
>>>>>>> >> Hello beam community,
>>>>>>> >>
>>>>>>> >> Thank you everyone for your initial feedback on this proposal so
>>>>>>> far. I
>>>>>>> >> have made some revisions based on the feedback. There were some
>>>>>>> larger
>>>>>>> >> questions asking about alternatives. For each of these I have
>>>>>>> added a
>>>>>>> >> section tagged with [Alternatives] and discussed my
>>>>>>> recommendation as well
>>>>>>> >> as as few other choices we considered.
>>>>>>> >>
>>>>>>> >> I would appreciate more feedback on the revised proposal. Please
>>>>>>> take
>>>>>>> >> another look and let me know
>>>>>>> >>
>>>>>>> >>
>>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>>> >>
>>>>>>> >> Etienne, I would appreciate it if you could please take another
>>>>>>> look after
>>>>>>> >> the revisions I have made as well.
>>>>>>> >>
>>>>>>> >> Thanks again,
>>>>>>> >> Alex
>>>>>>> >>
>>>>>>> >
>>>>>>>
>>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Ben Chambers <bj...@gmail.com>.

When you say type do you mean accumulator type, result type, or accumulator
strategy? Specifically, what is the "type" of sumint, sumlong, meanlong,
etc?

On Wed, Apr 11, 2018, 9:38 PM Robert Bradshaw <ro...@google.com> wrote:

> Fully custom metric types is the "more speculative and difficult" feature
> that I was proposing we kick down the road (and may never get to). What I'm
> suggesting is that we support custom metrics of standard type.
>
> On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org> wrote:
>
>> The metric api is designed to prevent user defined metric types based on
>> the fact they just weren't used enough to justify support.
>>
>> Is there a reason we are bringing that complexity back? Shouldn't we just
>> need the ability for the standard set plus any special system metrivs?
>> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com>
>> wrote:
>>
>>> Thanks. I think this has simplified things.
>>>
>>> One thing that has occurred to me is that we're conflating the idea of
>>> custom metrics and custom metric types. I would propose the MetricSpec
>>> field be augmented with an additional field "type" which is a urn
>>> specifying the type of metric it is (i.e. the contents of its payload, as
>>> well as the form of aggregation). Summing or maxing over ints would be a
>>> typical example. Though we could pursue making this opaque to the runner in
>>> the long run, that's a more speculative (and difficult) feature to tackle.
>>> This would allow the runner to at least aggregate and report/return to the
>>> SDK metrics that it did not itself understand the semantic meaning of. (It
>>> would probably simplify much of the specialization in the runner itself for
>>> metrics that it *did* understand as well.)
>>>
>>> In addition, rather than having UserMetricOfTypeX for every type X one
>>> would have a single URN for UserMetric and it spec would designate the type
>>> and payload designate the (qualified) name.
>>>
>>> - Robert
>>>
>>>
>>>
>>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com> wrote:
>>>
>>>> Thank you everyone for your feedback so far.
>>>> I have made a revision today which is to make all metrics refer to a
>>>> primary entity, so I have restructured some of the protos a little bit.
>>>>
>>>> The point of this change was to futureproof the possibility of allowing
>>>> custom user metrics, with custom aggregation functions for its metric
>>>> updates.
>>>> Now that each metric has an aggregation_entity associated with it (e.g.
>>>> PCollection, PTransform), we can design an approach which forwards the
>>>> opaque bytes metric updates, without deserializing them. These are
>>>> forwarded to user provided code which then would deserialize the metric
>>>> update payloads and perform the custom aggregations.
>>>>
>>>> I think it has also simplified some of the URN metric protos, as they
>>>> do not need to keep track of ptransform names inside themselves now. The
>>>> result is simpler structures, for the metrics as the entities are pulled
>>>> outside of the metric.
>>>>
>>>> I have mentioned this in the doc now, and wanted to draw attention to
>>>> this particular revision.
>>>>
>>>>
>>>>
>>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com> wrote:
>>>>
>>>>> I've gathered a lot of feedback so far and want to make a decision by
>>>>> Friday, and begin working on related PRs next week.
>>>>>
>>>>> Please make sure that you provide your feedback before then and I will
>>>>> post the final decisions made to this thread Friday afternoon.
>>>>>
>>>>>
>>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>>>>
>>>>>> Nice, I created a short link so people can refer to it easily in
>>>>>> future discussions, website, etc.
>>>>>>
>>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>>
>>>>>> Thanks for sharing.
>>>>>>
>>>>>>
>>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <ro...@google.com>
>>>>>> wrote:
>>>>>> > Thanks for the nice writeup. I added some comments.
>>>>>> >
>>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com>
>>>>>> wrote:
>>>>>> >>
>>>>>> >> Hello beam community,
>>>>>> >>
>>>>>> >> Thank you everyone for your initial feedback on this proposal so
>>>>>> far. I
>>>>>> >> have made some revisions based on the feedback. There were some
>>>>>> larger
>>>>>> >> questions asking about alternatives. For each of these I have
>>>>>> added a
>>>>>> >> section tagged with [Alternatives] and discussed my recommendation
>>>>>> as well
>>>>>> >> as as few other choices we considered.
>>>>>> >>
>>>>>> >> I would appreciate more feedback on the revised proposal. Please
>>>>>> take
>>>>>> >> another look and let me know
>>>>>> >>
>>>>>> >>
>>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>>> >>
>>>>>> >> Etienne, I would appreciate it if you could please take another
>>>>>> look after
>>>>>> >> the revisions I have made as well.
>>>>>> >>
>>>>>> >> Thanks again,
>>>>>> >> Alex
>>>>>> >>
>>>>>> >
>>>>>>
>>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

Fully custom metric types is the "more speculative and difficult" feature
that I was proposing we kick down the road (and may never get to). What I'm
suggesting is that we support custom metrics of standard type.

On Wed, Apr 11, 2018 at 5:52 PM Ben Chambers <bc...@apache.org> wrote:

> The metric api is designed to prevent user defined metric types based on
> the fact they just weren't used enough to justify support.
>
> Is there a reason we are bringing that complexity back? Shouldn't we just
> need the ability for the standard set plus any special system metrivs?
> On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com> wrote:
>
>> Thanks. I think this has simplified things.
>>
>> One thing that has occurred to me is that we're conflating the idea of
>> custom metrics and custom metric types. I would propose the MetricSpec
>> field be augmented with an additional field "type" which is a urn
>> specifying the type of metric it is (i.e. the contents of its payload, as
>> well as the form of aggregation). Summing or maxing over ints would be a
>> typical example. Though we could pursue making this opaque to the runner in
>> the long run, that's a more speculative (and difficult) feature to tackle.
>> This would allow the runner to at least aggregate and report/return to the
>> SDK metrics that it did not itself understand the semantic meaning of. (It
>> would probably simplify much of the specialization in the runner itself for
>> metrics that it *did* understand as well.)
>>
>> In addition, rather than having UserMetricOfTypeX for every type X one
>> would have a single URN for UserMetric and it spec would designate the type
>> and payload designate the (qualified) name.
>>
>> - Robert
>>
>>
>>
>> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com> wrote:
>>
>>> Thank you everyone for your feedback so far.
>>> I have made a revision today which is to make all metrics refer to a
>>> primary entity, so I have restructured some of the protos a little bit.
>>>
>>> The point of this change was to futureproof the possibility of allowing
>>> custom user metrics, with custom aggregation functions for its metric
>>> updates.
>>> Now that each metric has an aggregation_entity associated with it (e.g.
>>> PCollection, PTransform), we can design an approach which forwards the
>>> opaque bytes metric updates, without deserializing them. These are
>>> forwarded to user provided code which then would deserialize the metric
>>> update payloads and perform the custom aggregations.
>>>
>>> I think it has also simplified some of the URN metric protos, as they do
>>> not need to keep track of ptransform names inside themselves now. The
>>> result is simpler structures, for the metrics as the entities are pulled
>>> outside of the metric.
>>>
>>> I have mentioned this in the doc now, and wanted to draw attention to
>>> this particular revision.
>>>
>>>
>>>
>>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com> wrote:
>>>
>>>> I've gathered a lot of feedback so far and want to make a decision by
>>>> Friday, and begin working on related PRs next week.
>>>>
>>>> Please make sure that you provide your feedback before then and I will
>>>> post the final decisions made to this thread Friday afternoon.
>>>>
>>>>
>>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>>>
>>>>> Nice, I created a short link so people can refer to it easily in
>>>>> future discussions, website, etc.
>>>>>
>>>>> https://s.apache.org/beam-fn-api-metrics
>>>>>
>>>>> Thanks for sharing.
>>>>>
>>>>>
>>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <ro...@google.com>
>>>>> wrote:
>>>>> > Thanks for the nice writeup. I added some comments.
>>>>> >
>>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com>
>>>>> wrote:
>>>>> >>
>>>>> >> Hello beam community,
>>>>> >>
>>>>> >> Thank you everyone for your initial feedback on this proposal so
>>>>> far. I
>>>>> >> have made some revisions based on the feedback. There were some
>>>>> larger
>>>>> >> questions asking about alternatives. For each of these I have added
>>>>> a
>>>>> >> section tagged with [Alternatives] and discussed my recommendation
>>>>> as well
>>>>> >> as as few other choices we considered.
>>>>> >>
>>>>> >> I would appreciate more feedback on the revised proposal. Please
>>>>> take
>>>>> >> another look and let me know
>>>>> >>
>>>>> >>
>>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>>> >>
>>>>> >> Etienne, I would appreciate it if you could please take another
>>>>> look after
>>>>> >> the revisions I have made as well.
>>>>> >>
>>>>> >> Thanks again,
>>>>> >> Alex
>>>>> >>
>>>>> >
>>>>>
>>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Ben Chambers <bc...@apache.org>.

The metric api is designed to prevent user defined metric types based on
the fact they just weren't used enough to justify support.

Is there a reason we are bringing that complexity back? Shouldn't we just
need the ability for the standard set plus any special system metrivs?

On Wed, Apr 11, 2018, 5:43 PM Robert Bradshaw <ro...@google.com> wrote:

> Thanks. I think this has simplified things.
>
> One thing that has occurred to me is that we're conflating the idea of
> custom metrics and custom metric types. I would propose the MetricSpec
> field be augmented with an additional field "type" which is a urn
> specifying the type of metric it is (i.e. the contents of its payload, as
> well as the form of aggregation). Summing or maxing over ints would be a
> typical example. Though we could pursue making this opaque to the runner in
> the long run, that's a more speculative (and difficult) feature to tackle.
> This would allow the runner to at least aggregate and report/return to the
> SDK metrics that it did not itself understand the semantic meaning of. (It
> would probably simplify much of the specialization in the runner itself for
> metrics that it *did* understand as well.)
>
> In addition, rather than having UserMetricOfTypeX for every type X one
> would have a single URN for UserMetric and it spec would designate the type
> and payload designate the (qualified) name.
>
> - Robert
>
>
>
> On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com> wrote:
>
>> Thank you everyone for your feedback so far.
>> I have made a revision today which is to make all metrics refer to a
>> primary entity, so I have restructured some of the protos a little bit.
>>
>> The point of this change was to futureproof the possibility of allowing
>> custom user metrics, with custom aggregation functions for its metric
>> updates.
>> Now that each metric has an aggregation_entity associated with it (e.g.
>> PCollection, PTransform), we can design an approach which forwards the
>> opaque bytes metric updates, without deserializing them. These are
>> forwarded to user provided code which then would deserialize the metric
>> update payloads and perform the custom aggregations.
>>
>> I think it has also simplified some of the URN metric protos, as they do
>> not need to keep track of ptransform names inside themselves now. The
>> result is simpler structures, for the metrics as the entities are pulled
>> outside of the metric.
>>
>> I have mentioned this in the doc now, and wanted to draw attention to
>> this particular revision.
>>
>>
>>
>> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com> wrote:
>>
>>> I've gathered a lot of feedback so far and want to make a decision by
>>> Friday, and begin working on related PRs next week.
>>>
>>> Please make sure that you provide your feedback before then and I will
>>> post the final decisions made to this thread Friday afternoon.
>>>
>>>
>>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> Nice, I created a short link so people can refer to it easily in
>>>> future discussions, website, etc.
>>>>
>>>> https://s.apache.org/beam-fn-api-metrics
>>>>
>>>> Thanks for sharing.
>>>>
>>>>
>>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <ro...@google.com>
>>>> wrote:
>>>> > Thanks for the nice writeup. I added some comments.
>>>> >
>>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com> wrote:
>>>> >>
>>>> >> Hello beam community,
>>>> >>
>>>> >> Thank you everyone for your initial feedback on this proposal so
>>>> far. I
>>>> >> have made some revisions based on the feedback. There were some
>>>> larger
>>>> >> questions asking about alternatives. For each of these I have added a
>>>> >> section tagged with [Alternatives] and discussed my recommendation
>>>> as well
>>>> >> as as few other choices we considered.
>>>> >>
>>>> >> I would appreciate more feedback on the revised proposal. Please take
>>>> >> another look and let me know
>>>> >>
>>>> >>
>>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>>> >>
>>>> >> Etienne, I would appreciate it if you could please take another look
>>>> after
>>>> >> the revisions I have made as well.
>>>> >>
>>>> >> Thanks again,
>>>> >> Alex
>>>> >>
>>>> >
>>>>
>>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

Thanks. I think this has simplified things.

One thing that has occurred to me is that we're conflating the idea of
custom metrics and custom metric types. I would propose the MetricSpec
field be augmented with an additional field "type" which is a urn
specifying the type of metric it is (i.e. the contents of its payload, as
well as the form of aggregation). Summing or maxing over ints would be a
typical example. Though we could pursue making this opaque to the runner in
the long run, that's a more speculative (and difficult) feature to tackle.
This would allow the runner to at least aggregate and report/return to the
SDK metrics that it did not itself understand the semantic meaning of. (It
would probably simplify much of the specialization in the runner itself for
metrics that it *did* understand as well.)

In addition, rather than having UserMetricOfTypeX for every type X one
would have a single URN for UserMetric and it spec would designate the type
and payload designate the (qualified) name.

- Robert



On Wed, Apr 11, 2018 at 5:12 PM Alex Amato <aj...@google.com> wrote:

> Thank you everyone for your feedback so far.
> I have made a revision today which is to make all metrics refer to a
> primary entity, so I have restructured some of the protos a little bit.
>
> The point of this change was to futureproof the possibility of allowing
> custom user metrics, with custom aggregation functions for its metric
> updates.
> Now that each metric has an aggregation_entity associated with it (e.g.
> PCollection, PTransform), we can design an approach which forwards the
> opaque bytes metric updates, without deserializing them. These are
> forwarded to user provided code which then would deserialize the metric
> update payloads and perform the custom aggregations.
>
> I think it has also simplified some of the URN metric protos, as they do
> not need to keep track of ptransform names inside themselves now. The
> result is simpler structures, for the metrics as the entities are pulled
> outside of the metric.
>
> I have mentioned this in the doc now, and wanted to draw attention to this
> particular revision.
>
>
>
> On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com> wrote:
>
>> I've gathered a lot of feedback so far and want to make a decision by
>> Friday, and begin working on related PRs next week.
>>
>> Please make sure that you provide your feedback before then and I will
>> post the final decisions made to this thread Friday afternoon.
>>
>>
>> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> Nice, I created a short link so people can refer to it easily in
>>> future discussions, website, etc.
>>>
>>> https://s.apache.org/beam-fn-api-metrics
>>>
>>> Thanks for sharing.
>>>
>>>
>>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <ro...@google.com>
>>> wrote:
>>> > Thanks for the nice writeup. I added some comments.
>>> >
>>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com> wrote:
>>> >>
>>> >> Hello beam community,
>>> >>
>>> >> Thank you everyone for your initial feedback on this proposal so far.
>>> I
>>> >> have made some revisions based on the feedback. There were some larger
>>> >> questions asking about alternatives. For each of these I have added a
>>> >> section tagged with [Alternatives] and discussed my recommendation as
>>> well
>>> >> as as few other choices we considered.
>>> >>
>>> >> I would appreciate more feedback on the revised proposal. Please take
>>> >> another look and let me know
>>> >>
>>> >>
>>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>> >>
>>> >> Etienne, I would appreciate it if you could please take another look
>>> after
>>> >> the revisions I have made as well.
>>> >>
>>> >> Thanks again,
>>> >> Alex
>>> >>
>>> >
>>>
>>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Alex Amato <aj...@google.com>.

Thank you everyone for your feedback so far.
I have made a revision today which is to make all metrics refer to a
primary entity, so I have restructured some of the protos a little bit.

The point of this change was to futureproof the possibility of allowing
custom user metrics, with custom aggregation functions for its metric
updates.
Now that each metric has an aggregation_entity associated with it (e.g.
PCollection, PTransform), we can design an approach which forwards the
opaque bytes metric updates, without deserializing them. These are
forwarded to user provided code which then would deserialize the metric
update payloads and perform the custom aggregations.

I think it has also simplified some of the URN metric protos, as they do
not need to keep track of ptransform names inside themselves now. The
result is simpler structures, for the metrics as the entities are pulled
outside of the metric.

I have mentioned this in the doc now, and wanted to draw attention to this
particular revision.

On Tue, Apr 10, 2018 at 9:53 AM Alex Amato <aj...@google.com> wrote:

> I've gathered a lot of feedback so far and want to make a decision by
> Friday, and begin working on related PRs next week.
>
> Please make sure that you provide your feedback before then and I will
> post the final decisions made to this thread Friday afternoon.
>
>
> On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com> wrote:
>
>> Nice, I created a short link so people can refer to it easily in
>> future discussions, website, etc.
>>
>> https://s.apache.org/beam-fn-api-metrics
>>
>> Thanks for sharing.
>>
>>
>> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <ro...@google.com>
>> wrote:
>> > Thanks for the nice writeup. I added some comments.
>> >
>> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com> wrote:
>> >>
>> >> Hello beam community,
>> >>
>> >> Thank you everyone for your initial feedback on this proposal so far. I
>> >> have made some revisions based on the feedback. There were some larger
>> >> questions asking about alternatives. For each of these I have added a
>> >> section tagged with [Alternatives] and discussed my recommendation as
>> well
>> >> as as few other choices we considered.
>> >>
>> >> I would appreciate more feedback on the revised proposal. Please take
>> >> another look and let me know
>> >>
>> >>
>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>> >>
>> >> Etienne, I would appreciate it if you could please take another look
>> after
>> >> the revisions I have made as well.
>> >>
>> >> Thanks again,
>> >> Alex
>> >>
>> >
>>
>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Alex Amato <aj...@google.com>.

I've gathered a lot of feedback so far and want to make a decision by
Friday, and begin working on related PRs next week.

Please make sure that you provide your feedback before then and I will post
the final decisions made to this thread Friday afternoon.

On Thu, Apr 5, 2018 at 1:38 AM Ismaël Mejía <ie...@gmail.com> wrote:

> Nice, I created a short link so people can refer to it easily in
> future discussions, website, etc.
>
> https://s.apache.org/beam-fn-api-metrics
>
> Thanks for sharing.
>
>
> On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <ro...@google.com>
> wrote:
> > Thanks for the nice writeup. I added some comments.
> >
> > On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com> wrote:
> >>
> >> Hello beam community,
> >>
> >> Thank you everyone for your initial feedback on this proposal so far. I
> >> have made some revisions based on the feedback. There were some larger
> >> questions asking about alternatives. For each of these I have added a
> >> section tagged with [Alternatives] and discussed my recommendation as
> well
> >> as as few other choices we considered.
> >>
> >> I would appreciate more feedback on the revised proposal. Please take
> >> another look and let me know
> >>
> >>
> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
> >>
> >> Etienne, I would appreciate it if you could please take another look
> after
> >> the revisions I have made as well.
> >>
> >> Thanks again,
> >> Alex
> >>
> >
>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Ismaël Mejía <ie...@gmail.com>.

Nice, I created a short link so people can refer to it easily in
future discussions, website, etc.

https://s.apache.org/beam-fn-api-metrics

Thanks for sharing.


On Wed, Apr 4, 2018 at 11:28 PM, Robert Bradshaw <ro...@google.com> wrote:
> Thanks for the nice writeup. I added some comments.
>
> On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com> wrote:
>>
>> Hello beam community,
>>
>> Thank you everyone for your initial feedback on this proposal so far. I
>> have made some revisions based on the feedback. There were some larger
>> questions asking about alternatives. For each of these I have added a
>> section tagged with [Alternatives] and discussed my recommendation as well
>> as as few other choices we considered.
>>
>> I would appreciate more feedback on the revised proposal. Please take
>> another look and let me know
>>
>> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>>
>> Etienne, I would appreciate it if you could please take another look after
>> the revisions I have made as well.
>>
>> Thanks again,
>> Alex
>>
>

Re: Updated [Proposal] Apache Beam Fn API : Defining and adding SDK Metrics

Posted by Robert Bradshaw <ro...@google.com>.

Thanks for the nice writeup. I added some comments.

On Wed, Apr 4, 2018 at 1:53 PM Alex Amato <aj...@google.com> wrote:

> Hello beam community,
>
> Thank you everyone for your initial feedback on this proposal so far. I
> have made some revisions based on the feedback. There were some larger
> questions asking about alternatives. For each of these I have added a
> section tagged with [Alternatives] and discussed my recommendation as well
> as as few other choices we considered.
>
> I would appreciate more feedback on the revised proposal. Please take
> another look and let me know
>
> https://docs.google.com/document/d/1MtBZYV7NAcfbwyy9Op8STeFNBxtljxgy69FkHMvhTMA/edit
>
> Etienne, I would appreciate it if you could please take another look after
> the revisions I have made as well.
>
> Thanks again,
> Alex
>
>