You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@beam.apache.org by Aviem Zur <av...@gmail.com> on 2017/04/07 18:59:45 UTC

[DISCUSSION] PAssert success/failure count validation for all runners

Currently, PAssert assertions may not happen and tests will pass while
silently hiding issues.

Up until now, several runners have implemented an assertion that the number
of expected successful assertions have actually happened, and that no
failed assertions have happened. (runners which check this are Dataflow
runner and Spark runner).

This has been valuable in the past to find bugs which were hidden by
passing tests.

The work to repeat this in https://issues.apache.org/jira/browse/BEAM-1726 has
surfaced bugs in the Flink runner that were also hidden by passing tests.
However, with the removal of aggregators in
https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be harder
to implement, since Flink runner does not support metrics.

I believe that validating that runners do in fact support Beam model is a
blocker for first stable release. (BEAM-1726 was also marked as a blocker
for Flink runner).

I think we have one of 2 choices here:
1. Keep implementing this for each runner separately.
2. Implement this in a runner agnostic way (For runners which support
metrics - use metrics, for those that do not use a fallback implementation,
perhaps using files or some other method). This should be covered by the
following ticket: https://issues.apache.org/jira/browse/BEAM-1763

Thoughts?

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Ismaël Mejía <ie...@gmail.com>.

I have the impression this conversation went into a different
sub-discussion ignoring the core subject that is if it makes sense to
do the implementation of Passert as we are doing it right now (1), or
in a runner agnostic way (2).

Big +1 for (2).

And I think also this is critical enough to be part of the First
Stable Release ones (FSR), again I have the impression that removing
aggregators is a top priority, but we cannot remove them if we don’t
cover the exact same use cases with metrics and we have an acceptable
level of support from the runners.

Regards,
Ismaël

On Sat, Apr 8, 2017 at 4:00 PM, Aljoscha Krettek <al...@apache.org> wrote:
> @kenn What’s the design you’re mentioning? (I probably missed it because I’m not completely up to data on the Jiras and ML because of Flink Forward preparations)
>
>> On 7. Apr 2017, at 12:42, Kenneth Knowles <kl...@google.com.INVALID> wrote:
>>
>> We also have a design that improves the signal even without metrics, so I'm
>> pretty happy with this.
>>
>> On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik <lc...@google.com.invalid>
>> wrote:
>>
>>> I like the usage of metrics since it doesn't depend on external resources.
>>> I believe there could be some small amount of code shared between runners
>>> for the PAssert metric verification.
>>>
>>> I would say that PAssert by itself and PAssert with metrics are two levels
>>> of testing available. For runners that don't support metrics than PAssert
>>> gives a signal (albeit weaker one) and ones that do support metrics will
>>> have a stronger signal for execution correctness.
>>>
>>> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com> wrote:
>>>
>>>> Currently, PAssert assertions may not happen and tests will pass while
>>>> silently hiding issues.
>>>>
>>>> Up until now, several runners have implemented an assertion that the
>>> number
>>>> of expected successful assertions have actually happened, and that no
>>>> failed assertions have happened. (runners which check this are Dataflow
>>>> runner and Spark runner).
>>>>
>>>> This has been valuable in the past to find bugs which were hidden by
>>>> passing tests.
>>>>
>>>> The work to repeat this in https://issues.apache.org/
>>> jira/browse/BEAM-1726
>>>> has
>>>> surfaced bugs in the Flink runner that were also hidden by passing tests.
>>>> However, with the removal of aggregators in
>>>> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be
>>> harder
>>>> to implement, since Flink runner does not support metrics.
>>>>
>>>> I believe that validating that runners do in fact support Beam model is a
>>>> blocker for first stable release. (BEAM-1726 was also marked as a blocker
>>>> for Flink runner).
>>>>
>>>> I think we have one of 2 choices here:
>>>> 1. Keep implementing this for each runner separately.
>>>> 2. Implement this in a runner agnostic way (For runners which support
>>>> metrics - use metrics, for those that do not use a fallback
>>> implementation,
>>>> perhaps using files or some other method). This should be covered by the
>>>> following ticket: https://issues.apache.org/jira/browse/BEAM-1763
>>>>
>>>> Thoughts?
>>>>
>>>
>

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Aviem Zur <av...@gmail.com>.

So to summarize, most seem to agree that:

1) We should verify PAssert execution occurred in all runners.
2) We should verify this using metrics in sdk-java-core for runners which
support metrics. This will save those runner writers from having to verify
this in the runner code. See:
https://issues.apache.org/jira/browse/BEAM-1763
3) Runners which do not support metrics need to verify this in another way
(or alternatively, implement metrics support to get this functionality
automatically via (2).

This work is tracked in the following ticket:
https://issues.apache.org/jira/browse/BEAM-2001
Runners which support this either via (2)/(3) will have their respective
subtask marked as resolved.

On Mon, Apr 17, 2017 at 9:43 PM Pablo Estrada <pa...@google.com.invalid>
wrote:

> Hi all,
> sorry about the long silence. I was on vacation all of last week.
> Also, the implementation of my proposal is in this PR:
> https://github.com/apache/beam/pull/2417.
>
> I don't currently have plans on the work for
> https://issues.apache.org/jira/browse/BEAM-1763. My main focus as of right
> now is on the removal of aggregators from the Java SDK, which is tracked by
> https://issues.apache.org/jira/browse/BEAM-775.
>
> As per the previous discussion, it seems reasonable that I go ahead with PR
> 2417 and move on with the removal of aggregators from other parts of the
> SDK. Is this reasonable to the community?
> Best
> -P.
>
>
> On Mon, Apr 17, 2017 at 11:17 AM Dan Halperin <dh...@google.com> wrote:
>
> > (I'll also note that the bit about URNs and whatnot is decouplable -- we
> > have Pipeline surgery APIs right now, and will someday have
> > URN-with-payload-based-surgery APIs, but we can certainly do the work to
> > make PAssert more overridable now and be ready for full Runner API work
> > later).
> >
> > On Mon, Apr 17, 2017 at 11:14 AM, Dan Halperin <dh...@google.com>
> > wrote:
> >
> >> I believe Pablo's existing proposal is here:
> >>
> https://lists.apache.org/thread.html/CADJfNJBEuWYhhH1mzMwwvUL9Wv2HyFc8_E=9zYBKwUgT8ca1HA@mail.gmail.com
> >>
> >> The idea is that we'll stick with the current design -- aggregator- (but
> >> now metric)-driven validation of PAssert. Runners that do not support
> these
> >> things can override the validation step to do something different.
> >>
> >> This seems to me to satisfy all parties and unblock removal of
> >> aggregators. If a runner supports aggregators but not metrics because
> the
> >> semantics are slightly different, that runner can override the behavior.
> >>
> >> I agree that all runners doing sensible things with PAssert should be a
> >> first stable release blocker. But I do not think it's important that all
> >> runners verify them the same way. There has been no proposal that
> provides
> >> a single validation mechanism that works well with all runners.
> >>
> >> On Wed, Apr 12, 2017 at 9:24 AM, Aljoscha Krettek <al...@apache.org>
> >> wrote:
> >>
> >>> That sounds very good! Now we only have to manage to get this in before
> >>> the first stable release because I think this is a very important
> signal
> >>> for ensuring Runner correctness.
> >>>
> >>> @Pablo Do you already have plans regarding 3., i.e. stable URNs for the
> >>> assertions. And also for verifying them in a runner-agnostic way in
> >>> TestStream, i.e. https://issues.apache.org/jira/browse/BEAM-1763? <
> >>> https://issues.apache.org/jira/browse/BEAM-1763?>
> >>>
> >>> Best,
> >>> Aljoscha
> >>>
> >>> > On 10. Apr 2017, at 10:10, Kenneth Knowles <kl...@google.com.INVALID>
> >>> wrote:
> >>> >
> >>> > On Sat, Apr 8, 2017 at 7:00 AM, Aljoscha Krettek <
> aljoscha@apache.org>
> >>> > wrote:
> >>> >
> >>> >> @kenn What’s the design you’re mentioning? (I probably missed it
> >>> because
> >>> >> I’m not completely up to data on the Jiras and ML because of Flink
> >>> Forward
> >>> >> preparations)
> >>> >>
> >>> >
> >>> > There are three parts (I hope I say this in a way that makes everyone
> >>> happy)
> >>> >
> >>> > 1. Each assertion transform is followed by a verifier transform that
> >>> fails
> >>> > if it sees a non-success (in addition to bumping metrics).
> >>> > 2. Use the same trick PAssert already uses, flatten in a dummy value
> to
> >>> > reduce the risk that the verifier transform never runs.
> >>> > 3. Stable URNs for the assertion and verifier transforms so a runner
> >>> has a
> >>> > good chance to wire custom implementations, if it helps.
> >>> >
> >>> > I think someone mentioned it earlier, but these also work better with
> >>> > metrics that overcount, since it is now about covering the verifier
> >>> > transforms rather than an absolute number of successes.
> >>> >
> >>> > Kenn
> >>> >
> >>> >
> >>> >>> On 7. Apr 2017, at 12:42, Kenneth Knowles <kl...@google.com.INVALID>
> >>> >> wrote:
> >>> >>>
> >>> >>> We also have a design that improves the signal even without
> metrics,
> >>> so
> >>> >> I'm
> >>> >>> pretty happy with this.
> >>> >>>
> >>> >>> On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik
> >>> <lc...@google.com.invalid>
> >>> >>> wrote:
> >>> >>>
> >>> >>>> I like the usage of metrics since it doesn't depend on external
> >>> >> resources.
> >>> >>>> I believe there could be some small amount of code shared between
> >>> >> runners
> >>> >>>> for the PAssert metric verification.
> >>> >>>>
> >>> >>>> I would say that PAssert by itself and PAssert with metrics are
> two
> >>> >> levels
> >>> >>>> of testing available. For runners that don't support metrics than
> >>> >> PAssert
> >>> >>>> gives a signal (albeit weaker one) and ones that do support
> metrics
> >>> will
> >>> >>>> have a stronger signal for execution correctness.
> >>> >>>>
> >>> >>>> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com>
> >>> wrote:
> >>> >>>>
> >>> >>>>> Currently, PAssert assertions may not happen and tests will pass
> >>> while
> >>> >>>>> silently hiding issues.
> >>> >>>>>
> >>> >>>>> Up until now, several runners have implemented an assertion that
> >>> the
> >>> >>>> number
> >>> >>>>> of expected successful assertions have actually happened, and
> that
> >>> no
> >>> >>>>> failed assertions have happened. (runners which check this are
> >>> Dataflow
> >>> >>>>> runner and Spark runner).
> >>> >>>>>
> >>> >>>>> This has been valuable in the past to find bugs which were hidden
> >>> by
> >>> >>>>> passing tests.
> >>> >>>>>
> >>> >>>>> The work to repeat this in https://issues.apache.org/
> >>> >>>> jira/browse/BEAM-1726
> >>> >>>>> has
> >>> >>>>> surfaced bugs in the Flink runner that were also hidden by
> passing
> >>> >> tests.
> >>> >>>>> However, with the removal of aggregators in
> >>> >>>>> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will
> >>> be
> >>> >>>> harder
> >>> >>>>> to implement, since Flink runner does not support metrics.
> >>> >>>>>
> >>> >>>>> I believe that validating that runners do in fact support Beam
> >>> model
> >>> >> is a
> >>> >>>>> blocker for first stable release. (BEAM-1726 was also marked as a
> >>> >> blocker
> >>> >>>>> for Flink runner).
> >>> >>>>>
> >>> >>>>> I think we have one of 2 choices here:
> >>> >>>>> 1. Keep implementing this for each runner separately.
> >>> >>>>> 2. Implement this in a runner agnostic way (For runners which
> >>> support
> >>> >>>>> metrics - use metrics, for those that do not use a fallback
> >>> >>>> implementation,
> >>> >>>>> perhaps using files or some other method). This should be covered
> >>> by
> >>> >> the
> >>> >>>>> following ticket:
> https://issues.apache.org/jira/browse/BEAM-1763
> >>> >>>>>
> >>> >>>>> Thoughts?
> >>> >>>>>
> >>> >>>>
> >>> >>
> >>> >>
> >>>
> >>>
> >>
> >
>

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Pablo Estrada <pa...@google.com.INVALID>.

Hi all,
sorry about the long silence. I was on vacation all of last week.
Also, the implementation of my proposal is in this PR:
https://github.com/apache/beam/pull/2417.

I don't currently have plans on the work for
https://issues.apache.org/jira/browse/BEAM-1763. My main focus as of right
now is on the removal of aggregators from the Java SDK, which is tracked by
https://issues.apache.org/jira/browse/BEAM-775.

As per the previous discussion, it seems reasonable that I go ahead with PR
2417 and move on with the removal of aggregators from other parts of the
SDK. Is this reasonable to the community?
Best
-P.


On Mon, Apr 17, 2017 at 11:17 AM Dan Halperin <dh...@google.com> wrote:

> (I'll also note that the bit about URNs and whatnot is decouplable -- we
> have Pipeline surgery APIs right now, and will someday have
> URN-with-payload-based-surgery APIs, but we can certainly do the work to
> make PAssert more overridable now and be ready for full Runner API work
> later).
>
> On Mon, Apr 17, 2017 at 11:14 AM, Dan Halperin <dh...@google.com>
> wrote:
>
>> I believe Pablo's existing proposal is here:
>> https://lists.apache.org/thread.html/CADJfNJBEuWYhhH1mzMwwvUL9Wv2HyFc8_E=9zYBKwUgT8ca1HA@mail.gmail.com
>>
>> The idea is that we'll stick with the current design -- aggregator- (but
>> now metric)-driven validation of PAssert. Runners that do not support these
>> things can override the validation step to do something different.
>>
>> This seems to me to satisfy all parties and unblock removal of
>> aggregators. If a runner supports aggregators but not metrics because the
>> semantics are slightly different, that runner can override the behavior.
>>
>> I agree that all runners doing sensible things with PAssert should be a
>> first stable release blocker. But I do not think it's important that all
>> runners verify them the same way. There has been no proposal that provides
>> a single validation mechanism that works well with all runners.
>>
>> On Wed, Apr 12, 2017 at 9:24 AM, Aljoscha Krettek <al...@apache.org>
>> wrote:
>>
>>> That sounds very good! Now we only have to manage to get this in before
>>> the first stable release because I think this is a very important signal
>>> for ensuring Runner correctness.
>>>
>>> @Pablo Do you already have plans regarding 3., i.e. stable URNs for the
>>> assertions. And also for verifying them in a runner-agnostic way in
>>> TestStream, i.e. https://issues.apache.org/jira/browse/BEAM-1763? <
>>> https://issues.apache.org/jira/browse/BEAM-1763?>
>>>
>>> Best,
>>> Aljoscha
>>>
>>> > On 10. Apr 2017, at 10:10, Kenneth Knowles <kl...@google.com.INVALID>
>>> wrote:
>>> >
>>> > On Sat, Apr 8, 2017 at 7:00 AM, Aljoscha Krettek <al...@apache.org>
>>> > wrote:
>>> >
>>> >> @kenn What’s the design you’re mentioning? (I probably missed it
>>> because
>>> >> I’m not completely up to data on the Jiras and ML because of Flink
>>> Forward
>>> >> preparations)
>>> >>
>>> >
>>> > There are three parts (I hope I say this in a way that makes everyone
>>> happy)
>>> >
>>> > 1. Each assertion transform is followed by a verifier transform that
>>> fails
>>> > if it sees a non-success (in addition to bumping metrics).
>>> > 2. Use the same trick PAssert already uses, flatten in a dummy value to
>>> > reduce the risk that the verifier transform never runs.
>>> > 3. Stable URNs for the assertion and verifier transforms so a runner
>>> has a
>>> > good chance to wire custom implementations, if it helps.
>>> >
>>> > I think someone mentioned it earlier, but these also work better with
>>> > metrics that overcount, since it is now about covering the verifier
>>> > transforms rather than an absolute number of successes.
>>> >
>>> > Kenn
>>> >
>>> >
>>> >>> On 7. Apr 2017, at 12:42, Kenneth Knowles <kl...@google.com.INVALID>
>>> >> wrote:
>>> >>>
>>> >>> We also have a design that improves the signal even without metrics,
>>> so
>>> >> I'm
>>> >>> pretty happy with this.
>>> >>>
>>> >>> On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik
>>> <lc...@google.com.invalid>
>>> >>> wrote:
>>> >>>
>>> >>>> I like the usage of metrics since it doesn't depend on external
>>> >> resources.
>>> >>>> I believe there could be some small amount of code shared between
>>> >> runners
>>> >>>> for the PAssert metric verification.
>>> >>>>
>>> >>>> I would say that PAssert by itself and PAssert with metrics are two
>>> >> levels
>>> >>>> of testing available. For runners that don't support metrics than
>>> >> PAssert
>>> >>>> gives a signal (albeit weaker one) and ones that do support metrics
>>> will
>>> >>>> have a stronger signal for execution correctness.
>>> >>>>
>>> >>>> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com>
>>> wrote:
>>> >>>>
>>> >>>>> Currently, PAssert assertions may not happen and tests will pass
>>> while
>>> >>>>> silently hiding issues.
>>> >>>>>
>>> >>>>> Up until now, several runners have implemented an assertion that
>>> the
>>> >>>> number
>>> >>>>> of expected successful assertions have actually happened, and that
>>> no
>>> >>>>> failed assertions have happened. (runners which check this are
>>> Dataflow
>>> >>>>> runner and Spark runner).
>>> >>>>>
>>> >>>>> This has been valuable in the past to find bugs which were hidden
>>> by
>>> >>>>> passing tests.
>>> >>>>>
>>> >>>>> The work to repeat this in https://issues.apache.org/
>>> >>>> jira/browse/BEAM-1726
>>> >>>>> has
>>> >>>>> surfaced bugs in the Flink runner that were also hidden by passing
>>> >> tests.
>>> >>>>> However, with the removal of aggregators in
>>> >>>>> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will
>>> be
>>> >>>> harder
>>> >>>>> to implement, since Flink runner does not support metrics.
>>> >>>>>
>>> >>>>> I believe that validating that runners do in fact support Beam
>>> model
>>> >> is a
>>> >>>>> blocker for first stable release. (BEAM-1726 was also marked as a
>>> >> blocker
>>> >>>>> for Flink runner).
>>> >>>>>
>>> >>>>> I think we have one of 2 choices here:
>>> >>>>> 1. Keep implementing this for each runner separately.
>>> >>>>> 2. Implement this in a runner agnostic way (For runners which
>>> support
>>> >>>>> metrics - use metrics, for those that do not use a fallback
>>> >>>> implementation,
>>> >>>>> perhaps using files or some other method). This should be covered
>>> by
>>> >> the
>>> >>>>> following ticket: https://issues.apache.org/jira/browse/BEAM-1763
>>> >>>>>
>>> >>>>> Thoughts?
>>> >>>>>
>>> >>>>
>>> >>
>>> >>
>>>
>>>
>>
>

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Dan Halperin <dh...@google.com.INVALID>.

(I'll also note that the bit about URNs and whatnot is decouplable -- we
have Pipeline surgery APIs right now, and will someday have
URN-with-payload-based-surgery APIs, but we can certainly do the work to
make PAssert more overridable now and be ready for full Runner API work
later).

On Mon, Apr 17, 2017 at 11:14 AM, Dan Halperin <dh...@google.com> wrote:

> I believe Pablo's existing proposal is here: https://lists.apache.
> org/thread.html/CADJfNJBEuWYhhH1mzMwwvUL9Wv2HyFc8_E=9zYBKwUgT8ca1HA@mail.
> gmail.com
>
> The idea is that we'll stick with the current design -- aggregator- (but
> now metric)-driven validation of PAssert. Runners that do not support these
> things can override the validation step to do something different.
>
> This seems to me to satisfy all parties and unblock removal of
> aggregators. If a runner supports aggregators but not metrics because the
> semantics are slightly different, that runner can override the behavior.
>
> I agree that all runners doing sensible things with PAssert should be a
> first stable release blocker. But I do not think it's important that all
> runners verify them the same way. There has been no proposal that provides
> a single validation mechanism that works well with all runners.
>
> On Wed, Apr 12, 2017 at 9:24 AM, Aljoscha Krettek <al...@apache.org>
> wrote:
>
>> That sounds very good! Now we only have to manage to get this in before
>> the first stable release because I think this is a very important signal
>> for ensuring Runner correctness.
>>
>> @Pablo Do you already have plans regarding 3., i.e. stable URNs for the
>> assertions. And also for verifying them in a runner-agnostic way in
>> TestStream, i.e. https://issues.apache.org/jira/browse/BEAM-1763? <
>> https://issues.apache.org/jira/browse/BEAM-1763?>
>>
>> Best,
>> Aljoscha
>>
>> > On 10. Apr 2017, at 10:10, Kenneth Knowles <kl...@google.com.INVALID>
>> wrote:
>> >
>> > On Sat, Apr 8, 2017 at 7:00 AM, Aljoscha Krettek <al...@apache.org>
>> > wrote:
>> >
>> >> @kenn What’s the design you’re mentioning? (I probably missed it
>> because
>> >> I’m not completely up to data on the Jiras and ML because of Flink
>> Forward
>> >> preparations)
>> >>
>> >
>> > There are three parts (I hope I say this in a way that makes everyone
>> happy)
>> >
>> > 1. Each assertion transform is followed by a verifier transform that
>> fails
>> > if it sees a non-success (in addition to bumping metrics).
>> > 2. Use the same trick PAssert already uses, flatten in a dummy value to
>> > reduce the risk that the verifier transform never runs.
>> > 3. Stable URNs for the assertion and verifier transforms so a runner
>> has a
>> > good chance to wire custom implementations, if it helps.
>> >
>> > I think someone mentioned it earlier, but these also work better with
>> > metrics that overcount, since it is now about covering the verifier
>> > transforms rather than an absolute number of successes.
>> >
>> > Kenn
>> >
>> >
>> >>> On 7. Apr 2017, at 12:42, Kenneth Knowles <kl...@google.com.INVALID>
>> >> wrote:
>> >>>
>> >>> We also have a design that improves the signal even without metrics,
>> so
>> >> I'm
>> >>> pretty happy with this.
>> >>>
>> >>> On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik <lcwik@google.com.invalid
>> >
>> >>> wrote:
>> >>>
>> >>>> I like the usage of metrics since it doesn't depend on external
>> >> resources.
>> >>>> I believe there could be some small amount of code shared between
>> >> runners
>> >>>> for the PAssert metric verification.
>> >>>>
>> >>>> I would say that PAssert by itself and PAssert with metrics are two
>> >> levels
>> >>>> of testing available. For runners that don't support metrics than
>> >> PAssert
>> >>>> gives a signal (albeit weaker one) and ones that do support metrics
>> will
>> >>>> have a stronger signal for execution correctness.
>> >>>>
>> >>>> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com>
>> wrote:
>> >>>>
>> >>>>> Currently, PAssert assertions may not happen and tests will pass
>> while
>> >>>>> silently hiding issues.
>> >>>>>
>> >>>>> Up until now, several runners have implemented an assertion that the
>> >>>> number
>> >>>>> of expected successful assertions have actually happened, and that
>> no
>> >>>>> failed assertions have happened. (runners which check this are
>> Dataflow
>> >>>>> runner and Spark runner).
>> >>>>>
>> >>>>> This has been valuable in the past to find bugs which were hidden by
>> >>>>> passing tests.
>> >>>>>
>> >>>>> The work to repeat this in https://issues.apache.org/
>> >>>> jira/browse/BEAM-1726
>> >>>>> has
>> >>>>> surfaced bugs in the Flink runner that were also hidden by passing
>> >> tests.
>> >>>>> However, with the removal of aggregators in
>> >>>>> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be
>> >>>> harder
>> >>>>> to implement, since Flink runner does not support metrics.
>> >>>>>
>> >>>>> I believe that validating that runners do in fact support Beam model
>> >> is a
>> >>>>> blocker for first stable release. (BEAM-1726 was also marked as a
>> >> blocker
>> >>>>> for Flink runner).
>> >>>>>
>> >>>>> I think we have one of 2 choices here:
>> >>>>> 1. Keep implementing this for each runner separately.
>> >>>>> 2. Implement this in a runner agnostic way (For runners which
>> support
>> >>>>> metrics - use metrics, for those that do not use a fallback
>> >>>> implementation,
>> >>>>> perhaps using files or some other method). This should be covered by
>> >> the
>> >>>>> following ticket: https://issues.apache.org/jira/browse/BEAM-1763
>> >>>>>
>> >>>>> Thoughts?
>> >>>>>
>> >>>>
>> >>
>> >>
>>
>>
>

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Dan Halperin <dh...@google.com.INVALID>.

I believe Pablo's existing proposal is here:
https://lists.apache.org/thread.html/CADJfNJBEuWYhhH1mzMwwvUL9Wv2HyFc8_E=9zYBKwUgT8ca1HA@mail.gmail.com

The idea is that we'll stick with the current design -- aggregator- (but
now metric)-driven validation of PAssert. Runners that do not support these
things can override the validation step to do something different.

This seems to me to satisfy all parties and unblock removal of aggregators.
If a runner supports aggregators but not metrics because the semantics are
slightly different, that runner can override the behavior.

I agree that all runners doing sensible things with PAssert should be a
first stable release blocker. But I do not think it's important that all
runners verify them the same way. There has been no proposal that provides
a single validation mechanism that works well with all runners.

On Wed, Apr 12, 2017 at 9:24 AM, Aljoscha Krettek <al...@apache.org>
wrote:

> That sounds very good! Now we only have to manage to get this in before
> the first stable release because I think this is a very important signal
> for ensuring Runner correctness.
>
> @Pablo Do you already have plans regarding 3., i.e. stable URNs for the
> assertions. And also for verifying them in a runner-agnostic way in
> TestStream, i.e. https://issues.apache.org/jira/browse/BEAM-1763? <
> https://issues.apache.org/jira/browse/BEAM-1763?>
>
> Best,
> Aljoscha
>
> > On 10. Apr 2017, at 10:10, Kenneth Knowles <kl...@google.com.INVALID>
> wrote:
> >
> > On Sat, Apr 8, 2017 at 7:00 AM, Aljoscha Krettek <al...@apache.org>
> > wrote:
> >
> >> @kenn What’s the design you’re mentioning? (I probably missed it because
> >> I’m not completely up to data on the Jiras and ML because of Flink
> Forward
> >> preparations)
> >>
> >
> > There are three parts (I hope I say this in a way that makes everyone
> happy)
> >
> > 1. Each assertion transform is followed by a verifier transform that
> fails
> > if it sees a non-success (in addition to bumping metrics).
> > 2. Use the same trick PAssert already uses, flatten in a dummy value to
> > reduce the risk that the verifier transform never runs.
> > 3. Stable URNs for the assertion and verifier transforms so a runner has
> a
> > good chance to wire custom implementations, if it helps.
> >
> > I think someone mentioned it earlier, but these also work better with
> > metrics that overcount, since it is now about covering the verifier
> > transforms rather than an absolute number of successes.
> >
> > Kenn
> >
> >
> >>> On 7. Apr 2017, at 12:42, Kenneth Knowles <kl...@google.com.INVALID>
> >> wrote:
> >>>
> >>> We also have a design that improves the signal even without metrics, so
> >> I'm
> >>> pretty happy with this.
> >>>
> >>> On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik <lcwik@google.com.invalid
> >
> >>> wrote:
> >>>
> >>>> I like the usage of metrics since it doesn't depend on external
> >> resources.
> >>>> I believe there could be some small amount of code shared between
> >> runners
> >>>> for the PAssert metric verification.
> >>>>
> >>>> I would say that PAssert by itself and PAssert with metrics are two
> >> levels
> >>>> of testing available. For runners that don't support metrics than
> >> PAssert
> >>>> gives a signal (albeit weaker one) and ones that do support metrics
> will
> >>>> have a stronger signal for execution correctness.
> >>>>
> >>>> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com>
> wrote:
> >>>>
> >>>>> Currently, PAssert assertions may not happen and tests will pass
> while
> >>>>> silently hiding issues.
> >>>>>
> >>>>> Up until now, several runners have implemented an assertion that the
> >>>> number
> >>>>> of expected successful assertions have actually happened, and that no
> >>>>> failed assertions have happened. (runners which check this are
> Dataflow
> >>>>> runner and Spark runner).
> >>>>>
> >>>>> This has been valuable in the past to find bugs which were hidden by
> >>>>> passing tests.
> >>>>>
> >>>>> The work to repeat this in https://issues.apache.org/
> >>>> jira/browse/BEAM-1726
> >>>>> has
> >>>>> surfaced bugs in the Flink runner that were also hidden by passing
> >> tests.
> >>>>> However, with the removal of aggregators in
> >>>>> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be
> >>>> harder
> >>>>> to implement, since Flink runner does not support metrics.
> >>>>>
> >>>>> I believe that validating that runners do in fact support Beam model
> >> is a
> >>>>> blocker for first stable release. (BEAM-1726 was also marked as a
> >> blocker
> >>>>> for Flink runner).
> >>>>>
> >>>>> I think we have one of 2 choices here:
> >>>>> 1. Keep implementing this for each runner separately.
> >>>>> 2. Implement this in a runner agnostic way (For runners which support
> >>>>> metrics - use metrics, for those that do not use a fallback
> >>>> implementation,
> >>>>> perhaps using files or some other method). This should be covered by
> >> the
> >>>>> following ticket: https://issues.apache.org/jira/browse/BEAM-1763
> >>>>>
> >>>>> Thoughts?
> >>>>>
> >>>>
> >>
> >>
>
>

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Aljoscha Krettek <al...@apache.org>.

That sounds very good! Now we only have to manage to get this in before the first stable release because I think this is a very important signal for ensuring Runner correctness.

@Pablo Do you already have plans regarding 3., i.e. stable URNs for the assertions. And also for verifying them in a runner-agnostic way in TestStream, i.e. https://issues.apache.org/jira/browse/BEAM-1763? <https://issues.apache.org/jira/browse/BEAM-1763?>

Best,
Aljoscha

> On 10. Apr 2017, at 10:10, Kenneth Knowles <kl...@google.com.INVALID> wrote:
> 
> On Sat, Apr 8, 2017 at 7:00 AM, Aljoscha Krettek <al...@apache.org>
> wrote:
> 
>> @kenn What’s the design you’re mentioning? (I probably missed it because
>> I’m not completely up to data on the Jiras and ML because of Flink Forward
>> preparations)
>> 
> 
> There are three parts (I hope I say this in a way that makes everyone happy)
> 
> 1. Each assertion transform is followed by a verifier transform that fails
> if it sees a non-success (in addition to bumping metrics).
> 2. Use the same trick PAssert already uses, flatten in a dummy value to
> reduce the risk that the verifier transform never runs.
> 3. Stable URNs for the assertion and verifier transforms so a runner has a
> good chance to wire custom implementations, if it helps.
> 
> I think someone mentioned it earlier, but these also work better with
> metrics that overcount, since it is now about covering the verifier
> transforms rather than an absolute number of successes.
> 
> Kenn
> 
> 
>>> On 7. Apr 2017, at 12:42, Kenneth Knowles <kl...@google.com.INVALID>
>> wrote:
>>> 
>>> We also have a design that improves the signal even without metrics, so
>> I'm
>>> pretty happy with this.
>>> 
>>> On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik <lc...@google.com.invalid>
>>> wrote:
>>> 
>>>> I like the usage of metrics since it doesn't depend on external
>> resources.
>>>> I believe there could be some small amount of code shared between
>> runners
>>>> for the PAssert metric verification.
>>>> 
>>>> I would say that PAssert by itself and PAssert with metrics are two
>> levels
>>>> of testing available. For runners that don't support metrics than
>> PAssert
>>>> gives a signal (albeit weaker one) and ones that do support metrics will
>>>> have a stronger signal for execution correctness.
>>>> 
>>>> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com> wrote:
>>>> 
>>>>> Currently, PAssert assertions may not happen and tests will pass while
>>>>> silently hiding issues.
>>>>> 
>>>>> Up until now, several runners have implemented an assertion that the
>>>> number
>>>>> of expected successful assertions have actually happened, and that no
>>>>> failed assertions have happened. (runners which check this are Dataflow
>>>>> runner and Spark runner).
>>>>> 
>>>>> This has been valuable in the past to find bugs which were hidden by
>>>>> passing tests.
>>>>> 
>>>>> The work to repeat this in https://issues.apache.org/
>>>> jira/browse/BEAM-1726
>>>>> has
>>>>> surfaced bugs in the Flink runner that were also hidden by passing
>> tests.
>>>>> However, with the removal of aggregators in
>>>>> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be
>>>> harder
>>>>> to implement, since Flink runner does not support metrics.
>>>>> 
>>>>> I believe that validating that runners do in fact support Beam model
>> is a
>>>>> blocker for first stable release. (BEAM-1726 was also marked as a
>> blocker
>>>>> for Flink runner).
>>>>> 
>>>>> I think we have one of 2 choices here:
>>>>> 1. Keep implementing this for each runner separately.
>>>>> 2. Implement this in a runner agnostic way (For runners which support
>>>>> metrics - use metrics, for those that do not use a fallback
>>>> implementation,
>>>>> perhaps using files or some other method). This should be covered by
>> the
>>>>> following ticket: https://issues.apache.org/jira/browse/BEAM-1763
>>>>> 
>>>>> Thoughts?
>>>>> 
>>>> 
>> 
>>

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Kenneth Knowles <kl...@google.com.INVALID>.

On Sat, Apr 8, 2017 at 7:00 AM, Aljoscha Krettek <al...@apache.org>
wrote:

> @kenn What’s the design you’re mentioning? (I probably missed it because
> I’m not completely up to data on the Jiras and ML because of Flink Forward
> preparations)
>

There are three parts (I hope I say this in a way that makes everyone happy)

1. Each assertion transform is followed by a verifier transform that fails
if it sees a non-success (in addition to bumping metrics).
2. Use the same trick PAssert already uses, flatten in a dummy value to
reduce the risk that the verifier transform never runs.
3. Stable URNs for the assertion and verifier transforms so a runner has a
good chance to wire custom implementations, if it helps.

I think someone mentioned it earlier, but these also work better with
metrics that overcount, since it is now about covering the verifier
transforms rather than an absolute number of successes.

Kenn


> > On 7. Apr 2017, at 12:42, Kenneth Knowles <kl...@google.com.INVALID>
> wrote:
> >
> > We also have a design that improves the signal even without metrics, so
> I'm
> > pretty happy with this.
> >
> > On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik <lc...@google.com.invalid>
> > wrote:
> >
> >> I like the usage of metrics since it doesn't depend on external
> resources.
> >> I believe there could be some small amount of code shared between
> runners
> >> for the PAssert metric verification.
> >>
> >> I would say that PAssert by itself and PAssert with metrics are two
> levels
> >> of testing available. For runners that don't support metrics than
> PAssert
> >> gives a signal (albeit weaker one) and ones that do support metrics will
> >> have a stronger signal for execution correctness.
> >>
> >> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com> wrote:
> >>
> >>> Currently, PAssert assertions may not happen and tests will pass while
> >>> silently hiding issues.
> >>>
> >>> Up until now, several runners have implemented an assertion that the
> >> number
> >>> of expected successful assertions have actually happened, and that no
> >>> failed assertions have happened. (runners which check this are Dataflow
> >>> runner and Spark runner).
> >>>
> >>> This has been valuable in the past to find bugs which were hidden by
> >>> passing tests.
> >>>
> >>> The work to repeat this in https://issues.apache.org/
> >> jira/browse/BEAM-1726
> >>> has
> >>> surfaced bugs in the Flink runner that were also hidden by passing
> tests.
> >>> However, with the removal of aggregators in
> >>> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be
> >> harder
> >>> to implement, since Flink runner does not support metrics.
> >>>
> >>> I believe that validating that runners do in fact support Beam model
> is a
> >>> blocker for first stable release. (BEAM-1726 was also marked as a
> blocker
> >>> for Flink runner).
> >>>
> >>> I think we have one of 2 choices here:
> >>> 1. Keep implementing this for each runner separately.
> >>> 2. Implement this in a runner agnostic way (For runners which support
> >>> metrics - use metrics, for those that do not use a fallback
> >> implementation,
> >>> perhaps using files or some other method). This should be covered by
> the
> >>> following ticket: https://issues.apache.org/jira/browse/BEAM-1763
> >>>
> >>> Thoughts?
> >>>
> >>
>
>

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Aljoscha Krettek <al...@apache.org>.

@kenn What’s the design you’re mentioning? (I probably missed it because I’m not completely up to data on the Jiras and ML because of Flink Forward preparations)

> On 7. Apr 2017, at 12:42, Kenneth Knowles <kl...@google.com.INVALID> wrote:
> 
> We also have a design that improves the signal even without metrics, so I'm
> pretty happy with this.
> 
> On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik <lc...@google.com.invalid>
> wrote:
> 
>> I like the usage of metrics since it doesn't depend on external resources.
>> I believe there could be some small amount of code shared between runners
>> for the PAssert metric verification.
>> 
>> I would say that PAssert by itself and PAssert with metrics are two levels
>> of testing available. For runners that don't support metrics than PAssert
>> gives a signal (albeit weaker one) and ones that do support metrics will
>> have a stronger signal for execution correctness.
>> 
>> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com> wrote:
>> 
>>> Currently, PAssert assertions may not happen and tests will pass while
>>> silently hiding issues.
>>> 
>>> Up until now, several runners have implemented an assertion that the
>> number
>>> of expected successful assertions have actually happened, and that no
>>> failed assertions have happened. (runners which check this are Dataflow
>>> runner and Spark runner).
>>> 
>>> This has been valuable in the past to find bugs which were hidden by
>>> passing tests.
>>> 
>>> The work to repeat this in https://issues.apache.org/
>> jira/browse/BEAM-1726
>>> has
>>> surfaced bugs in the Flink runner that were also hidden by passing tests.
>>> However, with the removal of aggregators in
>>> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be
>> harder
>>> to implement, since Flink runner does not support metrics.
>>> 
>>> I believe that validating that runners do in fact support Beam model is a
>>> blocker for first stable release. (BEAM-1726 was also marked as a blocker
>>> for Flink runner).
>>> 
>>> I think we have one of 2 choices here:
>>> 1. Keep implementing this for each runner separately.
>>> 2. Implement this in a runner agnostic way (For runners which support
>>> metrics - use metrics, for those that do not use a fallback
>> implementation,
>>> perhaps using files or some other method). This should be covered by the
>>> following ticket: https://issues.apache.org/jira/browse/BEAM-1763
>>> 
>>> Thoughts?
>>> 
>>

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Kenneth Knowles <kl...@google.com.INVALID>.

We also have a design that improves the signal even without metrics, so I'm
pretty happy with this.

On Fri, Apr 7, 2017 at 12:12 PM, Lukasz Cwik <lc...@google.com.invalid>
wrote:

> I like the usage of metrics since it doesn't depend on external resources.
> I believe there could be some small amount of code shared between runners
> for the PAssert metric verification.
>
> I would say that PAssert by itself and PAssert with metrics are two levels
> of testing available. For runners that don't support metrics than PAssert
> gives a signal (albeit weaker one) and ones that do support metrics will
> have a stronger signal for execution correctness.
>
> On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com> wrote:
>
> > Currently, PAssert assertions may not happen and tests will pass while
> > silently hiding issues.
> >
> > Up until now, several runners have implemented an assertion that the
> number
> > of expected successful assertions have actually happened, and that no
> > failed assertions have happened. (runners which check this are Dataflow
> > runner and Spark runner).
> >
> > This has been valuable in the past to find bugs which were hidden by
> > passing tests.
> >
> > The work to repeat this in https://issues.apache.org/
> jira/browse/BEAM-1726
> > has
> > surfaced bugs in the Flink runner that were also hidden by passing tests.
> > However, with the removal of aggregators in
> > https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be
> harder
> > to implement, since Flink runner does not support metrics.
> >
> > I believe that validating that runners do in fact support Beam model is a
> > blocker for first stable release. (BEAM-1726 was also marked as a blocker
> > for Flink runner).
> >
> > I think we have one of 2 choices here:
> > 1. Keep implementing this for each runner separately.
> > 2. Implement this in a runner agnostic way (For runners which support
> > metrics - use metrics, for those that do not use a fallback
> implementation,
> > perhaps using files or some other method). This should be covered by the
> > following ticket: https://issues.apache.org/jira/browse/BEAM-1763
> >
> > Thoughts?
> >
>

Re: [DISCUSSION] PAssert success/failure count validation for all runners

Posted by Lukasz Cwik <lc...@google.com.INVALID>.

I like the usage of metrics since it doesn't depend on external resources.
I believe there could be some small amount of code shared between runners
for the PAssert metric verification.

I would say that PAssert by itself and PAssert with metrics are two levels
of testing available. For runners that don't support metrics than PAssert
gives a signal (albeit weaker one) and ones that do support metrics will
have a stronger signal for execution correctness.

On Fri, Apr 7, 2017 at 11:59 AM, Aviem Zur <av...@gmail.com> wrote:

> Currently, PAssert assertions may not happen and tests will pass while
> silently hiding issues.
>
> Up until now, several runners have implemented an assertion that the number
> of expected successful assertions have actually happened, and that no
> failed assertions have happened. (runners which check this are Dataflow
> runner and Spark runner).
>
> This has been valuable in the past to find bugs which were hidden by
> passing tests.
>
> The work to repeat this in https://issues.apache.org/jira/browse/BEAM-1726
> has
> surfaced bugs in the Flink runner that were also hidden by passing tests.
> However, with the removal of aggregators in
> https://issues.apache.org/jira/browse/BEAM-1148 this ticket will be harder
> to implement, since Flink runner does not support metrics.
>
> I believe that validating that runners do in fact support Beam model is a
> blocker for first stable release. (BEAM-1726 was also marked as a blocker
> for Flink runner).
>
> I think we have one of 2 choices here:
> 1. Keep implementing this for each runner separately.
> 2. Implement this in a runner agnostic way (For runners which support
> metrics - use metrics, for those that do not use a fallback implementation,
> perhaps using files or some other method). This should be covered by the
> following ticket: https://issues.apache.org/jira/browse/BEAM-1763
>
> Thoughts?
>