You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@beam.apache.org by Kenneth Knowles <kl...@google.com.INVALID> on 2017/01/03 18:19:59 UTC

Re: PipelineResult state management

Just to close the loop on the thread: There is, indeed, a JIRA for this,
and I see you have found it. It has a title similar to Dan's suggestion :-)

    https://issues.apache.org/jira/browse/BEAM-849

Kenn

On Tue, Dec 27, 2016 at 11:08 AM, Dan Halperin <dh...@google.com.invalid>
wrote:

> Right now we're using PipelineResult as a sort of hybrid Future (e.g,
> waitUntilFinish is like Future.get()), and I think Stas has a reasonable
> point that this is confusing.
>
> I know that figuring out the PipelineResult API for real is part of the
> pre-stable-release work -- sounds like the community should start gathering
> thoughts. JIRA is indeed probably a good place to do this -- there's
> probably alread a JIRA for "make PipelineResult a real API".
>
> Dan
>
> On Thu, Dec 22, 2016 at 3:39 PM, Stephen Sisk <si...@google.com.invalid>
> wrote:
>
> > Thanks for speaking up that you found it confusing. Feedback is always
> > appreciated!
> >
> > For the State enum - I hear you that returning nulls can cause issues
> when
> > you're expecting an enum.
> >
> > We could in theory have the API return a new result object that includes
> > "Did the wait succeed?" + "What was the state?" Having said that, I
> suspect
> > you could then argue that waitUntilFinish would be cleaner if it just
> > returned a bool for whether or not it
> >
> > If you feel strongly about this, I'd suggest filing a JIRA ticket. A
> > breaking API change would have a higher bar (and this is a method that I
> > suspect has a wide audience, so a very high bar), but a 2nd copy of
> > waitUntilFinish/cancel() that just returned whether or not they
> > accomplished their goal might be interesting. I suspect now that I've
> > suggested a specific potential solution other folks might chime in as
> well.
> > :)
> >
> > S
> >
> > On Thu, Dec 22, 2016 at 11:34 AM Stas Levin <st...@gmail.com> wrote:
> >
> > > Thanks for the detailed response.
> > >
> > > I did not have a particular scenario in mind, just found it a bit
> > confusing
> > > and was wondering if it was just me.
> > >
> > > Another thing that kind of threw me off was that waitUntilFinish()
> should
> > > return "null" (according to the documentation) in case it times out,
> > which
> > > IMHO is somewhat unexpected seeing that State is an enum. I guess the
> > > intent is to stress the fact that the returned "null" is the state of
> the
> > > *waiting* rather than the state of the underlying pipeline, yet it
> wasn't
> > > until the first couple of NullPointerExceptions that I realized this :)
> > >
> > > On Thu, Dec 22, 2016 at 9:05 PM Stephen Sisk <si...@google.com.invalid>
> > > wrote:
> > >
> > > > Is there a particular scenario you're worried about here? I can see
> how
> > > > it's important to be consistent once in a terminal state, but I don't
> > see
> > > > how the API makes that worse. Since any of these methods retrieving
> > state
> > > > ultimately rely on the runner, if the runner is returning different
> > > values
> > > > over time after returning a terminal state, then it's all moot.
> > > >
> > > > To put it another way, let's say I have the following code:
> > > > pipeline.waitUntilFinish();
> > > > State myState = pipeline.getState();
> > > > State myState2 = pipeline.getState();
> > > > // what guarantee is there that myState == myState2 ?
> > > >
> > > > Or let's say I have this:
> > > > pipeline.cancel();
> > > > State myState = pipeline.getState();
> > > > // if myState != cancel | fail, you will have angry tests
> > > >
> > > > It's noteworthy that waitUntilFinish and cancel result in terminal
> > > states,
> > > > where we shouldn't change to another state after we return from the
> > > > function. I'm not sure if that is documented/the consensus of the
> > > > community, but it's been how I think about those states. If a runner
> > were
> > > > changing states after being in a terminal state, then that can cause
> > > > issues. People like to write their tests/production code using
> > > > pipeline.waitUntilFinish() and then retrieving state, look at
> results,
> > > > etc... immediately after that. However, the fact that waitUntilFinish
> > > > returns a state doesn't cause that problem - it's that the runner
> impl
> > > > returns different values over time and that's true even if we only
> > return
> > > > state from getState.
> > > >
> > > > Is there a particular scenario driving your question?
> > > >
> > > > S
> > > >
> > > >
> > > > On Thu, Dec 22, 2016 at 10:29 AM Stas Levin <st...@gmail.com>
> > wrote:
> > > >
> > > > > I see, thanks for the snippet.
> > > > >
> > > > > Won't the API be more robust (i.e. not leave it up to implementors'
> > > > > interpretation) by not exposing the state in any way other than
> > > > getState()?
> > > > > The snippet above will still work by mutating existing state, and
> in
> > > > > addition such an API will prevent returning any other (possibly
> > > > > inconsistent) state.
> > > > >
> > > > > So instead of doing:
> > > > >
> > > > > State myState = pipeline.waitUntilFinish();
> > > > > State myOtherState = pipeline.getState();
> > > > >
> > > > > // technically (myState != state) can be true
> > > > >
> > > > > Users will do:
> > > > >
> > > > > pipeline.waitUntilFinish();
> > > > > State myState = pipeline.getState();
> > > > >
> > > > > What do you think?
> > > > >
> > > > >
> > > > > On Thu, Dec 22, 2016 at 7:32 PM Lukasz Cwik
> <lcwik@google.com.invalid
> > >
> > > > > wrote:
> > > > >
> > > > > > I think cancel and waitUntilFinish return state because they are
> > > > > > interacting with the runner and are likely to have pipeline state
> > > > > > information available to them at the time when performing that
> > > > operation.
> > > > > >
> > > > > > For example, waitUntilFinish(Duration) could just be a thin
> veneer
> > > of:
> > > > > > State state;
> > > > > > do {
> > > > > >   state = getState();
> > > > > >   if (state is terminal) {
> > > > > >     return state;
> > > > > >   }
> > > > > >   sleep
> > > > > > } while (time remaining)
> > > > > > return state;
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Dec 22, 2016 at 3:18 AM, Stas Levin <staslevin@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I was wondering if the current API for PipelineResult might
> open
> > > the
> > > > > door
> > > > > > > to inconsistencies stemming from cancel() or waitUntilFinish()
> > > > > returning
> > > > > > > one state, and getState() returning another? Are such cases
> > legit?
> > > > > > >
> > > > > > > PipelineResult's API has a getState() method:
> > > > > > >
> > > > > > > State getState();
> > > > > > >
> > > > > > > at the same time other methods such as cancel() and
> > > waitUntilFinish()
> > > > > > > return State as well:
> > > > > > >
> > > > > > > State waitUntilFinish(Duration duration);
> > > > > > >
> > > > > > > State cancel() throws IOException;
> > > > > > >
> > > > > > > Is this intentional?
> > > > > > >
> > > > > > > The alternative would be making cancel() and waitUntilFinish()
> > > return
> > > > > > void,
> > > > > > > so that the only (and thus consistent) way to obtain a
> > > > PipelineResult's
> > > > > > > state would be via getState().
> > > > > > >
> > > > > > > Am I missing something?
> > > > > > >
> > > > > > > Regards,
> > > > > > > Stas
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>