You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@beam.apache.org by Romain Manni-Bucau <rm...@gmail.com> on 2018/02/16 10:11:05 UTC

@TearDown guarantees

Hi guys,

I'm a bit concerned of this PR https://github.com/apache/beam/pull/4637

I understand the intent but I'd like to share how I see it and why it is an
issue for me:

1. you can't help if the JVM crash in any case. Tomcat had a try to
preallocate some memory for instance to free it in case of OOME and try to
recover but it never prooved useful and got dropped recently. This is a
good example you can't do anything if there is a cataclism and therefore
any framework or lib will not be blamed for it
2. if you expose an API, its behavior must be well defined. In the case of
a portable library like Beam it is even more important otherwise it leads
users to not use the API or the projet :(.


These two points lead to say that if the JVM crashes it is ok to not call
teardown and it is even implicit in any programming environment so no need
to mention it. However that a runner doesn't call teardown is a bug and not
a feature or something intended because it can have a huge impact on the
user flow.

The user workarounds are to use custom threads with timeouts to execute the
actions or things like that, all bad solutions to replace a buggy API, if
you remove the contract guarantee.

To make it obvious: substring(from, to): will substring the current string
between from and to...or not. Would you use the function?

What I ask is to add in the javadoc that the contract enforces the runner
to call that. Which means the core doesn't guarantee it but imposes the
runner to do so. This way the not portable behavior is where it belongs to,
in the vendor specific code. It leads to a reliable API for the end user
and let runners document they don't respect - yet - the API when relevant.

wdyt?

Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

@TearDown refers to DoFn teardown not process teardown (it's basically a
destructor). So it's also runner defined.

There may be a place for a container that lives as long as the process (not
tied to the DoFn life). However that would be something new to add.

On Fri, Feb 16, 2018, 8:52 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> finish bundle is well defined and must be called, right, not at the end so
> you still miss teardown as a user. Bundles are defined by the runner and
> you can have 100000 bundles per batch (even more for a stream ;)) so you
> dont want to release your resources or handle you execution auditing in it,
> you want it at the end so in tear down.
>
> So yes we must have teardown reliable somehow.
>
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>
> 2018-02-16 17:43 GMT+01:00 Reuven Lax <re...@google.com>:
>
>> +1 I think @FinishBundle is the right thing to look at here.
>>
>> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
>> wrote:
>>
>>> Hi Romain
>>>
>>> Is it not @FinishBundle your solution ?
>>>
>>> Regards
>>> JB
>>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com> a
>>> écrit:
>>>>
>>>> I see Reuven, so it is actually a broken contract for end users more
>>>> than a bug. Concretely a user must have a way to execute code once the
>>>> teardown is no more used and a teardown is populated by the user in the
>>>> context of an execution.
>>>> It means that if the environment wants to pool (cache) the instances it
>>>> must provide a postBorrowFromCache and preReturnToCache to let the user
>>>> handle that - we'll get back to EJB and passivation ;).
>>>>
>>>> Personally I think it is fine to cache the instances for the duration
>>>> of an execution but not accross execution. Concretely if you check out the
>>>> API it should just not be possible for a runner since the lifecycle is not
>>>> covered and the fact teardown can not be called today is an implementation
>>>> bug/leak surfacing the API.
>>>>
>>>> So I see 2 options:
>>>>
>>>> 1. make it mandatory and get rid of the caching - which shouldnt help
>>>> much in current state in terms of perf
>>>> 2. keep teardown a final release object (which is not that useful cause
>>>> of the end of the sentence) and add a clean cache lifecycle management
>>>>
>>>> tempted to say 1 is saner short terms, in particular cause beam is 2.x
>>>> and users already use it this way.
>>>>
>>>>
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> |  Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>
>>>>> So the concern is that @TearDown might not be called?
>>>>>
>>>>> Let's understand the reason for @TearDown. The runner is free to cache
>>>>> the DoFn object across many invocations, and indeed in streaming this is
>>>>> often a critical optimization. However if the runner does decide to destroy
>>>>> the DoFn object (e.g. because it's being evicted from cache), often users
>>>>> need a callback to tear down associated resources (file handles, RPC
>>>>> connections, etc.).
>>>>>
>>>>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>>>>> runner might never tear down the DoFn object! The runner might well decide
>>>>> to cache the object forever, in which case there is never .a time to call
>>>>> @TearDown. There is no violation of semantics here.
>>>>>
>>>>> Also, the point about not calling teardown if the JVM crashes might
>>>>> well sound implicit with no need to mention it. However empirically users
>>>>> do misunderstand even this, so it's worth mentioning.
>>>>>
>>>>> Reuven
>>>>>
>>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> Hi guys,
>>>>>>
>>>>>> I'm a bit concerned of this PR
>>>>>> https://github.com/apache/beam/pull/4637
>>>>>>
>>>>>> I understand the intent but I'd like to share how I see it and why it
>>>>>> is an issue for me:
>>>>>>
>>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>>>>> preallocate some memory for instance to free it in case of OOME and try to
>>>>>> recover but it never prooved useful and got dropped recently. This is a
>>>>>> good example you can't do anything if there is a cataclism and therefore
>>>>>> any framework or lib will not be blamed for it
>>>>>> 2. if you expose an API, its behavior must be well defined. In the
>>>>>> case of a portable library like Beam it is even more important otherwise it
>>>>>> leads users to not use the API or the projet :(.
>>>>>>
>>>>>>
>>>>>> These two points lead to say that if the JVM crashes it is ok to not
>>>>>> call teardown and it is even implicit in any programming environment so no
>>>>>> need to mention it. However that a runner doesn't call teardown is a bug
>>>>>> and not a feature or something intended because it can have a huge impact
>>>>>> on the user flow.
>>>>>>
>>>>>> The user workarounds are to use custom threads with timeouts to
>>>>>> execute the actions or things like that, all bad solutions to replace a
>>>>>> buggy API, if you remove the contract guarantee.
>>>>>>
>>>>>> To make it obvious: substring(from, to): will substring the current
>>>>>> string between from and to...or not. Would you use the function?
>>>>>>
>>>>>> What I ask is to add in the javadoc that the contract enforces the
>>>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>>>> user and let runners document they don't respect - yet - the API when
>>>>>> relevant.
>>>>>>
>>>>>> wdyt?
>>>>>>
>>>>>> Romain Manni-Bucau
>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>
>>>>>
>>>>>
>>>>
>

Re: @TearDown guarantees

Posted by Thomas Groh <tg...@google.com>.

On perf: Deserialization of an arbitrary object is expensive. This cost is
amortized over all of the elements that the object processes, but for a
runner with small bundles, that cost never gets meaningfully amortized -
deserializing a DoFn instance of unknown complexity to process one element
means that we're multiplying our decoding costs by potentially multiple
times. Reusing user Fns permits us to amortize across worker lifetimes,
which is many-times beneficial.

On resilience: You should distinguish "reliably" with "always". Users can
depend on "always" for correctness, but can't depend on something done
"reliably" for that. "Reliably" can be generally depended on for
performance, which is why Teardown exists.

Runners can call the Teardown method *almost always *if a DoFn instance
will not be reused (in the absence of arbitrary failures, the runner
*MUST* call
Teardown, according to the original spec), but *SHOULD* and *MUST* are
extremely different in terms of implementation requirements. If you say
that Teardown *MUST* be called, what this means is that a runner *MUST
NOT* have
resources that fail arbitrarily, and this is not an acceptable restriction
for any existing distributed backend.

If you need something which is always called once per element at a
granularity coarser than once per element, that is exactly what
FinishBundle provides, and is why the method exists.

The original proposal for Setup/Teardown:
https://docs.google.com/document/d/1LLQqggSePURt3XavKBGV7SZJYQ4NW8yCu63lBchzMRk/edit#

On Fri, Feb 16, 2018 at 9:39 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> So do I get it right a leak of Dataflow implementation impacts the API?
> Also sounds like this perf issues is due to a blind serialization instead
> of modelizing what is serialized - nothing should be slow enough in the
> serialization at that level, do you have more details on that particular
> point?
>
It also means you accept to leak particular instances data like password
> etc (all the @AutoValue builder ones typically) since you dont call - or
> not reliably - a postexecution hook which should get solved ASAP.
>

> @Thomas: I understand your update was to align the dataflow behavior to
> the API but actually the opposite should be done, align DataFlow impl on
> the API. If we disagree tearDown is [1;1] - I'm fine with that, then
> teardown is not really usable for users and we miss a
> "the fact that we leave the runner discretion on when it can call
> teardown does not make this poorly-defined; it means that users should not
> depend on teardown being called for correct behavior, and *this has
> always been true and will continue to be true*."
> This is not really the case, you say it yourself "[...] does not make
> this poorly-defined [...] it means that users should not depend on
> teardown". This literally means @TearDown is not part of the API. Once
> again I'm fine with it but this kind of API is needed.
> "*this has always been true and will continue to be true"*
> Not really too since it was not clear before and runner dependent so users
> can depend on it.
>
> With both statements I think it should just get fixed and made reliable
> which is technically possible IMHO instead of creating a new API which
> would make teardown a cache hook which is an implementation detail which
> shouldn't surface in the API.
>
> @AfterExecution. @FinishBundle is once the bundle finishes so not a
> "finally" for the dofn regarding the execution.
>
> Side note: the success callback hook which has been discussed N times
> doesnt match the need which is really per instance (= accessible from that
> particular instance and not globally) in both success and failure cases.
>
>
> 2018-02-16 18:18 GMT+01:00 Kenneth Knowles <kl...@google.com>:
>
>> Which runner's bundling are you concerned with? It sounds like the Flink
>> runner?
>>
>
> Flink, Spark, DirectRunner, DataFlow at least (others would be good but
> are out of scope)
>
>
>>
>> Kenn
>>
>>
>> On Fri, Feb 16, 2018 at 9:04 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>>
>>> 2018-02-16 17:59 GMT+01:00 Kenneth Knowles <kl...@google.com>:
>>>
>>>> What I am hearing is this:
>>>>
>>>>  - @FinishBundle does what you want (a reliable "flush" call) but your
>>>> runner is not doing a good job of bundling
>>>>
>>>
>>> Nop, finishbundle is defined but not a bundle. Typically for 1 million
>>> rows I'll get 1 million calls in flink and 1 call in spark (today) so this
>>> is not a way to call a final task to release dofn internal instances or do
>>> some one time auditing.
>>>
>>>
>>>>  - @Teardown has well-defined semantics and they are not what you want
>>>>
>>>
>>> "
>>> Note that calls to the annotated method are best effort, and may not
>>> occur for arbitrary reasons"
>>>
>>> is not really "well-defined" and is also a breaking change compared to
>>> the < 2.3.x (x >= 1) .
>>>
>>>
>>>> So you are hoping for something that is called less frequently but is
>>>> still mandatory.
>>>>
>>>> Just trying to establish the basics to start over and get this on track
>>>> to solving the real problem.
>>>>
>>>
>>> Concretely I need a well defined lifecycle for any DoFn executed in beam
>>> and today there is no such a thing making it impossible to develop
>>> correctly transforms/fn on an user side.
>>>
>>>
>>>>
>>>> Kenn
>>>>
>>>>
>>>> On Fri, Feb 16, 2018 at 8:51 AM, Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> finish bundle is well defined and must be called, right, not at the
>>>>> end so you still miss teardown as a user. Bundles are defined by the runner
>>>>> and you can have 100000 bundles per batch (even more for a stream ;)) so
>>>>> you dont want to release your resources or handle you execution auditing in
>>>>> it, you want it at the end so in tear down.
>>>>>
>>>>> So yes we must have teardown reliable somehow.
>>>>>
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>> 2018-02-16 17:43 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>
>>>>>> +1 I think @FinishBundle is the right thing to look at here.
>>>>>>
>>>>>> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi Romain
>>>>>>>
>>>>>>> Is it not @FinishBundle your solution ?
>>>>>>>
>>>>>>> Regards
>>>>>>> JB
>>>>>>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com>
>>>>>>> a écrit:
>>>>>>>>
>>>>>>>> I see Reuven, so it is actually a broken contract for end users
>>>>>>>> more than a bug. Concretely a user must have a way to execute code once the
>>>>>>>> teardown is no more used and a teardown is populated by the user in the
>>>>>>>> context of an execution.
>>>>>>>> It means that if the environment wants to pool (cache) the
>>>>>>>> instances it must provide a postBorrowFromCache and preReturnToCache to let
>>>>>>>> the user handle that - we'll get back to EJB and passivation ;).
>>>>>>>>
>>>>>>>> Personally I think it is fine to cache the instances for the
>>>>>>>> duration of an execution but not accross execution. Concretely if you check
>>>>>>>> out the API it should just not be possible for a runner since the lifecycle
>>>>>>>> is not covered and the fact teardown can not be called today is an
>>>>>>>> implementation bug/leak surfacing the API.
>>>>>>>>
>>>>>>>> So I see 2 options:
>>>>>>>>
>>>>>>>> 1. make it mandatory and get rid of the caching - which shouldnt
>>>>>>>> help much in current state in terms of perf
>>>>>>>> 2. keep teardown a final release object (which is not that useful
>>>>>>>> cause of the end of the sentence) and add a clean cache lifecycle
>>>>>>>> management
>>>>>>>>
>>>>>>>> tempted to say 1 is saner short terms, in particular cause beam is
>>>>>>>> 2.x and users already use it this way.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Romain Manni-Bucau
>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>
>>>>>>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>>
>>>>>>>>> So the concern is that @TearDown might not be called?
>>>>>>>>>
>>>>>>>>> Let's understand the reason for @TearDown. The runner is free to
>>>>>>>>> cache the DoFn object across many invocations, and indeed in streaming this
>>>>>>>>> is often a critical optimization. However if the runner does decide to
>>>>>>>>> destroy the DoFn object (e.g. because it's being evicted from cache), often
>>>>>>>>> users need a callback to tear down associated resources (file handles, RPC
>>>>>>>>> connections, etc.).
>>>>>>>>>
>>>>>>>>> Now @TearDown isn't guaranteed to be called for a simple reason:
>>>>>>>>> the runner might never tear down the DoFn object! The runner might well
>>>>>>>>> decide to cache the object forever, in which case there is never .a time to
>>>>>>>>> call @TearDown. There is no violation of semantics here.
>>>>>>>>>
>>>>>>>>> Also, the point about not calling teardown if the JVM crashes
>>>>>>>>> might well sound implicit with no need to mention it. However empirically
>>>>>>>>> users do misunderstand even this, so it's worth mentioning.
>>>>>>>>>
>>>>>>>>> Reuven
>>>>>>>>>
>>>>>>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hi guys,
>>>>>>>>>>
>>>>>>>>>> I'm a bit concerned of this PR
>>>>>>>>>> https://github.com/apache/beam/pull/4637
>>>>>>>>>>
>>>>>>>>>> I understand the intent but I'd like to share how I see it and
>>>>>>>>>> why it is an issue for me:
>>>>>>>>>>
>>>>>>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try
>>>>>>>>>> to preallocate some memory for instance to free it in case of OOME and try
>>>>>>>>>> to recover but it never prooved useful and got dropped recently. This is a
>>>>>>>>>> good example you can't do anything if there is a cataclism and therefore
>>>>>>>>>> any framework or lib will not be blamed for it
>>>>>>>>>> 2. if you expose an API, its behavior must be well defined. In
>>>>>>>>>> the case of a portable library like Beam it is even more important
>>>>>>>>>> otherwise it leads users to not use the API or the projet :(.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> These two points lead to say that if the JVM crashes it is ok to
>>>>>>>>>> not call teardown and it is even implicit in any programming environment so
>>>>>>>>>> no need to mention it. However that a runner doesn't call teardown is a bug
>>>>>>>>>> and not a feature or something intended because it can have a huge impact
>>>>>>>>>> on the user flow.
>>>>>>>>>>
>>>>>>>>>> The user workarounds are to use custom threads with timeouts to
>>>>>>>>>> execute the actions or things like that, all bad solutions to replace a
>>>>>>>>>> buggy API, if you remove the contract guarantee.
>>>>>>>>>>
>>>>>>>>>> To make it obvious: substring(from, to): will substring the
>>>>>>>>>> current string between from and to...or not. Would you use the function?
>>>>>>>>>>
>>>>>>>>>> What I ask is to add in the javadoc that the contract enforces
>>>>>>>>>> the runner to call that. Which means the core doesn't guarantee it but
>>>>>>>>>> imposes the runner to do so. This way the not portable behavior is where it
>>>>>>>>>> belongs to, in the vendor specific code. It leads to a reliable API for the
>>>>>>>>>> end user and let runners document they don't respect - yet - the API when
>>>>>>>>>> relevant.
>>>>>>>>>>
>>>>>>>>>> wdyt?
>>>>>>>>>>
>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Yes exactly JB, I just want to ensure the sdk/core API is clear and well
defined and that any not respect of that falls into a runner bug. What I
don't want is that a buggy impl leaks in the SDK/core definition.


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-18 17:56 GMT+01:00 Jean-Baptiste Onofré <jb...@nanthrax.net>:

> My bad, I thought you talked about guarantee in the Runner API.
>
> If it's semantic point in the SDK (enforcement instead of best effort),
> and then if the runner doesn't respect that, it's a limitation/bug in the
> runner, I would agree with that.
>
> Regards
> JB
>
> On 18/02/2018 16:58, Romain Manni-Bucau wrote:
>
>>
>>
>> Le 18 févr. 2018 15:39, "Jean-Baptiste Onofré" <jb@nanthrax.net <mailto:
>> jb@nanthrax.net>> a écrit :
>>
>>     Hi,
>>
>>     I think, as you said, it depends of the protocol and the IO.
>>
>>     For instance, in first version of JdbcIO, I created the connections
>>     in @Setup
>>     and released in @Teardown.
>>
>>     But, in case of streaming system, it's not so good (especially for
>>     pooling) as
>>     the connection stays open for a very long time.
>>
>>
>> Hmm can be discussed in practise (both pooling and connection holding for
>> jdbc in beam context) but lets assume it.
>>
>>
>>     So, I updated to deal with connection in @StartBundle and release in
>>     @FinishBundle.
>>
>>
>>
>> Which leads to an unpredictable bundle size and therefore very very bad
>> perfs on write size - read size is faked but in mem buffer i guess which
>> breaks the bundle definition but let s ignore it too for now.
>>
>>
>>     So, I think it depends of the kind of connections: the kind of
>>     connection
>>     actually holding resources should be manage in bundle (at least for
>>     now), the
>>     other kind of connection (just wrapping configuration but not
>>     holding resources
>>     like Apache HTTP Component Client for instance) could be dealt in
>>     DoFn lifecycle.
>>
>>
>>
>> Once again, I would be ok with bundles for now - but it doesnt solve the
>> real issue - if bundles are up to the user. Since it is not, it doesnt help
>> and can just degrade the overall behavior in both batch and streaming.
>>
>> I fully understand beam doesnt handle properly that today. What does
>> block to do it? Nothing technical so why not doing it?
>>
>> Technically:
>>
>> 1. Teardown can be guaranteed
>> 2. Bundle size can be highly influenced / configured by user
>>
>> Both are needed to be able to propose a strong api compared to
>> competitors and aims to not only have disavantages going portable for users.
>>
>> Let just do it, no?
>>
>>
>>     Regards
>>     JB
>>
>>     On 02/18/2018 11:05 AM, Romain Manni-Bucau wrote:
>>      >
>>      >
>>      > Le 18 févr. 2018 00:23, "Kenneth Knowles" <klk@google.com
>>     <ma...@google.com>
>>      > <mailto:klk@google.com <ma...@google.com>>> a écrit :
>>      >
>>      >     On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau
>>     <rmannibucau@gmail.com <ma...@gmail.com>
>>      >     <mailto:rmannibucau@gmail.com
>>
>>     <ma...@gmail.com>>> wrote:
>>      >
>>      >             If you give an example of a high-level need (e.g.
>>     "I'm trying to
>>      >             write an IO for system $x and it requires the following
>>      >             initialization and the following cleanup logic and
>>     the following
>>      >             processing in between") I'll be better able to help
>> you.
>>      >
>>      >
>>      >         Take a simple example of a transform requiring a
>>     connection. Using
>>      >         bundles is a perf killer since size is not controlled.
>>     Using teardown
>>      >         doesnt allow you to release the connection since it is a
>>     best effort
>>      >         thing. Not releasing the connection makes you pay a lot -
>>     aws ;) - or
>>      >         prevents you to launch other processings - concurrent
>> limit.
>>      >
>>      >
>>      >     For this example @Teardown is an exact fit. If things die so
>>     badly that
>>      >     @Teardown is not called then nothing else can be called to
>>     close the
>>      >     connection either. What AWS service are you thinking of that
>>     stays open for
>>      >     a long time when everything at the other end has died?
>>      >
>>      >
>>      > You assume connections are kind of stateless but some
>>     (proprietary) protocols
>>      > requires some closing exchanges which are not only "im leaving".
>>      >
>>      > For aws i was thinking about starting some services - machines -
>>     on the fly in a
>>      > pipeline startup and closing them at the end. If teardown is not
>>     called you leak
>>      > machines and money. You can say it can be done another way...as
>>     the full
>>      > pipeline ;).
>>      >
>>      > I dont want to be picky but if beam cant handle its components
>>     lifecycle it can
>>      > be used at scale for generic pipelines and if bound to some
>>     particular IO.
>>      >
>>      > What does prevent to enforce teardown - ignoring the interstellar
>>     crash case
>>      > which cant be handled by any human system? Nothing technically.
>>     Why do you push
>>      > to not handle it? Is it due to some legacy code on dataflow or
>>     something else?
>>      >
>>      > Also what does it mean for the users? Direct runner does it so if
>>     a user udes
>>      > the RI in test, he will get a different behavior in prod? Also
>>     dont forget the
>>      > user doesnt know what the IOs he composes use so this is so
>>     impacting for the
>>      > whole product than he must be handled IMHO.
>>      >
>>      > I understand the portability culture is new in big data world but
>>     it is not a
>>      > reason to ignore what people did for years and do it wrong before
>>     doing right ;).
>>      >
>>      > My proposal is to list what can prevent to guarantee - in the
>>     normal IT
>>      > conditions - the execution of teardown. Then we see if we can
>>     handle it and only
>>      > if there is a technical reason we cant we make it
>>     experimental/unsupported in
>>      > the api. I know spark and flink can, any unknown blocker for
>>     other runners?
>>      >
>>      > Technical note: even a kill should go through java shutdown hooks
>>     otherwise your
>>      > environment (beam enclosing software) is fully unhandled and your
>>     overall system
>>      > is uncontrolled. Only case where it is not true is when the
>>     software is always
>>      > owned by a vendor and never installed on customer environment. In
>>     this case it
>>      > belongd to the vendor to handle beam API and not to beam to
>>     adjust its API for a
>>      > vendor - otherwise all unsupported features by one runner should
>>     be made
>>      > optional right?
>>      >
>>      > All state is not about network, even in distributed systems so
>>     this is key to
>>      > have an explicit and defined lifecycle.
>>      >
>>      >
>>      >     Kenn
>>      >
>>      >
>>
>>     --
>>     Jean-Baptiste Onofré
>>     jbonofre@apache.org <ma...@apache.org>
>>     http://blog.nanthrax.net
>>     Talend - http://www.talend.com
>>
>>
>>

Re: @TearDown guarantees

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.

My bad, I thought you talked about guarantee in the Runner API.

If it's semantic point in the SDK (enforcement instead of best effort), 
and then if the runner doesn't respect that, it's a limitation/bug in 
the runner, I would agree with that.

Regards
JB

On 18/02/2018 16:58, Romain Manni-Bucau wrote:
> 
> 
> Le 18 févr. 2018 15:39, "Jean-Baptiste Onofré" <jb@nanthrax.net 
> <ma...@nanthrax.net>> a écrit :
> 
>     Hi,
> 
>     I think, as you said, it depends of the protocol and the IO.
> 
>     For instance, in first version of JdbcIO, I created the connections
>     in @Setup
>     and released in @Teardown.
> 
>     But, in case of streaming system, it's not so good (especially for
>     pooling) as
>     the connection stays open for a very long time.
> 
> 
> Hmm can be discussed in practise (both pooling and connection holding 
> for jdbc in beam context) but lets assume it.
> 
> 
>     So, I updated to deal with connection in @StartBundle and release in
>     @FinishBundle.
> 
> 
> 
> Which leads to an unpredictable bundle size and therefore very very bad 
> perfs on write size - read size is faked but in mem buffer i guess which 
> breaks the bundle definition but let s ignore it too for now.
> 
> 
>     So, I think it depends of the kind of connections: the kind of
>     connection
>     actually holding resources should be manage in bundle (at least for
>     now), the
>     other kind of connection (just wrapping configuration but not
>     holding resources
>     like Apache HTTP Component Client for instance) could be dealt in
>     DoFn lifecycle.
> 
> 
> 
> Once again, I would be ok with bundles for now - but it doesnt solve the 
> real issue - if bundles are up to the user. Since it is not, it doesnt 
> help and can just degrade the overall behavior in both batch and streaming.
> 
> I fully understand beam doesnt handle properly that today. What does 
> block to do it? Nothing technical so why not doing it?
> 
> Technically:
> 
> 1. Teardown can be guaranteed
> 2. Bundle size can be highly influenced / configured by user
> 
> Both are needed to be able to propose a strong api compared to 
> competitors and aims to not only have disavantages going portable for users.
> 
> Let just do it, no?
> 
> 
>     Regards
>     JB
> 
>     On 02/18/2018 11:05 AM, Romain Manni-Bucau wrote:
>      >
>      >
>      > Le 18 févr. 2018 00:23, "Kenneth Knowles" <klk@google.com
>     <ma...@google.com>
>      > <mailto:klk@google.com <ma...@google.com>>> a écrit :
>      >
>      >     On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau
>     <rmannibucau@gmail.com <ma...@gmail.com>
>      >     <mailto:rmannibucau@gmail.com
>     <ma...@gmail.com>>> wrote:
>      >
>      >             If you give an example of a high-level need (e.g.
>     "I'm trying to
>      >             write an IO for system $x and it requires the following
>      >             initialization and the following cleanup logic and
>     the following
>      >             processing in between") I'll be better able to help you.
>      >
>      >
>      >         Take a simple example of a transform requiring a
>     connection. Using
>      >         bundles is a perf killer since size is not controlled.
>     Using teardown
>      >         doesnt allow you to release the connection since it is a
>     best effort
>      >         thing. Not releasing the connection makes you pay a lot -
>     aws ;) - or
>      >         prevents you to launch other processings - concurrent limit.
>      >
>      >
>      >     For this example @Teardown is an exact fit. If things die so
>     badly that
>      >     @Teardown is not called then nothing else can be called to
>     close the
>      >     connection either. What AWS service are you thinking of that
>     stays open for
>      >     a long time when everything at the other end has died?
>      >
>      >
>      > You assume connections are kind of stateless but some
>     (proprietary) protocols
>      > requires some closing exchanges which are not only "im leaving".
>      >
>      > For aws i was thinking about starting some services - machines -
>     on the fly in a
>      > pipeline startup and closing them at the end. If teardown is not
>     called you leak
>      > machines and money. You can say it can be done another way...as
>     the full
>      > pipeline ;).
>      >
>      > I dont want to be picky but if beam cant handle its components
>     lifecycle it can
>      > be used at scale for generic pipelines and if bound to some
>     particular IO.
>      >
>      > What does prevent to enforce teardown - ignoring the interstellar
>     crash case
>      > which cant be handled by any human system? Nothing technically.
>     Why do you push
>      > to not handle it? Is it due to some legacy code on dataflow or
>     something else?
>      >
>      > Also what does it mean for the users? Direct runner does it so if
>     a user udes
>      > the RI in test, he will get a different behavior in prod? Also
>     dont forget the
>      > user doesnt know what the IOs he composes use so this is so
>     impacting for the
>      > whole product than he must be handled IMHO.
>      >
>      > I understand the portability culture is new in big data world but
>     it is not a
>      > reason to ignore what people did for years and do it wrong before
>     doing right ;).
>      >
>      > My proposal is to list what can prevent to guarantee - in the
>     normal IT
>      > conditions - the execution of teardown. Then we see if we can
>     handle it and only
>      > if there is a technical reason we cant we make it
>     experimental/unsupported in
>      > the api. I know spark and flink can, any unknown blocker for
>     other runners?
>      >
>      > Technical note: even a kill should go through java shutdown hooks
>     otherwise your
>      > environment (beam enclosing software) is fully unhandled and your
>     overall system
>      > is uncontrolled. Only case where it is not true is when the
>     software is always
>      > owned by a vendor and never installed on customer environment. In
>     this case it
>      > belongd to the vendor to handle beam API and not to beam to
>     adjust its API for a
>      > vendor - otherwise all unsupported features by one runner should
>     be made
>      > optional right?
>      >
>      > All state is not about network, even in distributed systems so
>     this is key to
>      > have an explicit and defined lifecycle.
>      >
>      >
>      >     Kenn
>      >
>      >
> 
>     --
>     Jean-Baptiste Onofré
>     jbonofre@apache.org <ma...@apache.org>
>     http://blog.nanthrax.net
>     Talend - http://www.talend.com
> 
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 18 févr. 2018 15:39, "Jean-Baptiste Onofré" <jb...@nanthrax.net> a écrit :

Hi,

I think, as you said, it depends of the protocol and the IO.

For instance, in first version of JdbcIO, I created the connections in
@Setup
and released in @Teardown.

But, in case of streaming system, it's not so good (especially for pooling)
as
the connection stays open for a very long time.


Hmm can be discussed in practise (both pooling and connection holding for
jdbc in beam context) but lets assume it.


So, I updated to deal with connection in @StartBundle and release in
@FinishBundle.



Which leads to an unpredictable bundle size and therefore very very bad
perfs on write size - read size is faked but in mem buffer i guess which
breaks the bundle definition but let s ignore it too for now.


So, I think it depends of the kind of connections: the kind of connection
actually holding resources should be manage in bundle (at least for now),
the
other kind of connection (just wrapping configuration but not holding
resources
like Apache HTTP Component Client for instance) could be dealt in DoFn
lifecycle.



Once again, I would be ok with bundles for now - but it doesnt solve the
real issue - if bundles are up to the user. Since it is not, it doesnt help
and can just degrade the overall behavior in both batch and streaming.

I fully understand beam doesnt handle properly that today. What does block
to do it? Nothing technical so why not doing it?

Technically:

1. Teardown can be guaranteed
2. Bundle size can be highly influenced / configured by user

Both are needed to be able to propose a strong api compared to competitors
and aims to not only have disavantages going portable for users.

Let just do it, no?


Regards
JB

On 02/18/2018 11:05 AM, Romain Manni-Bucau wrote:
>
>
> Le 18 févr. 2018 00:23, "Kenneth Knowles" <klk@google.com
> <ma...@google.com>> a écrit :
>
>     On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
rmannibucau@gmail.com
>     <ma...@gmail.com>> wrote:
>
>             If you give an example of a high-level need (e.g. "I'm trying
to
>             write an IO for system $x and it requires the following
>             initialization and the following cleanup logic and the
following
>             processing in between") I'll be better able to help you.
>
>
>         Take a simple example of a transform requiring a connection. Using
>         bundles is a perf killer since size is not controlled. Using
teardown
>         doesnt allow you to release the connection since it is a best
effort
>         thing. Not releasing the connection makes you pay a lot - aws ;)
- or
>         prevents you to launch other processings - concurrent limit.
>
>
>     For this example @Teardown is an exact fit. If things die so badly
that
>     @Teardown is not called then nothing else can be called to close the
>     connection either. What AWS service are you thinking of that stays
open for
>     a long time when everything at the other end has died?
>
>
> You assume connections are kind of stateless but some (proprietary)
protocols
> requires some closing exchanges which are not only "im leaving".
>
> For aws i was thinking about starting some services - machines - on the
fly in a
> pipeline startup and closing them at the end. If teardown is not called
you leak
> machines and money. You can say it can be done another way...as the full
> pipeline ;).
>
> I dont want to be picky but if beam cant handle its components lifecycle
it can
> be used at scale for generic pipelines and if bound to some particular IO.
>
> What does prevent to enforce teardown - ignoring the interstellar crash
case
> which cant be handled by any human system? Nothing technically. Why do
you push
> to not handle it? Is it due to some legacy code on dataflow or something
else?
>
> Also what does it mean for the users? Direct runner does it so if a user
udes
> the RI in test, he will get a different behavior in prod? Also dont
forget the
> user doesnt know what the IOs he composes use so this is so impacting for
the
> whole product than he must be handled IMHO.
>
> I understand the portability culture is new in big data world but it is
not a
> reason to ignore what people did for years and do it wrong before doing
right ;).
>
> My proposal is to list what can prevent to guarantee - in the normal IT
> conditions - the execution of teardown. Then we see if we can handle it
and only
> if there is a technical reason we cant we make it
experimental/unsupported in
> the api. I know spark and flink can, any unknown blocker for other
runners?
>
> Technical note: even a kill should go through java shutdown hooks
otherwise your
> environment (beam enclosing software) is fully unhandled and your overall
system
> is uncontrolled. Only case where it is not true is when the software is
always
> owned by a vendor and never installed on customer environment. In this
case it
> belongd to the vendor to handle beam API and not to beam to adjust its
API for a
> vendor - otherwise all unsupported features by one runner should be made
> optional right?
>
> All state is not about network, even in distributed systems so this is
key to
> have an explicit and defined lifecycle.
>
>
>     Kenn
>
>

--
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: @TearDown guarantees

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.

Hi,

I think, as you said, it depends of the protocol and the IO.

For instance, in first version of JdbcIO, I created the connections in @Setup
and released in @Teardown.

But, in case of streaming system, it's not so good (especially for pooling) as
the connection stays open for a very long time.

So, I updated to deal with connection in @StartBundle and release in @FinishBundle.

So, I think it depends of the kind of connections: the kind of connection
actually holding resources should be manage in bundle (at least for now), the
other kind of connection (just wrapping configuration but not holding resources
like Apache HTTP Component Client for instance) could be dealt in DoFn lifecycle.

Regards
JB

On 02/18/2018 11:05 AM, Romain Manni-Bucau wrote:
> 
> 
> Le 18 févr. 2018 00:23, "Kenneth Knowles" <klk@google.com
> <ma...@google.com>> a écrit :
> 
>     On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <rmannibucau@gmail.com
>     <ma...@gmail.com>> wrote:
> 
>             If you give an example of a high-level need (e.g. "I'm trying to
>             write an IO for system $x and it requires the following
>             initialization and the following cleanup logic and the following
>             processing in between") I'll be better able to help you.
> 
> 
>         Take a simple example of a transform requiring a connection. Using
>         bundles is a perf killer since size is not controlled. Using teardown
>         doesnt allow you to release the connection since it is a best effort
>         thing. Not releasing the connection makes you pay a lot - aws ;) - or
>         prevents you to launch other processings - concurrent limit.
> 
> 
>     For this example @Teardown is an exact fit. If things die so badly that
>     @Teardown is not called then nothing else can be called to close the
>     connection either. What AWS service are you thinking of that stays open for
>     a long time when everything at the other end has died?
> 
> 
> You assume connections are kind of stateless but some (proprietary) protocols
> requires some closing exchanges which are not only "im leaving".
> 
> For aws i was thinking about starting some services - machines - on the fly in a
> pipeline startup and closing them at the end. If teardown is not called you leak
> machines and money. You can say it can be done another way...as the full
> pipeline ;).
> 
> I dont want to be picky but if beam cant handle its components lifecycle it can
> be used at scale for generic pipelines and if bound to some particular IO.
> 
> What does prevent to enforce teardown - ignoring the interstellar crash case
> which cant be handled by any human system? Nothing technically. Why do you push
> to not handle it? Is it due to some legacy code on dataflow or something else?
> 
> Also what does it mean for the users? Direct runner does it so if a user udes
> the RI in test, he will get a different behavior in prod? Also dont forget the
> user doesnt know what the IOs he composes use so this is so impacting for the
> whole product than he must be handled IMHO.
> 
> I understand the portability culture is new in big data world but it is not a
> reason to ignore what people did for years and do it wrong before doing right ;).
> 
> My proposal is to list what can prevent to guarantee - in the normal IT
> conditions - the execution of teardown. Then we see if we can handle it and only
> if there is a technical reason we cant we make it experimental/unsupported in
> the api. I know spark and flink can, any unknown blocker for other runners?
> 
> Technical note: even a kill should go through java shutdown hooks otherwise your
> environment (beam enclosing software) is fully unhandled and your overall system
> is uncontrolled. Only case where it is not true is when the software is always
> owned by a vendor and never installed on customer environment. In this case it
> belongd to the vendor to handle beam API and not to beam to adjust its API for a
> vendor - otherwise all unsupported features by one runner should be made
> optional right?
> 
> All state is not about network, even in distributed systems so this is key to
> have an explicit and defined lifecycle.
> 
> 
>     Kenn
> 
> 

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: @TearDown guarantees

Posted by Kenneth Knowles <kl...@google.com>.

If I understand, I think I agree with Ismaël fully. We should be able to
make these ValidatesRunner tests and turn them on. At the small scale like
that, it should always pass. It is OK if a lightning strike causes the test
to fail. Many other tests will fail too :-)

I really like the new javadoc - I just had to read it to compare the tests
with the spec, and it was very easy.

Kenn

On Wed, Feb 21, 2018 at 12:47 AM, Ismaël Mejía <ie...@gmail.com> wrote:

> Hello, thanks Eugene for improving the documentation so we can close
> this thread.
>
> Reuven, I understood the semantics of the methods, what surprised me was
> that I
> interpreted the new documentation as if a runner could simply ignore to
> call
> @Teardown, and we already have dealt with the issues of not doing this when
> there is an exception on the element methods
> (startBundle/processElement/finishBundle),  we can leak resources by not
> calling
> teardown, as the Spark runner user reported in the link I sent.
>
> So considering that a runner should try at best to call that method, I
> promoted
> some of the methods of ParDoLifecycleTest to be ValidatesRunner to ensure
> that
> runners call teardown after exceptions and I filled BEAM-3245 so the
> DataflowRunner try at its best to respect the lifecycle when it can. (Note
> I
> auto-assigned this JIRA but it is up to you guys to reassign it to the
> person
> who can work on it).
>
>
> On Wed, Feb 21, 2018 at 7:26 AM, Reuven Lax <re...@google.com> wrote:
> > To close the loop here:
> >
> > Romain, I think your actual concern was that the Javadoc made it sound
> like
> > a runner could simply decide not to call Teardown. If so, then I agree
> with
> > you - the Javadoc was misleading (and appears it was confusing to Ismael
> as
> > well). If a runner destroys a DoFn, it _must_ call TearDown before it
> calls
> > Setup on a new DoFn.
> >
> > If so, then most of the back and forth on this thread had little to do
> with
> > your actual concern. However it did take almost three days of discussion
> > before Eugene understood what your real concern was, leading to the side
> > discussions.
> >
> > Reuven
> >
> > On Mon, Feb 19, 2018 at 6:08 PM, Reuven Lax <re...@google.com> wrote:
> >>
> >> +1 This PR clarifies the semantics quite a bit.
> >>
> >> On Mon, Feb 19, 2018 at 3:24 PM, Eugene Kirpichov <kirpichov@google.com
> >
> >> wrote:
> >>>
> >>> I've sent out a PR editing the Javadoc
> >>> https://github.com/apache/beam/pull/4711 . Hopefully, that should be
> >>> sufficient.
> >>>
> >>> On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax <re...@google.com> wrote:
> >>>>
> >>>> Ismael, your understanding is appropriate for FinishBundle.
> >>>>
> >>>> One basic issue with this understanding, is that the lifecycle of a
> DoFn
> >>>> is much longer than a single bundle (which I think you expressed by
> adding
> >>>> the *s). How long the DoFn lives is not defined. In fact a runner is
> >>>> completely free to decide that it will _never_ destroy the DoFn, in
> which
> >>>> case TearDown is never called simply because the DoFn was never torn
> down.
> >>>>
> >>>> Also, as mentioned before, the runner can only call TearDown in cases
> >>>> where the shutdown is in its control. If the JVM is shut down
> externally,
> >>>> the runner has no chance to call TearDown. This means that while
> TearDown is
> >>>> appropriate for cleaning up in-process resources (open connections,
> etc.),
> >>>> it's not the right answer for cleaning up persistent resources. If
> you rely
> >>>> on TearDown to delete VMs or delete files, there will be cases in
> which
> >>>> those files of VMs are not deleted.
> >>>>
> >>>> What we are _not_ saying is that the runner is free to just ignore
> >>>> TearDown. If the runner is explicitly destroying a DoFn object, it
> should
> >>>> call TearDown.
> >>>>
> >>>> Reuven
> >>>>
> >>>>
> >>>> On Mon, Feb 19, 2018 at 2:35 PM, Ismaël Mejía <ie...@gmail.com>
> wrote:
> >>>>>
> >>>>> I also had a different understanding of the lifecycle of a DoFn.
> >>>>>
> >>>>> My understanding of the use case for every method in the DoFn was
> clear
> >>>>> and
> >>>>> perfectly aligned with Thomas explanation, but what I understood was
> >>>>> that in a
> >>>>> general terms ‘@Setup was where I got resources/prepare connections
> and
> >>>>> @Teardown where I free them’, so calling Teardown seemed essential to
> >>>>> have a
> >>>>> complete lifecycle:
> >>>>> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
> >>>>>
> >>>>> The fact that @Teardown could not be called is a new detail for me
> too,
> >>>>> and I
> >>>>> also find weird to have a method that may or not be called as part of
> >>>>> an API,
> >>>>> why would users implement teardown if it will not be called? In that
> >>>>> case
> >>>>> probably a cleaner approach would be to get rid of that method
> >>>>> altogether, no?
> >>>>>
> >>>>> But well maybe that’s not so easy too, there was another point: Some
> >>>>> user
> >>>>> reported an issue with leaking resources using KafkaIO in the Spark
> >>>>> runner, for
> >>>>> ref.
> >>>>> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
> >>>>>
> >>>>> In that moment my understanding was that there was something fishy
> >>>>> because we
> >>>>> should be calling Teardown to close correctly the connections and
> free
> >>>>> the
> >>>>> resources in case of exceptions on start/process/finish, so I filled
> a
> >>>>> JIRA and
> >>>>> fixed this by enforcing the call of teardown for the Spark runner and
> >>>>> the Flink
> >>>>> runner:
> >>>>> https://issues.apache.org/jira/browse/BEAM-3187
> >>>>> https://issues.apache.org/jira/browse/BEAM-3244
> >>>>>
> >>>>> As you can see not calling this method does have consequences at
> least
> >>>>> for
> >>>>> non-containerized runners. Of course a runner that uses containers
> >>>>> could not
> >>>>> care about cleaning the resources this way, but a long living JVM in
> a
> >>>>> Hadoop
> >>>>> environment probably won’t have the same luck. So I am not sure that
> >>>>> having a
> >>>>> loose semantic there is the right option, I mean, runners could
> simply
> >>>>> guarantee
> >>>>> that they call teardown and if teardown takes too long they can
> decide
> >>>>> to send a
> >>>>> signal or kill the process/container/etc and go ahead, that way at
> >>>>> least users
> >>>>> would have a motivation to implement the teardown method, otherwise
> it
> >>>>> doesn’t
> >>>>> make any sense to have it (API wise).
> >>>>>
> >>>>> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov
> >>>>> <ki...@google.com> wrote:
> >>>>> > Romain, would it be fair to say that currently the goal of your
> >>>>> > participation in this discussion is to identify situations where
> >>>>> > @Teardown
> >>>>> > in principle could have been called, but some of the current
> runners
> >>>>> > don't
> >>>>> > make a good enough effort to call it? If yes - as I said before,
> >>>>> > please, by
> >>>>> > all means, file bugs of the form "Runner X doesn't call @Teardown
> in
> >>>>> > situation Y" if you're aware of any, and feel free to send PRs
> fixing
> >>>>> > runner
> >>>>> > X to reliably call @Teardown in situation Y. I think we all agree
> >>>>> > that this
> >>>>> > would be a good improvement.
> >>>>> >
> >>>>> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau
> >>>>> > <rm...@gmail.com>
> >>>>> > wrote:
> >>>>> >>
> >>>>> >>
> >>>>> >>
> >>>>> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
> >>>>> >>
> >>>>> >>
> >>>>> >>
> >>>>> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
> >>>>> >> <rm...@gmail.com> wrote:
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit
> :
> >>>>> >>>
> >>>>> >>> How do you call teardown? There are cases in which the Java code
> >>>>> >>> gets no
> >>>>> >>> indication that the restart is happening (e.g. cases where the
> >>>>> >>> machine
> >>>>> >>> itself is taken down)
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
> >>>>> >>> Crashes
> >>>>> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
> >>>>> >>> shutdown
> >>>>> >>> with a hook worse case.
> >>>>> >>
> >>>>> >>
> >>>>> >> What you say here is simply not true.
> >>>>> >>
> >>>>> >> There are many scenarios in which workers shutdown with no
> >>>>> >> opportunity for
> >>>>> >> any sort of shutdown hook. Sometimes the entire machine gets
> >>>>> >> shutdown, and
> >>>>> >> not even the OS will have much of a chance to do anything. At
> scale
> >>>>> >> this
> >>>>> >> will happen with some regularity, and a distributed system that
> >>>>> >> assumes this
> >>>>> >> will not happen is a poor distributed system.
> >>>>> >>
> >>>>> >>
> >>>>> >> This is part of the infra and there is no reason the machine is
> >>>>> >> shutdown
> >>>>> >> without shutting down what runs on it before except if it is a bug
> >>>>> >> in the
> >>>>> >> software or setup. I can hear you maybe dont do it everywhere but
> >>>>> >> there is
> >>>>> >> no blocker to do it. Means you can shutdown the machines and
> >>>>> >> guarantee
> >>>>> >> teardown is called.
> >>>>> >>
> >>>>> >> Where i go is simply that it is doable and beam sdk core can
> assume
> >>>>> >> setup
> >>>>> >> is well done. If there is a best effort downside due to that -
> with
> >>>>> >> the
> >>>>> >> meaning you defined - it is an impl bug or a user installation
> >>>>> >> issue.
> >>>>> >>
> >>>>> >> Technically all is true.
> >>>>> >>
> >>>>> >> What can prevent teardown is a hardware failure or so. This is
> fine
> >>>>> >> and
> >>>>> >> doesnt need to be in doc since it is life in IT and obvious or
> must
> >>>>> >> be very
> >>>>> >> explicit to avoid current ambiguity.
> >>>>> >>
> >>>>> >>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau
> >>>>> >>> <rm...@gmail.com>
> >>>>> >>> wrote:
> >>>>> >>>>
> >>>>> >>>> Restarting doesnt mean you dont call teardown. Except a bug
> there
> >>>>> >>>> is no
> >>>>> >>>> reason - technically - it happens, no reason.
> >>>>> >>>>
> >>>>> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a
> écrit :
> >>>>> >>>>>
> >>>>> >>>>> Workers restarting is not a bug, it's standard often expected.
> >>>>> >>>>>
> >>>>> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
> >>>>> >>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>
> >>>>> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug
> recovery
> >>>>> >>>>>> (procedure)
> >>>>> >>>>>>
> >>>>> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov"
> >>>>> >>>>>> <ki...@google.com> a
> >>>>> >>>>>> écrit :
> >>>>> >>>>>>>
> >>>>> >>>>>>> So what would you like to happen if there is a crash? The
> DoFn
> >>>>> >>>>>>> instance no longer exists because the JVM it ran on no longer
> >>>>> >>>>>>> exists. What
> >>>>> >>>>>>> should Teardown be called on?
> >>>>> >>>>>>>
> >>>>> >>>>>>>
> >>>>> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
> >>>>> >>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>
> >>>>> >>>>>>>> This is what i want and not 999999 teardowns for 1000000
> >>>>> >>>>>>>> setups
> >>>>> >>>>>>>> until there is an unexpected crash (= a bug).
> >>>>> >>>>>>>>
> >>>>> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a
> >>>>> >>>>>>>> écrit :
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
> >>>>> >>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
> >>>>> >>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but
> >>>>> >>>>>>>>>>>> leads to
> >>>>> >>>>>>>>>>>> the same pattern, the teardown is just a "if
> >>>>> >>>>>>>>>>>> (iCreatedThem) releaseThem();"
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
> >>>>> >>>>>>>>>>> workers,
> >>>>> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
> >>>>> >>>>>>>>>>> created per
> >>>>> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
> >>>>> >>>>>>>>>>> threads on each
> >>>>> >>>>>>>>>>> worker.
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>> Nop was the other way around, in this case on AWS you can
> >>>>> >>>>>>>>>> get 256
> >>>>> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So
> when
> >>>>> >>>>>>>>>> you compute the
> >>>>> >>>>>>>>>> distribution you allocate to some fn the role to own the
> >>>>> >>>>>>>>>> instance lookup and
> >>>>> >>>>>>>>>> releasing.
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>> I still don't understand. Let's be more precise. If you
> write
> >>>>> >>>>>>>>> the
> >>>>> >>>>>>>>> following code:
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>> There is no way to control how many instances of MyDoFn are
> >>>>> >>>>>>>>> created. The runner might decided to create a million
> >>>>> >>>>>>>>> instances of this
> >>>>> >>>>>>>>> class across your worker pool, which means that you will
> get
> >>>>> >>>>>>>>> a million Setup
> >>>>> >>>>>>>>> and Teardown calls.
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>> Anyway this was just an example of an external resource
> you
> >>>>> >>>>>>>>>> must
> >>>>> >>>>>>>>>> release. Real topic is that beam should define asap a
> >>>>> >>>>>>>>>> guaranteed generic
> >>>>> >>>>>>>>>> lifecycle to let user embrace its programming model.
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> @Eugene:
> >>>>> >>>>>>>>>>>> 1. wait logic is about passing the value which is not
> >>>>> >>>>>>>>>>>> always
> >>>>> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
> >>>>> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the
> >>>>> >>>>>>>>>>>> SDK
> >>>>> >>>>>>>>>>>> core). This API defines a *container* API and therefore
> >>>>> >>>>>>>>>>>> implies bean
> >>>>> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the
> >>>>> >>>>>>>>>>>> sources and dofn (not
> >>>>> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> A. Source
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
> >>>>> >>>>>>>>>>>> estimateSize and split. As an user you can expect both
> to
> >>>>> >>>>>>>>>>>> be called on the
> >>>>> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
> >>>>> >>>>>>>>>>>> cost(s) twice.
> >>>>> >>>>>>>>>>>> Concretely:
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> connect()
> >>>>> >>>>>>>>>>>> try {
> >>>>> >>>>>>>>>>>>   estimateSize()
> >>>>> >>>>>>>>>>>>   split()
> >>>>> >>>>>>>>>>>> } finally {
> >>>>> >>>>>>>>>>>>   disconnect()
> >>>>> >>>>>>>>>>>> }
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> connect()
> >>>>> >>>>>>>>>>>> try {
> >>>>> >>>>>>>>>>>>   estimateSize()
> >>>>> >>>>>>>>>>>> } finally {
> >>>>> >>>>>>>>>>>>   disconnect()
> >>>>> >>>>>>>>>>>> }
> >>>>> >>>>>>>>>>>> connect()
> >>>>> >>>>>>>>>>>> try {
> >>>>> >>>>>>>>>>>>   split()
> >>>>> >>>>>>>>>>>> } finally {
> >>>>> >>>>>>>>>>>>   disconnect()
> >>>>> >>>>>>>>>>>> }
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> + a workaround with an internal estimate size since this
> >>>>> >>>>>>>>>>>> primitive is often called in split but you dont want to
> >>>>> >>>>>>>>>>>> connect twice in the
> >>>>> >>>>>>>>>>>> second phase.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an
> >>>>> >>>>>>>>>>>> API to
> >>>>> >>>>>>>>>>>> implement sources which initializes the source bean and
> >>>>> >>>>>>>>>>>> destroys it.
> >>>>> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
> >>>>> >>>>>>>>>>>> However
> >>>>> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so
> building
> >>>>> >>>>>>>>>>>> any API on top of
> >>>>> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you
> >>>>> >>>>>>>>>>>> hit the exact same
> >>>>> >>>>>>>>>>>> issues - check how IO are implemented, the static
> >>>>> >>>>>>>>>>>> utilities which create
> >>>>> >>>>>>>>>>>> volatile connections preventing to reuse existing
> >>>>> >>>>>>>>>>>> connection in a single
> >>>>> >>>>>>>>>>>> method
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> (https://github.com/apache/
> beam/blob/master/sdks/java/io/elasticsearch/src/main/java/
> org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Same logic applies to the reader which is then created.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> B. DoFn & SDF
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime:
> >>>>> >>>>>>>>>>>> init();
> >>>>> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); }
> and
> >>>>> >>>>>>>>>>>> that it is
> >>>>> >>>>>>>>>>>> executed on the exact same instance to be able to be
> >>>>> >>>>>>>>>>>> stateful at that level
> >>>>> >>>>>>>>>>>> for expensive connections/operations/flow state
> handling.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> As you mentionned with the million example, this
> sequence
> >>>>> >>>>>>>>>>>> should
> >>>>> >>>>>>>>>>>> happen for each single instance so 1M times for your
> >>>>> >>>>>>>>>>>> example.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is
> a
> >>>>> >>>>>>>>>>>> generalisation of both cases (source and dofn).
> Therefore
> >>>>> >>>>>>>>>>>> it creates way
> >>>>> >>>>>>>>>>>> more instances and requires to have a way more
> >>>>> >>>>>>>>>>>> strict/explicit definition of
> >>>>> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since
> >>>>> >>>>>>>>>>>> beam handles the
> >>>>> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
> >>>>> >>>>>>>>>>>> init/destroy hooks
> >>>>> >>>>>>>>>>>> (setup/teardown) which can be stateful.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> If you take the JDBC example which was mentionned
> earlier.
> >>>>> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles.
> >>>>> >>>>>>>>>>>> Since bundles size is
> >>>>> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool
> to
> >>>>> >>>>>>>>>>>> be able to reuse
> >>>>> >>>>>>>>>>>> a connection instance to not correct performances. Now
> >>>>> >>>>>>>>>>>> with the SDF and the
> >>>>> >>>>>>>>>>>> split increase, how do you handle the pool size?
> Generally
> >>>>> >>>>>>>>>>>> in batch you use
> >>>>> >>>>>>>>>>>> a single connection per thread to avoid to consume all
> >>>>> >>>>>>>>>>>> database connections.
> >>>>> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2.
> use
> >>>>> >>>>>>>>>>>> a pool a bit
> >>>>> >>>>>>>>>>>> higher but multiplied by the number of beans you will
> >>>>> >>>>>>>>>>>> likely x2 or 3 the
> >>>>> >>>>>>>>>>>> connection count and make the execution fail with "no
> more
> >>>>> >>>>>>>>>>>> connection
> >>>>> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still
> >>>>> >>>>>>>>>>>> have to have a
> >>>>> >>>>>>>>>>>> reliable teardown by pool instance (close() generally)
> to
> >>>>> >>>>>>>>>>>> ensure you release
> >>>>> >>>>>>>>>>>> the pool and don't leak the connection information in
> the
> >>>>> >>>>>>>>>>>> JVM. In all case
> >>>>> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if
> >>>>> >>>>>>>>>>>> you fake to get
> >>>>> >>>>>>>>>>>> connections with bundles.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF
> >>>>> >>>>>>>>>>>> imply
> >>>>> >>>>>>>>>>>> all the current issues with the loose definition of the
> >>>>> >>>>>>>>>>>> bean lifecycles at
> >>>>> >>>>>>>>>>>> an exponential level, nothing else.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Romain Manni-Bucau
> >>>>> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn |
> Book
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
> >>>>> >>>>>>>>>>>> <ki...@google.com>:
> >>>>> >>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning
> >>>>> >>>>>>>>>>>>> can be
> >>>>> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in
> >>>>> >>>>>>>>>>>>> the thread above,
> >>>>> >>>>>>>>>>>>> and I believe it should become the canonical way to do
> >>>>> >>>>>>>>>>>>> that.
> >>>>> >>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main
> >>>>> >>>>>>>>>>>>> author of
> >>>>> >>>>>>>>>>>>> most design documents related to SDF and of its
> >>>>> >>>>>>>>>>>>> implementation in the Java
> >>>>> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated
> to
> >>>>> >>>>>>>>>>>>> the topic of
> >>>>> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming
> up)
> >>>>> >>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
> >>>>> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle
> too.
> >>>>> >>>>>>>>>>>>>> My
> >>>>> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it
> and
> >>>>> >>>>>>>>>>>>>> clean the api.
> >>>>> >>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle
> of
> >>>>> >>>>>>>>>>>>>> transforms?
> >>>>> >>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers"
> >>>>> >>>>>>>>>>>>>> <bc...@apache.org>
> >>>>> >>>>>>>>>>>>>> a écrit :
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
> >>>>> >>>>>>>>>>>>>>> DoFn's
> >>>>> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is
> necessary,
> >>>>> >>>>>>>>>>>>>>> it is around an
> >>>>> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
> >>>>> >>>>>>>>>>>>>>> discussions/proposals
> >>>>> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
> >>>>> >>>>>>>>>>>>>>> haven't been
> >>>>> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
> >>>>> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
> >>>>> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to
> do
> >>>>> >>>>>>>>>>>>>>> a
> >>>>> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
> >>>>> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the
> >>>>> >>>>>>>>>>>>>>> chance of
> >>>>> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
> >>>>> >>>>>>>>>>>>>>> destination).
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many
> >>>>> >>>>>>>>>>>>>>> workers,
> >>>>> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
> >>>>> >>>>>>>>>>>>>>> The move step should only happen once, so on one
> >>>>> >>>>>>>>>>>>>>> worker. This
> >>>>> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some
> >>>>> >>>>>>>>>>>>>>> stuff done to ensure it
> >>>>> >>>>>>>>>>>>>>> runs on one worker.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is
> not
> >>>>> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule
> >>>>> >>>>>>>>>>>>>>> some cleanup work for
> >>>>> >>>>>>>>>>>>>>> when the transform is "done". In batch this is
> >>>>> >>>>>>>>>>>>>>> relatively straightforward,
> >>>>> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some
> problems,
> >>>>> >>>>>>>>>>>>>>> such as BigQuery
> >>>>> >>>>>>>>>>>>>>> sink leaving files around that have failed to import
> >>>>> >>>>>>>>>>>>>>> into BigQuery.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you
> >>>>> >>>>>>>>>>>>>>> want to
> >>>>> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to
> >>>>> >>>>>>>>>>>>>>> wait until the end of
> >>>>> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until
> >>>>> >>>>>>>>>>>>>>> you know nobody will
> >>>>> >>>>>>>>>>>>>>> need the resource anymore.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API,
> >>>>> >>>>>>>>>>>>>>> where
> >>>>> >>>>>>>>>>>>>>> you could have a transform that output resource
> >>>>> >>>>>>>>>>>>>>> objects. Each resource
> >>>>> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
> >>>>> >>>>>>>>>>>>>>> would be something
> >>>>> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
> >>>>> >>>>>>>>>>>>>>> resource, and what
> >>>>> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon
> as
> >>>>> >>>>>>>>>>>>>>> that part of the
> >>>>> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no
> >>>>> >>>>>>>>>>>>>>> longer need the resources,
> >>>>> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at
> pipeline
> >>>>> >>>>>>>>>>>>>>> shutdown, or
> >>>>> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> Would something like this be a better fit for your
> use
> >>>>> >>>>>>>>>>>>>>> case?
> >>>>> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
> >>>>> >>>>>>>>>>>>>>> sufficient?
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
> >>>>> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the
> >>>>> >>>>>>>>>>>>>>>> overall
> >>>>> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a
> >>>>> >>>>>>>>>>>>>>>> thread of a worker - has
> >>>>> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
> >>>>> >>>>>>>>>>>>>>>> collection.
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
> >>>>> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup
> >>>>> >>>>>>>>>>>>>>>> before
> >>>>> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam
> has
> >>>>> >>>>>>>>>>>>>>>> the instance before it
> >>>>> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls
> >>>>> >>>>>>>>>>>>>>>> teardown.
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> It is as simple as it.
> >>>>> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn
> not
> >>>>> >>>>>>>>>>>>>>>> self
> >>>>> >>>>>>>>>>>>>>>> contained to implement basic transforms.
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax"
> >>>>> >>>>>>>>>>>>>>>> <re...@google.com> a
> >>>>> >>>>>>>>>>>>>>>> écrit :
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain
> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
> >>>>> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track.
> >>>>> >>>>>>>>>>>>>>>>>> Rather
> >>>>> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing
> >>>>> >>>>>>>>>>>>>>>>>> methods -- which have been
> >>>>> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it
> would
> >>>>> >>>>>>>>>>>>>>>>>> be helpful to focus on
> >>>>> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something
> >>>>> >>>>>>>>>>>>>>>>>> with different semantics.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
> >>>>> >>>>>>>>>>>>>>>>>> trying
> >>>>> >>>>>>>>>>>>>>>>>> to do):
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that
> was
> >>>>> >>>>>>>>>>>>>>>>>> initialized once during the startup of the
> pipeline.
> >>>>> >>>>>>>>>>>>>>>>>> If this is the case,
> >>>>> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only
> initialized
> >>>>> >>>>>>>>>>>>>>>>>> once (and not once per
> >>>>> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do
> you
> >>>>> >>>>>>>>>>>>>>>>>> know when the pipeline
> >>>>> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it
> reaches
> >>>>> >>>>>>>>>>>>>>>>>> step X", then what
> >>>>> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when
> >>>>> >>>>>>>>>>>>>>>>>> the
> >>>>> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or
> by a
> >>>>> >>>>>>>>>>>>>>>>>> jvm shutdown)
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>> I'm really not following what this means.
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers,
> >>>>> >>>>>>>>>>>>>>>>> and each
> >>>>> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy
> >>>>> >>>>>>>>>>>>>>>>> of the same DoFn). How
> >>>>> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000
> =
> >>>>> >>>>>>>>>>>>>>>>> 1M cleanups) and when
> >>>>> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is
> >>>>> >>>>>>>>>>>>>>>>> shut down? When an
> >>>>> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may
> be
> >>>>> >>>>>>>>>>>>>>>>> temporary - may be
> >>>>> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within
> some
> >>>>> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
> >>>>> >>>>>>>>>>>>>>>>>> methods are not a good fit
> >>>>> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources
> >>>>> >>>>>>>>>>>>>>>>>> within the DoFn), you could
> >>>>> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that
> it
> >>>>> >>>>>>>>>>>>>>>>>> produced. For instance:
> >>>>> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some
> token
> >>>>> >>>>>>>>>>>>>>>>>> that
> >>>>> >>>>>>>>>>>>>>>>>> stores information about resources)
> >>>>> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent
> >>>>> >>>>>>>>>>>>>>>>>> retries
> >>>>> >>>>>>>>>>>>>>>>>> from changing resource IDs)
> >>>>> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
> >>>>> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources,
> and
> >>>>> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
> >>>>> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
> >>>>> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data
> >>>>> >>>>>>>>>>>>>>>>>> it is
> >>>>> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in
> >>>>> >>>>>>>>>>>>>>>>>> use or have been finished
> >>>>> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
> >>>>> >>>>>>>>>>>>>>>>>> important to ensuring
> >>>>> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
> >>>>> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If
> it
> >>>>> >>>>>>>>>>>>>>>>>> is
> >>>>> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are
> >>>>> >>>>>>>>>>>>>>>>>> trying to accomplish? That
> >>>>> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
> >>>>> >>>>>>>>>>>>>>>>>> existing options and
> >>>>> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all
> >>>>> >>>>>>>>>>>>>>>>>> cases but
> >>>>> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
> >>>>> >>>>>>>>>>>>>>>>>> handling  except i
> >>>>> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since
> you
> >>>>> >>>>>>>>>>>>>>>>>> cant put any unified
> >>>>> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very
> >>>>> >>>>>>>>>>>>>>>>>> hard to integrate or to use
> >>>>> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
> >>>>> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> -- Ben
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
> >>>>> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many
> >>>>> >>>>>>>>>>>>>>>>>>>> of the
> >>>>> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
> >>>>> >>>>>>>>>>>>>>>>>>>> machine.
> >>>>> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be
> called
> >>>>> >>>>>>>>>>>>>>>>>>>> except in various situations where it's
> logically
> >>>>> >>>>>>>>>>>>>>>>>>>> impossible or impractical
> >>>>> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or
> >>>>> >>>>>>>>>>>>>>>>>>>> you can list some of the
> >>>>> >>>>>>>>>>>>>>>>>>>> examples above.
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>> Sounds ok to me
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
> >>>>> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
> >>>>> >>>>>>>>>>>>>>>>>>>> called - it's not just
> >>>>> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
> >>>>> >>>>>>>>>>>>>>>>>>>> important (e.g. cleaning up
> >>>>> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting
> down a
> >>>>> >>>>>>>>>>>>>>>>>>>> large number of VMs you
> >>>>> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one
> of
> >>>>> >>>>>>>>>>>>>>>>>>>> the other methods that
> >>>>> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come
> at
> >>>>> >>>>>>>>>>>>>>>>>>>> a cost, e.g. no
> >>>>> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly
> so
> >>>>> >>>>>>>>>>>>>>>>>>> not
> >>>>> >>>>>>>>>>>>>>>>>>> which which other method you speak about.
> >>>>> >>>>>>>>>>>>>>>>>>> Concretely if you make it really
> >>>>> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to
> me
> >>>>> >>>>>>>>>>>>>>>>>>> - then users can use it
> >>>>> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen
> >>>>> >>>>>>>>>>>>>>>>>>> but it is unexpected and
> >>>>> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have
> a
> >>>>> >>>>>>>>>>>>>>>>>>> manual - or auto if fancy
> >>>>> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all
> >>>>> >>>>>>>>>>>>>>>>>>> the difference and impacts
> >>>>> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain
> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means
> >>>>> >>>>>>>>>>>>>>>>>>>>> that. It
> >>>>> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is
> >>>>> >>>>>>>>>>>>>>>>>>>>> what triggered this thread.
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state
> >>>>> >>>>>>>>>>>>>>>>>>>>> prevents
> >>>>> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be
> very
> >>>>> >>>>>>>>>>>>>>>>>>>>> badly and wrongly
> >>>>> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github |
> >>>>> >>>>>>>>>>>>>>>>>>>>> LinkedIn |
> >>>>> >>>>>>>>>>>>>>>>>>>>> Book
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
> >>>>> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to
> call
> >>>>> >>>>>>>>>>>>>>>>>>>>>> it:
> >>>>> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have
> (intergalactic
> >>>>> >>>>>>>>>>>>>>>>>>>>>> crash), and in a number of
> >>>>> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker
> >>>>> >>>>>>>>>>>>>>>>>>>>>> container has crashed (eg user code
> >>>>> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over
> >>>>> >>>>>>>>>>>>>>>>>>>>>> JNI and it segfaulted), JVM
> >>>>> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the
> >>>>> >>>>>>>>>>>>>>>>>>>>>> worker has lost network
> >>>>> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it
> won't
> >>>>> >>>>>>>>>>>>>>>>>>>>>> be able to do anything
> >>>>> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a
> >>>>> >>>>>>>>>>>>>>>>>>>>>> preemptible VM and it was preempted by
> >>>>> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice
> or
> >>>>> >>>>>>>>>>>>>>>>>>>>>> if the worker was too busy
> >>>>> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
> >>>>> >>>>>>>>>>>>>>>>>>>>>> functions) until the preemption
> >>>>> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying
> hardware
> >>>>> >>>>>>>>>>>>>>>>>>>>>> simply failed (which
> >>>>> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many
> other
> >>>>> >>>>>>>>>>>>>>>>>>>>>> conditions.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to
> >>>>> >>>>>>>>>>>>>>>>>>>>>> describe
> >>>>> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs
> for
> >>>>> >>>>>>>>>>>>>>>>>>>>>> cases where you observed a
> >>>>> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where
> it
> >>>>> >>>>>>>>>>>>>>>>>>>>>> was possible to call it but
> >>>>> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain
> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level
> need
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> (e.g.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x
> and
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> it requires the following
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> logic and the following processing
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> you.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> requiring a
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> since size is not controlled.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> the connection since it is a best
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> makes you pay a lot - aws ;) - or
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> concurrent limit.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> If
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> called then nothing else can be
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> AWS service are you thinking of
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> everything at the other end has died?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of
> stateless
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> but
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> closing exchanges which are not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> services
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline
> startup
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> and closing them at the end.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> and money. You can say it can be
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> handle its
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at
> scale
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> for generic pipelines and if
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown -
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> ignoring
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> handled by any human system?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> handle it? Is it due to some
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> implemented
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> kind of change you're asking
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> for.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it
> is
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users?
> Direct
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> runner
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test,
> he
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> will get a different behavior
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> know what the IOs he composes use
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole
> product
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> than he must be handled IMHO.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> in big
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> what people did for years and do
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions -
> the
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> execution of teardown. Then we
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there
> is
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> a technical reason we cant we
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the
> api.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I know spark and flink can, any
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go
> through
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> java
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> (beam enclosing software) is fully
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> uncontrolled. Only case where it is not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned
> by
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> a vendor and never installed on
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it
> belongd
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> to the vendor to handle beam
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> vendor - otherwise all
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should
> be
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> made optional right?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> distributed
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> and defined lifecycle.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>
> >>>>> >>>
> >>>>> >>
> >>>>> >>
> >>>>> >
> >>>>
> >>>>
> >>
> >
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

Yeah, I think the documentation Eugene added is now clearer. +1 to adding
ValidatesRunner tests here.

On Wed, Feb 21, 2018 at 12:47 AM, Ismaël Mejía <ie...@gmail.com> wrote:

> Hello, thanks Eugene for improving the documentation so we can close
> this thread.
>
> Reuven, I understood the semantics of the methods, what surprised me was
> that I
> interpreted the new documentation as if a runner could simply ignore to
> call
> @Teardown, and we already have dealt with the issues of not doing this when
> there is an exception on the element methods
> (startBundle/processElement/finishBundle),  we can leak resources by not
> calling
> teardown, as the Spark runner user reported in the link I sent.
>
> So considering that a runner should try at best to call that method, I
> promoted
> some of the methods of ParDoLifecycleTest to be ValidatesRunner to ensure
> that
> runners call teardown after exceptions and I filled BEAM-3245 so the
> DataflowRunner try at its best to respect the lifecycle when it can. (Note
> I
> auto-assigned this JIRA but it is up to you guys to reassign it to the
> person
> who can work on it).
>
>
> On Wed, Feb 21, 2018 at 7:26 AM, Reuven Lax <re...@google.com> wrote:
> > To close the loop here:
> >
> > Romain, I think your actual concern was that the Javadoc made it sound
> like
> > a runner could simply decide not to call Teardown. If so, then I agree
> with
> > you - the Javadoc was misleading (and appears it was confusing to Ismael
> as
> > well). If a runner destroys a DoFn, it _must_ call TearDown before it
> calls
> > Setup on a new DoFn.
> >
> > If so, then most of the back and forth on this thread had little to do
> with
> > your actual concern. However it did take almost three days of discussion
> > before Eugene understood what your real concern was, leading to the side
> > discussions.
> >
> > Reuven
> >
> > On Mon, Feb 19, 2018 at 6:08 PM, Reuven Lax <re...@google.com> wrote:
> >>
> >> +1 This PR clarifies the semantics quite a bit.
> >>
> >> On Mon, Feb 19, 2018 at 3:24 PM, Eugene Kirpichov <kirpichov@google.com
> >
> >> wrote:
> >>>
> >>> I've sent out a PR editing the Javadoc
> >>> https://github.com/apache/beam/pull/4711 . Hopefully, that should be
> >>> sufficient.
> >>>
> >>> On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax <re...@google.com> wrote:
> >>>>
> >>>> Ismael, your understanding is appropriate for FinishBundle.
> >>>>
> >>>> One basic issue with this understanding, is that the lifecycle of a
> DoFn
> >>>> is much longer than a single bundle (which I think you expressed by
> adding
> >>>> the *s). How long the DoFn lives is not defined. In fact a runner is
> >>>> completely free to decide that it will _never_ destroy the DoFn, in
> which
> >>>> case TearDown is never called simply because the DoFn was never torn
> down.
> >>>>
> >>>> Also, as mentioned before, the runner can only call TearDown in cases
> >>>> where the shutdown is in its control. If the JVM is shut down
> externally,
> >>>> the runner has no chance to call TearDown. This means that while
> TearDown is
> >>>> appropriate for cleaning up in-process resources (open connections,
> etc.),
> >>>> it's not the right answer for cleaning up persistent resources. If
> you rely
> >>>> on TearDown to delete VMs or delete files, there will be cases in
> which
> >>>> those files of VMs are not deleted.
> >>>>
> >>>> What we are _not_ saying is that the runner is free to just ignore
> >>>> TearDown. If the runner is explicitly destroying a DoFn object, it
> should
> >>>> call TearDown.
> >>>>
> >>>> Reuven
> >>>>
> >>>>
> >>>> On Mon, Feb 19, 2018 at 2:35 PM, Ismaël Mejía <ie...@gmail.com>
> wrote:
> >>>>>
> >>>>> I also had a different understanding of the lifecycle of a DoFn.
> >>>>>
> >>>>> My understanding of the use case for every method in the DoFn was
> clear
> >>>>> and
> >>>>> perfectly aligned with Thomas explanation, but what I understood was
> >>>>> that in a
> >>>>> general terms ‘@Setup was where I got resources/prepare connections
> and
> >>>>> @Teardown where I free them’, so calling Teardown seemed essential to
> >>>>> have a
> >>>>> complete lifecycle:
> >>>>> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
> >>>>>
> >>>>> The fact that @Teardown could not be called is a new detail for me
> too,
> >>>>> and I
> >>>>> also find weird to have a method that may or not be called as part of
> >>>>> an API,
> >>>>> why would users implement teardown if it will not be called? In that
> >>>>> case
> >>>>> probably a cleaner approach would be to get rid of that method
> >>>>> altogether, no?
> >>>>>
> >>>>> But well maybe that’s not so easy too, there was another point: Some
> >>>>> user
> >>>>> reported an issue with leaking resources using KafkaIO in the Spark
> >>>>> runner, for
> >>>>> ref.
> >>>>> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
> >>>>>
> >>>>> In that moment my understanding was that there was something fishy
> >>>>> because we
> >>>>> should be calling Teardown to close correctly the connections and
> free
> >>>>> the
> >>>>> resources in case of exceptions on start/process/finish, so I filled
> a
> >>>>> JIRA and
> >>>>> fixed this by enforcing the call of teardown for the Spark runner and
> >>>>> the Flink
> >>>>> runner:
> >>>>> https://issues.apache.org/jira/browse/BEAM-3187
> >>>>> https://issues.apache.org/jira/browse/BEAM-3244
> >>>>>
> >>>>> As you can see not calling this method does have consequences at
> least
> >>>>> for
> >>>>> non-containerized runners. Of course a runner that uses containers
> >>>>> could not
> >>>>> care about cleaning the resources this way, but a long living JVM in
> a
> >>>>> Hadoop
> >>>>> environment probably won’t have the same luck. So I am not sure that
> >>>>> having a
> >>>>> loose semantic there is the right option, I mean, runners could
> simply
> >>>>> guarantee
> >>>>> that they call teardown and if teardown takes too long they can
> decide
> >>>>> to send a
> >>>>> signal or kill the process/container/etc and go ahead, that way at
> >>>>> least users
> >>>>> would have a motivation to implement the teardown method, otherwise
> it
> >>>>> doesn’t
> >>>>> make any sense to have it (API wise).
> >>>>>
> >>>>> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov
> >>>>> <ki...@google.com> wrote:
> >>>>> > Romain, would it be fair to say that currently the goal of your
> >>>>> > participation in this discussion is to identify situations where
> >>>>> > @Teardown
> >>>>> > in principle could have been called, but some of the current
> runners
> >>>>> > don't
> >>>>> > make a good enough effort to call it? If yes - as I said before,
> >>>>> > please, by
> >>>>> > all means, file bugs of the form "Runner X doesn't call @Teardown
> in
> >>>>> > situation Y" if you're aware of any, and feel free to send PRs
> fixing
> >>>>> > runner
> >>>>> > X to reliably call @Teardown in situation Y. I think we all agree
> >>>>> > that this
> >>>>> > would be a good improvement.
> >>>>> >
> >>>>> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau
> >>>>> > <rm...@gmail.com>
> >>>>> > wrote:
> >>>>> >>
> >>>>> >>
> >>>>> >>
> >>>>> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
> >>>>> >>
> >>>>> >>
> >>>>> >>
> >>>>> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
> >>>>> >> <rm...@gmail.com> wrote:
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit
> :
> >>>>> >>>
> >>>>> >>> How do you call teardown? There are cases in which the Java code
> >>>>> >>> gets no
> >>>>> >>> indication that the restart is happening (e.g. cases where the
> >>>>> >>> machine
> >>>>> >>> itself is taken down)
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
> >>>>> >>> Crashes
> >>>>> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
> >>>>> >>> shutdown
> >>>>> >>> with a hook worse case.
> >>>>> >>
> >>>>> >>
> >>>>> >> What you say here is simply not true.
> >>>>> >>
> >>>>> >> There are many scenarios in which workers shutdown with no
> >>>>> >> opportunity for
> >>>>> >> any sort of shutdown hook. Sometimes the entire machine gets
> >>>>> >> shutdown, and
> >>>>> >> not even the OS will have much of a chance to do anything. At
> scale
> >>>>> >> this
> >>>>> >> will happen with some regularity, and a distributed system that
> >>>>> >> assumes this
> >>>>> >> will not happen is a poor distributed system.
> >>>>> >>
> >>>>> >>
> >>>>> >> This is part of the infra and there is no reason the machine is
> >>>>> >> shutdown
> >>>>> >> without shutting down what runs on it before except if it is a bug
> >>>>> >> in the
> >>>>> >> software or setup. I can hear you maybe dont do it everywhere but
> >>>>> >> there is
> >>>>> >> no blocker to do it. Means you can shutdown the machines and
> >>>>> >> guarantee
> >>>>> >> teardown is called.
> >>>>> >>
> >>>>> >> Where i go is simply that it is doable and beam sdk core can
> assume
> >>>>> >> setup
> >>>>> >> is well done. If there is a best effort downside due to that -
> with
> >>>>> >> the
> >>>>> >> meaning you defined - it is an impl bug or a user installation
> >>>>> >> issue.
> >>>>> >>
> >>>>> >> Technically all is true.
> >>>>> >>
> >>>>> >> What can prevent teardown is a hardware failure or so. This is
> fine
> >>>>> >> and
> >>>>> >> doesnt need to be in doc since it is life in IT and obvious or
> must
> >>>>> >> be very
> >>>>> >> explicit to avoid current ambiguity.
> >>>>> >>
> >>>>> >>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau
> >>>>> >>> <rm...@gmail.com>
> >>>>> >>> wrote:
> >>>>> >>>>
> >>>>> >>>> Restarting doesnt mean you dont call teardown. Except a bug
> there
> >>>>> >>>> is no
> >>>>> >>>> reason - technically - it happens, no reason.
> >>>>> >>>>
> >>>>> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a
> écrit :
> >>>>> >>>>>
> >>>>> >>>>> Workers restarting is not a bug, it's standard often expected.
> >>>>> >>>>>
> >>>>> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
> >>>>> >>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>
> >>>>> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug
> recovery
> >>>>> >>>>>> (procedure)
> >>>>> >>>>>>
> >>>>> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov"
> >>>>> >>>>>> <ki...@google.com> a
> >>>>> >>>>>> écrit :
> >>>>> >>>>>>>
> >>>>> >>>>>>> So what would you like to happen if there is a crash? The
> DoFn
> >>>>> >>>>>>> instance no longer exists because the JVM it ran on no longer
> >>>>> >>>>>>> exists. What
> >>>>> >>>>>>> should Teardown be called on?
> >>>>> >>>>>>>
> >>>>> >>>>>>>
> >>>>> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
> >>>>> >>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>
> >>>>> >>>>>>>> This is what i want and not 999999 teardowns for 1000000
> >>>>> >>>>>>>> setups
> >>>>> >>>>>>>> until there is an unexpected crash (= a bug).
> >>>>> >>>>>>>>
> >>>>> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a
> >>>>> >>>>>>>> écrit :
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
> >>>>> >>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
> >>>>> >>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but
> >>>>> >>>>>>>>>>>> leads to
> >>>>> >>>>>>>>>>>> the same pattern, the teardown is just a "if
> >>>>> >>>>>>>>>>>> (iCreatedThem) releaseThem();"
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
> >>>>> >>>>>>>>>>> workers,
> >>>>> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
> >>>>> >>>>>>>>>>> created per
> >>>>> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
> >>>>> >>>>>>>>>>> threads on each
> >>>>> >>>>>>>>>>> worker.
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>> Nop was the other way around, in this case on AWS you can
> >>>>> >>>>>>>>>> get 256
> >>>>> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So
> when
> >>>>> >>>>>>>>>> you compute the
> >>>>> >>>>>>>>>> distribution you allocate to some fn the role to own the
> >>>>> >>>>>>>>>> instance lookup and
> >>>>> >>>>>>>>>> releasing.
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>> I still don't understand. Let's be more precise. If you
> write
> >>>>> >>>>>>>>> the
> >>>>> >>>>>>>>> following code:
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>> There is no way to control how many instances of MyDoFn are
> >>>>> >>>>>>>>> created. The runner might decided to create a million
> >>>>> >>>>>>>>> instances of this
> >>>>> >>>>>>>>> class across your worker pool, which means that you will
> get
> >>>>> >>>>>>>>> a million Setup
> >>>>> >>>>>>>>> and Teardown calls.
> >>>>> >>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>> Anyway this was just an example of an external resource
> you
> >>>>> >>>>>>>>>> must
> >>>>> >>>>>>>>>> release. Real topic is that beam should define asap a
> >>>>> >>>>>>>>>> guaranteed generic
> >>>>> >>>>>>>>>> lifecycle to let user embrace its programming model.
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> @Eugene:
> >>>>> >>>>>>>>>>>> 1. wait logic is about passing the value which is not
> >>>>> >>>>>>>>>>>> always
> >>>>> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
> >>>>> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the
> >>>>> >>>>>>>>>>>> SDK
> >>>>> >>>>>>>>>>>> core). This API defines a *container* API and therefore
> >>>>> >>>>>>>>>>>> implies bean
> >>>>> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the
> >>>>> >>>>>>>>>>>> sources and dofn (not
> >>>>> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> A. Source
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
> >>>>> >>>>>>>>>>>> estimateSize and split. As an user you can expect both
> to
> >>>>> >>>>>>>>>>>> be called on the
> >>>>> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
> >>>>> >>>>>>>>>>>> cost(s) twice.
> >>>>> >>>>>>>>>>>> Concretely:
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> connect()
> >>>>> >>>>>>>>>>>> try {
> >>>>> >>>>>>>>>>>>   estimateSize()
> >>>>> >>>>>>>>>>>>   split()
> >>>>> >>>>>>>>>>>> } finally {
> >>>>> >>>>>>>>>>>>   disconnect()
> >>>>> >>>>>>>>>>>> }
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> connect()
> >>>>> >>>>>>>>>>>> try {
> >>>>> >>>>>>>>>>>>   estimateSize()
> >>>>> >>>>>>>>>>>> } finally {
> >>>>> >>>>>>>>>>>>   disconnect()
> >>>>> >>>>>>>>>>>> }
> >>>>> >>>>>>>>>>>> connect()
> >>>>> >>>>>>>>>>>> try {
> >>>>> >>>>>>>>>>>>   split()
> >>>>> >>>>>>>>>>>> } finally {
> >>>>> >>>>>>>>>>>>   disconnect()
> >>>>> >>>>>>>>>>>> }
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> + a workaround with an internal estimate size since this
> >>>>> >>>>>>>>>>>> primitive is often called in split but you dont want to
> >>>>> >>>>>>>>>>>> connect twice in the
> >>>>> >>>>>>>>>>>> second phase.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an
> >>>>> >>>>>>>>>>>> API to
> >>>>> >>>>>>>>>>>> implement sources which initializes the source bean and
> >>>>> >>>>>>>>>>>> destroys it.
> >>>>> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
> >>>>> >>>>>>>>>>>> However
> >>>>> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so
> building
> >>>>> >>>>>>>>>>>> any API on top of
> >>>>> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you
> >>>>> >>>>>>>>>>>> hit the exact same
> >>>>> >>>>>>>>>>>> issues - check how IO are implemented, the static
> >>>>> >>>>>>>>>>>> utilities which create
> >>>>> >>>>>>>>>>>> volatile connections preventing to reuse existing
> >>>>> >>>>>>>>>>>> connection in a single
> >>>>> >>>>>>>>>>>> method
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> (https://github.com/apache/
> beam/blob/master/sdks/java/io/elasticsearch/src/main/java/
> org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Same logic applies to the reader which is then created.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> B. DoFn & SDF
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime:
> >>>>> >>>>>>>>>>>> init();
> >>>>> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); }
> and
> >>>>> >>>>>>>>>>>> that it is
> >>>>> >>>>>>>>>>>> executed on the exact same instance to be able to be
> >>>>> >>>>>>>>>>>> stateful at that level
> >>>>> >>>>>>>>>>>> for expensive connections/operations/flow state
> handling.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> As you mentionned with the million example, this
> sequence
> >>>>> >>>>>>>>>>>> should
> >>>>> >>>>>>>>>>>> happen for each single instance so 1M times for your
> >>>>> >>>>>>>>>>>> example.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is
> a
> >>>>> >>>>>>>>>>>> generalisation of both cases (source and dofn).
> Therefore
> >>>>> >>>>>>>>>>>> it creates way
> >>>>> >>>>>>>>>>>> more instances and requires to have a way more
> >>>>> >>>>>>>>>>>> strict/explicit definition of
> >>>>> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since
> >>>>> >>>>>>>>>>>> beam handles the
> >>>>> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
> >>>>> >>>>>>>>>>>> init/destroy hooks
> >>>>> >>>>>>>>>>>> (setup/teardown) which can be stateful.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> If you take the JDBC example which was mentionned
> earlier.
> >>>>> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles.
> >>>>> >>>>>>>>>>>> Since bundles size is
> >>>>> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool
> to
> >>>>> >>>>>>>>>>>> be able to reuse
> >>>>> >>>>>>>>>>>> a connection instance to not correct performances. Now
> >>>>> >>>>>>>>>>>> with the SDF and the
> >>>>> >>>>>>>>>>>> split increase, how do you handle the pool size?
> Generally
> >>>>> >>>>>>>>>>>> in batch you use
> >>>>> >>>>>>>>>>>> a single connection per thread to avoid to consume all
> >>>>> >>>>>>>>>>>> database connections.
> >>>>> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2.
> use
> >>>>> >>>>>>>>>>>> a pool a bit
> >>>>> >>>>>>>>>>>> higher but multiplied by the number of beans you will
> >>>>> >>>>>>>>>>>> likely x2 or 3 the
> >>>>> >>>>>>>>>>>> connection count and make the execution fail with "no
> more
> >>>>> >>>>>>>>>>>> connection
> >>>>> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still
> >>>>> >>>>>>>>>>>> have to have a
> >>>>> >>>>>>>>>>>> reliable teardown by pool instance (close() generally)
> to
> >>>>> >>>>>>>>>>>> ensure you release
> >>>>> >>>>>>>>>>>> the pool and don't leak the connection information in
> the
> >>>>> >>>>>>>>>>>> JVM. In all case
> >>>>> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if
> >>>>> >>>>>>>>>>>> you fake to get
> >>>>> >>>>>>>>>>>> connections with bundles.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF
> >>>>> >>>>>>>>>>>> imply
> >>>>> >>>>>>>>>>>> all the current issues with the loose definition of the
> >>>>> >>>>>>>>>>>> bean lifecycles at
> >>>>> >>>>>>>>>>>> an exponential level, nothing else.
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> Romain Manni-Bucau
> >>>>> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn |
> Book
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
> >>>>> >>>>>>>>>>>> <ki...@google.com>:
> >>>>> >>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning
> >>>>> >>>>>>>>>>>>> can be
> >>>>> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in
> >>>>> >>>>>>>>>>>>> the thread above,
> >>>>> >>>>>>>>>>>>> and I believe it should become the canonical way to do
> >>>>> >>>>>>>>>>>>> that.
> >>>>> >>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main
> >>>>> >>>>>>>>>>>>> author of
> >>>>> >>>>>>>>>>>>> most design documents related to SDF and of its
> >>>>> >>>>>>>>>>>>> implementation in the Java
> >>>>> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated
> to
> >>>>> >>>>>>>>>>>>> the topic of
> >>>>> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming
> up)
> >>>>> >>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
> >>>>> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle
> too.
> >>>>> >>>>>>>>>>>>>> My
> >>>>> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it
> and
> >>>>> >>>>>>>>>>>>>> clean the api.
> >>>>> >>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle
> of
> >>>>> >>>>>>>>>>>>>> transforms?
> >>>>> >>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers"
> >>>>> >>>>>>>>>>>>>> <bc...@apache.org>
> >>>>> >>>>>>>>>>>>>> a écrit :
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
> >>>>> >>>>>>>>>>>>>>> DoFn's
> >>>>> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is
> necessary,
> >>>>> >>>>>>>>>>>>>>> it is around an
> >>>>> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
> >>>>> >>>>>>>>>>>>>>> discussions/proposals
> >>>>> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
> >>>>> >>>>>>>>>>>>>>> haven't been
> >>>>> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
> >>>>> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
> >>>>> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to
> do
> >>>>> >>>>>>>>>>>>>>> a
> >>>>> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
> >>>>> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the
> >>>>> >>>>>>>>>>>>>>> chance of
> >>>>> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
> >>>>> >>>>>>>>>>>>>>> destination).
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many
> >>>>> >>>>>>>>>>>>>>> workers,
> >>>>> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
> >>>>> >>>>>>>>>>>>>>> The move step should only happen once, so on one
> >>>>> >>>>>>>>>>>>>>> worker. This
> >>>>> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some
> >>>>> >>>>>>>>>>>>>>> stuff done to ensure it
> >>>>> >>>>>>>>>>>>>>> runs on one worker.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is
> not
> >>>>> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule
> >>>>> >>>>>>>>>>>>>>> some cleanup work for
> >>>>> >>>>>>>>>>>>>>> when the transform is "done". In batch this is
> >>>>> >>>>>>>>>>>>>>> relatively straightforward,
> >>>>> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some
> problems,
> >>>>> >>>>>>>>>>>>>>> such as BigQuery
> >>>>> >>>>>>>>>>>>>>> sink leaving files around that have failed to import
> >>>>> >>>>>>>>>>>>>>> into BigQuery.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you
> >>>>> >>>>>>>>>>>>>>> want to
> >>>>> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to
> >>>>> >>>>>>>>>>>>>>> wait until the end of
> >>>>> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until
> >>>>> >>>>>>>>>>>>>>> you know nobody will
> >>>>> >>>>>>>>>>>>>>> need the resource anymore.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API,
> >>>>> >>>>>>>>>>>>>>> where
> >>>>> >>>>>>>>>>>>>>> you could have a transform that output resource
> >>>>> >>>>>>>>>>>>>>> objects. Each resource
> >>>>> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
> >>>>> >>>>>>>>>>>>>>> would be something
> >>>>> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
> >>>>> >>>>>>>>>>>>>>> resource, and what
> >>>>> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon
> as
> >>>>> >>>>>>>>>>>>>>> that part of the
> >>>>> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no
> >>>>> >>>>>>>>>>>>>>> longer need the resources,
> >>>>> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at
> pipeline
> >>>>> >>>>>>>>>>>>>>> shutdown, or
> >>>>> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> Would something like this be a better fit for your
> use
> >>>>> >>>>>>>>>>>>>>> case?
> >>>>> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
> >>>>> >>>>>>>>>>>>>>> sufficient?
> >>>>> >>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
> >>>>> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the
> >>>>> >>>>>>>>>>>>>>>> overall
> >>>>> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a
> >>>>> >>>>>>>>>>>>>>>> thread of a worker - has
> >>>>> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
> >>>>> >>>>>>>>>>>>>>>> collection.
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
> >>>>> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup
> >>>>> >>>>>>>>>>>>>>>> before
> >>>>> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam
> has
> >>>>> >>>>>>>>>>>>>>>> the instance before it
> >>>>> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls
> >>>>> >>>>>>>>>>>>>>>> teardown.
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> It is as simple as it.
> >>>>> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn
> not
> >>>>> >>>>>>>>>>>>>>>> self
> >>>>> >>>>>>>>>>>>>>>> contained to implement basic transforms.
> >>>>> >>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax"
> >>>>> >>>>>>>>>>>>>>>> <re...@google.com> a
> >>>>> >>>>>>>>>>>>>>>> écrit :
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain
> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
> >>>>> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track.
> >>>>> >>>>>>>>>>>>>>>>>> Rather
> >>>>> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing
> >>>>> >>>>>>>>>>>>>>>>>> methods -- which have been
> >>>>> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it
> would
> >>>>> >>>>>>>>>>>>>>>>>> be helpful to focus on
> >>>>> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something
> >>>>> >>>>>>>>>>>>>>>>>> with different semantics.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
> >>>>> >>>>>>>>>>>>>>>>>> trying
> >>>>> >>>>>>>>>>>>>>>>>> to do):
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that
> was
> >>>>> >>>>>>>>>>>>>>>>>> initialized once during the startup of the
> pipeline.
> >>>>> >>>>>>>>>>>>>>>>>> If this is the case,
> >>>>> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only
> initialized
> >>>>> >>>>>>>>>>>>>>>>>> once (and not once per
> >>>>> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do
> you
> >>>>> >>>>>>>>>>>>>>>>>> know when the pipeline
> >>>>> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it
> reaches
> >>>>> >>>>>>>>>>>>>>>>>> step X", then what
> >>>>> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when
> >>>>> >>>>>>>>>>>>>>>>>> the
> >>>>> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or
> by a
> >>>>> >>>>>>>>>>>>>>>>>> jvm shutdown)
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>> I'm really not following what this means.
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers,
> >>>>> >>>>>>>>>>>>>>>>> and each
> >>>>> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy
> >>>>> >>>>>>>>>>>>>>>>> of the same DoFn). How
> >>>>> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000
> =
> >>>>> >>>>>>>>>>>>>>>>> 1M cleanups) and when
> >>>>> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is
> >>>>> >>>>>>>>>>>>>>>>> shut down? When an
> >>>>> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may
> be
> >>>>> >>>>>>>>>>>>>>>>> temporary - may be
> >>>>> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within
> some
> >>>>> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
> >>>>> >>>>>>>>>>>>>>>>>> methods are not a good fit
> >>>>> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources
> >>>>> >>>>>>>>>>>>>>>>>> within the DoFn), you could
> >>>>> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that
> it
> >>>>> >>>>>>>>>>>>>>>>>> produced. For instance:
> >>>>> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some
> token
> >>>>> >>>>>>>>>>>>>>>>>> that
> >>>>> >>>>>>>>>>>>>>>>>> stores information about resources)
> >>>>> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent
> >>>>> >>>>>>>>>>>>>>>>>> retries
> >>>>> >>>>>>>>>>>>>>>>>> from changing resource IDs)
> >>>>> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
> >>>>> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources,
> and
> >>>>> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
> >>>>> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
> >>>>> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data
> >>>>> >>>>>>>>>>>>>>>>>> it is
> >>>>> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in
> >>>>> >>>>>>>>>>>>>>>>>> use or have been finished
> >>>>> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
> >>>>> >>>>>>>>>>>>>>>>>> important to ensuring
> >>>>> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
> >>>>> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If
> it
> >>>>> >>>>>>>>>>>>>>>>>> is
> >>>>> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are
> >>>>> >>>>>>>>>>>>>>>>>> trying to accomplish? That
> >>>>> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
> >>>>> >>>>>>>>>>>>>>>>>> existing options and
> >>>>> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all
> >>>>> >>>>>>>>>>>>>>>>>> cases but
> >>>>> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
> >>>>> >>>>>>>>>>>>>>>>>> handling  except i
> >>>>> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since
> you
> >>>>> >>>>>>>>>>>>>>>>>> cant put any unified
> >>>>> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very
> >>>>> >>>>>>>>>>>>>>>>>> hard to integrate or to use
> >>>>> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
> >>>>> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> -- Ben
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
> >>>>> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many
> >>>>> >>>>>>>>>>>>>>>>>>>> of the
> >>>>> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
> >>>>> >>>>>>>>>>>>>>>>>>>> machine.
> >>>>> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be
> called
> >>>>> >>>>>>>>>>>>>>>>>>>> except in various situations where it's
> logically
> >>>>> >>>>>>>>>>>>>>>>>>>> impossible or impractical
> >>>>> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or
> >>>>> >>>>>>>>>>>>>>>>>>>> you can list some of the
> >>>>> >>>>>>>>>>>>>>>>>>>> examples above.
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>> Sounds ok to me
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
> >>>>> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
> >>>>> >>>>>>>>>>>>>>>>>>>> called - it's not just
> >>>>> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
> >>>>> >>>>>>>>>>>>>>>>>>>> important (e.g. cleaning up
> >>>>> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting
> down a
> >>>>> >>>>>>>>>>>>>>>>>>>> large number of VMs you
> >>>>> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one
> of
> >>>>> >>>>>>>>>>>>>>>>>>>> the other methods that
> >>>>> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come
> at
> >>>>> >>>>>>>>>>>>>>>>>>>> a cost, e.g. no
> >>>>> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly
> so
> >>>>> >>>>>>>>>>>>>>>>>>> not
> >>>>> >>>>>>>>>>>>>>>>>>> which which other method you speak about.
> >>>>> >>>>>>>>>>>>>>>>>>> Concretely if you make it really
> >>>>> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to
> me
> >>>>> >>>>>>>>>>>>>>>>>>> - then users can use it
> >>>>> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen
> >>>>> >>>>>>>>>>>>>>>>>>> but it is unexpected and
> >>>>> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have
> a
> >>>>> >>>>>>>>>>>>>>>>>>> manual - or auto if fancy
> >>>>> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all
> >>>>> >>>>>>>>>>>>>>>>>>> the difference and impacts
> >>>>> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
> >>>>> >>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain
> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means
> >>>>> >>>>>>>>>>>>>>>>>>>>> that. It
> >>>>> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is
> >>>>> >>>>>>>>>>>>>>>>>>>>> what triggered this thread.
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state
> >>>>> >>>>>>>>>>>>>>>>>>>>> prevents
> >>>>> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be
> very
> >>>>> >>>>>>>>>>>>>>>>>>>>> badly and wrongly
> >>>>> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github |
> >>>>> >>>>>>>>>>>>>>>>>>>>> LinkedIn |
> >>>>> >>>>>>>>>>>>>>>>>>>>> Book
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
> >>>>> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to
> call
> >>>>> >>>>>>>>>>>>>>>>>>>>>> it:
> >>>>> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have
> (intergalactic
> >>>>> >>>>>>>>>>>>>>>>>>>>>> crash), and in a number of
> >>>>> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker
> >>>>> >>>>>>>>>>>>>>>>>>>>>> container has crashed (eg user code
> >>>>> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over
> >>>>> >>>>>>>>>>>>>>>>>>>>>> JNI and it segfaulted), JVM
> >>>>> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the
> >>>>> >>>>>>>>>>>>>>>>>>>>>> worker has lost network
> >>>>> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it
> won't
> >>>>> >>>>>>>>>>>>>>>>>>>>>> be able to do anything
> >>>>> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a
> >>>>> >>>>>>>>>>>>>>>>>>>>>> preemptible VM and it was preempted by
> >>>>> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice
> or
> >>>>> >>>>>>>>>>>>>>>>>>>>>> if the worker was too busy
> >>>>> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
> >>>>> >>>>>>>>>>>>>>>>>>>>>> functions) until the preemption
> >>>>> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying
> hardware
> >>>>> >>>>>>>>>>>>>>>>>>>>>> simply failed (which
> >>>>> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many
> other
> >>>>> >>>>>>>>>>>>>>>>>>>>>> conditions.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to
> >>>>> >>>>>>>>>>>>>>>>>>>>>> describe
> >>>>> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs
> for
> >>>>> >>>>>>>>>>>>>>>>>>>>>> cases where you observed a
> >>>>> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where
> it
> >>>>> >>>>>>>>>>>>>>>>>>>>>> was possible to call it but
> >>>>> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain
> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Manni-Bucau
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level
> need
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> (e.g.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x
> and
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> it requires the following
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> logic and the following processing
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> you.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> requiring a
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> since size is not controlled.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> the connection since it is a best
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> makes you pay a lot - aws ;) - or
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> concurrent limit.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> If
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> called then nothing else can be
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> AWS service are you thinking of
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> everything at the other end has died?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of
> stateless
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> but
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> closing exchanges which are not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> services
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline
> startup
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> and closing them at the end.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> and money. You can say it can be
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> handle its
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at
> scale
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> for generic pipelines and if
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown -
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> ignoring
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> handled by any human system?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> handle it? Is it due to some
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> implemented
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> kind of change you're asking
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>> for.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it
> is
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users?
> Direct
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> runner
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test,
> he
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> will get a different behavior
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> know what the IOs he composes use
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole
> product
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> than he must be handled IMHO.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> in big
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> what people did for years and do
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions -
> the
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> execution of teardown. Then we
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there
> is
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> a technical reason we cant we
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the
> api.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I know spark and flink can, any
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go
> through
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> java
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> (beam enclosing software) is fully
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> uncontrolled. Only case where it is not
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned
> by
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> a vendor and never installed on
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it
> belongd
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> to the vendor to handle beam
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> vendor - otherwise all
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should
> be
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> made optional right?
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> distributed
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> and defined lifecycle.
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>>>>>>
> >>>>> >>>>>>>>>>>>
> >>>>> >>>>>>>>>>>
> >>>>> >>>>>>>>>>
> >>>>> >>>>>>>>>
> >>>>> >>>
> >>>>> >>
> >>>>> >>
> >>>>> >
> >>>>
> >>>>
> >>
> >
>

Re: @TearDown guarantees

Posted by Ismaël Mejía <ie...@gmail.com>.

Hello, thanks Eugene for improving the documentation so we can close
this thread.

Reuven, I understood the semantics of the methods, what surprised me was that I
interpreted the new documentation as if a runner could simply ignore to call
@Teardown, and we already have dealt with the issues of not doing this when
there is an exception on the element methods
(startBundle/processElement/finishBundle),  we can leak resources by not calling
teardown, as the Spark runner user reported in the link I sent.

So considering that a runner should try at best to call that method, I promoted
some of the methods of ParDoLifecycleTest to be ValidatesRunner to ensure that
runners call teardown after exceptions and I filled BEAM-3245 so the
DataflowRunner try at its best to respect the lifecycle when it can. (Note I
auto-assigned this JIRA but it is up to you guys to reassign it to the person
who can work on it).


On Wed, Feb 21, 2018 at 7:26 AM, Reuven Lax <re...@google.com> wrote:
> To close the loop here:
>
> Romain, I think your actual concern was that the Javadoc made it sound like
> a runner could simply decide not to call Teardown. If so, then I agree with
> you - the Javadoc was misleading (and appears it was confusing to Ismael as
> well). If a runner destroys a DoFn, it _must_ call TearDown before it calls
> Setup on a new DoFn.
>
> If so, then most of the back and forth on this thread had little to do with
> your actual concern. However it did take almost three days of discussion
> before Eugene understood what your real concern was, leading to the side
> discussions.
>
> Reuven
>
> On Mon, Feb 19, 2018 at 6:08 PM, Reuven Lax <re...@google.com> wrote:
>>
>> +1 This PR clarifies the semantics quite a bit.
>>
>> On Mon, Feb 19, 2018 at 3:24 PM, Eugene Kirpichov <ki...@google.com>
>> wrote:
>>>
>>> I've sent out a PR editing the Javadoc
>>> https://github.com/apache/beam/pull/4711 . Hopefully, that should be
>>> sufficient.
>>>
>>> On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax <re...@google.com> wrote:
>>>>
>>>> Ismael, your understanding is appropriate for FinishBundle.
>>>>
>>>> One basic issue with this understanding, is that the lifecycle of a DoFn
>>>> is much longer than a single bundle (which I think you expressed by adding
>>>> the *s). How long the DoFn lives is not defined. In fact a runner is
>>>> completely free to decide that it will _never_ destroy the DoFn, in which
>>>> case TearDown is never called simply because the DoFn was never torn down.
>>>>
>>>> Also, as mentioned before, the runner can only call TearDown in cases
>>>> where the shutdown is in its control. If the JVM is shut down externally,
>>>> the runner has no chance to call TearDown. This means that while TearDown is
>>>> appropriate for cleaning up in-process resources (open connections, etc.),
>>>> it's not the right answer for cleaning up persistent resources. If you rely
>>>> on TearDown to delete VMs or delete files, there will be cases in which
>>>> those files of VMs are not deleted.
>>>>
>>>> What we are _not_ saying is that the runner is free to just ignore
>>>> TearDown. If the runner is explicitly destroying a DoFn object, it should
>>>> call TearDown.
>>>>
>>>> Reuven
>>>>
>>>>
>>>> On Mon, Feb 19, 2018 at 2:35 PM, Ismaël Mejía <ie...@gmail.com> wrote:
>>>>>
>>>>> I also had a different understanding of the lifecycle of a DoFn.
>>>>>
>>>>> My understanding of the use case for every method in the DoFn was clear
>>>>> and
>>>>> perfectly aligned with Thomas explanation, but what I understood was
>>>>> that in a
>>>>> general terms ‘@Setup was where I got resources/prepare connections and
>>>>> @Teardown where I free them’, so calling Teardown seemed essential to
>>>>> have a
>>>>> complete lifecycle:
>>>>> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
>>>>>
>>>>> The fact that @Teardown could not be called is a new detail for me too,
>>>>> and I
>>>>> also find weird to have a method that may or not be called as part of
>>>>> an API,
>>>>> why would users implement teardown if it will not be called? In that
>>>>> case
>>>>> probably a cleaner approach would be to get rid of that method
>>>>> altogether, no?
>>>>>
>>>>> But well maybe that’s not so easy too, there was another point: Some
>>>>> user
>>>>> reported an issue with leaking resources using KafkaIO in the Spark
>>>>> runner, for
>>>>> ref.
>>>>> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
>>>>>
>>>>> In that moment my understanding was that there was something fishy
>>>>> because we
>>>>> should be calling Teardown to close correctly the connections and free
>>>>> the
>>>>> resources in case of exceptions on start/process/finish, so I filled a
>>>>> JIRA and
>>>>> fixed this by enforcing the call of teardown for the Spark runner and
>>>>> the Flink
>>>>> runner:
>>>>> https://issues.apache.org/jira/browse/BEAM-3187
>>>>> https://issues.apache.org/jira/browse/BEAM-3244
>>>>>
>>>>> As you can see not calling this method does have consequences at least
>>>>> for
>>>>> non-containerized runners. Of course a runner that uses containers
>>>>> could not
>>>>> care about cleaning the resources this way, but a long living JVM in a
>>>>> Hadoop
>>>>> environment probably won’t have the same luck. So I am not sure that
>>>>> having a
>>>>> loose semantic there is the right option, I mean, runners could simply
>>>>> guarantee
>>>>> that they call teardown and if teardown takes too long they can decide
>>>>> to send a
>>>>> signal or kill the process/container/etc and go ahead, that way at
>>>>> least users
>>>>> would have a motivation to implement the teardown method, otherwise it
>>>>> doesn’t
>>>>> make any sense to have it (API wise).
>>>>>
>>>>> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov
>>>>> <ki...@google.com> wrote:
>>>>> > Romain, would it be fair to say that currently the goal of your
>>>>> > participation in this discussion is to identify situations where
>>>>> > @Teardown
>>>>> > in principle could have been called, but some of the current runners
>>>>> > don't
>>>>> > make a good enough effort to call it? If yes - as I said before,
>>>>> > please, by
>>>>> > all means, file bugs of the form "Runner X doesn't call @Teardown in
>>>>> > situation Y" if you're aware of any, and feel free to send PRs fixing
>>>>> > runner
>>>>> > X to reliably call @Teardown in situation Y. I think we all agree
>>>>> > that this
>>>>> > would be a good improvement.
>>>>> >
>>>>> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau
>>>>> > <rm...@gmail.com>
>>>>> > wrote:
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
>>>>> >>
>>>>> >>
>>>>> >>
>>>>> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
>>>>> >> <rm...@gmail.com> wrote:
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>>>>> >>>
>>>>> >>> How do you call teardown? There are cases in which the Java code
>>>>> >>> gets no
>>>>> >>> indication that the restart is happening (e.g. cases where the
>>>>> >>> machine
>>>>> >>> itself is taken down)
>>>>> >>>
>>>>> >>>
>>>>> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
>>>>> >>> Crashes
>>>>> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
>>>>> >>> shutdown
>>>>> >>> with a hook worse case.
>>>>> >>
>>>>> >>
>>>>> >> What you say here is simply not true.
>>>>> >>
>>>>> >> There are many scenarios in which workers shutdown with no
>>>>> >> opportunity for
>>>>> >> any sort of shutdown hook. Sometimes the entire machine gets
>>>>> >> shutdown, and
>>>>> >> not even the OS will have much of a chance to do anything. At scale
>>>>> >> this
>>>>> >> will happen with some regularity, and a distributed system that
>>>>> >> assumes this
>>>>> >> will not happen is a poor distributed system.
>>>>> >>
>>>>> >>
>>>>> >> This is part of the infra and there is no reason the machine is
>>>>> >> shutdown
>>>>> >> without shutting down what runs on it before except if it is a bug
>>>>> >> in the
>>>>> >> software or setup. I can hear you maybe dont do it everywhere but
>>>>> >> there is
>>>>> >> no blocker to do it. Means you can shutdown the machines and
>>>>> >> guarantee
>>>>> >> teardown is called.
>>>>> >>
>>>>> >> Where i go is simply that it is doable and beam sdk core can assume
>>>>> >> setup
>>>>> >> is well done. If there is a best effort downside due to that - with
>>>>> >> the
>>>>> >> meaning you defined - it is an impl bug or a user installation
>>>>> >> issue.
>>>>> >>
>>>>> >> Technically all is true.
>>>>> >>
>>>>> >> What can prevent teardown is a hardware failure or so. This is fine
>>>>> >> and
>>>>> >> doesnt need to be in doc since it is life in IT and obvious or must
>>>>> >> be very
>>>>> >> explicit to avoid current ambiguity.
>>>>> >>
>>>>> >>
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>>
>>>>> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau
>>>>> >>> <rm...@gmail.com>
>>>>> >>> wrote:
>>>>> >>>>
>>>>> >>>> Restarting doesnt mean you dont call teardown. Except a bug there
>>>>> >>>> is no
>>>>> >>>> reason - technically - it happens, no reason.
>>>>> >>>>
>>>>> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>>>>> >>>>>
>>>>> >>>>> Workers restarting is not a bug, it's standard often expected.
>>>>> >>>>>
>>>>> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
>>>>> >>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>
>>>>> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>>>> >>>>>> (procedure)
>>>>> >>>>>>
>>>>> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov"
>>>>> >>>>>> <ki...@google.com> a
>>>>> >>>>>> écrit :
>>>>> >>>>>>>
>>>>> >>>>>>> So what would you like to happen if there is a crash? The DoFn
>>>>> >>>>>>> instance no longer exists because the JVM it ran on no longer
>>>>> >>>>>>> exists. What
>>>>> >>>>>>> should Teardown be called on?
>>>>> >>>>>>>
>>>>> >>>>>>>
>>>>> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
>>>>> >>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>
>>>>> >>>>>>>> This is what i want and not 999999 teardowns for 1000000
>>>>> >>>>>>>> setups
>>>>> >>>>>>>> until there is an unexpected crash (= a bug).
>>>>> >>>>>>>>
>>>>> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a
>>>>> >>>>>>>> écrit :
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
>>>>> >>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
>>>>> >>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but
>>>>> >>>>>>>>>>>> leads to
>>>>> >>>>>>>>>>>> the same pattern, the teardown is just a "if
>>>>> >>>>>>>>>>>> (iCreatedThem) releaseThem();"
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
>>>>> >>>>>>>>>>> workers,
>>>>> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
>>>>> >>>>>>>>>>> created per
>>>>> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
>>>>> >>>>>>>>>>> threads on each
>>>>> >>>>>>>>>>> worker.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Nop was the other way around, in this case on AWS you can
>>>>> >>>>>>>>>> get 256
>>>>> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So when
>>>>> >>>>>>>>>> you compute the
>>>>> >>>>>>>>>> distribution you allocate to some fn the role to own the
>>>>> >>>>>>>>>> instance lookup and
>>>>> >>>>>>>>>> releasing.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> I still don't understand. Let's be more precise. If you write
>>>>> >>>>>>>>> the
>>>>> >>>>>>>>> following code:
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>> >>>>>>>>>
>>>>> >>>>>>>>> There is no way to control how many instances of MyDoFn are
>>>>> >>>>>>>>> created. The runner might decided to create a million
>>>>> >>>>>>>>> instances of this
>>>>> >>>>>>>>> class across your worker pool, which means that you will get
>>>>> >>>>>>>>> a million Setup
>>>>> >>>>>>>>> and Teardown calls.
>>>>> >>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>> Anyway this was just an example of an external resource you
>>>>> >>>>>>>>>> must
>>>>> >>>>>>>>>> release. Real topic is that beam should define asap a
>>>>> >>>>>>>>>> guaranteed generic
>>>>> >>>>>>>>>> lifecycle to let user embrace its programming model.
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> @Eugene:
>>>>> >>>>>>>>>>>> 1. wait logic is about passing the value which is not
>>>>> >>>>>>>>>>>> always
>>>>> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the
>>>>> >>>>>>>>>>>> SDK
>>>>> >>>>>>>>>>>> core). This API defines a *container* API and therefore
>>>>> >>>>>>>>>>>> implies bean
>>>>> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the
>>>>> >>>>>>>>>>>> sources and dofn (not
>>>>> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> A. Source
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
>>>>> >>>>>>>>>>>> estimateSize and split. As an user you can expect both to
>>>>> >>>>>>>>>>>> be called on the
>>>>> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
>>>>> >>>>>>>>>>>> cost(s) twice.
>>>>> >>>>>>>>>>>> Concretely:
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> connect()
>>>>> >>>>>>>>>>>> try {
>>>>> >>>>>>>>>>>>   estimateSize()
>>>>> >>>>>>>>>>>>   split()
>>>>> >>>>>>>>>>>> } finally {
>>>>> >>>>>>>>>>>>   disconnect()
>>>>> >>>>>>>>>>>> }
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> connect()
>>>>> >>>>>>>>>>>> try {
>>>>> >>>>>>>>>>>>   estimateSize()
>>>>> >>>>>>>>>>>> } finally {
>>>>> >>>>>>>>>>>>   disconnect()
>>>>> >>>>>>>>>>>> }
>>>>> >>>>>>>>>>>> connect()
>>>>> >>>>>>>>>>>> try {
>>>>> >>>>>>>>>>>>   split()
>>>>> >>>>>>>>>>>> } finally {
>>>>> >>>>>>>>>>>>   disconnect()
>>>>> >>>>>>>>>>>> }
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> + a workaround with an internal estimate size since this
>>>>> >>>>>>>>>>>> primitive is often called in split but you dont want to
>>>>> >>>>>>>>>>>> connect twice in the
>>>>> >>>>>>>>>>>> second phase.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an
>>>>> >>>>>>>>>>>> API to
>>>>> >>>>>>>>>>>> implement sources which initializes the source bean and
>>>>> >>>>>>>>>>>> destroys it.
>>>>> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
>>>>> >>>>>>>>>>>> However
>>>>> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building
>>>>> >>>>>>>>>>>> any API on top of
>>>>> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you
>>>>> >>>>>>>>>>>> hit the exact same
>>>>> >>>>>>>>>>>> issues - check how IO are implemented, the static
>>>>> >>>>>>>>>>>> utilities which create
>>>>> >>>>>>>>>>>> volatile connections preventing to reuse existing
>>>>> >>>>>>>>>>>> connection in a single
>>>>> >>>>>>>>>>>> method
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> (https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Same logic applies to the reader which is then created.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> B. DoFn & SDF
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime:
>>>>> >>>>>>>>>>>> init();
>>>>> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and
>>>>> >>>>>>>>>>>> that it is
>>>>> >>>>>>>>>>>> executed on the exact same instance to be able to be
>>>>> >>>>>>>>>>>> stateful at that level
>>>>> >>>>>>>>>>>> for expensive connections/operations/flow state handling.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> As you mentionned with the million example, this sequence
>>>>> >>>>>>>>>>>> should
>>>>> >>>>>>>>>>>> happen for each single instance so 1M times for your
>>>>> >>>>>>>>>>>> example.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>> >>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore
>>>>> >>>>>>>>>>>> it creates way
>>>>> >>>>>>>>>>>> more instances and requires to have a way more
>>>>> >>>>>>>>>>>> strict/explicit definition of
>>>>> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since
>>>>> >>>>>>>>>>>> beam handles the
>>>>> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
>>>>> >>>>>>>>>>>> init/destroy hooks
>>>>> >>>>>>>>>>>> (setup/teardown) which can be stateful.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
>>>>> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles.
>>>>> >>>>>>>>>>>> Since bundles size is
>>>>> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to
>>>>> >>>>>>>>>>>> be able to reuse
>>>>> >>>>>>>>>>>> a connection instance to not correct performances. Now
>>>>> >>>>>>>>>>>> with the SDF and the
>>>>> >>>>>>>>>>>> split increase, how do you handle the pool size? Generally
>>>>> >>>>>>>>>>>> in batch you use
>>>>> >>>>>>>>>>>> a single connection per thread to avoid to consume all
>>>>> >>>>>>>>>>>> database connections.
>>>>> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2. use
>>>>> >>>>>>>>>>>> a pool a bit
>>>>> >>>>>>>>>>>> higher but multiplied by the number of beans you will
>>>>> >>>>>>>>>>>> likely x2 or 3 the
>>>>> >>>>>>>>>>>> connection count and make the execution fail with "no more
>>>>> >>>>>>>>>>>> connection
>>>>> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still
>>>>> >>>>>>>>>>>> have to have a
>>>>> >>>>>>>>>>>> reliable teardown by pool instance (close() generally) to
>>>>> >>>>>>>>>>>> ensure you release
>>>>> >>>>>>>>>>>> the pool and don't leak the connection information in the
>>>>> >>>>>>>>>>>> JVM. In all case
>>>>> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if
>>>>> >>>>>>>>>>>> you fake to get
>>>>> >>>>>>>>>>>> connections with bundles.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF
>>>>> >>>>>>>>>>>> imply
>>>>> >>>>>>>>>>>> all the current issues with the loose definition of the
>>>>> >>>>>>>>>>>> bean lifecycles at
>>>>> >>>>>>>>>>>> an exponential level, nothing else.
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> Romain Manni-Bucau
>>>>> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
>>>>> >>>>>>>>>>>> <ki...@google.com>:
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning
>>>>> >>>>>>>>>>>>> can be
>>>>> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in
>>>>> >>>>>>>>>>>>> the thread above,
>>>>> >>>>>>>>>>>>> and I believe it should become the canonical way to do
>>>>> >>>>>>>>>>>>> that.
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main
>>>>> >>>>>>>>>>>>> author of
>>>>> >>>>>>>>>>>>> most design documents related to SDF and of its
>>>>> >>>>>>>>>>>>> implementation in the Java
>>>>> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to
>>>>> >>>>>>>>>>>>> the topic of
>>>>> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
>>>>> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too.
>>>>> >>>>>>>>>>>>>> My
>>>>> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and
>>>>> >>>>>>>>>>>>>> clean the api.
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>> >>>>>>>>>>>>>> transforms?
>>>>> >>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers"
>>>>> >>>>>>>>>>>>>> <bc...@apache.org>
>>>>> >>>>>>>>>>>>>> a écrit :
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
>>>>> >>>>>>>>>>>>>>> DoFn's
>>>>> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary,
>>>>> >>>>>>>>>>>>>>> it is around an
>>>>> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
>>>>> >>>>>>>>>>>>>>> discussions/proposals
>>>>> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
>>>>> >>>>>>>>>>>>>>> haven't been
>>>>> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do
>>>>> >>>>>>>>>>>>>>> a
>>>>> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
>>>>> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the
>>>>> >>>>>>>>>>>>>>> chance of
>>>>> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
>>>>> >>>>>>>>>>>>>>> destination).
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many
>>>>> >>>>>>>>>>>>>>> workers,
>>>>> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>> >>>>>>>>>>>>>>> The move step should only happen once, so on one
>>>>> >>>>>>>>>>>>>>> worker. This
>>>>> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some
>>>>> >>>>>>>>>>>>>>> stuff done to ensure it
>>>>> >>>>>>>>>>>>>>> runs on one worker.
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
>>>>> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule
>>>>> >>>>>>>>>>>>>>> some cleanup work for
>>>>> >>>>>>>>>>>>>>> when the transform is "done". In batch this is
>>>>> >>>>>>>>>>>>>>> relatively straightforward,
>>>>> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems,
>>>>> >>>>>>>>>>>>>>> such as BigQuery
>>>>> >>>>>>>>>>>>>>> sink leaving files around that have failed to import
>>>>> >>>>>>>>>>>>>>> into BigQuery.
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you
>>>>> >>>>>>>>>>>>>>> want to
>>>>> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to
>>>>> >>>>>>>>>>>>>>> wait until the end of
>>>>> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until
>>>>> >>>>>>>>>>>>>>> you know nobody will
>>>>> >>>>>>>>>>>>>>> need the resource anymore.
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API,
>>>>> >>>>>>>>>>>>>>> where
>>>>> >>>>>>>>>>>>>>> you could have a transform that output resource
>>>>> >>>>>>>>>>>>>>> objects. Each resource
>>>>> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
>>>>> >>>>>>>>>>>>>>> would be something
>>>>> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
>>>>> >>>>>>>>>>>>>>> resource, and what
>>>>> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as
>>>>> >>>>>>>>>>>>>>> that part of the
>>>>> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no
>>>>> >>>>>>>>>>>>>>> longer need the resources,
>>>>> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at pipeline
>>>>> >>>>>>>>>>>>>>> shutdown, or
>>>>> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> Would something like this be a better fit for your use
>>>>> >>>>>>>>>>>>>>> case?
>>>>> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
>>>>> >>>>>>>>>>>>>>> sufficient?
>>>>> >>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
>>>>> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the
>>>>> >>>>>>>>>>>>>>>> overall
>>>>> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a
>>>>> >>>>>>>>>>>>>>>> thread of a worker - has
>>>>> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
>>>>> >>>>>>>>>>>>>>>> collection.
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>>>>> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup
>>>>> >>>>>>>>>>>>>>>> before
>>>>> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has
>>>>> >>>>>>>>>>>>>>>> the instance before it
>>>>> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls
>>>>> >>>>>>>>>>>>>>>> teardown.
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> It is as simple as it.
>>>>> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not
>>>>> >>>>>>>>>>>>>>>> self
>>>>> >>>>>>>>>>>>>>>> contained to implement basic transforms.
>>>>> >>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax"
>>>>> >>>>>>>>>>>>>>>> <re...@google.com> a
>>>>> >>>>>>>>>>>>>>>> écrit :
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau
>>>>> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
>>>>> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track.
>>>>> >>>>>>>>>>>>>>>>>> Rather
>>>>> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing
>>>>> >>>>>>>>>>>>>>>>>> methods -- which have been
>>>>> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would
>>>>> >>>>>>>>>>>>>>>>>> be helpful to focus on
>>>>> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something
>>>>> >>>>>>>>>>>>>>>>>> with different semantics.
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
>>>>> >>>>>>>>>>>>>>>>>> trying
>>>>> >>>>>>>>>>>>>>>>>> to do):
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>> >>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline.
>>>>> >>>>>>>>>>>>>>>>>> If this is the case,
>>>>> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized
>>>>> >>>>>>>>>>>>>>>>>> once (and not once per
>>>>> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you
>>>>> >>>>>>>>>>>>>>>>>> know when the pipeline
>>>>> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches
>>>>> >>>>>>>>>>>>>>>>>> step X", then what
>>>>> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when
>>>>> >>>>>>>>>>>>>>>>>> the
>>>>> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a
>>>>> >>>>>>>>>>>>>>>>>> jvm shutdown)
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> I'm really not following what this means.
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers,
>>>>> >>>>>>>>>>>>>>>>> and each
>>>>> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy
>>>>> >>>>>>>>>>>>>>>>> of the same DoFn). How
>>>>> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 =
>>>>> >>>>>>>>>>>>>>>>> 1M cleanups) and when
>>>>> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is
>>>>> >>>>>>>>>>>>>>>>> shut down? When an
>>>>> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be
>>>>> >>>>>>>>>>>>>>>>> temporary - may be
>>>>> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
>>>>> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
>>>>> >>>>>>>>>>>>>>>>>> methods are not a good fit
>>>>> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources
>>>>> >>>>>>>>>>>>>>>>>> within the DoFn), you could
>>>>> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it
>>>>> >>>>>>>>>>>>>>>>>> produced. For instance:
>>>>> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token
>>>>> >>>>>>>>>>>>>>>>>> that
>>>>> >>>>>>>>>>>>>>>>>> stores information about resources)
>>>>> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent
>>>>> >>>>>>>>>>>>>>>>>> retries
>>>>> >>>>>>>>>>>>>>>>>> from changing resource IDs)
>>>>> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
>>>>> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data
>>>>> >>>>>>>>>>>>>>>>>> it is
>>>>> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in
>>>>> >>>>>>>>>>>>>>>>>> use or have been finished
>>>>> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
>>>>> >>>>>>>>>>>>>>>>>> important to ensuring
>>>>> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>>>> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it
>>>>> >>>>>>>>>>>>>>>>>> is
>>>>> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are
>>>>> >>>>>>>>>>>>>>>>>> trying to accomplish? That
>>>>> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
>>>>> >>>>>>>>>>>>>>>>>> existing options and
>>>>> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all
>>>>> >>>>>>>>>>>>>>>>>> cases but
>>>>> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
>>>>> >>>>>>>>>>>>>>>>>> handling  except i
>>>>> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you
>>>>> >>>>>>>>>>>>>>>>>> cant put any unified
>>>>> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very
>>>>> >>>>>>>>>>>>>>>>>> hard to integrate or to use
>>>>> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
>>>>> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> -- Ben
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
>>>>> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
>>>>> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many
>>>>> >>>>>>>>>>>>>>>>>>>> of the
>>>>> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
>>>>> >>>>>>>>>>>>>>>>>>>> machine.
>>>>> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
>>>>> >>>>>>>>>>>>>>>>>>>> except in various situations where it's logically
>>>>> >>>>>>>>>>>>>>>>>>>> impossible or impractical
>>>>> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or
>>>>> >>>>>>>>>>>>>>>>>>>> you can list some of the
>>>>> >>>>>>>>>>>>>>>>>>>> examples above.
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>> Sounds ok to me
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
>>>>> >>>>>>>>>>>>>>>>>>>> called - it's not just
>>>>> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
>>>>> >>>>>>>>>>>>>>>>>>>> important (e.g. cleaning up
>>>>> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a
>>>>> >>>>>>>>>>>>>>>>>>>> large number of VMs you
>>>>> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of
>>>>> >>>>>>>>>>>>>>>>>>>> the other methods that
>>>>> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at
>>>>> >>>>>>>>>>>>>>>>>>>> a cost, e.g. no
>>>>> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so
>>>>> >>>>>>>>>>>>>>>>>>> not
>>>>> >>>>>>>>>>>>>>>>>>> which which other method you speak about.
>>>>> >>>>>>>>>>>>>>>>>>> Concretely if you make it really
>>>>> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me
>>>>> >>>>>>>>>>>>>>>>>>> - then users can use it
>>>>> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen
>>>>> >>>>>>>>>>>>>>>>>>> but it is unexpected and
>>>>> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a
>>>>> >>>>>>>>>>>>>>>>>>> manual - or auto if fancy
>>>>> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all
>>>>> >>>>>>>>>>>>>>>>>>> the difference and impacts
>>>>> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
>>>>> >>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau
>>>>> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means
>>>>> >>>>>>>>>>>>>>>>>>>>> that. It
>>>>> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is
>>>>> >>>>>>>>>>>>>>>>>>>>> what triggered this thread.
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state
>>>>> >>>>>>>>>>>>>>>>>>>>> prevents
>>>>> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very
>>>>> >>>>>>>>>>>>>>>>>>>>> badly and wrongly
>>>>> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github |
>>>>> >>>>>>>>>>>>>>>>>>>>> LinkedIn |
>>>>> >>>>>>>>>>>>>>>>>>>>> Book
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
>>>>> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call
>>>>> >>>>>>>>>>>>>>>>>>>>>> it:
>>>>> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic
>>>>> >>>>>>>>>>>>>>>>>>>>>> crash), and in a number of
>>>>> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker
>>>>> >>>>>>>>>>>>>>>>>>>>>> container has crashed (eg user code
>>>>> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over
>>>>> >>>>>>>>>>>>>>>>>>>>>> JNI and it segfaulted), JVM
>>>>> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the
>>>>> >>>>>>>>>>>>>>>>>>>>>> worker has lost network
>>>>> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't
>>>>> >>>>>>>>>>>>>>>>>>>>>> be able to do anything
>>>>> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a
>>>>> >>>>>>>>>>>>>>>>>>>>>> preemptible VM and it was preempted by
>>>>> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice or
>>>>> >>>>>>>>>>>>>>>>>>>>>> if the worker was too busy
>>>>> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
>>>>> >>>>>>>>>>>>>>>>>>>>>> functions) until the preemption
>>>>> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying hardware
>>>>> >>>>>>>>>>>>>>>>>>>>>> simply failed (which
>>>>> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many other
>>>>> >>>>>>>>>>>>>>>>>>>>>> conditions.
>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to
>>>>> >>>>>>>>>>>>>>>>>>>>>> describe
>>>>> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for
>>>>> >>>>>>>>>>>>>>>>>>>>>> cases where you observed a
>>>>> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it
>>>>> >>>>>>>>>>>>>>>>>>>>>> was possible to call it but
>>>>> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau
>>>>> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
>>>>> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain
>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Manni-Bucau
>>>>> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Manni-Bucau
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> (e.g.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> it requires the following
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> logic and the following processing
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> you.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> requiring a
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> since size is not controlled.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> the connection since it is a best
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> makes you pay a lot - aws ;) - or
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> concurrent limit.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> If
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> called then nothing else can be
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> AWS service are you thinking of
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> everything at the other end has died?
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> but
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> closing exchanges which are not
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> services
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> and closing them at the end.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> and money. You can say it can be
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> handle its
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> for generic pipelines and if
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown -
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> ignoring
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> handled by any human system?
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> handle it? Is it due to some
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and
>>>>> >>>>>>>>>>>>>>>>>>>>>>>> implemented
>>>>> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what
>>>>> >>>>>>>>>>>>>>>>>>>>>>>> kind of change you're asking
>>>>> >>>>>>>>>>>>>>>>>>>>>>>> for.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is
>>>>> >>>>>>>>>>>>>>>>>>>>>>> not
>>>>> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> runner
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> will get a different behavior
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> know what the IOs he composes use
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> than he must be handled IMHO.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> in big
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> what people did for years and do
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> execution of teardown. Then we
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there is
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> a technical reason we cant we
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the api.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I know spark and flink can, any
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> java
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> (beam enclosing software) is fully
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> uncontrolled. Only case where it is not
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned by
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> a vendor and never installed on
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it belongd
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> to the vendor to handle beam
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> vendor - otherwise all
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should be
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> made optional right?
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> distributed
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> and defined lifecycle.
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>>>>>>
>>>>> >>>>>>>>>>>>
>>>>> >>>>>>>>>>>
>>>>> >>>>>>>>>>
>>>>> >>>>>>>>>
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >
>>>>
>>>>
>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 21 févr. 2018 07:26, "Reuven Lax" <re...@google.com> a écrit :

To close the loop here:

Romain, I think your actual concern was that the Javadoc made it sound like
a runner could simply decide not to call Teardown. If so, then I agree with
you - the Javadoc was misleading (and appears it was confusing to Ismael as
well). If a runner destroys a DoFn, it _must_ call TearDown before it calls
Setup on a new DoFn.


95% yes. 5 remaining % being a runner+setup must do its best to call it
whatever happens and the setup must allow it (avoiding blind kills as a
normal stop procedure for instance).


If so, then most of the back and forth on this thread had little to do with
your actual concern. However it did take almost three days of discussion
before Eugene understood what your real concern was, leading to the side
discussions.


Underlying issue which popped is that beam doesnt give yet much importance
to the lifecycle of the instances it manages in a lot of places so Im quite
careful when a potential regression happens at API level. This is key if
beam can enable to build dsl and other API on top of itself.

Now Im not sure what was unclear in the first mail, happy to get feedback
on it - feel free to ping me offline to not bother everyone ;).

And thanks for the fix again.



Reuven

On Mon, Feb 19, 2018 at 6:08 PM, Reuven Lax <re...@google.com> wrote:

> +1 This PR clarifies the semantics quite a bit.
>
> On Mon, Feb 19, 2018 at 3:24 PM, Eugene Kirpichov <ki...@google.com>
> wrote:
>
>> I've sent out a PR editing the Javadoc https://github.com/apa
>> che/beam/pull/4711 . Hopefully, that should be sufficient.
>>
>> On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax <re...@google.com> wrote:
>>
>>> Ismael, your understanding is appropriate for FinishBundle.
>>>
>>> One basic issue with this understanding, is that the lifecycle of a DoFn
>>> is much longer than a single bundle (which I think you expressed by adding
>>> the *s). How long the DoFn lives is not defined. In fact a runner is
>>> completely free to decide that it will _never_ destroy the DoFn, in which
>>> case TearDown is never called simply because the DoFn was never torn down.
>>>
>>> Also, as mentioned before, the runner can only call TearDown in cases
>>> where the shutdown is in its control. If the JVM is shut down externally,
>>> the runner has no chance to call TearDown. This means that while TearDown
>>> is appropriate for cleaning up in-process resources (open connections,
>>> etc.), it's not the right answer for cleaning up persistent resources. If
>>> you rely on TearDown to delete VMs or delete files, there will be cases in
>>> which those files of VMs are not deleted.
>>>
>>> What we are _not_ saying is that the runner is free to just ignore
>>> TearDown. If the runner is explicitly destroying a DoFn object, it should
>>> call TearDown.
>>>
>>> Reuven
>>>
>>>
>>> On Mon, Feb 19, 2018 at 2:35 PM, Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> I also had a different understanding of the lifecycle of a DoFn.
>>>>
>>>> My understanding of the use case for every method in the DoFn was clear
>>>> and
>>>> perfectly aligned with Thomas explanation, but what I understood was
>>>> that in a
>>>> general terms ‘@Setup was where I got resources/prepare connections and
>>>> @Teardown where I free them’, so calling Teardown seemed essential to
>>>> have a
>>>> complete lifecycle:
>>>> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
>>>>
>>>> The fact that @Teardown could not be called is a new detail for me too,
>>>> and I
>>>> also find weird to have a method that may or not be called as part of
>>>> an API,
>>>> why would users implement teardown if it will not be called? In that
>>>> case
>>>> probably a cleaner approach would be to get rid of that method
>>>> altogether, no?
>>>>
>>>> But well maybe that’s not so easy too, there was another point: Some
>>>> user
>>>> reported an issue with leaking resources using KafkaIO in the Spark
>>>> runner, for
>>>> ref.
>>>> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
>>>>
>>>> In that moment my understanding was that there was something fishy
>>>> because we
>>>> should be calling Teardown to close correctly the connections and free
>>>> the
>>>> resources in case of exceptions on start/process/finish, so I filled a
>>>> JIRA and
>>>> fixed this by enforcing the call of teardown for the Spark runner and
>>>> the Flink
>>>> runner:
>>>> https://issues.apache.org/jira/browse/BEAM-3187
>>>> https://issues.apache.org/jira/browse/BEAM-3244
>>>>
>>>> As you can see not calling this method does have consequences at least
>>>> for
>>>> non-containerized runners. Of course a runner that uses containers
>>>> could not
>>>> care about cleaning the resources this way, but a long living JVM in a
>>>> Hadoop
>>>> environment probably won’t have the same luck. So I am not sure that
>>>> having a
>>>> loose semantic there is the right option, I mean, runners could simply
>>>> guarantee
>>>> that they call teardown and if teardown takes too long they can decide
>>>> to send a
>>>> signal or kill the process/container/etc and go ahead, that way at
>>>> least users
>>>> would have a motivation to implement the teardown method, otherwise it
>>>> doesn’t
>>>> make any sense to have it (API wise).
>>>>
>>>> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov <
>>>> kirpichov@google.com> wrote:
>>>> > Romain, would it be fair to say that currently the goal of your
>>>> > participation in this discussion is to identify situations where
>>>> @Teardown
>>>> > in principle could have been called, but some of the current runners
>>>> don't
>>>> > make a good enough effort to call it? If yes - as I said before,
>>>> please, by
>>>> > all means, file bugs of the form "Runner X doesn't call @Teardown in
>>>> > situation Y" if you're aware of any, and feel free to send PRs fixing
>>>> runner
>>>> > X to reliably call @Teardown in situation Y. I think we all agree
>>>> that this
>>>> > would be a good improvement.
>>>> >
>>>> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com>
>>>> > wrote:
>>>> >>
>>>> >>
>>>> >>
>>>> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
>>>> >> <rm...@gmail.com> wrote:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>>>> >>>
>>>> >>> How do you call teardown? There are cases in which the Java code
>>>> gets no
>>>> >>> indication that the restart is happening (e.g. cases where the
>>>> machine
>>>> >>> itself is taken down)
>>>> >>>
>>>> >>>
>>>> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
>>>> Crashes
>>>> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
>>>> shutdown
>>>> >>> with a hook worse case.
>>>> >>
>>>> >>
>>>> >> What you say here is simply not true.
>>>> >>
>>>> >> There are many scenarios in which workers shutdown with no
>>>> opportunity for
>>>> >> any sort of shutdown hook. Sometimes the entire machine gets
>>>> shutdown, and
>>>> >> not even the OS will have much of a chance to do anything. At scale
>>>> this
>>>> >> will happen with some regularity, and a distributed system that
>>>> assumes this
>>>> >> will not happen is a poor distributed system.
>>>> >>
>>>> >>
>>>> >> This is part of the infra and there is no reason the machine is
>>>> shutdown
>>>> >> without shutting down what runs on it before except if it is a bug
>>>> in the
>>>> >> software or setup. I can hear you maybe dont do it everywhere but
>>>> there is
>>>> >> no blocker to do it. Means you can shutdown the machines and
>>>> guarantee
>>>> >> teardown is called.
>>>> >>
>>>> >> Where i go is simply that it is doable and beam sdk core can assume
>>>> setup
>>>> >> is well done. If there is a best effort downside due to that - with
>>>> the
>>>> >> meaning you defined - it is an impl bug or a user installation issue.
>>>> >>
>>>> >> Technically all is true.
>>>> >>
>>>> >> What can prevent teardown is a hardware failure or so. This is fine
>>>> and
>>>> >> doesnt need to be in doc since it is life in IT and obvious or must
>>>> be very
>>>> >> explicit to avoid current ambiguity.
>>>> >>
>>>> >>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com>
>>>> >>> wrote:
>>>> >>>>
>>>> >>>> Restarting doesnt mean you dont call teardown. Except a bug there
>>>> is no
>>>> >>>> reason - technically - it happens, no reason.
>>>> >>>>
>>>> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>>>> >>>>>
>>>> >>>>> Workers restarting is not a bug, it's standard often expected.
>>>> >>>>>
>>>> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
>>>> >>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>>> >>>>>> (procedure)
>>>> >>>>>>
>>>> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com>
>>>> a
>>>> >>>>>> écrit :
>>>> >>>>>>>
>>>> >>>>>>> So what would you like to happen if there is a crash? The DoFn
>>>> >>>>>>> instance no longer exists because the JVM it ran on no longer
>>>> exists. What
>>>> >>>>>>> should Teardown be called on?
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
>>>> >>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>
>>>> >>>>>>>> This is what i want and not 999999 teardowns for 1000000 setups
>>>> >>>>>>>> until there is an unexpected crash (= a bug).
>>>> >>>>>>>>
>>>> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a
>>>> écrit :
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
>>>> >>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
>>>> >>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but
>>>> leads to
>>>> >>>>>>>>>>>> the same pattern, the teardown is just a "if
>>>> (iCreatedThem) releaseThem();"
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
>>>> workers,
>>>> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
>>>> created per
>>>> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
>>>> threads on each
>>>> >>>>>>>>>>> worker.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Nop was the other way around, in this case on AWS you can
>>>> get 256
>>>> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So when
>>>> you compute the
>>>> >>>>>>>>>> distribution you allocate to some fn the role to own the
>>>> instance lookup and
>>>> >>>>>>>>>> releasing.
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> I still don't understand. Let's be more precise. If you write
>>>> the
>>>> >>>>>>>>> following code:
>>>> >>>>>>>>>
>>>> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>> >>>>>>>>>
>>>> >>>>>>>>> There is no way to control how many instances of MyDoFn are
>>>> >>>>>>>>> created. The runner might decided to create a million
>>>> instances of this
>>>> >>>>>>>>> class across your worker pool, which means that you will get
>>>> a million Setup
>>>> >>>>>>>>> and Teardown calls.
>>>> >>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Anyway this was just an example of an external resource you
>>>> must
>>>> >>>>>>>>>> release. Real topic is that beam should define asap a
>>>> guaranteed generic
>>>> >>>>>>>>>> lifecycle to let user embrace its programming model.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> @Eugene:
>>>> >>>>>>>>>>>> 1. wait logic is about passing the value which is not
>>>> always
>>>> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
>>>> >>>>>>>>>>>> core). This API defines a *container* API and therefore
>>>> implies bean
>>>> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the
>>>> sources and dofn (not
>>>> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> A. Source
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
>>>> >>>>>>>>>>>> estimateSize and split. As an user you can expect both to
>>>> be called on the
>>>> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
>>>> cost(s) twice.
>>>> >>>>>>>>>>>> Concretely:
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> connect()
>>>> >>>>>>>>>>>> try {
>>>> >>>>>>>>>>>>   estimateSize()
>>>> >>>>>>>>>>>>   split()
>>>> >>>>>>>>>>>> } finally {
>>>> >>>>>>>>>>>>   disconnect()
>>>> >>>>>>>>>>>> }
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> connect()
>>>> >>>>>>>>>>>> try {
>>>> >>>>>>>>>>>>   estimateSize()
>>>> >>>>>>>>>>>> } finally {
>>>> >>>>>>>>>>>>   disconnect()
>>>> >>>>>>>>>>>> }
>>>> >>>>>>>>>>>> connect()
>>>> >>>>>>>>>>>> try {
>>>> >>>>>>>>>>>>   split()
>>>> >>>>>>>>>>>> } finally {
>>>> >>>>>>>>>>>>   disconnect()
>>>> >>>>>>>>>>>> }
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> + a workaround with an internal estimate size since this
>>>> >>>>>>>>>>>> primitive is often called in split but you dont want to
>>>> connect twice in the
>>>> >>>>>>>>>>>> second phase.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an
>>>> API to
>>>> >>>>>>>>>>>> implement sources which initializes the source bean and
>>>> destroys it.
>>>> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
>>>> However
>>>> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building
>>>> any API on top of
>>>> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you
>>>> hit the exact same
>>>> >>>>>>>>>>>> issues - check how IO are implemented, the static
>>>> utilities which create
>>>> >>>>>>>>>>>> volatile connections preventing to reuse existing
>>>> connection in a single
>>>> >>>>>>>>>>>> method
>>>> >>>>>>>>>>>> (https://github.com/apache/bea
>>>> m/blob/master/sdks/java/io/elasticsearch/src/main/java/org/a
>>>> pache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Same logic applies to the reader which is then created.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> B. DoFn & SDF
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime:
>>>> init();
>>>> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and
>>>> that it is
>>>> >>>>>>>>>>>> executed on the exact same instance to be able to be
>>>> stateful at that level
>>>> >>>>>>>>>>>> for expensive connections/operations/flow state handling.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> As you mentionned with the million example, this sequence
>>>> should
>>>> >>>>>>>>>>>> happen for each single instance so 1M times for your
>>>> example.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>> >>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore
>>>> it creates way
>>>> >>>>>>>>>>>> more instances and requires to have a way more
>>>> strict/explicit definition of
>>>> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since
>>>> beam handles the
>>>> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
>>>> init/destroy hooks
>>>> >>>>>>>>>>>> (setup/teardown) which can be stateful.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
>>>> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles.
>>>> Since bundles size is
>>>> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to
>>>> be able to reuse
>>>> >>>>>>>>>>>> a connection instance to not correct performances. Now
>>>> with the SDF and the
>>>> >>>>>>>>>>>> split increase, how do you handle the pool size? Generally
>>>> in batch you use
>>>> >>>>>>>>>>>> a single connection per thread to avoid to consume all
>>>> database connections.
>>>> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2. use
>>>> a pool a bit
>>>> >>>>>>>>>>>> higher but multiplied by the number of beans you will
>>>> likely x2 or 3 the
>>>> >>>>>>>>>>>> connection count and make the execution fail with "no more
>>>> connection
>>>> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still
>>>> have to have a
>>>> >>>>>>>>>>>> reliable teardown by pool instance (close() generally) to
>>>> ensure you release
>>>> >>>>>>>>>>>> the pool and don't leak the connection information in the
>>>> JVM. In all case
>>>> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if
>>>> you fake to get
>>>> >>>>>>>>>>>> connections with bundles.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF
>>>> imply
>>>> >>>>>>>>>>>> all the current issues with the loose definition of the
>>>> bean lifecycles at
>>>> >>>>>>>>>>>> an exponential level, nothing else.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Romain Manni-Bucau
>>>> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
>>>> >>>>>>>>>>>> <ki...@google.com>:
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning
>>>> can be
>>>> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in
>>>> the thread above,
>>>> >>>>>>>>>>>>> and I believe it should become the canonical way to do
>>>> that.
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main
>>>> author of
>>>> >>>>>>>>>>>>> most design documents related to SDF and of its
>>>> implementation in the Java
>>>> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to
>>>> the topic of
>>>> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
>>>> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too.
>>>> My
>>>> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and
>>>> clean the api.
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>> >>>>>>>>>>>>>> transforms?
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <
>>>> bchambers@apache.org>
>>>> >>>>>>>>>>>>>> a écrit :
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
>>>> DoFn's
>>>> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary,
>>>> it is around an
>>>> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
>>>> discussions/proposals
>>>> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
>>>> haven't been
>>>> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do
>>>> a
>>>> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
>>>> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the
>>>> chance of
>>>> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
>>>> destination).
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many
>>>> workers,
>>>> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>> >>>>>>>>>>>>>>> The move step should only happen once, so on one
>>>> worker. This
>>>> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some
>>>> stuff done to ensure it
>>>> >>>>>>>>>>>>>>> runs on one worker.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
>>>> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule
>>>> some cleanup work for
>>>> >>>>>>>>>>>>>>> when the transform is "done". In batch this is
>>>> relatively straightforward,
>>>> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems,
>>>> such as BigQuery
>>>> >>>>>>>>>>>>>>> sink leaving files around that have failed to import
>>>> into BigQuery.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you
>>>> want to
>>>> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to
>>>> wait until the end of
>>>> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until
>>>> you know nobody will
>>>> >>>>>>>>>>>>>>> need the resource anymore.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API,
>>>> where
>>>> >>>>>>>>>>>>>>> you could have a transform that output resource
>>>> objects. Each resource
>>>> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
>>>> would be something
>>>> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
>>>> resource, and what
>>>> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as
>>>> that part of the
>>>> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no
>>>> longer need the resources,
>>>> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at pipeline
>>>> shutdown, or
>>>> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> Would something like this be a better fit for your use
>>>> case?
>>>> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
>>>> sufficient?
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a
>>>> thread of a worker - has
>>>> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
>>>> collection.
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>>>> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup
>>>> before
>>>> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has
>>>> the instance before it
>>>> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls teardown.
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> It is as simple as it.
>>>> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not
>>>> self
>>>> >>>>>>>>>>>>>>>> contained to implement basic transforms.
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com>
>>>> a
>>>> >>>>>>>>>>>>>>>> écrit :
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
>>>> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track.
>>>> Rather
>>>> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing
>>>> methods -- which have been
>>>> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would
>>>> be helpful to focus on
>>>> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something
>>>> with different semantics.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
>>>> trying
>>>> >>>>>>>>>>>>>>>>>> to do):
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>> >>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline.
>>>> If this is the case,
>>>> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized
>>>> once (and not once per
>>>> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you
>>>> know when the pipeline
>>>> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches
>>>> step X", then what
>>>> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the
>>>> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a
>>>> jvm shutdown)
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>> I'm really not following what this means.
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers,
>>>> and each
>>>> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy
>>>> of the same DoFn). How
>>>> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 =
>>>> 1M cleanups) and when
>>>> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is
>>>> shut down? When an
>>>> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be
>>>> temporary - may be
>>>> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
>>>> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
>>>> methods are not a good fit
>>>> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources
>>>> within the DoFn), you could
>>>> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it
>>>> produced. For instance:
>>>> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token
>>>> that
>>>> >>>>>>>>>>>>>>>>>> stores information about resources)
>>>> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent
>>>> retries
>>>> >>>>>>>>>>>>>>>>>> from changing resource IDs)
>>>> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
>>>> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data
>>>> it is
>>>> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in
>>>> use or have been finished
>>>> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
>>>> important to ensuring
>>>> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>>> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it
>>>> is
>>>> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are
>>>> trying to accomplish? That
>>>> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
>>>> existing options and
>>>> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all
>>>> cases but
>>>> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
>>>> handling  except i
>>>> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you
>>>> cant put any unified
>>>> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very
>>>> hard to integrate or to use
>>>> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
>>>> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> -- Ben
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
>>>> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many
>>>> of the
>>>> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
>>>> machine.
>>>> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
>>>> >>>>>>>>>>>>>>>>>>>> except in various situations where it's logically
>>>> impossible or impractical
>>>> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or
>>>> you can list some of the
>>>> >>>>>>>>>>>>>>>>>>>> examples above.
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>> Sounds ok to me
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
>>>> called - it's not just
>>>> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
>>>> important (e.g. cleaning up
>>>> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a
>>>> large number of VMs you
>>>> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of
>>>> the other methods that
>>>> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at
>>>> a cost, e.g. no
>>>> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so
>>>> not
>>>> >>>>>>>>>>>>>>>>>>> which which other method you speak about.
>>>> Concretely if you make it really
>>>> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me
>>>> - then users can use it
>>>> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen
>>>> but it is unexpected and
>>>> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a
>>>> manual - or auto if fancy
>>>> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all
>>>> the difference and impacts
>>>> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means
>>>> that. It
>>>> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is
>>>> what triggered this thread.
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state
>>>> prevents
>>>> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very
>>>> badly and wrongly
>>>> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github |
>>>> LinkedIn |
>>>> >>>>>>>>>>>>>>>>>>>>> Book
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
>>>> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call
>>>> it:
>>>> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic
>>>> crash), and in a number of
>>>> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker
>>>> container has crashed (eg user code
>>>> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over
>>>> JNI and it segfaulted), JVM
>>>> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the
>>>> worker has lost network
>>>> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't
>>>> be able to do anything
>>>> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a
>>>> preemptible VM and it was preempted by
>>>> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice or
>>>> if the worker was too busy
>>>> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
>>>> functions) until the preemption
>>>> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying hardware
>>>> simply failed (which
>>>> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many other
>>>> conditions.
>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to
>>>> describe
>>>> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for
>>>> cases where you observed a
>>>> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it
>>>> was possible to call it but
>>>> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
>>>> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain
>>>> Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
>>>> Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need
>>>> (e.g.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and
>>>> it requires the following
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup
>>>> logic and the following processing
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform
>>>> requiring a
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
>>>> since size is not controlled.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release
>>>> the connection since it is a best
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
>>>> makes you pay a lot - aws ;) - or
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
>>>> concurrent limit.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not
>>>> called then nothing else can be
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What
>>>> AWS service are you thinking of
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when
>>>> everything at the other end has died?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless
>>>> but
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
>>>> closing exchanges which are not
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
>>>> services
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup
>>>> and closing them at the end.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines
>>>> and money. You can say it can be
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant
>>>> handle its
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale
>>>> for generic pipelines and if
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown -
>>>> ignoring
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
>>>> handled by any human system?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
>>>> handle it? Is it due to some
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and
>>>> implemented
>>>> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what
>>>> kind of change you're asking
>>>> >>>>>>>>>>>>>>>>>>>>>>>> for.
>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is
>>>> not
>>>> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct
>>>> runner
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he
>>>> will get a different behavior
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt
>>>> know what the IOs he composes use
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product
>>>> than he must be handled IMHO.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new
>>>> in big
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore
>>>> what people did for years and do
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the
>>>> execution of teardown. Then we
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there is
>>>> a technical reason we cant we
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the api.
>>>> I know spark and flink can, any
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through
>>>> java
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment
>>>> (beam enclosing software) is fully
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
>>>> uncontrolled. Only case where it is not
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned by
>>>> a vendor and never installed on
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it belongd
>>>> to the vendor to handle beam
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
>>>> vendor - otherwise all
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should be
>>>> made optional right?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
>>>> distributed
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit
>>>> and defined lifecycle.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

To close the loop here:

Romain, I think your actual concern was that the Javadoc made it sound like
a runner could simply decide not to call Teardown. If so, then I agree with
you - the Javadoc was misleading (and appears it was confusing to Ismael as
well). If a runner destroys a DoFn, it _must_ call TearDown before it calls
Setup on a new DoFn.

If so, then most of the back and forth on this thread had little to do with
your actual concern. However it did take almost three days of discussion
before Eugene understood what your real concern was, leading to the side
discussions.

Reuven

On Mon, Feb 19, 2018 at 6:08 PM, Reuven Lax <re...@google.com> wrote:

> +1 This PR clarifies the semantics quite a bit.
>
> On Mon, Feb 19, 2018 at 3:24 PM, Eugene Kirpichov <ki...@google.com>
> wrote:
>
>> I've sent out a PR editing the Javadoc https://github.com/apa
>> che/beam/pull/4711 . Hopefully, that should be sufficient.
>>
>> On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax <re...@google.com> wrote:
>>
>>> Ismael, your understanding is appropriate for FinishBundle.
>>>
>>> One basic issue with this understanding, is that the lifecycle of a DoFn
>>> is much longer than a single bundle (which I think you expressed by adding
>>> the *s). How long the DoFn lives is not defined. In fact a runner is
>>> completely free to decide that it will _never_ destroy the DoFn, in which
>>> case TearDown is never called simply because the DoFn was never torn down.
>>>
>>> Also, as mentioned before, the runner can only call TearDown in cases
>>> where the shutdown is in its control. If the JVM is shut down externally,
>>> the runner has no chance to call TearDown. This means that while TearDown
>>> is appropriate for cleaning up in-process resources (open connections,
>>> etc.), it's not the right answer for cleaning up persistent resources. If
>>> you rely on TearDown to delete VMs or delete files, there will be cases in
>>> which those files of VMs are not deleted.
>>>
>>> What we are _not_ saying is that the runner is free to just ignore
>>> TearDown. If the runner is explicitly destroying a DoFn object, it should
>>> call TearDown.
>>>
>>> Reuven
>>>
>>>
>>> On Mon, Feb 19, 2018 at 2:35 PM, Ismaël Mejía <ie...@gmail.com> wrote:
>>>
>>>> I also had a different understanding of the lifecycle of a DoFn.
>>>>
>>>> My understanding of the use case for every method in the DoFn was clear
>>>> and
>>>> perfectly aligned with Thomas explanation, but what I understood was
>>>> that in a
>>>> general terms ‘@Setup was where I got resources/prepare connections and
>>>> @Teardown where I free them’, so calling Teardown seemed essential to
>>>> have a
>>>> complete lifecycle:
>>>> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
>>>>
>>>> The fact that @Teardown could not be called is a new detail for me too,
>>>> and I
>>>> also find weird to have a method that may or not be called as part of
>>>> an API,
>>>> why would users implement teardown if it will not be called? In that
>>>> case
>>>> probably a cleaner approach would be to get rid of that method
>>>> altogether, no?
>>>>
>>>> But well maybe that’s not so easy too, there was another point: Some
>>>> user
>>>> reported an issue with leaking resources using KafkaIO in the Spark
>>>> runner, for
>>>> ref.
>>>> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
>>>>
>>>> In that moment my understanding was that there was something fishy
>>>> because we
>>>> should be calling Teardown to close correctly the connections and free
>>>> the
>>>> resources in case of exceptions on start/process/finish, so I filled a
>>>> JIRA and
>>>> fixed this by enforcing the call of teardown for the Spark runner and
>>>> the Flink
>>>> runner:
>>>> https://issues.apache.org/jira/browse/BEAM-3187
>>>> https://issues.apache.org/jira/browse/BEAM-3244
>>>>
>>>> As you can see not calling this method does have consequences at least
>>>> for
>>>> non-containerized runners. Of course a runner that uses containers
>>>> could not
>>>> care about cleaning the resources this way, but a long living JVM in a
>>>> Hadoop
>>>> environment probably won’t have the same luck. So I am not sure that
>>>> having a
>>>> loose semantic there is the right option, I mean, runners could simply
>>>> guarantee
>>>> that they call teardown and if teardown takes too long they can decide
>>>> to send a
>>>> signal or kill the process/container/etc and go ahead, that way at
>>>> least users
>>>> would have a motivation to implement the teardown method, otherwise it
>>>> doesn’t
>>>> make any sense to have it (API wise).
>>>>
>>>> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov <
>>>> kirpichov@google.com> wrote:
>>>> > Romain, would it be fair to say that currently the goal of your
>>>> > participation in this discussion is to identify situations where
>>>> @Teardown
>>>> > in principle could have been called, but some of the current runners
>>>> don't
>>>> > make a good enough effort to call it? If yes - as I said before,
>>>> please, by
>>>> > all means, file bugs of the form "Runner X doesn't call @Teardown in
>>>> > situation Y" if you're aware of any, and feel free to send PRs fixing
>>>> runner
>>>> > X to reliably call @Teardown in situation Y. I think we all agree
>>>> that this
>>>> > would be a good improvement.
>>>> >
>>>> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com>
>>>> > wrote:
>>>> >>
>>>> >>
>>>> >>
>>>> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
>>>> >>
>>>> >>
>>>> >>
>>>> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
>>>> >> <rm...@gmail.com> wrote:
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>>>> >>>
>>>> >>> How do you call teardown? There are cases in which the Java code
>>>> gets no
>>>> >>> indication that the restart is happening (e.g. cases where the
>>>> machine
>>>> >>> itself is taken down)
>>>> >>>
>>>> >>>
>>>> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
>>>> Crashes
>>>> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
>>>> shutdown
>>>> >>> with a hook worse case.
>>>> >>
>>>> >>
>>>> >> What you say here is simply not true.
>>>> >>
>>>> >> There are many scenarios in which workers shutdown with no
>>>> opportunity for
>>>> >> any sort of shutdown hook. Sometimes the entire machine gets
>>>> shutdown, and
>>>> >> not even the OS will have much of a chance to do anything. At scale
>>>> this
>>>> >> will happen with some regularity, and a distributed system that
>>>> assumes this
>>>> >> will not happen is a poor distributed system.
>>>> >>
>>>> >>
>>>> >> This is part of the infra and there is no reason the machine is
>>>> shutdown
>>>> >> without shutting down what runs on it before except if it is a bug
>>>> in the
>>>> >> software or setup. I can hear you maybe dont do it everywhere but
>>>> there is
>>>> >> no blocker to do it. Means you can shutdown the machines and
>>>> guarantee
>>>> >> teardown is called.
>>>> >>
>>>> >> Where i go is simply that it is doable and beam sdk core can assume
>>>> setup
>>>> >> is well done. If there is a best effort downside due to that - with
>>>> the
>>>> >> meaning you defined - it is an impl bug or a user installation issue.
>>>> >>
>>>> >> Technically all is true.
>>>> >>
>>>> >> What can prevent teardown is a hardware failure or so. This is fine
>>>> and
>>>> >> doesnt need to be in doc since it is life in IT and obvious or must
>>>> be very
>>>> >> explicit to avoid current ambiguity.
>>>> >>
>>>> >>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>>
>>>> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com>
>>>> >>> wrote:
>>>> >>>>
>>>> >>>> Restarting doesnt mean you dont call teardown. Except a bug there
>>>> is no
>>>> >>>> reason - technically - it happens, no reason.
>>>> >>>>
>>>> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>>>> >>>>>
>>>> >>>>> Workers restarting is not a bug, it's standard often expected.
>>>> >>>>>
>>>> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
>>>> >>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>
>>>> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>>> >>>>>> (procedure)
>>>> >>>>>>
>>>> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com>
>>>> a
>>>> >>>>>> écrit :
>>>> >>>>>>>
>>>> >>>>>>> So what would you like to happen if there is a crash? The DoFn
>>>> >>>>>>> instance no longer exists because the JVM it ran on no longer
>>>> exists. What
>>>> >>>>>>> should Teardown be called on?
>>>> >>>>>>>
>>>> >>>>>>>
>>>> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
>>>> >>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>
>>>> >>>>>>>> This is what i want and not 999999 teardowns for 1000000 setups
>>>> >>>>>>>> until there is an unexpected crash (= a bug).
>>>> >>>>>>>>
>>>> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a
>>>> écrit :
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
>>>> >>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
>>>> >>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but
>>>> leads to
>>>> >>>>>>>>>>>> the same pattern, the teardown is just a "if
>>>> (iCreatedThem) releaseThem();"
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
>>>> workers,
>>>> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
>>>> created per
>>>> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
>>>> threads on each
>>>> >>>>>>>>>>> worker.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Nop was the other way around, in this case on AWS you can
>>>> get 256
>>>> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So when
>>>> you compute the
>>>> >>>>>>>>>> distribution you allocate to some fn the role to own the
>>>> instance lookup and
>>>> >>>>>>>>>> releasing.
>>>> >>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>>>>>>> I still don't understand. Let's be more precise. If you write
>>>> the
>>>> >>>>>>>>> following code:
>>>> >>>>>>>>>
>>>> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>> >>>>>>>>>
>>>> >>>>>>>>> There is no way to control how many instances of MyDoFn are
>>>> >>>>>>>>> created. The runner might decided to create a million
>>>> instances of this
>>>> >>>>>>>>> class across your worker pool, which means that you will get
>>>> a million Setup
>>>> >>>>>>>>> and Teardown calls.
>>>> >>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>> Anyway this was just an example of an external resource you
>>>> must
>>>> >>>>>>>>>> release. Real topic is that beam should define asap a
>>>> guaranteed generic
>>>> >>>>>>>>>> lifecycle to let user embrace its programming model.
>>>> >>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> @Eugene:
>>>> >>>>>>>>>>>> 1. wait logic is about passing the value which is not
>>>> always
>>>> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
>>>> >>>>>>>>>>>> core). This API defines a *container* API and therefore
>>>> implies bean
>>>> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the
>>>> sources and dofn (not
>>>> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> A. Source
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
>>>> >>>>>>>>>>>> estimateSize and split. As an user you can expect both to
>>>> be called on the
>>>> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
>>>> cost(s) twice.
>>>> >>>>>>>>>>>> Concretely:
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> connect()
>>>> >>>>>>>>>>>> try {
>>>> >>>>>>>>>>>>   estimateSize()
>>>> >>>>>>>>>>>>   split()
>>>> >>>>>>>>>>>> } finally {
>>>> >>>>>>>>>>>>   disconnect()
>>>> >>>>>>>>>>>> }
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> connect()
>>>> >>>>>>>>>>>> try {
>>>> >>>>>>>>>>>>   estimateSize()
>>>> >>>>>>>>>>>> } finally {
>>>> >>>>>>>>>>>>   disconnect()
>>>> >>>>>>>>>>>> }
>>>> >>>>>>>>>>>> connect()
>>>> >>>>>>>>>>>> try {
>>>> >>>>>>>>>>>>   split()
>>>> >>>>>>>>>>>> } finally {
>>>> >>>>>>>>>>>>   disconnect()
>>>> >>>>>>>>>>>> }
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> + a workaround with an internal estimate size since this
>>>> >>>>>>>>>>>> primitive is often called in split but you dont want to
>>>> connect twice in the
>>>> >>>>>>>>>>>> second phase.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an
>>>> API to
>>>> >>>>>>>>>>>> implement sources which initializes the source bean and
>>>> destroys it.
>>>> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
>>>> However
>>>> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building
>>>> any API on top of
>>>> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you
>>>> hit the exact same
>>>> >>>>>>>>>>>> issues - check how IO are implemented, the static
>>>> utilities which create
>>>> >>>>>>>>>>>> volatile connections preventing to reuse existing
>>>> connection in a single
>>>> >>>>>>>>>>>> method
>>>> >>>>>>>>>>>> (https://github.com/apache/bea
>>>> m/blob/master/sdks/java/io/elasticsearch/src/main/java/org/
>>>> apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Same logic applies to the reader which is then created.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> B. DoFn & SDF
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime:
>>>> init();
>>>> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and
>>>> that it is
>>>> >>>>>>>>>>>> executed on the exact same instance to be able to be
>>>> stateful at that level
>>>> >>>>>>>>>>>> for expensive connections/operations/flow state handling.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> As you mentionned with the million example, this sequence
>>>> should
>>>> >>>>>>>>>>>> happen for each single instance so 1M times for your
>>>> example.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>> >>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore
>>>> it creates way
>>>> >>>>>>>>>>>> more instances and requires to have a way more
>>>> strict/explicit definition of
>>>> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since
>>>> beam handles the
>>>> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
>>>> init/destroy hooks
>>>> >>>>>>>>>>>> (setup/teardown) which can be stateful.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
>>>> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles.
>>>> Since bundles size is
>>>> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to
>>>> be able to reuse
>>>> >>>>>>>>>>>> a connection instance to not correct performances. Now
>>>> with the SDF and the
>>>> >>>>>>>>>>>> split increase, how do you handle the pool size? Generally
>>>> in batch you use
>>>> >>>>>>>>>>>> a single connection per thread to avoid to consume all
>>>> database connections.
>>>> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2. use
>>>> a pool a bit
>>>> >>>>>>>>>>>> higher but multiplied by the number of beans you will
>>>> likely x2 or 3 the
>>>> >>>>>>>>>>>> connection count and make the execution fail with "no more
>>>> connection
>>>> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still
>>>> have to have a
>>>> >>>>>>>>>>>> reliable teardown by pool instance (close() generally) to
>>>> ensure you release
>>>> >>>>>>>>>>>> the pool and don't leak the connection information in the
>>>> JVM. In all case
>>>> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if
>>>> you fake to get
>>>> >>>>>>>>>>>> connections with bundles.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF
>>>> imply
>>>> >>>>>>>>>>>> all the current issues with the loose definition of the
>>>> bean lifecycles at
>>>> >>>>>>>>>>>> an exponential level, nothing else.
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> Romain Manni-Bucau
>>>> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
>>>> >>>>>>>>>>>> <ki...@google.com>:
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning
>>>> can be
>>>> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in
>>>> the thread above,
>>>> >>>>>>>>>>>>> and I believe it should become the canonical way to do
>>>> that.
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main
>>>> author of
>>>> >>>>>>>>>>>>> most design documents related to SDF and of its
>>>> implementation in the Java
>>>> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to
>>>> the topic of
>>>> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>
>>>> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
>>>> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too.
>>>> My
>>>> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and
>>>> clean the api.
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>> >>>>>>>>>>>>>> transforms?
>>>> >>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <
>>>> bchambers@apache.org>
>>>> >>>>>>>>>>>>>> a écrit :
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
>>>> DoFn's
>>>> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary,
>>>> it is around an
>>>> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
>>>> discussions/proposals
>>>> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
>>>> haven't been
>>>> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do
>>>> a
>>>> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
>>>> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the
>>>> chance of
>>>> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
>>>> destination).
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many
>>>> workers,
>>>> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>> >>>>>>>>>>>>>>> The move step should only happen once, so on one
>>>> worker. This
>>>> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some
>>>> stuff done to ensure it
>>>> >>>>>>>>>>>>>>> runs on one worker.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
>>>> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule
>>>> some cleanup work for
>>>> >>>>>>>>>>>>>>> when the transform is "done". In batch this is
>>>> relatively straightforward,
>>>> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems,
>>>> such as BigQuery
>>>> >>>>>>>>>>>>>>> sink leaving files around that have failed to import
>>>> into BigQuery.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you
>>>> want to
>>>> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to
>>>> wait until the end of
>>>> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until
>>>> you know nobody will
>>>> >>>>>>>>>>>>>>> need the resource anymore.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API,
>>>> where
>>>> >>>>>>>>>>>>>>> you could have a transform that output resource
>>>> objects. Each resource
>>>> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
>>>> would be something
>>>> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
>>>> resource, and what
>>>> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as
>>>> that part of the
>>>> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no
>>>> longer need the resources,
>>>> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at pipeline
>>>> shutdown, or
>>>> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> Would something like this be a better fit for your use
>>>> case?
>>>> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
>>>> sufficient?
>>>> >>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a
>>>> thread of a worker - has
>>>> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
>>>> collection.
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>>>> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup
>>>> before
>>>> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has
>>>> the instance before it
>>>> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls teardown.
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> It is as simple as it.
>>>> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not
>>>> self
>>>> >>>>>>>>>>>>>>>> contained to implement basic transforms.
>>>> >>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com>
>>>> a
>>>> >>>>>>>>>>>>>>>> écrit :
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
>>>> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track.
>>>> Rather
>>>> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing
>>>> methods -- which have been
>>>> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would
>>>> be helpful to focus on
>>>> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something
>>>> with different semantics.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
>>>> trying
>>>> >>>>>>>>>>>>>>>>>> to do):
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>> >>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline.
>>>> If this is the case,
>>>> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized
>>>> once (and not once per
>>>> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you
>>>> know when the pipeline
>>>> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches
>>>> step X", then what
>>>> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the
>>>> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a
>>>> jvm shutdown)
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>> I'm really not following what this means.
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers,
>>>> and each
>>>> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy
>>>> of the same DoFn). How
>>>> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 =
>>>> 1M cleanups) and when
>>>> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is
>>>> shut down? When an
>>>> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be
>>>> temporary - may be
>>>> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
>>>> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
>>>> methods are not a good fit
>>>> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources
>>>> within the DoFn), you could
>>>> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it
>>>> produced. For instance:
>>>> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token
>>>> that
>>>> >>>>>>>>>>>>>>>>>> stores information about resources)
>>>> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent
>>>> retries
>>>> >>>>>>>>>>>>>>>>>> from changing resource IDs)
>>>> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
>>>> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data
>>>> it is
>>>> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in
>>>> use or have been finished
>>>> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
>>>> important to ensuring
>>>> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>>> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it
>>>> is
>>>> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are
>>>> trying to accomplish? That
>>>> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
>>>> existing options and
>>>> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all
>>>> cases but
>>>> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
>>>> handling  except i
>>>> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you
>>>> cant put any unified
>>>> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very
>>>> hard to integrate or to use
>>>> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
>>>> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> -- Ben
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
>>>> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many
>>>> of the
>>>> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
>>>> machine.
>>>> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
>>>> >>>>>>>>>>>>>>>>>>>> except in various situations where it's logically
>>>> impossible or impractical
>>>> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or
>>>> you can list some of the
>>>> >>>>>>>>>>>>>>>>>>>> examples above.
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>> Sounds ok to me
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
>>>> called - it's not just
>>>> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
>>>> important (e.g. cleaning up
>>>> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a
>>>> large number of VMs you
>>>> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of
>>>> the other methods that
>>>> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at
>>>> a cost, e.g. no
>>>> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so
>>>> not
>>>> >>>>>>>>>>>>>>>>>>> which which other method you speak about.
>>>> Concretely if you make it really
>>>> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me
>>>> - then users can use it
>>>> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen
>>>> but it is unexpected and
>>>> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a
>>>> manual - or auto if fancy
>>>> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all
>>>> the difference and impacts
>>>> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
>>>> >>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means
>>>> that. It
>>>> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is
>>>> what triggered this thread.
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state
>>>> prevents
>>>> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very
>>>> badly and wrongly
>>>> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github |
>>>> LinkedIn |
>>>> >>>>>>>>>>>>>>>>>>>>> Book
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
>>>> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call
>>>> it:
>>>> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic
>>>> crash), and in a number of
>>>> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker
>>>> container has crashed (eg user code
>>>> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over
>>>> JNI and it segfaulted), JVM
>>>> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the
>>>> worker has lost network
>>>> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't
>>>> be able to do anything
>>>> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a
>>>> preemptible VM and it was preempted by
>>>> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice or
>>>> if the worker was too busy
>>>> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
>>>> functions) until the preemption
>>>> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying hardware
>>>> simply failed (which
>>>> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many other
>>>> conditions.
>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to
>>>> describe
>>>> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for
>>>> cases where you observed a
>>>> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it
>>>> was possible to call it but
>>>> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
>>>> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain
>>>> Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
>>>> Manni-Bucau
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need
>>>> (e.g.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and
>>>> it requires the following
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup
>>>> logic and the following processing
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform
>>>> requiring a
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
>>>> since size is not controlled.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release
>>>> the connection since it is a best
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
>>>> makes you pay a lot - aws ;) - or
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
>>>> concurrent limit.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not
>>>> called then nothing else can be
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What
>>>> AWS service are you thinking of
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when
>>>> everything at the other end has died?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless
>>>> but
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
>>>> closing exchanges which are not
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
>>>> services
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup
>>>> and closing them at the end.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines
>>>> and money. You can say it can be
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant
>>>> handle its
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale
>>>> for generic pipelines and if
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown -
>>>> ignoring
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
>>>> handled by any human system?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
>>>> handle it? Is it due to some
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and
>>>> implemented
>>>> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what
>>>> kind of change you're asking
>>>> >>>>>>>>>>>>>>>>>>>>>>>> for.
>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is
>>>> not
>>>> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>>> >>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct
>>>> runner
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he
>>>> will get a different behavior
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt
>>>> know what the IOs he composes use
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product
>>>> than he must be handled IMHO.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new
>>>> in big
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore
>>>> what people did for years and do
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the
>>>> execution of teardown. Then we
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there is
>>>> a technical reason we cant we
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the api.
>>>> I know spark and flink can, any
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through
>>>> java
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment
>>>> (beam enclosing software) is fully
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
>>>> uncontrolled. Only case where it is not
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned by
>>>> a vendor and never installed on
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it belongd
>>>> to the vendor to handle beam
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
>>>> vendor - otherwise all
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should be
>>>> made optional right?
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
>>>> distributed
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit
>>>> and defined lifecycle.
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>>>>>>
>>>> >>>>>>>>>>>>
>>>> >>>>>>>>>>>
>>>> >>>>>>>>>>
>>>> >>>>>>>>>
>>>> >>>
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

+1 This PR clarifies the semantics quite a bit.

On Mon, Feb 19, 2018 at 3:24 PM, Eugene Kirpichov <ki...@google.com>
wrote:

> I've sent out a PR editing the Javadoc https://github.com/
> apache/beam/pull/4711 . Hopefully, that should be sufficient.
>
> On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax <re...@google.com> wrote:
>
>> Ismael, your understanding is appropriate for FinishBundle.
>>
>> One basic issue with this understanding, is that the lifecycle of a DoFn
>> is much longer than a single bundle (which I think you expressed by adding
>> the *s). How long the DoFn lives is not defined. In fact a runner is
>> completely free to decide that it will _never_ destroy the DoFn, in which
>> case TearDown is never called simply because the DoFn was never torn down.
>>
>> Also, as mentioned before, the runner can only call TearDown in cases
>> where the shutdown is in its control. If the JVM is shut down externally,
>> the runner has no chance to call TearDown. This means that while TearDown
>> is appropriate for cleaning up in-process resources (open connections,
>> etc.), it's not the right answer for cleaning up persistent resources. If
>> you rely on TearDown to delete VMs or delete files, there will be cases in
>> which those files of VMs are not deleted.
>>
>> What we are _not_ saying is that the runner is free to just ignore
>> TearDown. If the runner is explicitly destroying a DoFn object, it should
>> call TearDown.
>>
>> Reuven
>>
>>
>> On Mon, Feb 19, 2018 at 2:35 PM, Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> I also had a different understanding of the lifecycle of a DoFn.
>>>
>>> My understanding of the use case for every method in the DoFn was clear
>>> and
>>> perfectly aligned with Thomas explanation, but what I understood was
>>> that in a
>>> general terms ‘@Setup was where I got resources/prepare connections and
>>> @Teardown where I free them’, so calling Teardown seemed essential to
>>> have a
>>> complete lifecycle:
>>> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
>>>
>>> The fact that @Teardown could not be called is a new detail for me too,
>>> and I
>>> also find weird to have a method that may or not be called as part of an
>>> API,
>>> why would users implement teardown if it will not be called? In that case
>>> probably a cleaner approach would be to get rid of that method
>>> altogether, no?
>>>
>>> But well maybe that’s not so easy too, there was another point: Some user
>>> reported an issue with leaking resources using KafkaIO in the Spark
>>> runner, for
>>> ref.
>>> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
>>>
>>> In that moment my understanding was that there was something fishy
>>> because we
>>> should be calling Teardown to close correctly the connections and free
>>> the
>>> resources in case of exceptions on start/process/finish, so I filled a
>>> JIRA and
>>> fixed this by enforcing the call of teardown for the Spark runner and
>>> the Flink
>>> runner:
>>> https://issues.apache.org/jira/browse/BEAM-3187
>>> https://issues.apache.org/jira/browse/BEAM-3244
>>>
>>> As you can see not calling this method does have consequences at least
>>> for
>>> non-containerized runners. Of course a runner that uses containers could
>>> not
>>> care about cleaning the resources this way, but a long living JVM in a
>>> Hadoop
>>> environment probably won’t have the same luck. So I am not sure that
>>> having a
>>> loose semantic there is the right option, I mean, runners could simply
>>> guarantee
>>> that they call teardown and if teardown takes too long they can decide
>>> to send a
>>> signal or kill the process/container/etc and go ahead, that way at least
>>> users
>>> would have a motivation to implement the teardown method, otherwise it
>>> doesn’t
>>> make any sense to have it (API wise).
>>>
>>> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov <ki...@google.com>
>>> wrote:
>>> > Romain, would it be fair to say that currently the goal of your
>>> > participation in this discussion is to identify situations where
>>> @Teardown
>>> > in principle could have been called, but some of the current runners
>>> don't
>>> > make a good enough effort to call it? If yes - as I said before,
>>> please, by
>>> > all means, file bugs of the form "Runner X doesn't call @Teardown in
>>> > situation Y" if you're aware of any, and feel free to send PRs fixing
>>> runner
>>> > X to reliably call @Teardown in situation Y. I think we all agree that
>>> this
>>> > would be a good improvement.
>>> >
>>> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau <
>>> rmannibucau@gmail.com>
>>> > wrote:
>>> >>
>>> >>
>>> >>
>>> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
>>> >>
>>> >>
>>> >>
>>> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
>>> >> <rm...@gmail.com> wrote:
>>> >>>
>>> >>>
>>> >>>
>>> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>>> >>>
>>> >>> How do you call teardown? There are cases in which the Java code
>>> gets no
>>> >>> indication that the restart is happening (e.g. cases where the
>>> machine
>>> >>> itself is taken down)
>>> >>>
>>> >>>
>>> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
>>> Crashes
>>> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
>>> shutdown
>>> >>> with a hook worse case.
>>> >>
>>> >>
>>> >> What you say here is simply not true.
>>> >>
>>> >> There are many scenarios in which workers shutdown with no
>>> opportunity for
>>> >> any sort of shutdown hook. Sometimes the entire machine gets
>>> shutdown, and
>>> >> not even the OS will have much of a chance to do anything. At scale
>>> this
>>> >> will happen with some regularity, and a distributed system that
>>> assumes this
>>> >> will not happen is a poor distributed system.
>>> >>
>>> >>
>>> >> This is part of the infra and there is no reason the machine is
>>> shutdown
>>> >> without shutting down what runs on it before except if it is a bug in
>>> the
>>> >> software or setup. I can hear you maybe dont do it everywhere but
>>> there is
>>> >> no blocker to do it. Means you can shutdown the machines and guarantee
>>> >> teardown is called.
>>> >>
>>> >> Where i go is simply that it is doable and beam sdk core can assume
>>> setup
>>> >> is well done. If there is a best effort downside due to that - with
>>> the
>>> >> meaning you defined - it is an impl bug or a user installation issue.
>>> >>
>>> >> Technically all is true.
>>> >>
>>> >> What can prevent teardown is a hardware failure or so. This is fine
>>> and
>>> >> doesnt need to be in doc since it is life in IT and obvious or must
>>> be very
>>> >> explicit to avoid current ambiguity.
>>> >>
>>> >>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <
>>> rmannibucau@gmail.com>
>>> >>> wrote:
>>> >>>>
>>> >>>> Restarting doesnt mean you dont call teardown. Except a bug there
>>> is no
>>> >>>> reason - technically - it happens, no reason.
>>> >>>>
>>> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>>> >>>>>
>>> >>>>> Workers restarting is not a bug, it's standard often expected.
>>> >>>>>
>>> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
>>> >>>>> <rm...@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>> >>>>>> (procedure)
>>> >>>>>>
>>> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com>
>>> a
>>> >>>>>> écrit :
>>> >>>>>>>
>>> >>>>>>> So what would you like to happen if there is a crash? The DoFn
>>> >>>>>>> instance no longer exists because the JVM it ran on no longer
>>> exists. What
>>> >>>>>>> should Teardown be called on?
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
>>> >>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>
>>> >>>>>>>> This is what i want and not 999999 teardowns for 1000000 setups
>>> >>>>>>>> until there is an unexpected crash (= a bug).
>>> >>>>>>>>
>>> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a
>>> écrit :
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
>>> >>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
>>> >>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads
>>> to
>>> >>>>>>>>>>>> the same pattern, the teardown is just a "if (iCreatedThem)
>>> releaseThem();"
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
>>> workers,
>>> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
>>> created per
>>> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
>>> threads on each
>>> >>>>>>>>>>> worker.
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> Nop was the other way around, in this case on AWS you can get
>>> 256
>>> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So when
>>> you compute the
>>> >>>>>>>>>> distribution you allocate to some fn the role to own the
>>> instance lookup and
>>> >>>>>>>>>> releasing.
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> I still don't understand. Let's be more precise. If you write
>>> the
>>> >>>>>>>>> following code:
>>> >>>>>>>>>
>>> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>> >>>>>>>>>
>>> >>>>>>>>> There is no way to control how many instances of MyDoFn are
>>> >>>>>>>>> created. The runner might decided to create a million
>>> instances of this
>>> >>>>>>>>> class across your worker pool, which means that you will get a
>>> million Setup
>>> >>>>>>>>> and Teardown calls.
>>> >>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> Anyway this was just an example of an external resource you
>>> must
>>> >>>>>>>>>> release. Real topic is that beam should define asap a
>>> guaranteed generic
>>> >>>>>>>>>> lifecycle to let user embrace its programming model.
>>> >>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> @Eugene:
>>> >>>>>>>>>>>> 1. wait logic is about passing the value which is not always
>>> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
>>> >>>>>>>>>>>> core). This API defines a *container* API and therefore
>>> implies bean
>>> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the
>>> sources and dofn (not
>>> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> A. Source
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
>>> >>>>>>>>>>>> estimateSize and split. As an user you can expect both to
>>> be called on the
>>> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
>>> cost(s) twice.
>>> >>>>>>>>>>>> Concretely:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> connect()
>>> >>>>>>>>>>>> try {
>>> >>>>>>>>>>>>   estimateSize()
>>> >>>>>>>>>>>>   split()
>>> >>>>>>>>>>>> } finally {
>>> >>>>>>>>>>>>   disconnect()
>>> >>>>>>>>>>>> }
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> connect()
>>> >>>>>>>>>>>> try {
>>> >>>>>>>>>>>>   estimateSize()
>>> >>>>>>>>>>>> } finally {
>>> >>>>>>>>>>>>   disconnect()
>>> >>>>>>>>>>>> }
>>> >>>>>>>>>>>> connect()
>>> >>>>>>>>>>>> try {
>>> >>>>>>>>>>>>   split()
>>> >>>>>>>>>>>> } finally {
>>> >>>>>>>>>>>>   disconnect()
>>> >>>>>>>>>>>> }
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> + a workaround with an internal estimate size since this
>>> >>>>>>>>>>>> primitive is often called in split but you dont want to
>>> connect twice in the
>>> >>>>>>>>>>>> second phase.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an
>>> API to
>>> >>>>>>>>>>>> implement sources which initializes the source bean and
>>> destroys it.
>>> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
>>> However
>>> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building
>>> any API on top of
>>> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you
>>> hit the exact same
>>> >>>>>>>>>>>> issues - check how IO are implemented, the static utilities
>>> which create
>>> >>>>>>>>>>>> volatile connections preventing to reuse existing
>>> connection in a single
>>> >>>>>>>>>>>> method
>>> >>>>>>>>>>>> (https://github.com/apache/beam/blob/master/sdks/java/io/
>>> elasticsearch/src/main/java/org/apache/beam/sdk/io/
>>> elasticsearch/ElasticsearchIO.java#L862).
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Same logic applies to the reader which is then created.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> B. DoFn & SDF
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime:
>>> init();
>>> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and
>>> that it is
>>> >>>>>>>>>>>> executed on the exact same instance to be able to be
>>> stateful at that level
>>> >>>>>>>>>>>> for expensive connections/operations/flow state handling.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> As you mentionned with the million example, this sequence
>>> should
>>> >>>>>>>>>>>> happen for each single instance so 1M times for your
>>> example.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>> >>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore
>>> it creates way
>>> >>>>>>>>>>>> more instances and requires to have a way more
>>> strict/explicit definition of
>>> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since
>>> beam handles the
>>> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
>>> init/destroy hooks
>>> >>>>>>>>>>>> (setup/teardown) which can be stateful.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
>>> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles. Since
>>> bundles size is
>>> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to
>>> be able to reuse
>>> >>>>>>>>>>>> a connection instance to not correct performances. Now with
>>> the SDF and the
>>> >>>>>>>>>>>> split increase, how do you handle the pool size? Generally
>>> in batch you use
>>> >>>>>>>>>>>> a single connection per thread to avoid to consume all
>>> database connections.
>>> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2. use
>>> a pool a bit
>>> >>>>>>>>>>>> higher but multiplied by the number of beans you will
>>> likely x2 or 3 the
>>> >>>>>>>>>>>> connection count and make the execution fail with "no more
>>> connection
>>> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still
>>> have to have a
>>> >>>>>>>>>>>> reliable teardown by pool instance (close() generally) to
>>> ensure you release
>>> >>>>>>>>>>>> the pool and don't leak the connection information in the
>>> JVM. In all case
>>> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if you
>>> fake to get
>>> >>>>>>>>>>>> connections with bundles.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF
>>> imply
>>> >>>>>>>>>>>> all the current issues with the loose definition of the
>>> bean lifecycles at
>>> >>>>>>>>>>>> an exponential level, nothing else.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Romain Manni-Bucau
>>> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
>>> >>>>>>>>>>>> <ki...@google.com>:
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning
>>> can be
>>> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in
>>> the thread above,
>>> >>>>>>>>>>>>> and I believe it should become the canonical way to do
>>> that.
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main author
>>> of
>>> >>>>>>>>>>>>> most design documents related to SDF and of its
>>> implementation in the Java
>>> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to
>>> the topic of
>>> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
>>> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and
>>> clean the api.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>> >>>>>>>>>>>>>> transforms?
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <
>>> bchambers@apache.org>
>>> >>>>>>>>>>>>>> a écrit :
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
>>> DoFn's
>>> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary,
>>> it is around an
>>> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
>>> discussions/proposals
>>> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
>>> haven't been
>>> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a
>>> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
>>> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the chance
>>> of
>>> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
>>> destination).
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many
>>> workers,
>>> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>> >>>>>>>>>>>>>>> The move step should only happen once, so on one worker.
>>> This
>>> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some
>>> stuff done to ensure it
>>> >>>>>>>>>>>>>>> runs on one worker.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
>>> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule some
>>> cleanup work for
>>> >>>>>>>>>>>>>>> when the transform is "done". In batch this is
>>> relatively straightforward,
>>> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems,
>>> such as BigQuery
>>> >>>>>>>>>>>>>>> sink leaving files around that have failed to import
>>> into BigQuery.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you want
>>> to
>>> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to
>>> wait until the end of
>>> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until you
>>> know nobody will
>>> >>>>>>>>>>>>>>> need the resource anymore.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API,
>>> where
>>> >>>>>>>>>>>>>>> you could have a transform that output resource objects.
>>> Each resource
>>> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
>>> would be something
>>> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
>>> resource, and what
>>> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as
>>> that part of the
>>> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no longer
>>> need the resources,
>>> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at pipeline
>>> shutdown, or
>>> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Would something like this be a better fit for your use
>>> case?
>>> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
>>> sufficient?
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread
>>> of a worker - has
>>> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
>>> collection.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>>> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup
>>> before
>>> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has
>>> the instance before it
>>> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls teardown.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> It is as simple as it.
>>> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not
>>> self
>>> >>>>>>>>>>>>>>>> contained to implement basic transforms.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com>
>>> a
>>> >>>>>>>>>>>>>>>> écrit :
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
>>> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track.
>>> Rather
>>> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing
>>> methods -- which have been
>>> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would
>>> be helpful to focus on
>>> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something with
>>> different semantics.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
>>> trying
>>> >>>>>>>>>>>>>>>>>> to do):
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>> >>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline.
>>> If this is the case,
>>> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized
>>> once (and not once per
>>> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you
>>> know when the pipeline
>>> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches
>>> step X", then what
>>> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the
>>> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a
>>> jvm shutdown)
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> I'm really not following what this means.
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and
>>> each
>>> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of
>>> the same DoFn). How
>>> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 =
>>> 1M cleanups) and when
>>> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is
>>> shut down? When an
>>> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be
>>> temporary - may be
>>> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
>>> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
>>> methods are not a good fit
>>> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources
>>> within the DoFn), you could
>>> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it
>>> produced. For instance:
>>> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token
>>> that
>>> >>>>>>>>>>>>>>>>>> stores information about resources)
>>> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent
>>> retries
>>> >>>>>>>>>>>>>>>>>> from changing resource IDs)
>>> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
>>> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data it
>>> is
>>> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in
>>> use or have been finished
>>> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
>>> important to ensuring
>>> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is
>>> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are trying
>>> to accomplish? That
>>> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
>>> existing options and
>>> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all
>>> cases but
>>> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
>>> handling  except i
>>> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you
>>> cant put any unified
>>> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard
>>> to integrate or to use
>>> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
>>> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> -- Ben
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
>>> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of
>>> the
>>> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
>>> machine.
>>> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
>>> >>>>>>>>>>>>>>>>>>>> except in various situations where it's logically
>>> impossible or impractical
>>> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or you
>>> can list some of the
>>> >>>>>>>>>>>>>>>>>>>> examples above.
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> Sounds ok to me
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
>>> called - it's not just
>>> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
>>> important (e.g. cleaning up
>>> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a
>>> large number of VMs you
>>> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of
>>> the other methods that
>>> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a
>>> cost, e.g. no
>>> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so
>>> not
>>> >>>>>>>>>>>>>>>>>>> which which other method you speak about. Concretely
>>> if you make it really
>>> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me -
>>> then users can use it
>>> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but
>>> it is unexpected and
>>> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a
>>> manual - or auto if fancy
>>> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all the
>>> difference and impacts
>>> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that.
>>> It
>>> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is
>>> what triggered this thread.
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state
>>> prevents
>>> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very
>>> badly and wrongly
>>> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github |
>>> LinkedIn |
>>> >>>>>>>>>>>>>>>>>>>>> Book
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
>>> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call
>>> it:
>>> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic
>>> crash), and in a number of
>>> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker
>>> container has crashed (eg user code
>>> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over JNI
>>> and it segfaulted), JVM
>>> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the
>>> worker has lost network
>>> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't
>>> be able to do anything
>>> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible
>>> VM and it was preempted by
>>> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice or
>>> if the worker was too busy
>>> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
>>> functions) until the preemption
>>> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying hardware
>>> simply failed (which
>>> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many other
>>> conditions.
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to
>>> describe
>>> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for
>>> cases where you observed a
>>> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it
>>> was possible to call it but
>>> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
>>> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
>>> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
>>> Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need
>>> (e.g.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and
>>> it requires the following
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup
>>> logic and the following processing
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform
>>> requiring a
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
>>> since size is not controlled.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release
>>> the connection since it is a best
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
>>> makes you pay a lot - aws ;) - or
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
>>> concurrent limit.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
>>> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not
>>> called then nothing else can be
>>> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What
>>> AWS service are you thinking of
>>> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when
>>> everything at the other end has died?
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless
>>> but
>>> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
>>> closing exchanges which are not
>>> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
>>> services
>>> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup
>>> and closing them at the end.
>>> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines
>>> and money. You can say it can be
>>> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant
>>> handle its
>>> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale
>>> for generic pipelines and if
>>> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown -
>>> ignoring
>>> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
>>> handled by any human system?
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
>>> handle it? Is it due to some
>>> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented
>>> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what
>>> kind of change you're asking
>>> >>>>>>>>>>>>>>>>>>>>>>>> for.
>>> >>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is
>>> not
>>> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>> >>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct
>>> runner
>>> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he
>>> will get a different behavior
>>> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know
>>> what the IOs he composes use
>>> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product
>>> than he must be handled IMHO.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in
>>> big
>>> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore
>>> what people did for years and do
>>> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
>>> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the
>>> execution of teardown. Then we
>>> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there is a
>>> technical reason we cant we
>>> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the api. I
>>> know spark and flink can, any
>>> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through
>>> java
>>> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment
>>> (beam enclosing software) is fully
>>> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
>>> uncontrolled. Only case where it is not
>>> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned by a
>>> vendor and never installed on
>>> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it belongd
>>> to the vendor to handle beam
>>> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
>>> vendor - otherwise all
>>> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should be
>>> made optional right?
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
>>> distributed
>>> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and
>>> defined lifecycle.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>
>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

+1, thks Eugene


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-20 0:24 GMT+01:00 Eugene Kirpichov <ki...@google.com>:

> I've sent out a PR editing the Javadoc https://github.com/
> apache/beam/pull/4711 . Hopefully, that should be sufficient.
>
> On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax <re...@google.com> wrote:
>
>> Ismael, your understanding is appropriate for FinishBundle.
>>
>> One basic issue with this understanding, is that the lifecycle of a DoFn
>> is much longer than a single bundle (which I think you expressed by adding
>> the *s). How long the DoFn lives is not defined. In fact a runner is
>> completely free to decide that it will _never_ destroy the DoFn, in which
>> case TearDown is never called simply because the DoFn was never torn down.
>>
>> Also, as mentioned before, the runner can only call TearDown in cases
>> where the shutdown is in its control. If the JVM is shut down externally,
>> the runner has no chance to call TearDown. This means that while TearDown
>> is appropriate for cleaning up in-process resources (open connections,
>> etc.), it's not the right answer for cleaning up persistent resources. If
>> you rely on TearDown to delete VMs or delete files, there will be cases in
>> which those files of VMs are not deleted.
>>
>> What we are _not_ saying is that the runner is free to just ignore
>> TearDown. If the runner is explicitly destroying a DoFn object, it should
>> call TearDown.
>>
>> Reuven
>>
>>
>> On Mon, Feb 19, 2018 at 2:35 PM, Ismaël Mejía <ie...@gmail.com> wrote:
>>
>>> I also had a different understanding of the lifecycle of a DoFn.
>>>
>>> My understanding of the use case for every method in the DoFn was clear
>>> and
>>> perfectly aligned with Thomas explanation, but what I understood was
>>> that in a
>>> general terms ‘@Setup was where I got resources/prepare connections and
>>> @Teardown where I free them’, so calling Teardown seemed essential to
>>> have a
>>> complete lifecycle:
>>> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
>>>
>>> The fact that @Teardown could not be called is a new detail for me too,
>>> and I
>>> also find weird to have a method that may or not be called as part of an
>>> API,
>>> why would users implement teardown if it will not be called? In that case
>>> probably a cleaner approach would be to get rid of that method
>>> altogether, no?
>>>
>>> But well maybe that’s not so easy too, there was another point: Some user
>>> reported an issue with leaking resources using KafkaIO in the Spark
>>> runner, for
>>> ref.
>>> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
>>>
>>> In that moment my understanding was that there was something fishy
>>> because we
>>> should be calling Teardown to close correctly the connections and free
>>> the
>>> resources in case of exceptions on start/process/finish, so I filled a
>>> JIRA and
>>> fixed this by enforcing the call of teardown for the Spark runner and
>>> the Flink
>>> runner:
>>> https://issues.apache.org/jira/browse/BEAM-3187
>>> https://issues.apache.org/jira/browse/BEAM-3244
>>>
>>> As you can see not calling this method does have consequences at least
>>> for
>>> non-containerized runners. Of course a runner that uses containers could
>>> not
>>> care about cleaning the resources this way, but a long living JVM in a
>>> Hadoop
>>> environment probably won’t have the same luck. So I am not sure that
>>> having a
>>> loose semantic there is the right option, I mean, runners could simply
>>> guarantee
>>> that they call teardown and if teardown takes too long they can decide
>>> to send a
>>> signal or kill the process/container/etc and go ahead, that way at least
>>> users
>>> would have a motivation to implement the teardown method, otherwise it
>>> doesn’t
>>> make any sense to have it (API wise).
>>>
>>> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov <ki...@google.com>
>>> wrote:
>>> > Romain, would it be fair to say that currently the goal of your
>>> > participation in this discussion is to identify situations where
>>> @Teardown
>>> > in principle could have been called, but some of the current runners
>>> don't
>>> > make a good enough effort to call it? If yes - as I said before,
>>> please, by
>>> > all means, file bugs of the form "Runner X doesn't call @Teardown in
>>> > situation Y" if you're aware of any, and feel free to send PRs fixing
>>> runner
>>> > X to reliably call @Teardown in situation Y. I think we all agree that
>>> this
>>> > would be a good improvement.
>>> >
>>> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau <
>>> rmannibucau@gmail.com>
>>> > wrote:
>>> >>
>>> >>
>>> >>
>>> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
>>> >>
>>> >>
>>> >>
>>> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
>>> >> <rm...@gmail.com> wrote:
>>> >>>
>>> >>>
>>> >>>
>>> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>>> >>>
>>> >>> How do you call teardown? There are cases in which the Java code
>>> gets no
>>> >>> indication that the restart is happening (e.g. cases where the
>>> machine
>>> >>> itself is taken down)
>>> >>>
>>> >>>
>>> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
>>> Crashes
>>> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
>>> shutdown
>>> >>> with a hook worse case.
>>> >>
>>> >>
>>> >> What you say here is simply not true.
>>> >>
>>> >> There are many scenarios in which workers shutdown with no
>>> opportunity for
>>> >> any sort of shutdown hook. Sometimes the entire machine gets
>>> shutdown, and
>>> >> not even the OS will have much of a chance to do anything. At scale
>>> this
>>> >> will happen with some regularity, and a distributed system that
>>> assumes this
>>> >> will not happen is a poor distributed system.
>>> >>
>>> >>
>>> >> This is part of the infra and there is no reason the machine is
>>> shutdown
>>> >> without shutting down what runs on it before except if it is a bug in
>>> the
>>> >> software or setup. I can hear you maybe dont do it everywhere but
>>> there is
>>> >> no blocker to do it. Means you can shutdown the machines and guarantee
>>> >> teardown is called.
>>> >>
>>> >> Where i go is simply that it is doable and beam sdk core can assume
>>> setup
>>> >> is well done. If there is a best effort downside due to that - with
>>> the
>>> >> meaning you defined - it is an impl bug or a user installation issue.
>>> >>
>>> >> Technically all is true.
>>> >>
>>> >> What can prevent teardown is a hardware failure or so. This is fine
>>> and
>>> >> doesnt need to be in doc since it is life in IT and obvious or must
>>> be very
>>> >> explicit to avoid current ambiguity.
>>> >>
>>> >>
>>> >>>
>>> >>>
>>> >>>
>>> >>>
>>> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <
>>> rmannibucau@gmail.com>
>>> >>> wrote:
>>> >>>>
>>> >>>> Restarting doesnt mean you dont call teardown. Except a bug there
>>> is no
>>> >>>> reason - technically - it happens, no reason.
>>> >>>>
>>> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>>> >>>>>
>>> >>>>> Workers restarting is not a bug, it's standard often expected.
>>> >>>>>
>>> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
>>> >>>>> <rm...@gmail.com> wrote:
>>> >>>>>>
>>> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>> >>>>>> (procedure)
>>> >>>>>>
>>> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com>
>>> a
>>> >>>>>> écrit :
>>> >>>>>>>
>>> >>>>>>> So what would you like to happen if there is a crash? The DoFn
>>> >>>>>>> instance no longer exists because the JVM it ran on no longer
>>> exists. What
>>> >>>>>>> should Teardown be called on?
>>> >>>>>>>
>>> >>>>>>>
>>> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
>>> >>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>
>>> >>>>>>>> This is what i want and not 999999 teardowns for 1000000 setups
>>> >>>>>>>> until there is an unexpected crash (= a bug).
>>> >>>>>>>>
>>> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a
>>> écrit :
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
>>> >>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
>>> >>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads
>>> to
>>> >>>>>>>>>>>> the same pattern, the teardown is just a "if (iCreatedThem)
>>> releaseThem();"
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
>>> workers,
>>> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
>>> created per
>>> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
>>> threads on each
>>> >>>>>>>>>>> worker.
>>> >>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> Nop was the other way around, in this case on AWS you can get
>>> 256
>>> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So when
>>> you compute the
>>> >>>>>>>>>> distribution you allocate to some fn the role to own the
>>> instance lookup and
>>> >>>>>>>>>> releasing.
>>> >>>>>>>>>
>>> >>>>>>>>>
>>> >>>>>>>>> I still don't understand. Let's be more precise. If you write
>>> the
>>> >>>>>>>>> following code:
>>> >>>>>>>>>
>>> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>> >>>>>>>>>
>>> >>>>>>>>> There is no way to control how many instances of MyDoFn are
>>> >>>>>>>>> created. The runner might decided to create a million
>>> instances of this
>>> >>>>>>>>> class across your worker pool, which means that you will get a
>>> million Setup
>>> >>>>>>>>> and Teardown calls.
>>> >>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>> Anyway this was just an example of an external resource you
>>> must
>>> >>>>>>>>>> release. Real topic is that beam should define asap a
>>> guaranteed generic
>>> >>>>>>>>>> lifecycle to let user embrace its programming model.
>>> >>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> @Eugene:
>>> >>>>>>>>>>>> 1. wait logic is about passing the value which is not always
>>> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
>>> >>>>>>>>>>>> core). This API defines a *container* API and therefore
>>> implies bean
>>> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the
>>> sources and dofn (not
>>> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> A. Source
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
>>> >>>>>>>>>>>> estimateSize and split. As an user you can expect both to
>>> be called on the
>>> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
>>> cost(s) twice.
>>> >>>>>>>>>>>> Concretely:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> connect()
>>> >>>>>>>>>>>> try {
>>> >>>>>>>>>>>>   estimateSize()
>>> >>>>>>>>>>>>   split()
>>> >>>>>>>>>>>> } finally {
>>> >>>>>>>>>>>>   disconnect()
>>> >>>>>>>>>>>> }
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> connect()
>>> >>>>>>>>>>>> try {
>>> >>>>>>>>>>>>   estimateSize()
>>> >>>>>>>>>>>> } finally {
>>> >>>>>>>>>>>>   disconnect()
>>> >>>>>>>>>>>> }
>>> >>>>>>>>>>>> connect()
>>> >>>>>>>>>>>> try {
>>> >>>>>>>>>>>>   split()
>>> >>>>>>>>>>>> } finally {
>>> >>>>>>>>>>>>   disconnect()
>>> >>>>>>>>>>>> }
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> + a workaround with an internal estimate size since this
>>> >>>>>>>>>>>> primitive is often called in split but you dont want to
>>> connect twice in the
>>> >>>>>>>>>>>> second phase.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an
>>> API to
>>> >>>>>>>>>>>> implement sources which initializes the source bean and
>>> destroys it.
>>> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
>>> However
>>> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building
>>> any API on top of
>>> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you
>>> hit the exact same
>>> >>>>>>>>>>>> issues - check how IO are implemented, the static utilities
>>> which create
>>> >>>>>>>>>>>> volatile connections preventing to reuse existing
>>> connection in a single
>>> >>>>>>>>>>>> method
>>> >>>>>>>>>>>> (https://github.com/apache/beam/blob/master/sdks/java/io/
>>> elasticsearch/src/main/java/org/apache/beam/sdk/io/
>>> elasticsearch/ElasticsearchIO.java#L862).
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Same logic applies to the reader which is then created.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> B. DoFn & SDF
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime:
>>> init();
>>> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and
>>> that it is
>>> >>>>>>>>>>>> executed on the exact same instance to be able to be
>>> stateful at that level
>>> >>>>>>>>>>>> for expensive connections/operations/flow state handling.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> As you mentionned with the million example, this sequence
>>> should
>>> >>>>>>>>>>>> happen for each single instance so 1M times for your
>>> example.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>> >>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore
>>> it creates way
>>> >>>>>>>>>>>> more instances and requires to have a way more
>>> strict/explicit definition of
>>> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since
>>> beam handles the
>>> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
>>> init/destroy hooks
>>> >>>>>>>>>>>> (setup/teardown) which can be stateful.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
>>> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles. Since
>>> bundles size is
>>> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to
>>> be able to reuse
>>> >>>>>>>>>>>> a connection instance to not correct performances. Now with
>>> the SDF and the
>>> >>>>>>>>>>>> split increase, how do you handle the pool size? Generally
>>> in batch you use
>>> >>>>>>>>>>>> a single connection per thread to avoid to consume all
>>> database connections.
>>> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2. use
>>> a pool a bit
>>> >>>>>>>>>>>> higher but multiplied by the number of beans you will
>>> likely x2 or 3 the
>>> >>>>>>>>>>>> connection count and make the execution fail with "no more
>>> connection
>>> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still
>>> have to have a
>>> >>>>>>>>>>>> reliable teardown by pool instance (close() generally) to
>>> ensure you release
>>> >>>>>>>>>>>> the pool and don't leak the connection information in the
>>> JVM. In all case
>>> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if you
>>> fake to get
>>> >>>>>>>>>>>> connections with bundles.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF
>>> imply
>>> >>>>>>>>>>>> all the current issues with the loose definition of the
>>> bean lifecycles at
>>> >>>>>>>>>>>> an exponential level, nothing else.
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> Romain Manni-Bucau
>>> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
>>> >>>>>>>>>>>> <ki...@google.com>:
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning
>>> can be
>>> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in
>>> the thread above,
>>> >>>>>>>>>>>>> and I believe it should become the canonical way to do
>>> that.
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main author
>>> of
>>> >>>>>>>>>>>>> most design documents related to SDF and of its
>>> implementation in the Java
>>> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to
>>> the topic of
>>> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>>
>>> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
>>> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and
>>> clean the api.
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>> >>>>>>>>>>>>>> transforms?
>>> >>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <
>>> bchambers@apache.org>
>>> >>>>>>>>>>>>>> a écrit :
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
>>> DoFn's
>>> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary,
>>> it is around an
>>> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
>>> discussions/proposals
>>> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
>>> haven't been
>>> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a
>>> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
>>> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the chance
>>> of
>>> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
>>> destination).
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many
>>> workers,
>>> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>> >>>>>>>>>>>>>>> The move step should only happen once, so on one worker.
>>> This
>>> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some
>>> stuff done to ensure it
>>> >>>>>>>>>>>>>>> runs on one worker.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
>>> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule some
>>> cleanup work for
>>> >>>>>>>>>>>>>>> when the transform is "done". In batch this is
>>> relatively straightforward,
>>> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems,
>>> such as BigQuery
>>> >>>>>>>>>>>>>>> sink leaving files around that have failed to import
>>> into BigQuery.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you want
>>> to
>>> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to
>>> wait until the end of
>>> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until you
>>> know nobody will
>>> >>>>>>>>>>>>>>> need the resource anymore.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API,
>>> where
>>> >>>>>>>>>>>>>>> you could have a transform that output resource objects.
>>> Each resource
>>> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
>>> would be something
>>> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
>>> resource, and what
>>> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as
>>> that part of the
>>> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no longer
>>> need the resources,
>>> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at pipeline
>>> shutdown, or
>>> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> Would something like this be a better fit for your use
>>> case?
>>> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
>>> sufficient?
>>> >>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread
>>> of a worker - has
>>> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
>>> collection.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>>> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup
>>> before
>>> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has
>>> the instance before it
>>> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls teardown.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> It is as simple as it.
>>> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not
>>> self
>>> >>>>>>>>>>>>>>>> contained to implement basic transforms.
>>> >>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com>
>>> a
>>> >>>>>>>>>>>>>>>> écrit :
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
>>> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track.
>>> Rather
>>> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing
>>> methods -- which have been
>>> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would
>>> be helpful to focus on
>>> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something with
>>> different semantics.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
>>> trying
>>> >>>>>>>>>>>>>>>>>> to do):
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>> >>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline.
>>> If this is the case,
>>> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized
>>> once (and not once per
>>> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you
>>> know when the pipeline
>>> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches
>>> step X", then what
>>> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the
>>> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a
>>> jvm shutdown)
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> I'm really not following what this means.
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and
>>> each
>>> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of
>>> the same DoFn). How
>>> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 =
>>> 1M cleanups) and when
>>> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is
>>> shut down? When an
>>> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be
>>> temporary - may be
>>> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
>>> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
>>> methods are not a good fit
>>> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources
>>> within the DoFn), you could
>>> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it
>>> produced. For instance:
>>> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token
>>> that
>>> >>>>>>>>>>>>>>>>>> stores information about resources)
>>> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent
>>> retries
>>> >>>>>>>>>>>>>>>>>> from changing resource IDs)
>>> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
>>> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data it
>>> is
>>> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in
>>> use or have been finished
>>> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
>>> important to ensuring
>>> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is
>>> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are trying
>>> to accomplish? That
>>> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
>>> existing options and
>>> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all
>>> cases but
>>> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
>>> handling  except i
>>> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you
>>> cant put any unified
>>> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard
>>> to integrate or to use
>>> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
>>> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> -- Ben
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
>>> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of
>>> the
>>> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
>>> machine.
>>> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
>>> >>>>>>>>>>>>>>>>>>>> except in various situations where it's logically
>>> impossible or impractical
>>> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or you
>>> can list some of the
>>> >>>>>>>>>>>>>>>>>>>> examples above.
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> Sounds ok to me
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
>>> called - it's not just
>>> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
>>> important (e.g. cleaning up
>>> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a
>>> large number of VMs you
>>> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of
>>> the other methods that
>>> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a
>>> cost, e.g. no
>>> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so
>>> not
>>> >>>>>>>>>>>>>>>>>>> which which other method you speak about. Concretely
>>> if you make it really
>>> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me -
>>> then users can use it
>>> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but
>>> it is unexpected and
>>> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a
>>> manual - or auto if fancy
>>> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all the
>>> difference and impacts
>>> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
>>> >>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that.
>>> It
>>> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is
>>> what triggered this thread.
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state
>>> prevents
>>> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very
>>> badly and wrongly
>>> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github |
>>> LinkedIn |
>>> >>>>>>>>>>>>>>>>>>>>> Book
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
>>> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call
>>> it:
>>> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic
>>> crash), and in a number of
>>> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker
>>> container has crashed (eg user code
>>> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over JNI
>>> and it segfaulted), JVM
>>> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the
>>> worker has lost network
>>> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't
>>> be able to do anything
>>> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible
>>> VM and it was preempted by
>>> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice or
>>> if the worker was too busy
>>> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
>>> functions) until the preemption
>>> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying hardware
>>> simply failed (which
>>> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many other
>>> conditions.
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to
>>> describe
>>> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for
>>> cases where you observed a
>>> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it
>>> was possible to call it but
>>> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
>>> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
>>> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
>>> Manni-Bucau
>>> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need
>>> (e.g.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and
>>> it requires the following
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup
>>> logic and the following processing
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform
>>> requiring a
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
>>> since size is not controlled.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release
>>> the connection since it is a best
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
>>> makes you pay a lot - aws ;) - or
>>> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
>>> concurrent limit.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
>>> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not
>>> called then nothing else can be
>>> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What
>>> AWS service are you thinking of
>>> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when
>>> everything at the other end has died?
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless
>>> but
>>> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
>>> closing exchanges which are not
>>> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
>>> services
>>> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup
>>> and closing them at the end.
>>> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines
>>> and money. You can say it can be
>>> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant
>>> handle its
>>> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale
>>> for generic pipelines and if
>>> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown -
>>> ignoring
>>> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
>>> handled by any human system?
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
>>> handle it? Is it due to some
>>> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented
>>> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what
>>> kind of change you're asking
>>> >>>>>>>>>>>>>>>>>>>>>>>> for.
>>> >>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is
>>> not
>>> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>> >>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct
>>> runner
>>> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he
>>> will get a different behavior
>>> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know
>>> what the IOs he composes use
>>> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product
>>> than he must be handled IMHO.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in
>>> big
>>> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore
>>> what people did for years and do
>>> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
>>> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the
>>> execution of teardown. Then we
>>> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there is a
>>> technical reason we cant we
>>> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the api. I
>>> know spark and flink can, any
>>> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through
>>> java
>>> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment
>>> (beam enclosing software) is fully
>>> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
>>> uncontrolled. Only case where it is not
>>> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned by a
>>> vendor and never installed on
>>> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it belongd
>>> to the vendor to handle beam
>>> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
>>> vendor - otherwise all
>>> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should be
>>> made optional right?
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
>>> distributed
>>> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and
>>> defined lifecycle.
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>>>>>>
>>> >>>>>>>>>>>>
>>> >>>>>>>>>>>
>>> >>>>>>>>>>
>>> >>>>>>>>>
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>
>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

I've sent out a PR editing the Javadoc
https://github.com/apache/beam/pull/4711 . Hopefully, that should be
sufficient.

On Mon, Feb 19, 2018 at 3:20 PM Reuven Lax <re...@google.com> wrote:

> Ismael, your understanding is appropriate for FinishBundle.
>
> One basic issue with this understanding, is that the lifecycle of a DoFn
> is much longer than a single bundle (which I think you expressed by adding
> the *s). How long the DoFn lives is not defined. In fact a runner is
> completely free to decide that it will _never_ destroy the DoFn, in which
> case TearDown is never called simply because the DoFn was never torn down.
>
> Also, as mentioned before, the runner can only call TearDown in cases
> where the shutdown is in its control. If the JVM is shut down externally,
> the runner has no chance to call TearDown. This means that while TearDown
> is appropriate for cleaning up in-process resources (open connections,
> etc.), it's not the right answer for cleaning up persistent resources. If
> you rely on TearDown to delete VMs or delete files, there will be cases in
> which those files of VMs are not deleted.
>
> What we are _not_ saying is that the runner is free to just ignore
> TearDown. If the runner is explicitly destroying a DoFn object, it should
> call TearDown.
>
> Reuven
>
>
> On Mon, Feb 19, 2018 at 2:35 PM, Ismaël Mejía <ie...@gmail.com> wrote:
>
>> I also had a different understanding of the lifecycle of a DoFn.
>>
>> My understanding of the use case for every method in the DoFn was clear
>> and
>> perfectly aligned with Thomas explanation, but what I understood was that
>> in a
>> general terms ‘@Setup was where I got resources/prepare connections and
>> @Teardown where I free them’, so calling Teardown seemed essential to
>> have a
>> complete lifecycle:
>> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
>>
>> The fact that @Teardown could not be called is a new detail for me too,
>> and I
>> also find weird to have a method that may or not be called as part of an
>> API,
>> why would users implement teardown if it will not be called? In that case
>> probably a cleaner approach would be to get rid of that method
>> altogether, no?
>>
>> But well maybe that’s not so easy too, there was another point: Some user
>> reported an issue with leaking resources using KafkaIO in the Spark
>> runner, for
>> ref.
>> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
>>
>> In that moment my understanding was that there was something fishy
>> because we
>> should be calling Teardown to close correctly the connections and free the
>> resources in case of exceptions on start/process/finish, so I filled a
>> JIRA and
>> fixed this by enforcing the call of teardown for the Spark runner and the
>> Flink
>> runner:
>> https://issues.apache.org/jira/browse/BEAM-3187
>> https://issues.apache.org/jira/browse/BEAM-3244
>>
>> As you can see not calling this method does have consequences at least for
>> non-containerized runners. Of course a runner that uses containers could
>> not
>> care about cleaning the resources this way, but a long living JVM in a
>> Hadoop
>> environment probably won’t have the same luck. So I am not sure that
>> having a
>> loose semantic there is the right option, I mean, runners could simply
>> guarantee
>> that they call teardown and if teardown takes too long they can decide to
>> send a
>> signal or kill the process/container/etc and go ahead, that way at least
>> users
>> would have a motivation to implement the teardown method, otherwise it
>> doesn’t
>> make any sense to have it (API wise).
>>
>> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov <ki...@google.com>
>> wrote:
>> > Romain, would it be fair to say that currently the goal of your
>> > participation in this discussion is to identify situations where
>> @Teardown
>> > in principle could have been called, but some of the current runners
>> don't
>> > make a good enough effort to call it? If yes - as I said before,
>> please, by
>> > all means, file bugs of the form "Runner X doesn't call @Teardown in
>> > situation Y" if you're aware of any, and feel free to send PRs fixing
>> runner
>> > X to reliably call @Teardown in situation Y. I think we all agree that
>> this
>> > would be a good improvement.
>> >
>> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau <
>> rmannibucau@gmail.com>
>> > wrote:
>> >>
>> >>
>> >>
>> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
>> >>
>> >>
>> >>
>> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
>> >> <rm...@gmail.com> wrote:
>> >>>
>> >>>
>> >>>
>> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>> >>>
>> >>> How do you call teardown? There are cases in which the Java code gets
>> no
>> >>> indication that the restart is happening (e.g. cases where the machine
>> >>> itself is taken down)
>> >>>
>> >>>
>> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
>> Crashes
>> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
>> shutdown
>> >>> with a hook worse case.
>> >>
>> >>
>> >> What you say here is simply not true.
>> >>
>> >> There are many scenarios in which workers shutdown with no opportunity
>> for
>> >> any sort of shutdown hook. Sometimes the entire machine gets shutdown,
>> and
>> >> not even the OS will have much of a chance to do anything. At scale
>> this
>> >> will happen with some regularity, and a distributed system that
>> assumes this
>> >> will not happen is a poor distributed system.
>> >>
>> >>
>> >> This is part of the infra and there is no reason the machine is
>> shutdown
>> >> without shutting down what runs on it before except if it is a bug in
>> the
>> >> software or setup. I can hear you maybe dont do it everywhere but
>> there is
>> >> no blocker to do it. Means you can shutdown the machines and guarantee
>> >> teardown is called.
>> >>
>> >> Where i go is simply that it is doable and beam sdk core can assume
>> setup
>> >> is well done. If there is a best effort downside due to that - with the
>> >> meaning you defined - it is an impl bug or a user installation issue.
>> >>
>> >> Technically all is true.
>> >>
>> >> What can prevent teardown is a hardware failure or so. This is fine and
>> >> doesnt need to be in doc since it is life in IT and obvious or must be
>> very
>> >> explicit to avoid current ambiguity.
>> >>
>> >>
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <
>> rmannibucau@gmail.com>
>> >>> wrote:
>> >>>>
>> >>>> Restarting doesnt mean you dont call teardown. Except a bug there is
>> no
>> >>>> reason - technically - it happens, no reason.
>> >>>>
>> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>> >>>>>
>> >>>>> Workers restarting is not a bug, it's standard often expected.
>> >>>>>
>> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
>> >>>>> <rm...@gmail.com> wrote:
>> >>>>>>
>> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>> >>>>>> (procedure)
>> >>>>>>
>> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com>
>> a
>> >>>>>> écrit :
>> >>>>>>>
>> >>>>>>> So what would you like to happen if there is a crash? The DoFn
>> >>>>>>> instance no longer exists because the JVM it ran on no longer
>> exists. What
>> >>>>>>> should Teardown be called on?
>> >>>>>>>
>> >>>>>>>
>> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
>> >>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>
>> >>>>>>>> This is what i want and not 999999 teardowns for 1000000 setups
>> >>>>>>>> until there is an unexpected crash (= a bug).
>> >>>>>>>>
>> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit
>> :
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
>> >>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
>> >>>>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads
>> to
>> >>>>>>>>>>>> the same pattern, the teardown is just a "if (iCreatedThem)
>> releaseThem();"
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
>> workers,
>> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
>> created per
>> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
>> threads on each
>> >>>>>>>>>>> worker.
>> >>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Nop was the other way around, in this case on AWS you can get
>> 256
>> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So when
>> you compute the
>> >>>>>>>>>> distribution you allocate to some fn the role to own the
>> instance lookup and
>> >>>>>>>>>> releasing.
>> >>>>>>>>>
>> >>>>>>>>>
>> >>>>>>>>> I still don't understand. Let's be more precise. If you write
>> the
>> >>>>>>>>> following code:
>> >>>>>>>>>
>> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>> >>>>>>>>>
>> >>>>>>>>> There is no way to control how many instances of MyDoFn are
>> >>>>>>>>> created. The runner might decided to create a million instances
>> of this
>> >>>>>>>>> class across your worker pool, which means that you will get a
>> million Setup
>> >>>>>>>>> and Teardown calls.
>> >>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>> Anyway this was just an example of an external resource you
>> must
>> >>>>>>>>>> release. Real topic is that beam should define asap a
>> guaranteed generic
>> >>>>>>>>>> lifecycle to let user embrace its programming model.
>> >>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> @Eugene:
>> >>>>>>>>>>>> 1. wait logic is about passing the value which is not always
>> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
>> >>>>>>>>>>>> core). This API defines a *container* API and therefore
>> implies bean
>> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the
>> sources and dofn (not
>> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> A. Source
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
>> >>>>>>>>>>>> estimateSize and split. As an user you can expect both to be
>> called on the
>> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
>> cost(s) twice.
>> >>>>>>>>>>>> Concretely:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> connect()
>> >>>>>>>>>>>> try {
>> >>>>>>>>>>>>   estimateSize()
>> >>>>>>>>>>>>   split()
>> >>>>>>>>>>>> } finally {
>> >>>>>>>>>>>>   disconnect()
>> >>>>>>>>>>>> }
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> connect()
>> >>>>>>>>>>>> try {
>> >>>>>>>>>>>>   estimateSize()
>> >>>>>>>>>>>> } finally {
>> >>>>>>>>>>>>   disconnect()
>> >>>>>>>>>>>> }
>> >>>>>>>>>>>> connect()
>> >>>>>>>>>>>> try {
>> >>>>>>>>>>>>   split()
>> >>>>>>>>>>>> } finally {
>> >>>>>>>>>>>>   disconnect()
>> >>>>>>>>>>>> }
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> + a workaround with an internal estimate size since this
>> >>>>>>>>>>>> primitive is often called in split but you dont want to
>> connect twice in the
>> >>>>>>>>>>>> second phase.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an API
>> to
>> >>>>>>>>>>>> implement sources which initializes the source bean and
>> destroys it.
>> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
>> However
>> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building
>> any API on top of
>> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you hit
>> the exact same
>> >>>>>>>>>>>> issues - check how IO are implemented, the static utilities
>> which create
>> >>>>>>>>>>>> volatile connections preventing to reuse existing connection
>> in a single
>> >>>>>>>>>>>> method
>> >>>>>>>>>>>> (
>> https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862
>> ).
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Same logic applies to the reader which is then created.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> B. DoFn & SDF
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime:
>> init();
>> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and
>> that it is
>> >>>>>>>>>>>> executed on the exact same instance to be able to be
>> stateful at that level
>> >>>>>>>>>>>> for expensive connections/operations/flow state handling.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> As you mentionned with the million example, this sequence
>> should
>> >>>>>>>>>>>> happen for each single instance so 1M times for your example.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>> >>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore it
>> creates way
>> >>>>>>>>>>>> more instances and requires to have a way more
>> strict/explicit definition of
>> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since beam
>> handles the
>> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
>> init/destroy hooks
>> >>>>>>>>>>>> (setup/teardown) which can be stateful.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
>> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles. Since
>> bundles size is
>> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to
>> be able to reuse
>> >>>>>>>>>>>> a connection instance to not correct performances. Now with
>> the SDF and the
>> >>>>>>>>>>>> split increase, how do you handle the pool size? Generally
>> in batch you use
>> >>>>>>>>>>>> a single connection per thread to avoid to consume all
>> database connections.
>> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>> pool a bit
>> >>>>>>>>>>>> higher but multiplied by the number of beans you will likely
>> x2 or 3 the
>> >>>>>>>>>>>> connection count and make the execution fail with "no more
>> connection
>> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still have
>> to have a
>> >>>>>>>>>>>> reliable teardown by pool instance (close() generally) to
>> ensure you release
>> >>>>>>>>>>>> the pool and don't leak the connection information in the
>> JVM. In all case
>> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if you
>> fake to get
>> >>>>>>>>>>>> connections with bundles.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF
>> imply
>> >>>>>>>>>>>> all the current issues with the loose definition of the bean
>> lifecycles at
>> >>>>>>>>>>>> an exponential level, nothing else.
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> Romain Manni-Bucau
>> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>> >>>>>>>>>>>>
>> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
>> >>>>>>>>>>>> <ki...@google.com>:
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can
>> be
>> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in the
>> thread above,
>> >>>>>>>>>>>>> and I believe it should become the canonical way to do that.
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main author
>> of
>> >>>>>>>>>>>>> most design documents related to SDF and of its
>> implementation in the Java
>> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to
>> the topic of
>> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>>
>> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
>> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and
>> clean the api.
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>> >>>>>>>>>>>>>> transforms?
>> >>>>>>>>>>>>>>
>> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <
>> bchambers@apache.org>
>> >>>>>>>>>>>>>> a écrit :
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
>> DoFn's
>> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary, it
>> is around an
>> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
>> discussions/proposals
>> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
>> haven't been
>> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a
>> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
>> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the chance
>> of
>> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
>> destination).
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many
>> workers,
>> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>> >>>>>>>>>>>>>>> The move step should only happen once, so on one worker.
>> This
>> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some stuff
>> done to ensure it
>> >>>>>>>>>>>>>>> runs on one worker.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
>> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule some
>> cleanup work for
>> >>>>>>>>>>>>>>> when the transform is "done". In batch this is relatively
>> straightforward,
>> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems,
>> such as BigQuery
>> >>>>>>>>>>>>>>> sink leaving files around that have failed to import into
>> BigQuery.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you want
>> to
>> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to
>> wait until the end of
>> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until you
>> know nobody will
>> >>>>>>>>>>>>>>> need the resource anymore.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API, where
>> >>>>>>>>>>>>>>> you could have a transform that output resource objects.
>> Each resource
>> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
>> would be something
>> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
>> resource, and what
>> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as
>> that part of the
>> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no longer
>> need the resources,
>> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at pipeline
>> shutdown, or
>> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> Would something like this be a better fit for your use
>> case?
>> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
>> sufficient?
>> >>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
>> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread
>> of a worker - has
>> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
>> collection.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup before
>> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has
>> the instance before it
>> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls teardown.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> It is as simple as it.
>> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not
>> self
>> >>>>>>>>>>>>>>>> contained to implement basic transforms.
>> >>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com>
>> a
>> >>>>>>>>>>>>>>>> écrit :
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau
>> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
>> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather
>> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing methods
>> -- which have been
>> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would
>> be helpful to focus on
>> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something with
>> different semantics.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
>> trying
>> >>>>>>>>>>>>>>>>>> to do):
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>> >>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline.
>> If this is the case,
>> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized
>> once (and not once per
>> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you
>> know when the pipeline
>> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches
>> step X", then what
>> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the
>> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a
>> jvm shutdown)
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> I'm really not following what this means.
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and
>> each
>> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of
>> the same DoFn). How
>> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M
>> cleanups) and when
>> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut
>> down? When an
>> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be
>> temporary - may be
>> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
>> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
>> methods are not a good fit
>> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources
>> within the DoFn), you could
>> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it
>> produced. For instance:
>> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token
>> that
>> >>>>>>>>>>>>>>>>>> stores information about resources)
>> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries
>> >>>>>>>>>>>>>>>>>> from changing resource IDs)
>> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
>> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data it
>> is
>> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use
>> or have been finished
>> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
>> important to ensuring
>> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is
>> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are trying
>> to accomplish? That
>> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
>> existing options and
>> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases
>> but
>> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
>> handling  except i
>> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you
>> cant put any unified
>> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard
>> to integrate or to use
>> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
>> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> -- Ben
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
>> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
>> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of
>> the
>> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
>> machine.
>> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
>> >>>>>>>>>>>>>>>>>>>> except in various situations where it's logically
>> impossible or impractical
>> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or you
>> can list some of the
>> >>>>>>>>>>>>>>>>>>>> examples above.
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> Sounds ok to me
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
>> called - it's not just
>> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
>> important (e.g. cleaning up
>> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a
>> large number of VMs you
>> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of
>> the other methods that
>> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a
>> cost, e.g. no
>> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not
>> >>>>>>>>>>>>>>>>>>> which which other method you speak about. Concretely
>> if you make it really
>> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me -
>> then users can use it
>> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but
>> it is unexpected and
>> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a
>> manual - or auto if fancy
>> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all the
>> difference and impacts
>> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
>> >>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau
>> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that.
>> It
>> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is
>> what triggered this thread.
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents
>> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very
>> badly and wrongly
>> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn
>> |
>> >>>>>>>>>>>>>>>>>>>>> Book
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
>> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>> >>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call
>> it:
>> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic
>> crash), and in a number of
>> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container
>> has crashed (eg user code
>> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over JNI
>> and it segfaulted), JVM
>> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the
>> worker has lost network
>> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't
>> be able to do anything
>> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible
>> VM and it was preempted by
>> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice or
>> if the worker was too busy
>> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
>> functions) until the preemption
>> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying hardware
>> simply failed (which
>> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many other
>> conditions.
>> >>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe
>> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for
>> cases where you observed a
>> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it
>> was possible to call it but
>> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
>> >>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau
>> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
>> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau
>> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
>> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
>> Manni-Bucau
>> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need
>> (e.g.
>> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and
>> it requires the following
>> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic
>> and the following processing
>> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring
>> a
>> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
>> since size is not controlled.
>> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the
>> connection since it is a best
>> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
>> makes you pay a lot - aws ;) - or
>> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
>> concurrent limit.
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
>> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not
>> called then nothing else can be
>> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What AWS
>> service are you thinking of
>> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when everything
>> at the other end has died?
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but
>> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
>> closing exchanges which are not
>> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
>> services
>> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup
>> and closing them at the end.
>> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines and
>> money. You can say it can be
>> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle
>> its
>> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale
>> for generic pipelines and if
>> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring
>> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
>> handled by any human system?
>> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
>> handle it? Is it due to some
>> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented
>> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what
>> kind of change you're asking
>> >>>>>>>>>>>>>>>>>>>>>>>> for.
>> >>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is
>> not
>> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>> >>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct
>> runner
>> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he
>> will get a different behavior
>> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know
>> what the IOs he composes use
>> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product
>> than he must be handled IMHO.
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in
>> big
>> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore
>> what people did for years and do
>> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
>> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the
>> execution of teardown. Then we
>> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there is a
>> technical reason we cant we
>> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the api. I
>> know spark and flink can, any
>> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through
>> java
>> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam
>> enclosing software) is fully
>> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
>> uncontrolled. Only case where it is not
>> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned by a
>> vendor and never installed on
>> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it belongd
>> to the vendor to handle beam
>> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
>> vendor - otherwise all
>> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should be
>> made optional right?
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
>> distributed
>> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and
>> defined lifecycle.
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>>>>>>
>> >>>>>>>>>>>>
>> >>>>>>>>>>>
>> >>>>>>>>>>
>> >>>>>>>>>
>> >>>
>> >>
>> >>
>> >
>>
>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

Ismael, your understanding is appropriate for FinishBundle.

One basic issue with this understanding, is that the lifecycle of a DoFn is
much longer than a single bundle (which I think you expressed by adding the
*s). How long the DoFn lives is not defined. In fact a runner is completely
free to decide that it will _never_ destroy the DoFn, in which case
TearDown is never called simply because the DoFn was never torn down.

Also, as mentioned before, the runner can only call TearDown in cases where
the shutdown is in its control. If the JVM is shut down externally, the
runner has no chance to call TearDown. This means that while TearDown is
appropriate for cleaning up in-process resources (open connections, etc.),
it's not the right answer for cleaning up persistent resources. If you rely
on TearDown to delete VMs or delete files, there will be cases in which
those files of VMs are not deleted.

What we are _not_ saying is that the runner is free to just ignore
TearDown. If the runner is explicitly destroying a DoFn object, it should
call TearDown.

Reuven


On Mon, Feb 19, 2018 at 2:35 PM, Ismaël Mejía <ie...@gmail.com> wrote:

> I also had a different understanding of the lifecycle of a DoFn.
>
> My understanding of the use case for every method in the DoFn was clear and
> perfectly aligned with Thomas explanation, but what I understood was that
> in a
> general terms ‘@Setup was where I got resources/prepare connections and
> @Teardown where I free them’, so calling Teardown seemed essential to have
> a
> complete lifecycle:
> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
>
> The fact that @Teardown could not be called is a new detail for me too,
> and I
> also find weird to have a method that may or not be called as part of an
> API,
> why would users implement teardown if it will not be called? In that case
> probably a cleaner approach would be to get rid of that method altogether,
> no?
>
> But well maybe that’s not so easy too, there was another point: Some user
> reported an issue with leaking resources using KafkaIO in the Spark
> runner, for
> ref.
> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
>
> In that moment my understanding was that there was something fishy because
> we
> should be calling Teardown to close correctly the connections and free the
> resources in case of exceptions on start/process/finish, so I filled a
> JIRA and
> fixed this by enforcing the call of teardown for the Spark runner and the
> Flink
> runner:
> https://issues.apache.org/jira/browse/BEAM-3187
> https://issues.apache.org/jira/browse/BEAM-3244
>
> As you can see not calling this method does have consequences at least for
> non-containerized runners. Of course a runner that uses containers could
> not
> care about cleaning the resources this way, but a long living JVM in a
> Hadoop
> environment probably won’t have the same luck. So I am not sure that
> having a
> loose semantic there is the right option, I mean, runners could simply
> guarantee
> that they call teardown and if teardown takes too long they can decide to
> send a
> signal or kill the process/container/etc and go ahead, that way at least
> users
> would have a motivation to implement the teardown method, otherwise it
> doesn’t
> make any sense to have it (API wise).
>
> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov <ki...@google.com>
> wrote:
> > Romain, would it be fair to say that currently the goal of your
> > participation in this discussion is to identify situations where
> @Teardown
> > in principle could have been called, but some of the current runners
> don't
> > make a good enough effort to call it? If yes - as I said before, please,
> by
> > all means, file bugs of the form "Runner X doesn't call @Teardown in
> > situation Y" if you're aware of any, and feel free to send PRs fixing
> runner
> > X to reliably call @Teardown in situation Y. I think we all agree that
> this
> > would be a good improvement.
> >
> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau <
> rmannibucau@gmail.com>
> > wrote:
> >>
> >>
> >>
> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
> >>
> >>
> >>
> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
> >> <rm...@gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
> >>>
> >>> How do you call teardown? There are cases in which the Java code gets
> no
> >>> indication that the restart is happening (e.g. cases where the machine
> >>> itself is taken down)
> >>>
> >>>
> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
> Crashes
> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
> shutdown
> >>> with a hook worse case.
> >>
> >>
> >> What you say here is simply not true.
> >>
> >> There are many scenarios in which workers shutdown with no opportunity
> for
> >> any sort of shutdown hook. Sometimes the entire machine gets shutdown,
> and
> >> not even the OS will have much of a chance to do anything. At scale this
> >> will happen with some regularity, and a distributed system that assumes
> this
> >> will not happen is a poor distributed system.
> >>
> >>
> >> This is part of the infra and there is no reason the machine is shutdown
> >> without shutting down what runs on it before except if it is a bug in
> the
> >> software or setup. I can hear you maybe dont do it everywhere but there
> is
> >> no blocker to do it. Means you can shutdown the machines and guarantee
> >> teardown is called.
> >>
> >> Where i go is simply that it is doable and beam sdk core can assume
> setup
> >> is well done. If there is a best effort downside due to that - with the
> >> meaning you defined - it is an impl bug or a user installation issue.
> >>
> >> Technically all is true.
> >>
> >> What can prevent teardown is a hardware failure or so. This is fine and
> >> doesnt need to be in doc since it is life in IT and obvious or must be
> very
> >> explicit to avoid current ambiguity.
> >>
> >>
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <
> rmannibucau@gmail.com>
> >>> wrote:
> >>>>
> >>>> Restarting doesnt mean you dont call teardown. Except a bug there is
> no
> >>>> reason - technically - it happens, no reason.
> >>>>
> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
> >>>>>
> >>>>> Workers restarting is not a bug, it's standard often expected.
> >>>>>
> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
> >>>>> <rm...@gmail.com> wrote:
> >>>>>>
> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
> >>>>>> (procedure)
> >>>>>>
> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
> >>>>>> écrit :
> >>>>>>>
> >>>>>>> So what would you like to happen if there is a crash? The DoFn
> >>>>>>> instance no longer exists because the JVM it ran on no longer
> exists. What
> >>>>>>> should Teardown be called on?
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
> >>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> This is what i want and not 999999 teardowns for 1000000 setups
> >>>>>>>> until there is an unexpected crash (= a bug).
> >>>>>>>>
> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
> >>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
> >>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to
> >>>>>>>>>>>> the same pattern, the teardown is just a "if (iCreatedThem)
> releaseThem();"
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
> workers,
> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
> created per
> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
> threads on each
> >>>>>>>>>>> worker.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Nop was the other way around, in this case on AWS you can get
> 256
> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So when
> you compute the
> >>>>>>>>>> distribution you allocate to some fn the role to own the
> instance lookup and
> >>>>>>>>>> releasing.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I still don't understand. Let's be more precise. If you write the
> >>>>>>>>> following code:
> >>>>>>>>>
> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
> >>>>>>>>>
> >>>>>>>>> There is no way to control how many instances of MyDoFn are
> >>>>>>>>> created. The runner might decided to create a million instances
> of this
> >>>>>>>>> class across your worker pool, which means that you will get a
> million Setup
> >>>>>>>>> and Teardown calls.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Anyway this was just an example of an external resource you must
> >>>>>>>>>> release. Real topic is that beam should define asap a
> guaranteed generic
> >>>>>>>>>> lifecycle to let user embrace its programming model.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Eugene:
> >>>>>>>>>>>> 1. wait logic is about passing the value which is not always
> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
> >>>>>>>>>>>> core). This API defines a *container* API and therefore
> implies bean
> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the sources
> and dofn (not
> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
> >>>>>>>>>>>>
> >>>>>>>>>>>> A. Source
> >>>>>>>>>>>>
> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
> >>>>>>>>>>>> estimateSize and split. As an user you can expect both to be
> called on the
> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
> cost(s) twice.
> >>>>>>>>>>>> Concretely:
> >>>>>>>>>>>>
> >>>>>>>>>>>> connect()
> >>>>>>>>>>>> try {
> >>>>>>>>>>>>   estimateSize()
> >>>>>>>>>>>>   split()
> >>>>>>>>>>>> } finally {
> >>>>>>>>>>>>   disconnect()
> >>>>>>>>>>>> }
> >>>>>>>>>>>>
> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
> >>>>>>>>>>>>
> >>>>>>>>>>>> connect()
> >>>>>>>>>>>> try {
> >>>>>>>>>>>>   estimateSize()
> >>>>>>>>>>>> } finally {
> >>>>>>>>>>>>   disconnect()
> >>>>>>>>>>>> }
> >>>>>>>>>>>> connect()
> >>>>>>>>>>>> try {
> >>>>>>>>>>>>   split()
> >>>>>>>>>>>> } finally {
> >>>>>>>>>>>>   disconnect()
> >>>>>>>>>>>> }
> >>>>>>>>>>>>
> >>>>>>>>>>>> + a workaround with an internal estimate size since this
> >>>>>>>>>>>> primitive is often called in split but you dont want to
> connect twice in the
> >>>>>>>>>>>> second phase.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an API
> to
> >>>>>>>>>>>> implement sources which initializes the source bean and
> destroys it.
> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
> However
> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building
> any API on top of
> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you hit
> the exact same
> >>>>>>>>>>>> issues - check how IO are implemented, the static utilities
> which create
> >>>>>>>>>>>> volatile connections preventing to reuse existing connection
> in a single
> >>>>>>>>>>>> method
> >>>>>>>>>>>> (https://github.com/apache/beam/blob/master/sdks/java/io/
> elasticsearch/src/main/java/org/apache/beam/sdk/io/
> elasticsearch/ElasticsearchIO.java#L862).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Same logic applies to the reader which is then created.
> >>>>>>>>>>>>
> >>>>>>>>>>>> B. DoFn & SDF
> >>>>>>>>>>>>
> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime: init();
> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and
> that it is
> >>>>>>>>>>>> executed on the exact same instance to be able to be stateful
> at that level
> >>>>>>>>>>>> for expensive connections/operations/flow state handling.
> >>>>>>>>>>>>
> >>>>>>>>>>>> As you mentionned with the million example, this sequence
> should
> >>>>>>>>>>>> happen for each single instance so 1M times for your example.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
> >>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore it
> creates way
> >>>>>>>>>>>> more instances and requires to have a way more
> strict/explicit definition of
> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since beam
> handles the
> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
> init/destroy hooks
> >>>>>>>>>>>> (setup/teardown) which can be stateful.
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles. Since
> bundles size is
> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to be
> able to reuse
> >>>>>>>>>>>> a connection instance to not correct performances. Now with
> the SDF and the
> >>>>>>>>>>>> split increase, how do you handle the pool size? Generally in
> batch you use
> >>>>>>>>>>>> a single connection per thread to avoid to consume all
> database connections.
> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2. use a
> pool a bit
> >>>>>>>>>>>> higher but multiplied by the number of beans you will likely
> x2 or 3 the
> >>>>>>>>>>>> connection count and make the execution fail with "no more
> connection
> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still have
> to have a
> >>>>>>>>>>>> reliable teardown by pool instance (close() generally) to
> ensure you release
> >>>>>>>>>>>> the pool and don't leak the connection information in the
> JVM. In all case
> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if you
> fake to get
> >>>>>>>>>>>> connections with bundles.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply
> >>>>>>>>>>>> all the current issues with the loose definition of the bean
> lifecycles at
> >>>>>>>>>>>> an exponential level, nothing else.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Romain Manni-Bucau
> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
> >>>>>>>>>>>> <ki...@google.com>:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can
> be
> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in the
> thread above,
> >>>>>>>>>>>>> and I believe it should become the canonical way to do that.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main author of
> >>>>>>>>>>>>> most design documents related to SDF and of its
> implementation in the Java
> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to
> the topic of
> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and
> clean the api.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
> >>>>>>>>>>>>>> transforms?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <
> bchambers@apache.org>
> >>>>>>>>>>>>>> a écrit :
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
> DoFn's
> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary, it
> is around an
> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
> discussions/proposals
> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
> haven't been
> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a
> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the chance of
> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
> destination).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
> >>>>>>>>>>>>>>> The move step should only happen once, so on one worker.
> This
> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some stuff
> done to ensure it
> >>>>>>>>>>>>>>> runs on one worker.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule some
> cleanup work for
> >>>>>>>>>>>>>>> when the transform is "done". In batch this is relatively
> straightforward,
> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems,
> such as BigQuery
> >>>>>>>>>>>>>>> sink leaving files around that have failed to import into
> BigQuery.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you want to
> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to wait
> until the end of
> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until you
> know nobody will
> >>>>>>>>>>>>>>> need the resource anymore.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API, where
> >>>>>>>>>>>>>>> you could have a transform that output resource objects.
> Each resource
> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
> would be something
> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
> resource, and what
> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as
> that part of the
> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no longer
> need the resources,
> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at pipeline
> shutdown, or
> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Would something like this be a better fit for your use
> case?
> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
> sufficient?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread
> of a worker - has
> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
> collection.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup before
> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has the
> instance before it
> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls teardown.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> It is as simple as it.
> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not
> self
> >>>>>>>>>>>>>>>> contained to implement basic transforms.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
> >>>>>>>>>>>>>>>> écrit :
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau
> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather
> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing methods
> -- which have been
> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would be
> helpful to focus on
> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something with
> different semantics.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
> trying
> >>>>>>>>>>>>>>>>>> to do):
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
> >>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If
> this is the case,
> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized
> once (and not once per
> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you
> know when the pipeline
> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches
> step X", then what
> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the
> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a
> jvm shutdown)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I'm really not following what this means.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and
> each
> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of
> the same DoFn). How
> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M
> cleanups) and when
> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut
> down? When an
> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be
> temporary - may be
> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
> methods are not a good fit
> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources within
> the DoFn), you could
> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it
> produced. For instance:
> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
> >>>>>>>>>>>>>>>>>> stores information about resources)
> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries
> >>>>>>>>>>>>>>>>>> from changing resource IDs)
> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data it is
> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use
> or have been finished
> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
> important to ensuring
> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is
> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are trying
> to accomplish? That
> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
> existing options and
> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases
> but
> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
> handling  except i
> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you
> cant put any unified
> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard
> to integrate or to use
> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> -- Ben
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of
> the
> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
> machine.
> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
> >>>>>>>>>>>>>>>>>>>> except in various situations where it's logically
> impossible or impractical
> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or you
> can list some of the
> >>>>>>>>>>>>>>>>>>>> examples above.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Sounds ok to me
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
> called - it's not just
> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
> important (e.g. cleaning up
> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a
> large number of VMs you
> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of the
> other methods that
> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a
> cost, e.g. no
> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not
> >>>>>>>>>>>>>>>>>>> which which other method you speak about. Concretely
> if you make it really
> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me -
> then users can use it
> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but
> it is unexpected and
> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a
> manual - or auto if fancy
> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all the
> difference and impacts
> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It
> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is what
> triggered this thread.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents
> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very
> badly and wrongly
> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn |
> >>>>>>>>>>>>>>>>>>>>> Book
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it:
> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic
> crash), and in a number of
> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container
> has crashed (eg user code
> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over JNI
> and it segfaulted), JVM
> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the worker
> has lost network
> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be
> able to do anything
> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible
> VM and it was preempted by
> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice or if
> the worker was too busy
> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
> functions) until the preemption
> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying hardware
> simply failed (which
> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many other
> conditions.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe
> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for
> cases where you observed a
> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it
> was possible to call it but
> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
> Manni-Bucau
> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need
> (e.g.
> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it
> requires the following
> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic
> and the following processing
> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
> since size is not controlled.
> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the
> connection since it is a best
> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
> makes you pay a lot - aws ;) - or
> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
> concurrent limit.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not called
> then nothing else can be
> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What AWS
> service are you thinking of
> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when everything
> at the other end has died?
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but
> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
> closing exchanges which are not
> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
> services
> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup
> and closing them at the end.
> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines and
> money. You can say it can be
> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle
> its
> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for
> generic pipelines and if
> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring
> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
> handled by any human system?
> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
> handle it? Is it due to some
> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented
> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what kind
> of change you're asking
> >>>>>>>>>>>>>>>>>>>>>>>> for.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not
> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct
> runner
> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he
> will get a different behavior
> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know
> what the IOs he composes use
> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product
> than he must be handled IMHO.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in
> big
> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what
> people did for years and do
> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the
> execution of teardown. Then we
> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there is a
> technical reason we cant we
> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the api. I
> know spark and flink can, any
> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through
> java
> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam
> enclosing software) is fully
> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
> uncontrolled. Only case where it is not
> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned by a
> vendor and never installed on
> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it belongd to
> the vendor to handle beam
> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
> vendor - otherwise all
> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should be
> made optional right?
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
> distributed
> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and
> defined lifecycle.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>
> >>
> >>
> >
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Agree

let's try another time:


any issue removing "best effort"?
if yes, any issue explaining it is due to failure and not a runner choice?

if one of both is fine then we close this thread and just decide who fixes
it, if not we must define and discuss why there is a teardown now and how
to implement the need in beam.



Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-19 23:35 GMT+01:00 Ismaël Mejía <ie...@gmail.com>:

> I also had a different understanding of the lifecycle of a DoFn.
>
> My understanding of the use case for every method in the DoFn was clear and
> perfectly aligned with Thomas explanation, but what I understood was that
> in a
> general terms ‘@Setup was where I got resources/prepare connections and
> @Teardown where I free them’, so calling Teardown seemed essential to have
> a
> complete lifecycle:
> Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown
>
> The fact that @Teardown could not be called is a new detail for me too,
> and I
> also find weird to have a method that may or not be called as part of an
> API,
> why would users implement teardown if it will not be called? In that case
> probably a cleaner approach would be to get rid of that method altogether,
> no?
>
> But well maybe that’s not so easy too, there was another point: Some user
> reported an issue with leaking resources using KafkaIO in the Spark
> runner, for
> ref.
> https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622
>
> In that moment my understanding was that there was something fishy because
> we
> should be calling Teardown to close correctly the connections and free the
> resources in case of exceptions on start/process/finish, so I filled a
> JIRA and
> fixed this by enforcing the call of teardown for the Spark runner and the
> Flink
> runner:
> https://issues.apache.org/jira/browse/BEAM-3187
> https://issues.apache.org/jira/browse/BEAM-3244
>
> As you can see not calling this method does have consequences at least for
> non-containerized runners. Of course a runner that uses containers could
> not
> care about cleaning the resources this way, but a long living JVM in a
> Hadoop
> environment probably won’t have the same luck. So I am not sure that
> having a
> loose semantic there is the right option, I mean, runners could simply
> guarantee
> that they call teardown and if teardown takes too long they can decide to
> send a
> signal or kill the process/container/etc and go ahead, that way at least
> users
> would have a motivation to implement the teardown method, otherwise it
> doesn’t
> make any sense to have it (API wise).
>
> On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov <ki...@google.com>
> wrote:
> > Romain, would it be fair to say that currently the goal of your
> > participation in this discussion is to identify situations where
> @Teardown
> > in principle could have been called, but some of the current runners
> don't
> > make a good enough effort to call it? If yes - as I said before, please,
> by
> > all means, file bugs of the form "Runner X doesn't call @Teardown in
> > situation Y" if you're aware of any, and feel free to send PRs fixing
> runner
> > X to reliably call @Teardown in situation Y. I think we all agree that
> this
> > would be a good improvement.
> >
> > On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau <
> rmannibucau@gmail.com>
> > wrote:
> >>
> >>
> >>
> >> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
> >>
> >>
> >>
> >> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
> >> <rm...@gmail.com> wrote:
> >>>
> >>>
> >>>
> >>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
> >>>
> >>> How do you call teardown? There are cases in which the Java code gets
> no
> >>> indication that the restart is happening (e.g. cases where the machine
> >>> itself is taken down)
> >>>
> >>>
> >>> This is a bug, 0 downtime maintenance is very doable in 2018 ;).
> Crashes
> >>> are bugs, kill -9 to shutdown is a bug too. Other cases let call
> shutdown
> >>> with a hook worse case.
> >>
> >>
> >> What you say here is simply not true.
> >>
> >> There are many scenarios in which workers shutdown with no opportunity
> for
> >> any sort of shutdown hook. Sometimes the entire machine gets shutdown,
> and
> >> not even the OS will have much of a chance to do anything. At scale this
> >> will happen with some regularity, and a distributed system that assumes
> this
> >> will not happen is a poor distributed system.
> >>
> >>
> >> This is part of the infra and there is no reason the machine is shutdown
> >> without shutting down what runs on it before except if it is a bug in
> the
> >> software or setup. I can hear you maybe dont do it everywhere but there
> is
> >> no blocker to do it. Means you can shutdown the machines and guarantee
> >> teardown is called.
> >>
> >> Where i go is simply that it is doable and beam sdk core can assume
> setup
> >> is well done. If there is a best effort downside due to that - with the
> >> meaning you defined - it is an impl bug or a user installation issue.
> >>
> >> Technically all is true.
> >>
> >> What can prevent teardown is a hardware failure or so. This is fine and
> >> doesnt need to be in doc since it is life in IT and obvious or must be
> very
> >> explicit to avoid current ambiguity.
> >>
> >>
> >>>
> >>>
> >>>
> >>>
> >>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <
> rmannibucau@gmail.com>
> >>> wrote:
> >>>>
> >>>> Restarting doesnt mean you dont call teardown. Except a bug there is
> no
> >>>> reason - technically - it happens, no reason.
> >>>>
> >>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
> >>>>>
> >>>>> Workers restarting is not a bug, it's standard often expected.
> >>>>>
> >>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
> >>>>> <rm...@gmail.com> wrote:
> >>>>>>
> >>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
> >>>>>> (procedure)
> >>>>>>
> >>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
> >>>>>> écrit :
> >>>>>>>
> >>>>>>> So what would you like to happen if there is a crash? The DoFn
> >>>>>>> instance no longer exists because the JVM it ran on no longer
> exists. What
> >>>>>>> should Teardown be called on?
> >>>>>>>
> >>>>>>>
> >>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
> >>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>
> >>>>>>>> This is what i want and not 999999 teardowns for 1000000 setups
> >>>>>>>> until there is an unexpected crash (= a bug).
> >>>>>>>>
> >>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
> >>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
> >>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to
> >>>>>>>>>>>> the same pattern, the teardown is just a "if (iCreatedThem)
> releaseThem();"
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> How do you control "256?" Even if you have a pool of 256
> workers,
> >>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are
> created per
> >>>>>>>>>>> worker. In theory the runner might decide to create 1000
> threads on each
> >>>>>>>>>>> worker.
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Nop was the other way around, in this case on AWS you can get
> 256
> >>>>>>>>>> instances at once but not 512 (which will be 2x256). So when
> you compute the
> >>>>>>>>>> distribution you allocate to some fn the role to own the
> instance lookup and
> >>>>>>>>>> releasing.
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> I still don't understand. Let's be more precise. If you write the
> >>>>>>>>> following code:
> >>>>>>>>>
> >>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
> >>>>>>>>>
> >>>>>>>>> There is no way to control how many instances of MyDoFn are
> >>>>>>>>> created. The runner might decided to create a million instances
> of this
> >>>>>>>>> class across your worker pool, which means that you will get a
> million Setup
> >>>>>>>>> and Teardown calls.
> >>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> Anyway this was just an example of an external resource you must
> >>>>>>>>>> release. Real topic is that beam should define asap a
> guaranteed generic
> >>>>>>>>>> lifecycle to let user embrace its programming model.
> >>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> @Eugene:
> >>>>>>>>>>>> 1. wait logic is about passing the value which is not always
> >>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
> >>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
> >>>>>>>>>>>> core). This API defines a *container* API and therefore
> implies bean
> >>>>>>>>>>>> lifecycles. I'll not detail them all but just use the sources
> and dofn (not
> >>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
> >>>>>>>>>>>>
> >>>>>>>>>>>> A. Source
> >>>>>>>>>>>>
> >>>>>>>>>>>> A source computes a partition plan with 2 primitives:
> >>>>>>>>>>>> estimateSize and split. As an user you can expect both to be
> called on the
> >>>>>>>>>>>> same bean instance to avoid to pay the same connection
> cost(s) twice.
> >>>>>>>>>>>> Concretely:
> >>>>>>>>>>>>
> >>>>>>>>>>>> connect()
> >>>>>>>>>>>> try {
> >>>>>>>>>>>>   estimateSize()
> >>>>>>>>>>>>   split()
> >>>>>>>>>>>> } finally {
> >>>>>>>>>>>>   disconnect()
> >>>>>>>>>>>> }
> >>>>>>>>>>>>
> >>>>>>>>>>>> this is not guaranteed by the API so you must do:
> >>>>>>>>>>>>
> >>>>>>>>>>>> connect()
> >>>>>>>>>>>> try {
> >>>>>>>>>>>>   estimateSize()
> >>>>>>>>>>>> } finally {
> >>>>>>>>>>>>   disconnect()
> >>>>>>>>>>>> }
> >>>>>>>>>>>> connect()
> >>>>>>>>>>>> try {
> >>>>>>>>>>>>   split()
> >>>>>>>>>>>> } finally {
> >>>>>>>>>>>>   disconnect()
> >>>>>>>>>>>> }
> >>>>>>>>>>>>
> >>>>>>>>>>>> + a workaround with an internal estimate size since this
> >>>>>>>>>>>> primitive is often called in split but you dont want to
> connect twice in the
> >>>>>>>>>>>> second phase.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Why do you need that? Simply cause you want to define an API
> to
> >>>>>>>>>>>> implement sources which initializes the source bean and
> destroys it.
> >>>>>>>>>>>> I insists it is a very very basic concern for such API.
> However
> >>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building
> any API on top of
> >>>>>>>>>>>> beam is very hurtful today and for direct beam users you hit
> the exact same
> >>>>>>>>>>>> issues - check how IO are implemented, the static utilities
> which create
> >>>>>>>>>>>> volatile connections preventing to reuse existing connection
> in a single
> >>>>>>>>>>>> method
> >>>>>>>>>>>> (https://github.com/apache/beam/blob/master/sdks/java/io/
> elasticsearch/src/main/java/org/apache/beam/sdk/io/
> elasticsearch/ElasticsearchIO.java#L862).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Same logic applies to the reader which is then created.
> >>>>>>>>>>>>
> >>>>>>>>>>>> B. DoFn & SDF
> >>>>>>>>>>>>
> >>>>>>>>>>>> As a fn dev you expect the same from the beam runtime: init();
> >>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and
> that it is
> >>>>>>>>>>>> executed on the exact same instance to be able to be stateful
> at that level
> >>>>>>>>>>>> for expensive connections/operations/flow state handling.
> >>>>>>>>>>>>
> >>>>>>>>>>>> As you mentionned with the million example, this sequence
> should
> >>>>>>>>>>>> happen for each single instance so 1M times for your example.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
> >>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore it
> creates way
> >>>>>>>>>>>> more instances and requires to have a way more
> strict/explicit definition of
> >>>>>>>>>>>> the exact lifecycle and which instance does what. Since beam
> handles the
> >>>>>>>>>>>> full lifecycle of the bean instances it must provide
> init/destroy hooks
> >>>>>>>>>>>> (setup/teardown) which can be stateful.
> >>>>>>>>>>>>
> >>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
> >>>>>>>>>>>> Today, because of the teardown issue it uses bundles. Since
> bundles size is
> >>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to be
> able to reuse
> >>>>>>>>>>>> a connection instance to not correct performances. Now with
> the SDF and the
> >>>>>>>>>>>> split increase, how do you handle the pool size? Generally in
> batch you use
> >>>>>>>>>>>> a single connection per thread to avoid to consume all
> database connections.
> >>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2. use a
> pool a bit
> >>>>>>>>>>>> higher but multiplied by the number of beans you will likely
> x2 or 3 the
> >>>>>>>>>>>> connection count and make the execution fail with "no more
> connection
> >>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still have
> to have a
> >>>>>>>>>>>> reliable teardown by pool instance (close() generally) to
> ensure you release
> >>>>>>>>>>>> the pool and don't leak the connection information in the
> JVM. In all case
> >>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if you
> fake to get
> >>>>>>>>>>>> connections with bundles.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply
> >>>>>>>>>>>> all the current issues with the loose definition of the bean
> lifecycles at
> >>>>>>>>>>>> an exponential level, nothing else.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> Romain Manni-Bucau
> >>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
> >>>>>>>>>>>>
> >>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
> >>>>>>>>>>>> <ki...@google.com>:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can
> be
> >>>>>>>>>>>>> accomplished using the Wait transform as I suggested in the
> thread above,
> >>>>>>>>>>>>> and I believe it should become the canonical way to do that.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> (Would like to reiterate one more time, as the main author of
> >>>>>>>>>>>>> most design documents related to SDF and of its
> implementation in the Java
> >>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to
> the topic of
> >>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
> >>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
> >>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and
> clean the api.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
> >>>>>>>>>>>>>> transforms?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <
> bchambers@apache.org>
> >>>>>>>>>>>>>> a écrit :
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific
> DoFn's
> >>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary, it
> is around an
> >>>>>>>>>>>>>>> entire composite PTransform. I think there have been
> discussions/proposals
> >>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those
> haven't been
> >>>>>>>>>>>>>>> implemented, to the best of my knowledge.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
> >>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
> >>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a
> >>>>>>>>>>>>>>> bulk copy to put them in the final destination.
> >>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> (This is often desirable because it minimizes the chance of
> >>>>>>>>>>>>>>> seeing partial/incomplete results in the final
> destination).
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
> >>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
> >>>>>>>>>>>>>>> The move step should only happen once, so on one worker.
> This
> >>>>>>>>>>>>>>> means it will be a different DoFn, likely with some stuff
> done to ensure it
> >>>>>>>>>>>>>>> runs on one worker.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
> >>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule some
> cleanup work for
> >>>>>>>>>>>>>>> when the transform is "done". In batch this is relatively
> straightforward,
> >>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems,
> such as BigQuery
> >>>>>>>>>>>>>>> sink leaving files around that have failed to import into
> BigQuery.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> In streaming this is less straightforward -- do you want to
> >>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to wait
> until the end of
> >>>>>>>>>>>>>>> the window? In practice, you just want to wait until you
> know nobody will
> >>>>>>>>>>>>>>> need the resource anymore.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API, where
> >>>>>>>>>>>>>>> you could have a transform that output resource objects.
> Each resource
> >>>>>>>>>>>>>>> object would have logic for cleaning it up. And there
> would be something
> >>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that
> resource, and what
> >>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as
> that part of the
> >>>>>>>>>>>>>>> pipeline had advanced far enough that it would no longer
> need the resources,
> >>>>>>>>>>>>>>> they would get cleaned up. This can be done at pipeline
> shutdown, or
> >>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Would something like this be a better fit for your use
> case?
> >>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn
> sufficient?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
> >>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread
> of a worker - has
> >>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage
> collection.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
> >>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> What i want is any "new" to have a following setup before
> >>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has the
> instance before it
> >>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls teardown.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> It is as simple as it.
> >>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not
> self
> >>>>>>>>>>>>>>>> contained to implement basic transforms.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
> >>>>>>>>>>>>>>>> écrit :
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau
> >>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
> >>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather
> >>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing methods
> -- which have been
> >>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would be
> helpful to focus on
> >>>>>>>>>>>>>>>>>> more on the reason you are looking for something with
> different semantics.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are
> trying
> >>>>>>>>>>>>>>>>>> to do):
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
> >>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If
> this is the case,
> >>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized
> once (and not once per
> >>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you
> know when the pipeline
> >>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches
> step X", then what
> >>>>>>>>>>>>>>>>>> about a streaming pipeline?
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the
> >>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a
> jvm shutdown)
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> I'm really not following what this means.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and
> each
> >>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of
> the same DoFn). How
> >>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M
> cleanups) and when
> >>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut
> down? When an
> >>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be
> temporary - may be
> >>>>>>>>>>>>>>>>> about to start back up)? Something else?
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
> >>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle
> methods are not a good fit
> >>>>>>>>>>>>>>>>>> for this (they are focused on managing resources within
> the DoFn), you could
> >>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it
> produced. For instance:
> >>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
> >>>>>>>>>>>>>>>>>> stores information about resources)
> >>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries
> >>>>>>>>>>>>>>>>>> from changing resource IDs)
> >>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
> >>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
> >>>>>>>>>>>>>>>>>> eventually output the fact they're done
> >>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
> >>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> By making the use of the resource part of the data it is
> >>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use
> or have been finished
> >>>>>>>>>>>>>>>>>> by using the require deterministic input. This is
> important to ensuring
> >>>>>>>>>>>>>>>>>> everything is actually cleaned up.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
> >>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is
> >>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are trying
> to accomplish? That
> >>>>>>>>>>>>>>>>>> would help me understand both the problems with
> existing options and
> >>>>>>>>>>>>>>>>>> possibly what could be done to help.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases
> but
> >>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle
> handling  except i
> >>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you
> cant put any unified
> >>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard
> to integrate or to use
> >>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
> >>>>>>>>>>>>>>>>>> discussions and just stay at API level.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> -- Ben
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
> >>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of
> the
> >>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine
> machine.
> >>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
> >>>>>>>>>>>>>>>>>>>> except in various situations where it's logically
> impossible or impractical
> >>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or you
> can list some of the
> >>>>>>>>>>>>>>>>>>>> examples above.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Sounds ok to me
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
> >>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be
> called - it's not just
> >>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very
> important (e.g. cleaning up
> >>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a
> large number of VMs you
> >>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of the
> other methods that
> >>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a
> cost, e.g. no
> >>>>>>>>>>>>>>>>>>>> pass-by-reference).
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not
> >>>>>>>>>>>>>>>>>>> which which other method you speak about. Concretely
> if you make it really
> >>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me -
> then users can use it
> >>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but
> it is unexpected and
> >>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a
> manual - or auto if fancy
> >>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all the
> difference and impacts
> >>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It
> >>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is what
> triggered this thread.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents
> >>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very
> badly and wrongly
> >>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn |
> >>>>>>>>>>>>>>>>>>>>> Book
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
> >>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it:
> >>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic
> crash), and in a number of
> >>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container
> has crashed (eg user code
> >>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over JNI
> and it segfaulted), JVM
> >>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the worker
> has lost network
> >>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be
> able to do anything
> >>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible
> VM and it was preempted by
> >>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice or if
> the worker was too busy
> >>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown
> functions) until the preemption
> >>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying hardware
> simply failed (which
> >>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many other
> conditions.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe
> >>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for
> cases where you observed a
> >>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it
> was possible to call it but
> >>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
> >>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau
> >>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
> >>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain
> Manni-Bucau
> >>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need
> (e.g.
> >>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it
> requires the following
> >>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic
> and the following processing
> >>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
> >>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer
> since size is not controlled.
> >>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the
> connection since it is a best
> >>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection
> makes you pay a lot - aws ;) - or
> >>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings -
> concurrent limit.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
> >>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not called
> then nothing else can be
> >>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What AWS
> service are you thinking of
> >>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when everything
> at the other end has died?
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but
> >>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some
> closing exchanges which are not
> >>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some
> services
> >>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup
> and closing them at the end.
> >>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines and
> money. You can say it can be
> >>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle
> its
> >>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for
> generic pipelines and if
> >>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring
> >>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be
> handled by any human system?
> >>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not
> handle it? Is it due to some
> >>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented
> >>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what kind
> of change you're asking
> >>>>>>>>>>>>>>>>>>>>>>>> for.
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not
> >>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct
> runner
> >>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he
> will get a different behavior
> >>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know
> what the IOs he composes use
> >>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product
> than he must be handled IMHO.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in
> big
> >>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what
> people did for years and do
> >>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
> >>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the
> execution of teardown. Then we
> >>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there is a
> technical reason we cant we
> >>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the api. I
> know spark and flink can, any
> >>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through
> java
> >>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam
> enclosing software) is fully
> >>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is
> uncontrolled. Only case where it is not
> >>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned by a
> vendor and never installed on
> >>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it belongd to
> the vendor to handle beam
> >>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a
> vendor - otherwise all
> >>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should be
> made optional right?
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in
> distributed
> >>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and
> defined lifecycle.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Kenn
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>
> >>
> >>
> >
>

Re: @TearDown guarantees

Posted by Ismaël Mejía <ie...@gmail.com>.

I also had a different understanding of the lifecycle of a DoFn.

My understanding of the use case for every method in the DoFn was clear and
perfectly aligned with Thomas explanation, but what I understood was that in a
general terms ‘@Setup was where I got resources/prepare connections and
@Teardown where I free them’, so calling Teardown seemed essential to have a
complete lifecycle:
Setup → StartBundle* → ProcessElement* → FinishBundle* → Teardown

The fact that @Teardown could not be called is a new detail for me too, and I
also find weird to have a method that may or not be called as part of an API,
why would users implement teardown if it will not be called? In that case
probably a cleaner approach would be to get rid of that method altogether, no?

But well maybe that’s not so easy too, there was another point: Some user
reported an issue with leaking resources using KafkaIO in the Spark runner, for
ref.
https://apachebeam.slack.com/archives/C1AAFJYMP/p1510596938000622

In that moment my understanding was that there was something fishy because we
should be calling Teardown to close correctly the connections and free the
resources in case of exceptions on start/process/finish, so I filled a JIRA and
fixed this by enforcing the call of teardown for the Spark runner and the Flink
runner:
https://issues.apache.org/jira/browse/BEAM-3187
https://issues.apache.org/jira/browse/BEAM-3244

As you can see not calling this method does have consequences at least for
non-containerized runners. Of course a runner that uses containers could not
care about cleaning the resources this way, but a long living JVM in a Hadoop
environment probably won’t have the same luck. So I am not sure that having a
loose semantic there is the right option, I mean, runners could simply guarantee
that they call teardown and if teardown takes too long they can decide to send a
signal or kill the process/container/etc and go ahead, that way at least users
would have a motivation to implement the teardown method, otherwise it doesn’t
make any sense to have it (API wise).

On Mon, Feb 19, 2018 at 11:30 PM, Eugene Kirpichov <ki...@google.com> wrote:
> Romain, would it be fair to say that currently the goal of your
> participation in this discussion is to identify situations where @Teardown
> in principle could have been called, but some of the current runners don't
> make a good enough effort to call it? If yes - as I said before, please, by
> all means, file bugs of the form "Runner X doesn't call @Teardown in
> situation Y" if you're aware of any, and feel free to send PRs fixing runner
> X to reliably call @Teardown in situation Y. I think we all agree that this
> would be a good improvement.
>
> On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>>
>>
>>
>> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
>>
>>
>>
>> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau
>> <rm...@gmail.com> wrote:
>>>
>>>
>>>
>>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>>>
>>> How do you call teardown? There are cases in which the Java code gets no
>>> indication that the restart is happening (e.g. cases where the machine
>>> itself is taken down)
>>>
>>>
>>> This is a bug, 0 downtime maintenance is very doable in 2018 ;). Crashes
>>> are bugs, kill -9 to shutdown is a bug too. Other cases let call shutdown
>>> with a hook worse case.
>>
>>
>> What you say here is simply not true.
>>
>> There are many scenarios in which workers shutdown with no opportunity for
>> any sort of shutdown hook. Sometimes the entire machine gets shutdown, and
>> not even the OS will have much of a chance to do anything. At scale this
>> will happen with some regularity, and a distributed system that assumes this
>> will not happen is a poor distributed system.
>>
>>
>> This is part of the infra and there is no reason the machine is shutdown
>> without shutting down what runs on it before except if it is a bug in the
>> software or setup. I can hear you maybe dont do it everywhere but there is
>> no blocker to do it. Means you can shutdown the machines and guarantee
>> teardown is called.
>>
>> Where i go is simply that it is doable and beam sdk core can assume setup
>> is well done. If there is a best effort downside due to that - with the
>> meaning you defined - it is an impl bug or a user installation issue.
>>
>> Technically all is true.
>>
>> What can prevent teardown is a hardware failure or so. This is fine and
>> doesnt need to be in doc since it is life in IT and obvious or must be very
>> explicit to avoid current ambiguity.
>>
>>
>>>
>>>
>>>
>>>
>>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>>
>>>> Restarting doesnt mean you dont call teardown. Except a bug there is no
>>>> reason - technically - it happens, no reason.
>>>>
>>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>>>>>
>>>>> Workers restarting is not a bug, it's standard often expected.
>>>>>
>>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau
>>>>> <rm...@gmail.com> wrote:
>>>>>>
>>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>>>>> (procedure)
>>>>>>
>>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
>>>>>> écrit :
>>>>>>>
>>>>>>> So what would you like to happen if there is a crash? The DoFn
>>>>>>> instance no longer exists because the JVM it ran on no longer exists. What
>>>>>>> should Teardown be called on?
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau
>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>
>>>>>>>> This is what i want and not 999999 teardowns for 1000000 setups
>>>>>>>> until there is an unexpected crash (= a bug).
>>>>>>>>
>>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau
>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau
>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to
>>>>>>>>>>>> the same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>>>>>>> worker.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>>>>>>> instances at once but not 512 (which will be 2x256). So when you compute the
>>>>>>>>>> distribution you allocate to some fn the role to own the instance lookup and
>>>>>>>>>> releasing.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I still don't understand. Let's be more precise. If you write the
>>>>>>>>> following code:
>>>>>>>>>
>>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>>>>>>
>>>>>>>>> There is no way to control how many instances of MyDoFn are
>>>>>>>>> created. The runner might decided to create a million instances of this
>>>>>>>>> class across your worker pool, which means that you will get a million Setup
>>>>>>>>> and Teardown calls.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Anyway this was just an example of an external resource you must
>>>>>>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>>>>>>> lifecycle to let user embrace its programming model.
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> @Eugene:
>>>>>>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
>>>>>>>>>>>> core). This API defines a *container* API and therefore implies bean
>>>>>>>>>>>> lifecycles. I'll not detail them all but just use the sources and dofn (not
>>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>>>>>>>>>>>>
>>>>>>>>>>>> A. Source
>>>>>>>>>>>>
>>>>>>>>>>>> A source computes a partition plan with 2 primitives:
>>>>>>>>>>>> estimateSize and split. As an user you can expect both to be called on the
>>>>>>>>>>>> same bean instance to avoid to pay the same connection cost(s) twice.
>>>>>>>>>>>> Concretely:
>>>>>>>>>>>>
>>>>>>>>>>>> connect()
>>>>>>>>>>>> try {
>>>>>>>>>>>>   estimateSize()
>>>>>>>>>>>>   split()
>>>>>>>>>>>> } finally {
>>>>>>>>>>>>   disconnect()
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>>>>>>
>>>>>>>>>>>> connect()
>>>>>>>>>>>> try {
>>>>>>>>>>>>   estimateSize()
>>>>>>>>>>>> } finally {
>>>>>>>>>>>>   disconnect()
>>>>>>>>>>>> }
>>>>>>>>>>>> connect()
>>>>>>>>>>>> try {
>>>>>>>>>>>>   split()
>>>>>>>>>>>> } finally {
>>>>>>>>>>>>   disconnect()
>>>>>>>>>>>> }
>>>>>>>>>>>>
>>>>>>>>>>>> + a workaround with an internal estimate size since this
>>>>>>>>>>>> primitive is often called in split but you dont want to connect twice in the
>>>>>>>>>>>> second phase.
>>>>>>>>>>>>
>>>>>>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>>>>>>> I insists it is a very very basic concern for such API. However
>>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building any API on top of
>>>>>>>>>>>> beam is very hurtful today and for direct beam users you hit the exact same
>>>>>>>>>>>> issues - check how IO are implemented, the static utilities which create
>>>>>>>>>>>> volatile connections preventing to reuse existing connection in a single
>>>>>>>>>>>> method
>>>>>>>>>>>> (https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
>>>>>>>>>>>>
>>>>>>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>>>>>>
>>>>>>>>>>>> B. DoFn & SDF
>>>>>>>>>>>>
>>>>>>>>>>>> As a fn dev you expect the same from the beam runtime: init();
>>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and that it is
>>>>>>>>>>>> executed on the exact same instance to be able to be stateful at that level
>>>>>>>>>>>> for expensive connections/operations/flow state handling.
>>>>>>>>>>>>
>>>>>>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>>>>>>
>>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>>>>>>> more instances and requires to have a way more strict/explicit definition of
>>>>>>>>>>>> the exact lifecycle and which instance does what. Since beam handles the
>>>>>>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>>>>>>
>>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
>>>>>>>>>>>> Today, because of the teardown issue it uses bundles. Since bundles size is
>>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to be able to reuse
>>>>>>>>>>>> a connection instance to not correct performances. Now with the SDF and the
>>>>>>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>>>>>>> a single connection per thread to avoid to consume all database connections.
>>>>>>>>>>>> With a pool you have 2 choices: 1. use a pool of 1, 2. use a pool a bit
>>>>>>>>>>>> higher but multiplied by the number of beans you will likely x2 or 3 the
>>>>>>>>>>>> connection count and make the execution fail with "no more connection
>>>>>>>>>>>> available". I you picked 1 (pool of #1), then you still have to have a
>>>>>>>>>>>> reliable teardown by pool instance (close() generally) to ensure you release
>>>>>>>>>>>> the pool and don't leak the connection information in the JVM. In all case
>>>>>>>>>>>> you come back to the init()/destroy() lifecycle even if you fake to get
>>>>>>>>>>>> connections with bundles.
>>>>>>>>>>>>
>>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply
>>>>>>>>>>>> all the current issues with the loose definition of the bean lifecycles at
>>>>>>>>>>>> an exponential level, nothing else.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn | Book
>>>>>>>>>>>>
>>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov
>>>>>>>>>>>> <ki...@google.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (Would like to reiterate one more time, as the main author of
>>>>>>>>>>>>> most design documents related to SDF and of its implementation in the Java
>>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau
>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>>>>>>>>>>> transforms?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's
>>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary, it is around an
>>>>>>>>>>>>>>> entire composite PTransform. I think there have been discussions/proposals
>>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those haven't been
>>>>>>>>>>>>>>> implemented, to the best of my knowledge.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a
>>>>>>>>>>>>>>> bulk copy to put them in the final destination.
>>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>>>>>>> runs on one worker.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
>>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule some cleanup work for
>>>>>>>>>>>>>>> when the transform is "done". In batch this is relatively straightforward,
>>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems, such as BigQuery
>>>>>>>>>>>>>>> sink leaving files around that have failed to import into BigQuery.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In streaming this is less straightforward -- do you want to
>>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to wait until the end of
>>>>>>>>>>>>>>> the window? In practice, you just want to wait until you know nobody will
>>>>>>>>>>>>>>> need the resource anymore.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API, where
>>>>>>>>>>>>>>> you could have a transform that output resource objects. Each resource
>>>>>>>>>>>>>>> object would have logic for cleaning it up. And there would be something
>>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that resource, and what
>>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as that part of the
>>>>>>>>>>>>>>> pipeline had advanced far enough that it would no longer need the resources,
>>>>>>>>>>>>>>> they would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Would something like this be a better fit for your use case?
>>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau
>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What i want is any "new" to have a following setup before
>>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has the instance before it
>>>>>>>>>>>>>>>> is gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>>>>>>> écrit :
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau
>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers"
>>>>>>>>>>>>>>>>>> <bc...@apache.org> a écrit :
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather
>>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing methods -- which have been
>>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would be helpful to focus on
>>>>>>>>>>>>>>>>>> more on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying
>>>>>>>>>>>>>>>>>> to do):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the
>>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
>>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle methods are not a good fit
>>>>>>>>>>>>>>>>>> for this (they are focused on managing resources within the DoFn), you could
>>>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries
>>>>>>>>>>>>>>>>>> from changing resource IDs)
>>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>>>>>>>>>>>>>>> eventually output the fact they're done
>>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been finished
>>>>>>>>>>>>>>>>>> by using the require deterministic input. This is important to ensuring
>>>>>>>>>>>>>>>>>> everything is actually cleaned up.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is
>>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are trying to accomplish? That
>>>>>>>>>>>>>>>>>> would help me understand both the problems with existing options and
>>>>>>>>>>>>>>>>>> possibly what could be done to help.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to use
>>>>>>>>>>>>>>>>>> to build higher level libraries or softwares.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
>>>>>>>>>>>>>>>>>> discussions and just stay at API level.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau
>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov
>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
>>>>>>>>>>>>>>>>>>>> except in various situations where it's logically impossible or impractical
>>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not
>>>>>>>>>>>>>>>>>>> which which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if fancy
>>>>>>>>>>>>>>>>>>> - recovery procedure. This is where it makes all the difference and impacts
>>>>>>>>>>>>>>>>>>> the developpers, ops (all users basically).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It
>>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents
>>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very badly and wrongly
>>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>>>> @rmannibucau |  Blog | Old Blog | Github | LinkedIn |
>>>>>>>>>>>>>>>>>>>>> Book
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov
>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it:
>>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user code
>>>>>>>>>>>>>>>>>>>>>> in a different thread called a C library over JNI and it segfaulted), JVM
>>>>>>>>>>>>>>>>>>>>>> bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted by
>>>>>>>>>>>>>>>>>>>>>> the underlying cluster manager without notice or if the worker was too busy
>>>>>>>>>>>>>>>>>>>>>> with other stuff (eg calling other Teardown functions) until the preemption
>>>>>>>>>>>>>>>>>>>>>> timeout elapsed, in case the underlying hardware simply failed (which
>>>>>>>>>>>>>>>>>>>>>> happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe
>>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it but
>>>>>>>>>>>>>>>>>>>>>> the runner made insufficient effort.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov
>>>>>>>>>>>>>>>>>>>>>>> <ki...@google.com>:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles"
>>>>>>>>>>>>>>>>>>>>>>>>> <kl...@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g.
>>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a best
>>>>>>>>>>>>>>>>>>>>>>>>>> effort thing. Not releasing the connection makes you pay a lot - aws ;) - or
>>>>>>>>>>>>>>>>>>>>>>>>>> prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
>>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not called then nothing else can be
>>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What AWS service are you thinking of
>>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but
>>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some closing exchanges which are not
>>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services
>>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup and closing them at the end.
>>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring
>>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be handled by any human system?
>>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not handle it? Is it due to some
>>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented
>>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what kind of change you're asking
>>>>>>>>>>>>>>>>>>>>>>>> for.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not
>>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big
>>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what people did for years and do
>>>>>>>>>>>>>>>>>>>>>>>>> it wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
>>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the execution of teardown. Then we
>>>>>>>>>>>>>>>>>>>>>>>>> see if we can handle it and only if there is a technical reason we cant we
>>>>>>>>>>>>>>>>>>>>>>>>> make it experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is fully
>>>>>>>>>>>>>>>>>>>>>>>>> unhandled and your overall system is uncontrolled. Only case where it is not
>>>>>>>>>>>>>>>>>>>>>>>>> true is when the software is always owned by a vendor and never installed on
>>>>>>>>>>>>>>>>>>>>>>>>> customer environment. In this case it belongd to the vendor to handle beam
>>>>>>>>>>>>>>>>>>>>>>>>> API and not to beam to adjust its API for a vendor - otherwise all
>>>>>>>>>>>>>>>>>>>>>>>>> unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>
>>
>>
>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

Romain, would it be fair to say that currently the goal of your
participation in this discussion is to identify situations where @Teardown
in principle could have been called, but some of the current runners don't
make a good enough effort to call it? If yes - as I said before, please, by
all means, file bugs of the form "Runner X doesn't call @Teardown in
situation Y" if you're aware of any, and feel free to send PRs fixing
runner X to reliably call @Teardown in situation Y. I think we all agree
that this would be a good improvement.

On Mon, Feb 19, 2018 at 2:03 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

>
>
> Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :
>
>
>
> On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau <rmannibucau@gmail.com
> > wrote:
>
>>
>>
>> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>>
>> How do you call teardown? There are cases in which the Java code gets no
>> indication that the restart is happening (e.g. cases where the machine
>> itself is taken down)
>>
>>
>> This is a bug, 0 downtime maintenance is very doable in 2018 ;). Crashes
>> are bugs, kill -9 to shutdown is a bug too. Other cases let call shutdown
>> with a hook worse case.
>>
>
> What you say here is simply not true.
>
> There are many scenarios in which workers shutdown with no opportunity for
> any sort of shutdown hook. Sometimes the entire machine gets shutdown, and
> not even the OS will have much of a chance to do anything. At scale this
> will happen with some regularity, and a distributed system that assumes
> this will not happen is a poor distributed system.
>
>
> This is part of the infra and there is no reason the machine is shutdown
> without shutting down what runs on it before except if it is a bug in the
> software or setup. I can hear you maybe dont do it everywhere but there is
> no blocker to do it. Means you can shutdown the machines and guarantee
> teardown is called.
>
> Where i go is simply that it is doable and beam sdk core can assume setup
> is well done. If there is a best effort downside due to that - with the
> meaning you defined - it is an impl bug or a user installation issue.
>
> Technically all is true.
>
> What can prevent teardown is a hardware failure or so. This is fine and
> doesnt need to be in doc since it is life in IT and obvious or must be very
> explicit to avoid current ambiguity.
>
>
>
>>
>>
>>
>> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> Restarting doesnt mean you dont call teardown. Except a bug there is no
>>> reason - technically - it happens, no reason.
>>>
>>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>>>
>>>> Workers restarting is not a bug, it's standard often expected.
>>>>
>>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>>>> (procedure)
>>>>>
>>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
>>>>> écrit :
>>>>>
>>>>>> So what would you like to happen if there is a crash? The DoFn
>>>>>> instance no longer exists because the JVM it ran on no longer exists. What
>>>>>> should Teardown be called on?
>>>>>>
>>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> This is what i want and not 999999 teardowns for 1000000 setups
>>>>>>> until there is an unexpected crash (= a bug).
>>>>>>>
>>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to
>>>>>>>>>>> the same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>>>>>> worker.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>>>>>> the distribution you allocate to some fn the role to own the instance
>>>>>>>>> lookup and releasing.
>>>>>>>>>
>>>>>>>>
>>>>>>>> I still don't understand. Let's be more precise. If you write the
>>>>>>>> following code:
>>>>>>>>
>>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>>>>>
>>>>>>>> There is no way to control how many instances of MyDoFn are
>>>>>>>> created. The runner might decided to create a million instances of this
>>>>>>>> class across your worker pool, which means that you will get a million
>>>>>>>> Setup and Teardown calls.
>>>>>>>>
>>>>>>>>
>>>>>>>>> Anyway this was just an example of an external resource you must
>>>>>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>>>>>> lifecycle to let user embrace its programming model.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> @Eugene:
>>>>>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
>>>>>>>>>>> core). This API defines a *container* API and therefore implies bean
>>>>>>>>>>> lifecycles. I'll not detail them all but just use the sources and dofn (not
>>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>>>>>>>>>>>
>>>>>>>>>>> A. Source
>>>>>>>>>>>
>>>>>>>>>>> A source computes a partition plan with 2 primitives:
>>>>>>>>>>> estimateSize and split. As an user you can expect both to be called on the
>>>>>>>>>>> same bean instance to avoid to pay the same connection cost(s) twice.
>>>>>>>>>>> Concretely:
>>>>>>>>>>>
>>>>>>>>>>> connect()
>>>>>>>>>>> try {
>>>>>>>>>>>   estimateSize()
>>>>>>>>>>>   split()
>>>>>>>>>>> } finally {
>>>>>>>>>>>   disconnect()
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>>>>>
>>>>>>>>>>> connect()
>>>>>>>>>>> try {
>>>>>>>>>>>   estimateSize()
>>>>>>>>>>> } finally {
>>>>>>>>>>>   disconnect()
>>>>>>>>>>> }
>>>>>>>>>>> connect()
>>>>>>>>>>> try {
>>>>>>>>>>>   split()
>>>>>>>>>>> } finally {
>>>>>>>>>>>   disconnect()
>>>>>>>>>>> }
>>>>>>>>>>>
>>>>>>>>>>> + a workaround with an internal estimate size since this
>>>>>>>>>>> primitive is often called in split but you dont want to connect twice in
>>>>>>>>>>> the second phase.
>>>>>>>>>>>
>>>>>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>>>>>> I insists it is a very very basic concern for such API. However
>>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building any API on top
>>>>>>>>>>> of beam is very hurtful today and for direct beam users you hit the exact
>>>>>>>>>>> same issues - check how IO are implemented, the static utilities which
>>>>>>>>>>> create volatile connections preventing to reuse existing connection in a
>>>>>>>>>>> single method (
>>>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862
>>>>>>>>>>> ).
>>>>>>>>>>>
>>>>>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>>>>>
>>>>>>>>>>> B. DoFn & SDF
>>>>>>>>>>>
>>>>>>>>>>> As a fn dev you expect the same from the beam runtime: init();
>>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and that it is
>>>>>>>>>>> executed on the exact same instance to be able to be stateful at that level
>>>>>>>>>>> for expensive connections/operations/flow state handling.
>>>>>>>>>>>
>>>>>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>>>>>
>>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>>>>>
>>>>>>>>>>> If you take the JDBC example which was mentionned earlier.
>>>>>>>>>>> Today, because of the teardown issue it uses bundles. Since bundles size is
>>>>>>>>>>> not defined - and will not with SDF, it must use a pool to be able to reuse
>>>>>>>>>>> a connection instance to not correct performances. Now with the SDF and the
>>>>>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>>>>>>> to get connections with bundles.
>>>>>>>>>>>
>>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply
>>>>>>>>>>> all the current issues with the loose definition of the bean lifecycles at
>>>>>>>>>>> an exponential level, nothing else.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>
>>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>
>>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>>>>>
>>>>>>>>>>>> (Would like to reiterate one more time, as the main author of
>>>>>>>>>>>> most design documents related to SDF and of its implementation in the Java
>>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>>>>>>>>>> transforms?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's
>>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary, it is around an
>>>>>>>>>>>>>> entire composite PTransform. I think there have been discussions/proposals
>>>>>>>>>>>>>> around a more methodical "cleanup" option, but those haven't been
>>>>>>>>>>>>>> implemented, to the best of my knowledge.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a
>>>>>>>>>>>>>> bulk copy to put them in the final destination.
>>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>>>>>> runs on one worker.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not
>>>>>>>>>>>>>> enough. We need an API for a PTransform to schedule some cleanup work for
>>>>>>>>>>>>>> when the transform is "done". In batch this is relatively straightforward,
>>>>>>>>>>>>>> but doesn't exist. This is the source of some problems, such as BigQuery
>>>>>>>>>>>>>> sink leaving files around that have failed to import into BigQuery.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In streaming this is less straightforward -- do you want to
>>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to wait until the end of
>>>>>>>>>>>>>> the window? In practice, you just want to wait until you know nobody will
>>>>>>>>>>>>>> need the resource anymore.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This led to some discussions around a "cleanup" API, where
>>>>>>>>>>>>>> you could have a transform that output resource objects. Each resource
>>>>>>>>>>>>>> object would have logic for cleaning it up. And there would be something
>>>>>>>>>>>>>> that indicated what parts of the pipeline needed that resource, and what
>>>>>>>>>>>>>> kind of temporal lifetime those objects had. As soon as that part of the
>>>>>>>>>>>>>> pipeline had advanced far enough that it would no longer need the
>>>>>>>>>>>>>> resources, they would get cleaned up. This can be done at pipeline
>>>>>>>>>>>>>> shutdown, or incrementally during a streaming pipeline, etc.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Would something like this be a better fit for your use case?
>>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What i want is any "new" to have a following setup before
>>>>>>>>>>>>>>> any process or stattbundle and the last time beam has the instance before
>>>>>>>>>>>>>>> it is gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>>>>>> écrit :
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <
>>>>>>>>>>>>>>>>> bchambers@apache.org> a écrit :
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather
>>>>>>>>>>>>>>>>> than focusing on the semantics of the existing methods -- which have been
>>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would be helpful to focus on
>>>>>>>>>>>>>>>>> more on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying
>>>>>>>>>>>>>>>>> to do):
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the
>>>>>>>>>>>>>>>>> batch is done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some
>>>>>>>>>>>>>>>>> region of the pipeline. While, the DoFn lifecycle methods are not a good
>>>>>>>>>>>>>>>>> fit for this (they are focused on managing resources within the DoFn), you
>>>>>>>>>>>>>>>>> could model this on how FileIO finalizes the files that it produced. For
>>>>>>>>>>>>>>>>> instance:
>>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries
>>>>>>>>>>>>>>>>> from changing resource IDs)
>>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>>>>>>>>>>>>>> eventually output the fact they're done
>>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is
>>>>>>>>>>>>>>>>> this case, could you elaborate on what you are trying to accomplish? That
>>>>>>>>>>>>>>>>> would help me understand both the problems with existing options and
>>>>>>>>>>>>>>>>> possibly what could be done to help.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> This is why i tried to not start the workaround
>>>>>>>>>>>>>>>>> discussions and just stay at API level.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called
>>>>>>>>>>>>>>>>>>> except in various situations where it's logically impossible or impractical
>>>>>>>>>>>>>>>>>>> to guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not
>>>>>>>>>>>>>>>>>> which which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It
>>>>>>>>>>>>>>>>>>>> is also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents
>>>>>>>>>>>>>>>>>>>> it" but "best effort" is too open and can be very badly and wrongly
>>>>>>>>>>>>>>>>>>>> perceived by users (like I did).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it:
>>>>>>>>>>>>>>>>>>>>> in the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user
>>>>>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it segfaulted),
>>>>>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe
>>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g.
>>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
>>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not called then nothing else can be
>>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What AWS service are you thinking of
>>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but
>>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some closing exchanges which are not
>>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services
>>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup and closing them at the end.
>>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines and money. You can say it can
>>>>>>>>>>>>>>>>>>>>>>>> be done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring
>>>>>>>>>>>>>>>>>>>>>>>> the interstellar crash case which cant be handled by any human system?
>>>>>>>>>>>>>>>>>>>>>>>> Nothing technically. Why do you push to not handle it? Is it due to some
>>>>>>>>>>>>>>>>>>>>>>>> legacy code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented
>>>>>>>>>>>>>>>>>>>>>>> this way (best-effort). So I'm not sure what kind of change you're asking
>>>>>>>>>>>>>>>>>>>>>>> for.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not
>>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big
>>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what people did for years and
>>>>>>>>>>>>>>>>>>>>>>>> do it wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to
>>>>>>>>>>>>>>>>>>>>>>>> guarantee - in the normal IT conditions - the execution of teardown. Then
>>>>>>>>>>>>>>>>>>>>>>>> we see if we can handle it and only if there is a technical reason we cant
>>>>>>>>>>>>>>>>>>>>>>>> we make it experimental/unsupported in the api. I know spark and flink can,
>>>>>>>>>>>>>>>>>>>>>>>> any unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>
>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 19 févr. 2018 22:56, "Reuven Lax" <re...@google.com> a écrit :



On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

>
>
> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>
> How do you call teardown? There are cases in which the Java code gets no
> indication that the restart is happening (e.g. cases where the machine
> itself is taken down)
>
>
> This is a bug, 0 downtime maintenance is very doable in 2018 ;). Crashes
> are bugs, kill -9 to shutdown is a bug too. Other cases let call shutdown
> with a hook worse case.
>

What you say here is simply not true.

There are many scenarios in which workers shutdown with no opportunity for
any sort of shutdown hook. Sometimes the entire machine gets shutdown, and
not even the OS will have much of a chance to do anything. At scale this
will happen with some regularity, and a distributed system that assumes
this will not happen is a poor distributed system.


This is part of the infra and there is no reason the machine is shutdown
without shutting down what runs on it before except if it is a bug in the
software or setup. I can hear you maybe dont do it everywhere but there is
no blocker to do it. Means you can shutdown the machines and guarantee
teardown is called.

Where i go is simply that it is doable and beam sdk core can assume setup
is well done. If there is a best effort downside due to that - with the
meaning you defined - it is an impl bug or a user installation issue.

Technically all is true.

What can prevent teardown is a hardware failure or so. This is fine and
doesnt need to be in doc since it is life in IT and obvious or must be very
explicit to avoid current ambiguity.



>
>
>
> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> Restarting doesnt mean you dont call teardown. Except a bug there is no
>> reason - technically - it happens, no reason.
>>
>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>>
>>> Workers restarting is not a bug, it's standard often expected.
>>>
>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>
>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>>> (procedure)
>>>>
>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
>>>> écrit :
>>>>
>>>>> So what would you like to happen if there is a crash? The DoFn
>>>>> instance no longer exists because the JVM it ran on no longer exists. What
>>>>> should Teardown be called on?
>>>>>
>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> This is what i want and not 999999 teardowns for 1000000 setups until
>>>>>> there is an unexpected crash (= a bug).
>>>>>>
>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to
>>>>>>>>>> the same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>>>>> worker.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>>>>> the distribution you allocate to some fn the role to own the instance
>>>>>>>> lookup and releasing.
>>>>>>>>
>>>>>>>
>>>>>>> I still don't understand. Let's be more precise. If you write the
>>>>>>> following code:
>>>>>>>
>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>>>>
>>>>>>> There is no way to control how many instances of MyDoFn are created.
>>>>>>> The runner might decided to create a million instances of this class across
>>>>>>> your worker pool, which means that you will get a million Setup and
>>>>>>> Teardown calls.
>>>>>>>
>>>>>>>
>>>>>>>> Anyway this was just an example of an external resource you must
>>>>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>>>>> lifecycle to let user embrace its programming model.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> @Eugene:
>>>>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
>>>>>>>>>> core). This API defines a *container* API and therefore implies bean
>>>>>>>>>> lifecycles. I'll not detail them all but just use the sources and dofn (not
>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>>>>>>>>>>
>>>>>>>>>> A. Source
>>>>>>>>>>
>>>>>>>>>> A source computes a partition plan with 2 primitives:
>>>>>>>>>> estimateSize and split. As an user you can expect both to be called on the
>>>>>>>>>> same bean instance to avoid to pay the same connection cost(s) twice.
>>>>>>>>>> Concretely:
>>>>>>>>>>
>>>>>>>>>> connect()
>>>>>>>>>> try {
>>>>>>>>>>   estimateSize()
>>>>>>>>>>   split()
>>>>>>>>>> } finally {
>>>>>>>>>>   disconnect()
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>>>>
>>>>>>>>>> connect()
>>>>>>>>>> try {
>>>>>>>>>>   estimateSize()
>>>>>>>>>> } finally {
>>>>>>>>>>   disconnect()
>>>>>>>>>> }
>>>>>>>>>> connect()
>>>>>>>>>> try {
>>>>>>>>>>   split()
>>>>>>>>>> } finally {
>>>>>>>>>>   disconnect()
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> + a workaround with an internal estimate size since this
>>>>>>>>>> primitive is often called in split but you dont want to connect twice in
>>>>>>>>>> the second phase.
>>>>>>>>>>
>>>>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>>>>> I insists it is a very very basic concern for such API. However
>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building any API on top
>>>>>>>>>> of beam is very hurtful today and for direct beam users you hit the exact
>>>>>>>>>> same issues - check how IO are implemented, the static utilities which
>>>>>>>>>> create volatile connections preventing to reuse existing connection in a
>>>>>>>>>> single method (https://github.com/apache/bea
>>>>>>>>>> m/blob/master/sdks/java/io/elasticsearch/src/main/java/org/a
>>>>>>>>>> pache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
>>>>>>>>>>
>>>>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>>>>
>>>>>>>>>> B. DoFn & SDF
>>>>>>>>>>
>>>>>>>>>> As a fn dev you expect the same from the beam runtime: init();
>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and that it is
>>>>>>>>>> executed on the exact same instance to be able to be stateful at that level
>>>>>>>>>> for expensive connections/operations/flow state handling.
>>>>>>>>>>
>>>>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>>>>
>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>>>>
>>>>>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>>>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>>>>>> to get connections with bundles.
>>>>>>>>>>
>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply
>>>>>>>>>> all the current issues with the loose definition of the bean lifecycles at
>>>>>>>>>> an exponential level, nothing else.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>
>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <kirpichov@google.com
>>>>>>>>>> >:
>>>>>>>>>>
>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>>>>
>>>>>>>>>>> (Would like to reiterate one more time, as the main author of
>>>>>>>>>>> most design documents related to SDF and of its implementation in the Java
>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>>>>
>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>>>>>>>>> transforms?
>>>>>>>>>>>>
>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's
>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary, it is around an
>>>>>>>>>>>>> entire composite PTransform. I think there have been discussions/proposals
>>>>>>>>>>>>> around a more methodical "cleanup" option, but those haven't been
>>>>>>>>>>>>> implemented, to the best of my knowledge.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>>>>>> copy to put them in the final destination.
>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>>>>> runs on one worker.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough.
>>>>>>>>>>>>> We need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In streaming this is less straightforward -- do you want to
>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to wait until the end of
>>>>>>>>>>>>> the window? In practice, you just want to wait until you know nobody will
>>>>>>>>>>>>> need the resource anymore.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Would something like this be a better fit for your use case?
>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>>>>> écrit :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <
>>>>>>>>>>>>>>>> bchambers@apache.org> a écrit :
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather
>>>>>>>>>>>>>>>> than focusing on the semantics of the existing methods -- which have been
>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would be helpful to focus on
>>>>>>>>>>>>>>>> more on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying
>>>>>>>>>>>>>>>> to do):
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch
>>>>>>>>>>>>>>>> is done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some region
>>>>>>>>>>>>>>>> of the pipeline. While, the DoFn lifecycle methods are not a good fit for
>>>>>>>>>>>>>>>> this (they are focused on managing resources within the DoFn), you could
>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries
>>>>>>>>>>>>>>>> from changing resource IDs)
>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>>>>>>>>>>>>> eventually output the fact they're done
>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is why i tried to not start the workaround discussions
>>>>>>>>>>>>>>>> and just stay at API level.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except
>>>>>>>>>>>>>>>>>> in various situations where it's logically impossible or impractical to
>>>>>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not
>>>>>>>>>>>>>>>>> which which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it"
>>>>>>>>>>>>>>>>>>> but "best effort" is too open and can be very badly and wrongly perceived
>>>>>>>>>>>>>>>>>>> by users (like I did).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in
>>>>>>>>>>>>>>>>>>>> the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user
>>>>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it segfaulted),
>>>>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe
>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g.
>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not called then nothing else can be
>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What AWS service are you thinking of
>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but
>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some closing exchanges which are not
>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services
>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup and closing them at the end.
>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines and money. You can say it can
>>>>>>>>>>>>>>>>>>>>>>> be done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this
>>>>>>>>>>>>>>>>>>>>>> way (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not
>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big
>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what people did for years and
>>>>>>>>>>>>>>>>>>>>>>> do it wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee
>>>>>>>>>>>>>>>>>>>>>>> - in the normal IT conditions - the execution of teardown. Then we see if
>>>>>>>>>>>>>>>>>>>>>>> we can handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

On Mon, Feb 19, 2018 at 1:51 PM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

>
>
> Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :
>
> How do you call teardown? There are cases in which the Java code gets no
> indication that the restart is happening (e.g. cases where the machine
> itself is taken down)
>
>
> This is a bug, 0 downtime maintenance is very doable in 2018 ;). Crashes
> are bugs, kill -9 to shutdown is a bug too. Other cases let call shutdown
> with a hook worse case.
>

What you say here is simply not true.

There are many scenarios in which workers shutdown with no opportunity for
any sort of shutdown hook. Sometimes the entire machine gets shutdown, and
not even the OS will have much of a chance to do anything. At scale this
will happen with some regularity, and a distributed system that assumes
this will not happen is a poor distributed system.


>
>
>
> On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> Restarting doesnt mean you dont call teardown. Except a bug there is no
>> reason - technically - it happens, no reason.
>>
>> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>>
>>> Workers restarting is not a bug, it's standard often expected.
>>>
>>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>
>>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>>> (procedure)
>>>>
>>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
>>>> écrit :
>>>>
>>>>> So what would you like to happen if there is a crash? The DoFn
>>>>> instance no longer exists because the JVM it ran on no longer exists. What
>>>>> should Teardown be called on?
>>>>>
>>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> This is what i want and not 999999 teardowns for 1000000 setups until
>>>>>> there is an unexpected crash (= a bug).
>>>>>>
>>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to
>>>>>>>>>> the same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>>>>> worker.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>>>>> the distribution you allocate to some fn the role to own the instance
>>>>>>>> lookup and releasing.
>>>>>>>>
>>>>>>>
>>>>>>> I still don't understand. Let's be more precise. If you write the
>>>>>>> following code:
>>>>>>>
>>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>>>>
>>>>>>> There is no way to control how many instances of MyDoFn are created.
>>>>>>> The runner might decided to create a million instances of this class across
>>>>>>> your worker pool, which means that you will get a million Setup and
>>>>>>> Teardown calls.
>>>>>>>
>>>>>>>
>>>>>>>> Anyway this was just an example of an external resource you must
>>>>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>>>>> lifecycle to let user embrace its programming model.
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> @Eugene:
>>>>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Concretely beam exposes a portable API (included in the SDK
>>>>>>>>>> core). This API defines a *container* API and therefore implies bean
>>>>>>>>>> lifecycles. I'll not detail them all but just use the sources and dofn (not
>>>>>>>>>> sdf) to illustrate the idea I'm trying to develop.
>>>>>>>>>>
>>>>>>>>>> A. Source
>>>>>>>>>>
>>>>>>>>>> A source computes a partition plan with 2 primitives:
>>>>>>>>>> estimateSize and split. As an user you can expect both to be called on the
>>>>>>>>>> same bean instance to avoid to pay the same connection cost(s) twice.
>>>>>>>>>> Concretely:
>>>>>>>>>>
>>>>>>>>>> connect()
>>>>>>>>>> try {
>>>>>>>>>>   estimateSize()
>>>>>>>>>>   split()
>>>>>>>>>> } finally {
>>>>>>>>>>   disconnect()
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>>>>
>>>>>>>>>> connect()
>>>>>>>>>> try {
>>>>>>>>>>   estimateSize()
>>>>>>>>>> } finally {
>>>>>>>>>>   disconnect()
>>>>>>>>>> }
>>>>>>>>>> connect()
>>>>>>>>>> try {
>>>>>>>>>>   split()
>>>>>>>>>> } finally {
>>>>>>>>>>   disconnect()
>>>>>>>>>> }
>>>>>>>>>>
>>>>>>>>>> + a workaround with an internal estimate size since this
>>>>>>>>>> primitive is often called in split but you dont want to connect twice in
>>>>>>>>>> the second phase.
>>>>>>>>>>
>>>>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>>>>> I insists it is a very very basic concern for such API. However
>>>>>>>>>> beam doesn't embraces it and doesn't assume it so building any API on top
>>>>>>>>>> of beam is very hurtful today and for direct beam users you hit the exact
>>>>>>>>>> same issues - check how IO are implemented, the static utilities which
>>>>>>>>>> create volatile connections preventing to reuse existing connection in a
>>>>>>>>>> single method (https://github.com/apache/bea
>>>>>>>>>> m/blob/master/sdks/java/io/elasticsearch/src/main/java/org/
>>>>>>>>>> apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
>>>>>>>>>>
>>>>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>>>>
>>>>>>>>>> B. DoFn & SDF
>>>>>>>>>>
>>>>>>>>>> As a fn dev you expect the same from the beam runtime: init();
>>>>>>>>>> try { while (...) process(); } finally { destroy(); } and that it is
>>>>>>>>>> executed on the exact same instance to be able to be stateful at that level
>>>>>>>>>> for expensive connections/operations/flow state handling.
>>>>>>>>>>
>>>>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>>>>
>>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>>>>
>>>>>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>>>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>>>>>> to get connections with bundles.
>>>>>>>>>>
>>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply
>>>>>>>>>> all the current issues with the loose definition of the bean lifecycles at
>>>>>>>>>> an exponential level, nothing else.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>
>>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <kirpichov@google.com
>>>>>>>>>> >:
>>>>>>>>>>
>>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>>>>
>>>>>>>>>>> (Would like to reiterate one more time, as the main author of
>>>>>>>>>>> most design documents related to SDF and of its implementation in the Java
>>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>>>>
>>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>>>>>>>>> transforms?
>>>>>>>>>>>>
>>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's
>>>>>>>>>>>>> is appropriate? Many cases where cleanup is necessary, it is around an
>>>>>>>>>>>>> entire composite PTransform. I think there have been discussions/proposals
>>>>>>>>>>>>> around a more methodical "cleanup" option, but those haven't been
>>>>>>>>>>>>> implemented, to the best of my knowledge.
>>>>>>>>>>>>>
>>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>>>>>> copy to put them in the final destination.
>>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>>>>
>>>>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>>>>
>>>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>>>>> runs on one worker.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough.
>>>>>>>>>>>>> We need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In streaming this is less straightforward -- do you want to
>>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to wait until the end of
>>>>>>>>>>>>> the window? In practice, you just want to wait until you know nobody will
>>>>>>>>>>>>> need the resource anymore.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Would something like this be a better fit for your use case?
>>>>>>>>>>>>> If not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> In practise, new is often an unsafe allocate
>>>>>>>>>>>>>> (deserialization) but it doesnt matter here.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>>>>> écrit :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <
>>>>>>>>>>>>>>>> bchambers@apache.org> a écrit :
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather
>>>>>>>>>>>>>>>> than focusing on the semantics of the existing methods -- which have been
>>>>>>>>>>>>>>>> noted to be meet many existing use cases -- it would be helpful to focus on
>>>>>>>>>>>>>>>> more on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying
>>>>>>>>>>>>>>>> to do):
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch
>>>>>>>>>>>>>>>> is done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some region
>>>>>>>>>>>>>>>> of the pipeline. While, the DoFn lifecycle methods are not a good fit for
>>>>>>>>>>>>>>>> this (they are focused on managing resources within the DoFn), you could
>>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries
>>>>>>>>>>>>>>>> from changing resource IDs)
>>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>>>>>>>>>>>>> eventually output the fact they're done
>>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> This is why i tried to not start the workaround discussions
>>>>>>>>>>>>>>>> and just stay at API level.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except
>>>>>>>>>>>>>>>>>> in various situations where it's logically impossible or impractical to
>>>>>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not
>>>>>>>>>>>>>>>>> which which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it"
>>>>>>>>>>>>>>>>>>> but "best effort" is too open and can be very badly and wrongly perceived
>>>>>>>>>>>>>>>>>>> by users (like I did).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in
>>>>>>>>>>>>>>>>>>>> the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user
>>>>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it segfaulted),
>>>>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe
>>>>>>>>>>>>>>>>>>>> such behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau
>>>>>>>>>>>>>>>>>>>>>>> <rm...@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g.
>>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If
>>>>>>>>>>>>>>>>>>>>>>> things die so badly that @Teardown is not called then nothing else can be
>>>>>>>>>>>>>>>>>>>>>>> called to close the connection either. What AWS service are you thinking of
>>>>>>>>>>>>>>>>>>>>>>> that stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but
>>>>>>>>>>>>>>>>>>>>>>> some (proprietary) protocols requires some closing exchanges which are not
>>>>>>>>>>>>>>>>>>>>>>> only "im leaving".
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services
>>>>>>>>>>>>>>>>>>>>>>> - machines - on the fly in a pipeline startup and closing them at the end.
>>>>>>>>>>>>>>>>>>>>>>> If teardown is not called you leak machines and money. You can say it can
>>>>>>>>>>>>>>>>>>>>>>> be done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this
>>>>>>>>>>>>>>>>>>>>>> way (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not
>>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big
>>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what people did for years and
>>>>>>>>>>>>>>>>>>>>>>> do it wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee
>>>>>>>>>>>>>>>>>>>>>>> - in the normal IT conditions - the execution of teardown. Then we see if
>>>>>>>>>>>>>>>>>>>>>>> we can handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 19 févr. 2018 21:28, "Reuven Lax" <re...@google.com> a écrit :

How do you call teardown? There are cases in which the Java code gets no
indication that the restart is happening (e.g. cases where the machine
itself is taken down)


This is a bug, 0 downtime maintenance is very doable in 2018 ;). Crashes
are bugs, kill -9 to shutdown is a bug too. Other cases let call shutdown
with a hook worse case.



On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Restarting doesnt mean you dont call teardown. Except a bug there is no
> reason - technically - it happens, no reason.
>
> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>
>> Workers restarting is not a bug, it's standard often expected.
>>
>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>> (procedure)
>>>
>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
>>> écrit :
>>>
>>>> So what would you like to happen if there is a crash? The DoFn instance
>>>> no longer exists because the JVM it ran on no longer exists. What should
>>>> Teardown be called on?
>>>>
>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> This is what i want and not 999999 teardowns for 1000000 setups until
>>>>> there is an unexpected crash (= a bug).
>>>>>
>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to the
>>>>>>>>> same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>>>
>>>>>>>>
>>>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>>>> worker.
>>>>>>>>
>>>>>>>
>>>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>>>> the distribution you allocate to some fn the role to own the instance
>>>>>>> lookup and releasing.
>>>>>>>
>>>>>>
>>>>>> I still don't understand. Let's be more precise. If you write the
>>>>>> following code:
>>>>>>
>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>>>
>>>>>> There is no way to control how many instances of MyDoFn are created.
>>>>>> The runner might decided to create a million instances of this class across
>>>>>> your worker pool, which means that you will get a million Setup and
>>>>>> Teardown calls.
>>>>>>
>>>>>>
>>>>>>> Anyway this was just an example of an external resource you must
>>>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>>>> lifecycle to let user embrace its programming model.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> @Eugene:
>>>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Concretely beam exposes a portable API (included in the SDK core).
>>>>>>>>> This API defines a *container* API and therefore implies bean lifecycles.
>>>>>>>>> I'll not detail them all but just use the sources and dofn (not sdf) to
>>>>>>>>> illustrate the idea I'm trying to develop.
>>>>>>>>>
>>>>>>>>> A. Source
>>>>>>>>>
>>>>>>>>> A source computes a partition plan with 2 primitives: estimateSize
>>>>>>>>> and split. As an user you can expect both to be called on the same bean
>>>>>>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>>>>>>
>>>>>>>>> connect()
>>>>>>>>> try {
>>>>>>>>>   estimateSize()
>>>>>>>>>   split()
>>>>>>>>> } finally {
>>>>>>>>>   disconnect()
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>>>
>>>>>>>>> connect()
>>>>>>>>> try {
>>>>>>>>>   estimateSize()
>>>>>>>>> } finally {
>>>>>>>>>   disconnect()
>>>>>>>>> }
>>>>>>>>> connect()
>>>>>>>>> try {
>>>>>>>>>   split()
>>>>>>>>> } finally {
>>>>>>>>>   disconnect()
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> + a workaround with an internal estimate size since this primitive
>>>>>>>>> is often called in split but you dont want to connect twice in the second
>>>>>>>>> phase.
>>>>>>>>>
>>>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>>>> I insists it is a very very basic concern for such API. However
>>>>>>>>> beam doesn't embraces it and doesn't assume it so building any API on top
>>>>>>>>> of beam is very hurtful today and for direct beam users you hit the exact
>>>>>>>>> same issues - check how IO are implemented, the static utilities which
>>>>>>>>> create volatile connections preventing to reuse existing connection in a
>>>>>>>>> single method (https://github.com/apache/
>>>>>>>>> beam/blob/master/sdks/java/io/elasticsearch/src/main/java/
>>>>>>>>> org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
>>>>>>>>>
>>>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>>>
>>>>>>>>> B. DoFn & SDF
>>>>>>>>>
>>>>>>>>> As a fn dev you expect the same from the beam runtime: init(); try
>>>>>>>>> { while (...) process(); } finally { destroy(); } and that it is executed
>>>>>>>>> on the exact same instance to be able to be stateful at that level for
>>>>>>>>> expensive connections/operations/flow state handling.
>>>>>>>>>
>>>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>>>
>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>>>
>>>>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>>>>> to get connections with bundles.
>>>>>>>>>
>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply all
>>>>>>>>> the current issues with the loose definition of the bean lifecycles at an
>>>>>>>>> exponential level, nothing else.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Romain Manni-Bucau
>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>
>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>
>>>>>>>>> :
>>>>>>>>>
>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>>>
>>>>>>>>>> (Would like to reiterate one more time, as the main author of
>>>>>>>>>> most design documents related to SDF and of its implementation in the Java
>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>>>
>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>>>>>>>> transforms?
>>>>>>>>>>>
>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>>>>>>>> écrit :
>>>>>>>>>>>
>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>>>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>>>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>>>>>>>> the best of my knowledge.
>>>>>>>>>>>>
>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>>>>> copy to put them in the final destination.
>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>>>
>>>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>>>
>>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>>>> runs on one worker.
>>>>>>>>>>>>
>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough.
>>>>>>>>>>>> We need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>>>>
>>>>>>>>>>>> In streaming this is less straightforward -- do you want to
>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to wait until the end of
>>>>>>>>>>>> the window? In practice, you just want to wait until you know nobody will
>>>>>>>>>>>> need the resource anymore.
>>>>>>>>>>>>
>>>>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>>>
>>>>>>>>>>>> Would something like this be a better fit for your use case? If
>>>>>>>>>>>> not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In practise, new is often an unsafe allocate (deserialization)
>>>>>>>>>>>>> but it doesnt matter here.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>>>> écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to
>>>>>>>>>>>>>>> do):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch
>>>>>>>>>>>>>>> is done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some region
>>>>>>>>>>>>>>> of the pipeline. While, the DoFn lifecycle methods are not a good fit for
>>>>>>>>>>>>>>> this (they are focused on managing resources within the DoFn), you could
>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>>>>>>> changing resource IDs)
>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>>>>>>>>>>>> eventually output the fact they're done
>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is why i tried to not start the workaround discussions
>>>>>>>>>>>>>>> and just stay at API level.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except
>>>>>>>>>>>>>>>>> in various situations where it's logically impossible or impractical to
>>>>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not
>>>>>>>>>>>>>>>> which which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it"
>>>>>>>>>>>>>>>>>> but "best effort" is too open and can be very badly and wrongly perceived
>>>>>>>>>>>>>>>>>> by users (like I did).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in
>>>>>>>>>>>>>>>>>>> the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user
>>>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it segfaulted),
>>>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g.
>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things
>>>>>>>>>>>>>>>>>>>>>> die so badly that @Teardown is not called then nothing else can be called
>>>>>>>>>>>>>>>>>>>>>> to close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this
>>>>>>>>>>>>>>>>>>>>> way (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not
>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big
>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what people did for years and
>>>>>>>>>>>>>>>>>>>>>> do it wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee
>>>>>>>>>>>>>>>>>>>>>> - in the normal IT conditions - the execution of teardown. Then we see if
>>>>>>>>>>>>>>>>>>>>>> we can handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

How do you call teardown? There are cases in which the Java code gets no
indication that the restart is happening (e.g. cases where the machine
itself is taken down)

On Mon, Feb 19, 2018, 12:24 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Restarting doesnt mean you dont call teardown. Except a bug there is no
> reason - technically - it happens, no reason.
>
> Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :
>
>> Workers restarting is not a bug, it's standard often expected.
>>
>> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>>> (procedure)
>>>
>>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
>>> écrit :
>>>
>>>> So what would you like to happen if there is a crash? The DoFn instance
>>>> no longer exists because the JVM it ran on no longer exists. What should
>>>> Teardown be called on?
>>>>
>>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> This is what i want and not 999999 teardowns for 1000000 setups until
>>>>> there is an unexpected crash (= a bug).
>>>>>
>>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to the
>>>>>>>>> same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>>>
>>>>>>>>
>>>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>>>> worker.
>>>>>>>>
>>>>>>>
>>>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>>>> the distribution you allocate to some fn the role to own the instance
>>>>>>> lookup and releasing.
>>>>>>>
>>>>>>
>>>>>> I still don't understand. Let's be more precise. If you write the
>>>>>> following code:
>>>>>>
>>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>>>
>>>>>> There is no way to control how many instances of MyDoFn are created.
>>>>>> The runner might decided to create a million instances of this class across
>>>>>> your worker pool, which means that you will get a million Setup and
>>>>>> Teardown calls.
>>>>>>
>>>>>>
>>>>>>> Anyway this was just an example of an external resource you must
>>>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>>>> lifecycle to let user embrace its programming model.
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> @Eugene:
>>>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Concretely beam exposes a portable API (included in the SDK core).
>>>>>>>>> This API defines a *container* API and therefore implies bean lifecycles.
>>>>>>>>> I'll not detail them all but just use the sources and dofn (not sdf) to
>>>>>>>>> illustrate the idea I'm trying to develop.
>>>>>>>>>
>>>>>>>>> A. Source
>>>>>>>>>
>>>>>>>>> A source computes a partition plan with 2 primitives: estimateSize
>>>>>>>>> and split. As an user you can expect both to be called on the same bean
>>>>>>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>>>>>>
>>>>>>>>> connect()
>>>>>>>>> try {
>>>>>>>>>   estimateSize()
>>>>>>>>>   split()
>>>>>>>>> } finally {
>>>>>>>>>   disconnect()
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>>>
>>>>>>>>> connect()
>>>>>>>>> try {
>>>>>>>>>   estimateSize()
>>>>>>>>> } finally {
>>>>>>>>>   disconnect()
>>>>>>>>> }
>>>>>>>>> connect()
>>>>>>>>> try {
>>>>>>>>>   split()
>>>>>>>>> } finally {
>>>>>>>>>   disconnect()
>>>>>>>>> }
>>>>>>>>>
>>>>>>>>> + a workaround with an internal estimate size since this primitive
>>>>>>>>> is often called in split but you dont want to connect twice in the second
>>>>>>>>> phase.
>>>>>>>>>
>>>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>>>> I insists it is a very very basic concern for such API. However
>>>>>>>>> beam doesn't embraces it and doesn't assume it so building any API on top
>>>>>>>>> of beam is very hurtful today and for direct beam users you hit the exact
>>>>>>>>> same issues - check how IO are implemented, the static utilities which
>>>>>>>>> create volatile connections preventing to reuse existing connection in a
>>>>>>>>> single method (
>>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862
>>>>>>>>> ).
>>>>>>>>>
>>>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>>>
>>>>>>>>> B. DoFn & SDF
>>>>>>>>>
>>>>>>>>> As a fn dev you expect the same from the beam runtime: init(); try
>>>>>>>>> { while (...) process(); } finally { destroy(); } and that it is executed
>>>>>>>>> on the exact same instance to be able to be stateful at that level for
>>>>>>>>> expensive connections/operations/flow state handling.
>>>>>>>>>
>>>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>>>
>>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>>>
>>>>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>>>>> to get connections with bundles.
>>>>>>>>>
>>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply all
>>>>>>>>> the current issues with the loose definition of the bean lifecycles at an
>>>>>>>>> exponential level, nothing else.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Romain Manni-Bucau
>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>
>>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>
>>>>>>>>> :
>>>>>>>>>
>>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>>>
>>>>>>>>>> (Would like to reiterate one more time, as the main author of
>>>>>>>>>> most design documents related to SDF and of its implementation in the Java
>>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>>>
>>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>>>>>>>> transforms?
>>>>>>>>>>>
>>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>>>>>>>> écrit :
>>>>>>>>>>>
>>>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>>>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>>>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>>>>>>>> the best of my knowledge.
>>>>>>>>>>>>
>>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>>>>> copy to put them in the final destination.
>>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>>>
>>>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>>>
>>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>>>> runs on one worker.
>>>>>>>>>>>>
>>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough.
>>>>>>>>>>>> We need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>>>>
>>>>>>>>>>>> In streaming this is less straightforward -- do you want to
>>>>>>>>>>>> wait until the end of the pipeline? Or do you want to wait until the end of
>>>>>>>>>>>> the window? In practice, you just want to wait until you know nobody will
>>>>>>>>>>>> need the resource anymore.
>>>>>>>>>>>>
>>>>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>>>
>>>>>>>>>>>> Would something like this be a better fit for your use case? If
>>>>>>>>>>>> not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>>>
>>>>>>>>>>>>> In practise, new is often an unsafe allocate (deserialization)
>>>>>>>>>>>>> but it doesnt matter here.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>>>
>>>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>>>> écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to
>>>>>>>>>>>>>>> do):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch
>>>>>>>>>>>>>>> is done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2. Finalize some resources that are used within some region
>>>>>>>>>>>>>>> of the pipeline. While, the DoFn lifecycle methods are not a good fit for
>>>>>>>>>>>>>>> this (they are focused on managing resources within the DoFn), you could
>>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>>>>>>> changing resource IDs)
>>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>>>>>>>>>>>> eventually output the fact they're done
>>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I nees that but generic and not case by case to
>>>>>>>>>>>>>>> industrialize some api on top of beam.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> This is why i tried to not start the workaround discussions
>>>>>>>>>>>>>>> and just stay at API level.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except
>>>>>>>>>>>>>>>>> in various situations where it's logically impossible or impractical to
>>>>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not
>>>>>>>>>>>>>>>> which which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it"
>>>>>>>>>>>>>>>>>> but "best effort" is too open and can be very badly and wrongly perceived
>>>>>>>>>>>>>>>>>> by users (like I did).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in
>>>>>>>>>>>>>>>>>>> the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user
>>>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it segfaulted),
>>>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g.
>>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things
>>>>>>>>>>>>>>>>>>>>>> die so badly that @Teardown is not called then nothing else can be called
>>>>>>>>>>>>>>>>>>>>>> to close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this
>>>>>>>>>>>>>>>>>>>>> way (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not
>>>>>>>>>>>>>>>>>>>> call then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big
>>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what people did for years and
>>>>>>>>>>>>>>>>>>>>>> do it wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee
>>>>>>>>>>>>>>>>>>>>>> - in the normal IT conditions - the execution of teardown. Then we see if
>>>>>>>>>>>>>>>>>>>>>> we can handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Restarting doesnt mean you dont call teardown. Except a bug there is no
reason - technically - it happens, no reason.

Le 19 févr. 2018 21:14, "Reuven Lax" <re...@google.com> a écrit :

> Workers restarting is not a bug, it's standard often expected.
>
> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>> (procedure)
>>
>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
>> écrit :
>>
>>> So what would you like to happen if there is a crash? The DoFn instance
>>> no longer exists because the JVM it ran on no longer exists. What should
>>> Teardown be called on?
>>>
>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>
>>>> This is what i want and not 999999 teardowns for 1000000 setups until
>>>> there is an unexpected crash (= a bug).
>>>>
>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>>
>>>>>
>>>>>
>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to the
>>>>>>>> same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>>
>>>>>>>
>>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>>> worker.
>>>>>>>
>>>>>>
>>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>>> the distribution you allocate to some fn the role to own the instance
>>>>>> lookup and releasing.
>>>>>>
>>>>>
>>>>> I still don't understand. Let's be more precise. If you write the
>>>>> following code:
>>>>>
>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>>
>>>>> There is no way to control how many instances of MyDoFn are created.
>>>>> The runner might decided to create a million instances of this class across
>>>>> your worker pool, which means that you will get a million Setup and
>>>>> Teardown calls.
>>>>>
>>>>>
>>>>>> Anyway this was just an example of an external resource you must
>>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>>> lifecycle to let user embrace its programming model.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> @Eugene:
>>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>>
>>>>>>>>
>>>>>>>> Concretely beam exposes a portable API (included in the SDK core).
>>>>>>>> This API defines a *container* API and therefore implies bean lifecycles.
>>>>>>>> I'll not detail them all but just use the sources and dofn (not sdf) to
>>>>>>>> illustrate the idea I'm trying to develop.
>>>>>>>>
>>>>>>>> A. Source
>>>>>>>>
>>>>>>>> A source computes a partition plan with 2 primitives: estimateSize
>>>>>>>> and split. As an user you can expect both to be called on the same bean
>>>>>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>>>>>
>>>>>>>> connect()
>>>>>>>> try {
>>>>>>>>   estimateSize()
>>>>>>>>   split()
>>>>>>>> } finally {
>>>>>>>>   disconnect()
>>>>>>>> }
>>>>>>>>
>>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>>
>>>>>>>> connect()
>>>>>>>> try {
>>>>>>>>   estimateSize()
>>>>>>>> } finally {
>>>>>>>>   disconnect()
>>>>>>>> }
>>>>>>>> connect()
>>>>>>>> try {
>>>>>>>>   split()
>>>>>>>> } finally {
>>>>>>>>   disconnect()
>>>>>>>> }
>>>>>>>>
>>>>>>>> + a workaround with an internal estimate size since this primitive
>>>>>>>> is often called in split but you dont want to connect twice in the second
>>>>>>>> phase.
>>>>>>>>
>>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>>> I insists it is a very very basic concern for such API. However
>>>>>>>> beam doesn't embraces it and doesn't assume it so building any API on top
>>>>>>>> of beam is very hurtful today and for direct beam users you hit the exact
>>>>>>>> same issues - check how IO are implemented, the static utilities which
>>>>>>>> create volatile connections preventing to reuse existing connection in a
>>>>>>>> single method (https://github.com/apache/
>>>>>>>> beam/blob/master/sdks/java/io/elasticsearch/src/main/java/
>>>>>>>> org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
>>>>>>>>
>>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>>
>>>>>>>> B. DoFn & SDF
>>>>>>>>
>>>>>>>> As a fn dev you expect the same from the beam runtime: init(); try
>>>>>>>> { while (...) process(); } finally { destroy(); } and that it is executed
>>>>>>>> on the exact same instance to be able to be stateful at that level for
>>>>>>>> expensive connections/operations/flow state handling.
>>>>>>>>
>>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>>
>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>>
>>>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>>>> to get connections with bundles.
>>>>>>>>
>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply all
>>>>>>>> the current issues with the loose definition of the bean lifecycles at an
>>>>>>>> exponential level, nothing else.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Romain Manni-Bucau
>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>
>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>>
>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>>
>>>>>>>>> (Would like to reiterate one more time, as the main author of most
>>>>>>>>> design documents related to SDF and of its implementation in the Java
>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>>
>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>>
>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>>>>>>> transforms?
>>>>>>>>>>
>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>>>>>>> écrit :
>>>>>>>>>>
>>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>>>>>>> the best of my knowledge.
>>>>>>>>>>>
>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>>>> copy to put them in the final destination.
>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>>
>>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>>
>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>>> runs on one worker.
>>>>>>>>>>>
>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough.
>>>>>>>>>>> We need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>>>
>>>>>>>>>>> In streaming this is less straightforward -- do you want to wait
>>>>>>>>>>> until the end of the pipeline? Or do you want to wait until the end of the
>>>>>>>>>>> window? In practice, you just want to wait until you know nobody will need
>>>>>>>>>>> the resource anymore.
>>>>>>>>>>>
>>>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>>
>>>>>>>>>>> Would something like this be a better fit for your use case? If
>>>>>>>>>>> not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>>
>>>>>>>>>>>> In practise, new is often an unsafe allocate (deserialization)
>>>>>>>>>>>> but it doesnt matter here.
>>>>>>>>>>>>
>>>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>>
>>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>>
>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>>> écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to
>>>>>>>>>>>>>> do):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch
>>>>>>>>>>>>>> is done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2. Finalize some resources that are used within some region
>>>>>>>>>>>>>> of the pipeline. While, the DoFn lifecycle methods are not a good fit for
>>>>>>>>>>>>>> this (they are focused on managing resources within the DoFn), you could
>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>>>>>> changing resource IDs)
>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>>>>>>>>>>> eventually output the fact they're done
>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I nees that but generic and not case by case to industrialize
>>>>>>>>>>>>>> some api on top of beam.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is why i tried to not start the workaround discussions
>>>>>>>>>>>>>> and just stay at API level.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except
>>>>>>>>>>>>>>>> in various situations where it's logically impossible or impractical to
>>>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which
>>>>>>>>>>>>>>> which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it"
>>>>>>>>>>>>>>>>> but "best effort" is too open and can be very badly and wrongly perceived
>>>>>>>>>>>>>>>>> by users (like I did).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in
>>>>>>>>>>>>>>>>>> the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user
>>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it segfaulted),
>>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g.
>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things
>>>>>>>>>>>>>>>>>>>>> die so badly that @Teardown is not called then nothing else can be called
>>>>>>>>>>>>>>>>>>>>> to close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this
>>>>>>>>>>>>>>>>>>>> way (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call
>>>>>>>>>>>>>>>>>>> then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big
>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what people did for years and
>>>>>>>>>>>>>>>>>>>>> do it wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee -
>>>>>>>>>>>>>>>>>>>>> in the normal IT conditions - the execution of teardown. Then we see if we
>>>>>>>>>>>>>>>>>>>>> can handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 19 févr. 2018 21:24, "Eugene Kirpichov" <ki...@google.com> a écrit :

Okay, so then this is exactly how Teardown works already, as we've
discussed above - no change needed (except perhaps a clarification in docs,
as also suggested above - feel free to send a PR). Do you agree with this
much?


This is all i asked, removed this "best effort" mention which is misleading
cause it should be everywhere or nowhere by design.



I'm not sure I understand what's left to discuss in this thread. Teardown
is for best-effort cleanup of stuff that lives in the process/thread (eg
connections). Wait is for guaranteed global cleanup of global stuff.


Nothing left if the jdoc can be fixed - judt dropping the mention or
developping to explicit the cases.



Do you have a concrete example that a) can not be supported with either of
these two, but b) can be supported with a hypothetical modification to
Beam's API or semantics that is theoretically possible to support? E.g.
"Integrating with ACME DataStorage requires initialization $x and cleanup
$y. Here's why it can't be done via @Teardown: ... Here's why it can't be
done via Wait: ... Here's an API I propose: ... Here's how the ACME
DataStorageIO would look like with that API: ... Here's how a hypothetical
runner might implement the API: ..." - I realize that this is a lot to ask,
but I feel that after so much frustrating going in circles in this thread,
this is the level of rigor required to stop going in circles.

On Mon, Feb 19, 2018 at 12:14 PM Reuven Lax <re...@google.com> wrote:

> Workers restarting is not a bug, it's standard often expected.
>
> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>> (procedure)
>>
>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
>> écrit :
>>
>>> So what would you like to happen if there is a crash? The DoFn instance
>>> no longer exists because the JVM it ran on no longer exists. What should
>>> Teardown be called on?
>>>
>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>
>>>> This is what i want and not 999999 teardowns for 1000000 setups until
>>>> there is an unexpected crash (= a bug).
>>>>
>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>>
>>>>>
>>>>>
>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to the
>>>>>>>> same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>>
>>>>>>>
>>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>>> worker.
>>>>>>>
>>>>>>
>>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>>> the distribution you allocate to some fn the role to own the instance
>>>>>> lookup and releasing.
>>>>>>
>>>>>
>>>>> I still don't understand. Let's be more precise. If you write the
>>>>> following code:
>>>>>
>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>>
>>>>> There is no way to control how many instances of MyDoFn are created.
>>>>> The runner might decided to create a million instances of this class across
>>>>> your worker pool, which means that you will get a million Setup and
>>>>> Teardown calls.
>>>>>
>>>>>
>>>>>> Anyway this was just an example of an external resource you must
>>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>>> lifecycle to let user embrace its programming model.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> @Eugene:
>>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>>
>>>>>>>>
>>>>>>>> Concretely beam exposes a portable API (included in the SDK core).
>>>>>>>> This API defines a *container* API and therefore implies bean lifecycles.
>>>>>>>> I'll not detail them all but just use the sources and dofn (not sdf) to
>>>>>>>> illustrate the idea I'm trying to develop.
>>>>>>>>
>>>>>>>> A. Source
>>>>>>>>
>>>>>>>> A source computes a partition plan with 2 primitives: estimateSize
>>>>>>>> and split. As an user you can expect both to be called on the same bean
>>>>>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>>>>>
>>>>>>>> connect()
>>>>>>>> try {
>>>>>>>>   estimateSize()
>>>>>>>>   split()
>>>>>>>> } finally {
>>>>>>>>   disconnect()
>>>>>>>> }
>>>>>>>>
>>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>>
>>>>>>>> connect()
>>>>>>>> try {
>>>>>>>>   estimateSize()
>>>>>>>> } finally {
>>>>>>>>   disconnect()
>>>>>>>> }
>>>>>>>> connect()
>>>>>>>> try {
>>>>>>>>   split()
>>>>>>>> } finally {
>>>>>>>>   disconnect()
>>>>>>>> }
>>>>>>>>
>>>>>>>> + a workaround with an internal estimate size since this primitive
>>>>>>>> is often called in split but you dont want to connect twice in the second
>>>>>>>> phase.
>>>>>>>>
>>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>>> I insists it is a very very basic concern for such API. However
>>>>>>>> beam doesn't embraces it and doesn't assume it so building any API on top
>>>>>>>> of beam is very hurtful today and for direct beam users you hit the exact
>>>>>>>> same issues - check how IO are implemented, the static utilities which
>>>>>>>> create volatile connections preventing to reuse existing connection in a
>>>>>>>> single method (https://github.com/apache/
>>>>>>>> beam/blob/master/sdks/java/io/elasticsearch/src/main/java/
>>>>>>>> org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862).
>>>>>>>>
>>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>>
>>>>>>>> B. DoFn & SDF
>>>>>>>>
>>>>>>>> As a fn dev you expect the same from the beam runtime: init(); try
>>>>>>>> { while (...) process(); } finally { destroy(); } and that it is executed
>>>>>>>> on the exact same instance to be able to be stateful at that level for
>>>>>>>> expensive connections/operations/flow state handling.
>>>>>>>>
>>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>>
>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>>
>>>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>>>> to get connections with bundles.
>>>>>>>>
>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply all
>>>>>>>> the current issues with the loose definition of the bean lifecycles at an
>>>>>>>> exponential level, nothing else.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Romain Manni-Bucau
>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>
>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>>
>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>>
>>>>>>>>> (Would like to reiterate one more time, as the main author of most
>>>>>>>>> design documents related to SDF and of its implementation in the Java
>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>>
>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>>
>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>>>>>>> transforms?
>>>>>>>>>>
>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>>>>>>> écrit :
>>>>>>>>>>
>>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>>>>>>> the best of my knowledge.
>>>>>>>>>>>
>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>>>> copy to put them in the final destination.
>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>>
>>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>>
>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>>> runs on one worker.
>>>>>>>>>>>
>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough.
>>>>>>>>>>> We need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>>>
>>>>>>>>>>> In streaming this is less straightforward -- do you want to wait
>>>>>>>>>>> until the end of the pipeline? Or do you want to wait until the end of the
>>>>>>>>>>> window? In practice, you just want to wait until you know nobody will need
>>>>>>>>>>> the resource anymore.
>>>>>>>>>>>
>>>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>>
>>>>>>>>>>> Would something like this be a better fit for your use case? If
>>>>>>>>>>> not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>>
>>>>>>>>>>>> In practise, new is often an unsafe allocate (deserialization)
>>>>>>>>>>>> but it doesnt matter here.
>>>>>>>>>>>>
>>>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>>
>>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>>
>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>>> écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to
>>>>>>>>>>>>>> do):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch
>>>>>>>>>>>>>> is done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2. Finalize some resources that are used within some region
>>>>>>>>>>>>>> of the pipeline. While, the DoFn lifecycle methods are not a good fit for
>>>>>>>>>>>>>> this (they are focused on managing resources within the DoFn), you could
>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>>>>>> changing resource IDs)
>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>>>>>>>>>>> eventually output the fact they're done
>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I nees that but generic and not case by case to industrialize
>>>>>>>>>>>>>> some api on top of beam.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is why i tried to not start the workaround discussions
>>>>>>>>>>>>>> and just stay at API level.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except
>>>>>>>>>>>>>>>> in various situations where it's logically impossible or impractical to
>>>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which
>>>>>>>>>>>>>>> which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it"
>>>>>>>>>>>>>>>>> but "best effort" is too open and can be very badly and wrongly perceived
>>>>>>>>>>>>>>>>> by users (like I did).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in
>>>>>>>>>>>>>>>>>> the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user
>>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it segfaulted),
>>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g.
>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things
>>>>>>>>>>>>>>>>>>>>> die so badly that @Teardown is not called then nothing else can be called
>>>>>>>>>>>>>>>>>>>>> to close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this
>>>>>>>>>>>>>>>>>>>> way (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call
>>>>>>>>>>>>>>>>>>> then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big
>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what people did for years and
>>>>>>>>>>>>>>>>>>>>> do it wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee -
>>>>>>>>>>>>>>>>>>>>> in the normal IT conditions - the execution of teardown. Then we see if we
>>>>>>>>>>>>>>>>>>>>> can handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

Okay, so then this is exactly how Teardown works already, as we've
discussed above - no change needed (except perhaps a clarification in docs,
as also suggested above - feel free to send a PR). Do you agree with this
much?

I'm not sure I understand what's left to discuss in this thread. Teardown
is for best-effort cleanup of stuff that lives in the process/thread (eg
connections). Wait is for guaranteed global cleanup of global stuff.

Do you have a concrete example that a) can not be supported with either of
these two, but b) can be supported with a hypothetical modification to
Beam's API or semantics that is theoretically possible to support? E.g.
"Integrating with ACME DataStorage requires initialization $x and cleanup
$y. Here's why it can't be done via @Teardown: ... Here's why it can't be
done via Wait: ... Here's an API I propose: ... Here's how the ACME
DataStorageIO would look like with that API: ... Here's how a hypothetical
runner might implement the API: ..." - I realize that this is a lot to ask,
but I feel that after so much frustrating going in circles in this thread,
this is the level of rigor required to stop going in circles.

On Mon, Feb 19, 2018 at 12:14 PM Reuven Lax <re...@google.com> wrote:

> Workers restarting is not a bug, it's standard often expected.
>
> On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> Nothing, as mentionned it is a bug so recovery is a bug recovery
>> (procedure)
>>
>> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
>> écrit :
>>
>>> So what would you like to happen if there is a crash? The DoFn instance
>>> no longer exists because the JVM it ran on no longer exists. What should
>>> Teardown be called on?
>>>
>>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>
>>>> This is what i want and not 999999 teardowns for 1000000 setups until
>>>> there is an unexpected crash (= a bug).
>>>>
>>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>>
>>>>>
>>>>>
>>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to the
>>>>>>>> same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>>
>>>>>>>
>>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>>> worker.
>>>>>>>
>>>>>>
>>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>>> the distribution you allocate to some fn the role to own the instance
>>>>>> lookup and releasing.
>>>>>>
>>>>>
>>>>> I still don't understand. Let's be more precise. If you write the
>>>>> following code:
>>>>>
>>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>>
>>>>> There is no way to control how many instances of MyDoFn are created.
>>>>> The runner might decided to create a million instances of this class across
>>>>> your worker pool, which means that you will get a million Setup and
>>>>> Teardown calls.
>>>>>
>>>>>
>>>>>> Anyway this was just an example of an external resource you must
>>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>>> lifecycle to let user embrace its programming model.
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> @Eugene:
>>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>>
>>>>>>>>
>>>>>>>> Concretely beam exposes a portable API (included in the SDK core).
>>>>>>>> This API defines a *container* API and therefore implies bean lifecycles.
>>>>>>>> I'll not detail them all but just use the sources and dofn (not sdf) to
>>>>>>>> illustrate the idea I'm trying to develop.
>>>>>>>>
>>>>>>>> A. Source
>>>>>>>>
>>>>>>>> A source computes a partition plan with 2 primitives: estimateSize
>>>>>>>> and split. As an user you can expect both to be called on the same bean
>>>>>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>>>>>
>>>>>>>> connect()
>>>>>>>> try {
>>>>>>>>   estimateSize()
>>>>>>>>   split()
>>>>>>>> } finally {
>>>>>>>>   disconnect()
>>>>>>>> }
>>>>>>>>
>>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>>
>>>>>>>> connect()
>>>>>>>> try {
>>>>>>>>   estimateSize()
>>>>>>>> } finally {
>>>>>>>>   disconnect()
>>>>>>>> }
>>>>>>>> connect()
>>>>>>>> try {
>>>>>>>>   split()
>>>>>>>> } finally {
>>>>>>>>   disconnect()
>>>>>>>> }
>>>>>>>>
>>>>>>>> + a workaround with an internal estimate size since this primitive
>>>>>>>> is often called in split but you dont want to connect twice in the second
>>>>>>>> phase.
>>>>>>>>
>>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>>> I insists it is a very very basic concern for such API. However
>>>>>>>> beam doesn't embraces it and doesn't assume it so building any API on top
>>>>>>>> of beam is very hurtful today and for direct beam users you hit the exact
>>>>>>>> same issues - check how IO are implemented, the static utilities which
>>>>>>>> create volatile connections preventing to reuse existing connection in a
>>>>>>>> single method (
>>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862
>>>>>>>> ).
>>>>>>>>
>>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>>
>>>>>>>> B. DoFn & SDF
>>>>>>>>
>>>>>>>> As a fn dev you expect the same from the beam runtime: init(); try
>>>>>>>> { while (...) process(); } finally { destroy(); } and that it is executed
>>>>>>>> on the exact same instance to be able to be stateful at that level for
>>>>>>>> expensive connections/operations/flow state handling.
>>>>>>>>
>>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>>
>>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>>
>>>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>>>> to get connections with bundles.
>>>>>>>>
>>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply all
>>>>>>>> the current issues with the loose definition of the bean lifecycles at an
>>>>>>>> exponential level, nothing else.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Romain Manni-Bucau
>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>
>>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>>
>>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>>
>>>>>>>>> (Would like to reiterate one more time, as the main author of most
>>>>>>>>> design documents related to SDF and of its implementation in the Java
>>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>>
>>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>>
>>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of
>>>>>>>>>> transforms?
>>>>>>>>>>
>>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>>>>>>> écrit :
>>>>>>>>>>
>>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>>>>>>> the best of my knowledge.
>>>>>>>>>>>
>>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>>>> copy to put them in the final destination.
>>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>>
>>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>>
>>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>>> runs on one worker.
>>>>>>>>>>>
>>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough.
>>>>>>>>>>> We need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>>>
>>>>>>>>>>> In streaming this is less straightforward -- do you want to wait
>>>>>>>>>>> until the end of the pipeline? Or do you want to wait until the end of the
>>>>>>>>>>> window? In practice, you just want to wait until you know nobody will need
>>>>>>>>>>> the resource anymore.
>>>>>>>>>>>
>>>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>>
>>>>>>>>>>> Would something like this be a better fit for your use case? If
>>>>>>>>>>> not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>>
>>>>>>>>>>>> In practise, new is often an unsafe allocate (deserialization)
>>>>>>>>>>>> but it doesnt matter here.
>>>>>>>>>>>>
>>>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>>
>>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>>
>>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>>> écrit :
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to
>>>>>>>>>>>>>> do):
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch
>>>>>>>>>>>>>> is done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2. Finalize some resources that are used within some region
>>>>>>>>>>>>>> of the pipeline. While, the DoFn lifecycle methods are not a good fit for
>>>>>>>>>>>>>> this (they are focused on managing resources within the DoFn), you could
>>>>>>>>>>>>>> model this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>>>>>> changing resource IDs)
>>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and
>>>>>>>>>>>>>> eventually output the fact they're done
>>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I nees that but generic and not case by case to industrialize
>>>>>>>>>>>>>> some api on top of beam.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> This is why i tried to not start the workaround discussions
>>>>>>>>>>>>>> and just stay at API level.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except
>>>>>>>>>>>>>>>> in various situations where it's logically impossible or impractical to
>>>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which
>>>>>>>>>>>>>>> which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it"
>>>>>>>>>>>>>>>>> but "best effort" is too open and can be very badly and wrongly perceived
>>>>>>>>>>>>>>>>> by users (like I did).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in
>>>>>>>>>>>>>>>>>> the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user
>>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it segfaulted),
>>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g.
>>>>>>>>>>>>>>>>>>>>>> "I'm trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things
>>>>>>>>>>>>>>>>>>>>> die so badly that @Teardown is not called then nothing else can be called
>>>>>>>>>>>>>>>>>>>>> to close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this
>>>>>>>>>>>>>>>>>>>> way (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call
>>>>>>>>>>>>>>>>>>> then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big
>>>>>>>>>>>>>>>>>>>>> data world but it is not a reason to ignore what people did for years and
>>>>>>>>>>>>>>>>>>>>> do it wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee -
>>>>>>>>>>>>>>>>>>>>> in the normal IT conditions - the execution of teardown. Then we see if we
>>>>>>>>>>>>>>>>>>>>> can handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

Workers restarting is not a bug, it's standard often expected.

On Mon, Feb 19, 2018, 12:03 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Nothing, as mentionned it is a bug so recovery is a bug recovery
> (procedure)
>
> Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a
> écrit :
>
>> So what would you like to happen if there is a crash? The DoFn instance
>> no longer exists because the JVM it ran on no longer exists. What should
>> Teardown be called on?
>>
>> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> This is what i want and not 999999 teardowns for 1000000 setups until
>>> there is an unexpected crash (= a bug).
>>>
>>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>>
>>>>
>>>>
>>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> @Reuven: in practise it is created by pool of 256 but leads to the
>>>>>>> same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>>
>>>>>>
>>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>>> worker.
>>>>>>
>>>>>
>>>>> Nop was the other way around, in this case on AWS you can get 256
>>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>>> the distribution you allocate to some fn the role to own the instance
>>>>> lookup and releasing.
>>>>>
>>>>
>>>> I still don't understand. Let's be more precise. If you write the
>>>> following code:
>>>>
>>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>>
>>>> There is no way to control how many instances of MyDoFn are created.
>>>> The runner might decided to create a million instances of this class across
>>>> your worker pool, which means that you will get a million Setup and
>>>> Teardown calls.
>>>>
>>>>
>>>>> Anyway this was just an example of an external resource you must
>>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>>> lifecycle to let user embrace its programming model.
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>>> @Eugene:
>>>>>>> 1. wait logic is about passing the value which is not always
>>>>>>> possible (like 15% of cases from my raw estimate)
>>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>>
>>>>>>>
>>>>>>> Concretely beam exposes a portable API (included in the SDK core).
>>>>>>> This API defines a *container* API and therefore implies bean lifecycles.
>>>>>>> I'll not detail them all but just use the sources and dofn (not sdf) to
>>>>>>> illustrate the idea I'm trying to develop.
>>>>>>>
>>>>>>> A. Source
>>>>>>>
>>>>>>> A source computes a partition plan with 2 primitives: estimateSize
>>>>>>> and split. As an user you can expect both to be called on the same bean
>>>>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>>>>
>>>>>>> connect()
>>>>>>> try {
>>>>>>>   estimateSize()
>>>>>>>   split()
>>>>>>> } finally {
>>>>>>>   disconnect()
>>>>>>> }
>>>>>>>
>>>>>>> this is not guaranteed by the API so you must do:
>>>>>>>
>>>>>>> connect()
>>>>>>> try {
>>>>>>>   estimateSize()
>>>>>>> } finally {
>>>>>>>   disconnect()
>>>>>>> }
>>>>>>> connect()
>>>>>>> try {
>>>>>>>   split()
>>>>>>> } finally {
>>>>>>>   disconnect()
>>>>>>> }
>>>>>>>
>>>>>>> + a workaround with an internal estimate size since this primitive
>>>>>>> is often called in split but you dont want to connect twice in the second
>>>>>>> phase.
>>>>>>>
>>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>>> I insists it is a very very basic concern for such API. However beam
>>>>>>> doesn't embraces it and doesn't assume it so building any API on top of
>>>>>>> beam is very hurtful today and for direct beam users you hit the exact same
>>>>>>> issues - check how IO are implemented, the static utilities which create
>>>>>>> volatile connections preventing to reuse existing connection in a single
>>>>>>> method (
>>>>>>> https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862
>>>>>>> ).
>>>>>>>
>>>>>>> Same logic applies to the reader which is then created.
>>>>>>>
>>>>>>> B. DoFn & SDF
>>>>>>>
>>>>>>> As a fn dev you expect the same from the beam runtime: init(); try {
>>>>>>> while (...) process(); } finally { destroy(); } and that it is executed on
>>>>>>> the exact same instance to be able to be stateful at that level for
>>>>>>> expensive connections/operations/flow state handling.
>>>>>>>
>>>>>>> As you mentionned with the million example, this sequence should
>>>>>>> happen for each single instance so 1M times for your example.
>>>>>>>
>>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>>> (setup/teardown) which can be stateful.
>>>>>>>
>>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>>> a single connection per thread to avoid to consume all database
>>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>>> to get connections with bundles.
>>>>>>>
>>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply all
>>>>>>> the current issues with the loose definition of the bean lifecycles at an
>>>>>>> exponential level, nothing else.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Romain Manni-Bucau
>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>
>>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>
>>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>>
>>>>>>>> (Would like to reiterate one more time, as the main author of most
>>>>>>>> design documents related to SDF and of its implementation in the Java
>>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>>
>>>>>>>>> Otherwise how to normalize - single api -  lifecycle of transforms?
>>>>>>>>>
>>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>>>>>> écrit :
>>>>>>>>>
>>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>>>>>> the best of my knowledge.
>>>>>>>>>>
>>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>>> copy to put them in the final destination.
>>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>>
>>>>>>>>>> (This is often desirable because it minimizes the chance of
>>>>>>>>>> seeing partial/incomplete results in the final destination).
>>>>>>>>>>
>>>>>>>>>> In the above, you'd want step 1 to execute on many workers,
>>>>>>>>>> likely using a ParDo (say N different workers).
>>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>>> runs on one worker.
>>>>>>>>>>
>>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough. We
>>>>>>>>>> need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>>
>>>>>>>>>> In streaming this is less straightforward -- do you want to wait
>>>>>>>>>> until the end of the pipeline? Or do you want to wait until the end of the
>>>>>>>>>> window? In practice, you just want to wait until you know nobody will need
>>>>>>>>>> the resource anymore.
>>>>>>>>>>
>>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>>
>>>>>>>>>> Would something like this be a better fit for your use case? If
>>>>>>>>>> not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>>
>>>>>>>>>>> In practise, new is often an unsafe allocate (deserialization)
>>>>>>>>>>> but it doesnt matter here.
>>>>>>>>>>>
>>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>>
>>>>>>>>>>> It is as simple as it.
>>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>>
>>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a
>>>>>>>>>>> écrit :
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to
>>>>>>>>>>>>> do):
>>>>>>>>>>>>>
>>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch is
>>>>>>>>>>>>> done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>>
>>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2. Finalize some resources that are used within some region of
>>>>>>>>>>>>> the pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>>>>>>>>>>> (they are focused on managing resources within the DoFn), you could model
>>>>>>>>>>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that
>>>>>>>>>>>>> stores information about resources)
>>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>>>>> changing resource IDs)
>>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>>    d) Pipeline segments that use the resources, and eventually
>>>>>>>>>>>>> output the fact they're done
>>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>>
>>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I nees that but generic and not case by case to industrialize
>>>>>>>>>>>>> some api on top of beam.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>>
>>>>>>>>>>>>> This is why i tried to not start the workaround discussions
>>>>>>>>>>>>> and just stay at API level.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except in
>>>>>>>>>>>>>>> various situations where it's logically impossible or impractical to
>>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which
>>>>>>>>>>>>>> which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it"
>>>>>>>>>>>>>>>> but "best effort" is too open and can be very badly and wrongly perceived
>>>>>>>>>>>>>>>> by users (like I did).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in
>>>>>>>>>>>>>>>>> the example situation you have (intergalactic crash), and in a number of
>>>>>>>>>>>>>>>>> more common cases: eg in case the worker container has crashed (eg user
>>>>>>>>>>>>>>>>> code in a different thread called a C library over JNI and it segfaulted),
>>>>>>>>>>>>>>>>> JVM bug, crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm
>>>>>>>>>>>>>>>>>>>>> trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things
>>>>>>>>>>>>>>>>>>>> die so badly that @Teardown is not called then nothing else can be called
>>>>>>>>>>>>>>>>>>>> to close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this
>>>>>>>>>>>>>>>>>>> way (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call
>>>>>>>>>>>>>>>>>> then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner
>>>>>>>>>>>>>>>>>>>> does it so if a user udes the RI in test, he will get a different behavior
>>>>>>>>>>>>>>>>>>>> in prod? Also dont forget the user doesnt know what the IOs he composes use
>>>>>>>>>>>>>>>>>>>> so this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big data
>>>>>>>>>>>>>>>>>>>> world but it is not a reason to ignore what people did for years and do it
>>>>>>>>>>>>>>>>>>>> wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee -
>>>>>>>>>>>>>>>>>>>> in the normal IT conditions - the execution of teardown. Then we see if we
>>>>>>>>>>>>>>>>>>>> can handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Nothing, as mentionned it is a bug so recovery is a bug recovery (procedure)

Le 19 févr. 2018 19:42, "Eugene Kirpichov" <ki...@google.com> a écrit :

> So what would you like to happen if there is a crash? The DoFn instance no
> longer exists because the JVM it ran on no longer exists. What should
> Teardown be called on?
>
> On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> This is what i want and not 999999 teardowns for 1000000 setups until
>> there is an unexpected crash (= a bug).
>>
>> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>>
>>>
>>>
>>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> @Reuven: in practise it is created by pool of 256 but leads to the
>>>>>> same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>>
>>>>>
>>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>>> worker.
>>>>>
>>>>
>>>> Nop was the other way around, in this case on AWS you can get 256
>>>> instances at once but not 512 (which will be 2x256). So when you compute
>>>> the distribution you allocate to some fn the role to own the instance
>>>> lookup and releasing.
>>>>
>>>
>>> I still don't understand. Let's be more precise. If you write the
>>> following code:
>>>
>>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>>
>>> There is no way to control how many instances of MyDoFn are created. The
>>> runner might decided to create a million instances of this class across
>>> your worker pool, which means that you will get a million Setup and
>>> Teardown calls.
>>>
>>>
>>>> Anyway this was just an example of an external resource you must
>>>> release. Real topic is that beam should define asap a guaranteed generic
>>>> lifecycle to let user embrace its programming model.
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>>> @Eugene:
>>>>>> 1. wait logic is about passing the value which is not always possible
>>>>>> (like 15% of cases from my raw estimate)
>>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>>
>>>>>>
>>>>>> Concretely beam exposes a portable API (included in the SDK core).
>>>>>> This API defines a *container* API and therefore implies bean lifecycles.
>>>>>> I'll not detail them all but just use the sources and dofn (not sdf) to
>>>>>> illustrate the idea I'm trying to develop.
>>>>>>
>>>>>> A. Source
>>>>>>
>>>>>> A source computes a partition plan with 2 primitives: estimateSize
>>>>>> and split. As an user you can expect both to be called on the same bean
>>>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>>>
>>>>>> connect()
>>>>>> try {
>>>>>>   estimateSize()
>>>>>>   split()
>>>>>> } finally {
>>>>>>   disconnect()
>>>>>> }
>>>>>>
>>>>>> this is not guaranteed by the API so you must do:
>>>>>>
>>>>>> connect()
>>>>>> try {
>>>>>>   estimateSize()
>>>>>> } finally {
>>>>>>   disconnect()
>>>>>> }
>>>>>> connect()
>>>>>> try {
>>>>>>   split()
>>>>>> } finally {
>>>>>>   disconnect()
>>>>>> }
>>>>>>
>>>>>> + a workaround with an internal estimate size since this primitive is
>>>>>> often called in split but you dont want to connect twice in the second
>>>>>> phase.
>>>>>>
>>>>>> Why do you need that? Simply cause you want to define an API to
>>>>>> implement sources which initializes the source bean and destroys it.
>>>>>> I insists it is a very very basic concern for such API. However beam
>>>>>> doesn't embraces it and doesn't assume it so building any API on top of
>>>>>> beam is very hurtful today and for direct beam users you hit the exact same
>>>>>> issues - check how IO are implemented, the static utilities which create
>>>>>> volatile connections preventing to reuse existing connection in a single
>>>>>> method (https://github.com/apache/beam/blob/master/sdks/java/io/
>>>>>> elasticsearch/src/main/java/org/apache/beam/sdk/io/
>>>>>> elasticsearch/ElasticsearchIO.java#L862).
>>>>>>
>>>>>> Same logic applies to the reader which is then created.
>>>>>>
>>>>>> B. DoFn & SDF
>>>>>>
>>>>>> As a fn dev you expect the same from the beam runtime: init(); try {
>>>>>> while (...) process(); } finally { destroy(); } and that it is executed on
>>>>>> the exact same instance to be able to be stateful at that level for
>>>>>> expensive connections/operations/flow state handling.
>>>>>>
>>>>>> As you mentionned with the million example, this sequence should
>>>>>> happen for each single instance so 1M times for your example.
>>>>>>
>>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>>> more instances and requires to have a way more strict/explicit definition
>>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>>> (setup/teardown) which can be stateful.
>>>>>>
>>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>>> a single connection per thread to avoid to consume all database
>>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>>> or 3 the connection count and make the execution fail with "no more
>>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>>> to get connections with bundles.
>>>>>>
>>>>>> Just to make it obvious: SDF mentions are just cause SDF imply all
>>>>>> the current issues with the loose definition of the bean lifecycles at an
>>>>>> exponential level, nothing else.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Romain Manni-Bucau
>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>
>>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>
>>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>>> and I believe it should become the canonical way to do that.
>>>>>>>
>>>>>>> (Would like to reiterate one more time, as the main author of most
>>>>>>> design documents related to SDF and of its implementation in the Java
>>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>>
>>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>>
>>>>>>>> Otherwise how to normalize - single api -  lifecycle of transforms?
>>>>>>>>
>>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>>>>> écrit :
>>>>>>>>
>>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>>>>> the best of my knowledge.
>>>>>>>>>
>>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk
>>>>>>>>> copy to put them in the final destination.
>>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>>
>>>>>>>>> (This is often desirable because it minimizes the chance of seeing
>>>>>>>>> partial/incomplete results in the final destination).
>>>>>>>>>
>>>>>>>>> In the above, you'd want step 1 to execute on many workers, likely
>>>>>>>>> using a ParDo (say N different workers).
>>>>>>>>> The move step should only happen once, so on one worker. This
>>>>>>>>> means it will be a different DoFn, likely with some stuff done to ensure it
>>>>>>>>> runs on one worker.
>>>>>>>>>
>>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough. We
>>>>>>>>> need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>>
>>>>>>>>> In streaming this is less straightforward -- do you want to wait
>>>>>>>>> until the end of the pipeline? Or do you want to wait until the end of the
>>>>>>>>> window? In practice, you just want to wait until you know nobody will need
>>>>>>>>> the resource anymore.
>>>>>>>>>
>>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>>
>>>>>>>>> Would something like this be a better fit for your use case? If
>>>>>>>>> not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>>
>>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall
>>>>>>>>>> execution. Each instance - one fn so likely in a thread of a worker - has
>>>>>>>>>> its lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>>
>>>>>>>>>> In practise, new is often an unsafe allocate (deserialization)
>>>>>>>>>> but it doesnt matter here.
>>>>>>>>>>
>>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>>
>>>>>>>>>> It is as simple as it.
>>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>>
>>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org>
>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>
>>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>>>>
>>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to
>>>>>>>>>>>> do):
>>>>>>>>>>>>
>>>>>>>>>>>> 1. Clean-up some external, global resource, that was
>>>>>>>>>>>> initialized once during the startup of the pipeline. If this is the case,
>>>>>>>>>>>> how are you ensuring it was really only initialized once (and not once per
>>>>>>>>>>>> worker, per thread, per instance, etc.)? How do you know when the pipeline
>>>>>>>>>>>> should release it? If the answer is "when it reaches step X", then what
>>>>>>>>>>>> about a streaming pipeline?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> When the dofn is no more needed logically ie when the batch is
>>>>>>>>>>>> done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>>
>>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 2. Finalize some resources that are used within some region of
>>>>>>>>>>>> the pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>>>>>>>>>> (they are focused on managing resources within the DoFn), you could model
>>>>>>>>>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that stores
>>>>>>>>>>>> information about resources)
>>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>>>> changing resource IDs)
>>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>>    d) Pipeline segments that use the resources, and eventually
>>>>>>>>>>>> output the fact they're done
>>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>>
>>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I nees that but generic and not case by case to industrialize
>>>>>>>>>>>> some api on top of beam.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> I understand there are sorkaround for almost all cases but
>>>>>>>>>>>> means each transform is different in its lifecycle handling  except i
>>>>>>>>>>>> dislike it a lot at a scale and as a user since you cant put any unified
>>>>>>>>>>>> practise on top of beam, it also makes beam very hard to integrate or to
>>>>>>>>>>>> use to build higher level libraries or softwares.
>>>>>>>>>>>>
>>>>>>>>>>>> This is why i tried to not start the workaround discussions and
>>>>>>>>>>>> just stay at API level.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -- Ben
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except in
>>>>>>>>>>>>>> various situations where it's logically impossible or impractical to
>>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The main point for the user is, you *will* see
>>>>>>>>>>>>>> non-preventable situations where it couldn't be called - it's not just
>>>>>>>>>>>>>> intergalactic crashes - so if the logic is very important (e.g. cleaning up
>>>>>>>>>>>>>> a large amount of temporary files, shutting down a large number of VMs you
>>>>>>>>>>>>>> started etc), you have to express it using one of the other methods that
>>>>>>>>>>>>>> have stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which
>>>>>>>>>>>>> which other method you speak about. Concretely if you make it really
>>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is
>>>>>>>>>>>>>>> also often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it" but
>>>>>>>>>>>>>>> "best effort" is too open and can be very badly and wrongly perceived by
>>>>>>>>>>>>>>> users (like I did).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in the
>>>>>>>>>>>>>>>> example situation you have (intergalactic crash), and in a number of more
>>>>>>>>>>>>>>>> common cases: eg in case the worker container has crashed (eg user code in
>>>>>>>>>>>>>>>> a different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>>>>>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <
>>>>>>>>>>>>>>>>>>> klk@google.com> a écrit :
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm
>>>>>>>>>>>>>>>>>>>> trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things
>>>>>>>>>>>>>>>>>>> die so badly that @Teardown is not called then nothing else can be called
>>>>>>>>>>>>>>>>>>> to close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>>>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call
>>>>>>>>>>>>>>>>> then it is a bug and we are done :).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner does
>>>>>>>>>>>>>>>>>>> it so if a user udes the RI in test, he will get a different behavior in
>>>>>>>>>>>>>>>>>>> prod? Also dont forget the user doesnt know what the IOs he composes use so
>>>>>>>>>>>>>>>>>>> this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big data
>>>>>>>>>>>>>>>>>>> world but it is not a reason to ignore what people did for years and do it
>>>>>>>>>>>>>>>>>>> wrong before doing right ;).
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee -
>>>>>>>>>>>>>>>>>>> in the normal IT conditions - the execution of teardown. Then we see if we
>>>>>>>>>>>>>>>>>>> can handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>
>>>>>
>>>>
>>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

So what would you like to happen if there is a crash? The DoFn instance no
longer exists because the JVM it ran on no longer exists. What should
Teardown be called on?

On Mon, Feb 19, 2018, 10:20 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> This is what i want and not 999999 teardowns for 1000000 setups until
> there is an unexpected crash (= a bug).
>
> Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :
>
>>
>>
>> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>>
>>>
>>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>>
>>>>
>>>>
>>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> @Reuven: in practise it is created by pool of 256 but leads to the
>>>>> same pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>>
>>>>
>>>> How do you control "256?" Even if you have a pool of 256 workers,
>>>> nothing in Beam guarantees how many threads and DoFns are created per
>>>> worker. In theory the runner might decide to create 1000 threads on each
>>>> worker.
>>>>
>>>
>>> Nop was the other way around, in this case on AWS you can get 256
>>> instances at once but not 512 (which will be 2x256). So when you compute
>>> the distribution you allocate to some fn the role to own the instance
>>> lookup and releasing.
>>>
>>
>> I still don't understand. Let's be more precise. If you write the
>> following code:
>>
>>    pCollection.apply(ParDo.of(new MyDoFn()));
>>
>> There is no way to control how many instances of MyDoFn are created. The
>> runner might decided to create a million instances of this class across
>> your worker pool, which means that you will get a million Setup and
>> Teardown calls.
>>
>>
>>> Anyway this was just an example of an external resource you must
>>> release. Real topic is that beam should define asap a guaranteed generic
>>> lifecycle to let user embrace its programming model.
>>>
>>>
>>>>
>>>>
>>>>
>>>>> @Eugene:
>>>>> 1. wait logic is about passing the value which is not always possible
>>>>> (like 15% of cases from my raw estimate)
>>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>>
>>>>>
>>>>> Concretely beam exposes a portable API (included in the SDK core).
>>>>> This API defines a *container* API and therefore implies bean lifecycles.
>>>>> I'll not detail them all but just use the sources and dofn (not sdf) to
>>>>> illustrate the idea I'm trying to develop.
>>>>>
>>>>> A. Source
>>>>>
>>>>> A source computes a partition plan with 2 primitives: estimateSize and
>>>>> split. As an user you can expect both to be called on the same bean
>>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>>
>>>>> connect()
>>>>> try {
>>>>>   estimateSize()
>>>>>   split()
>>>>> } finally {
>>>>>   disconnect()
>>>>> }
>>>>>
>>>>> this is not guaranteed by the API so you must do:
>>>>>
>>>>> connect()
>>>>> try {
>>>>>   estimateSize()
>>>>> } finally {
>>>>>   disconnect()
>>>>> }
>>>>> connect()
>>>>> try {
>>>>>   split()
>>>>> } finally {
>>>>>   disconnect()
>>>>> }
>>>>>
>>>>> + a workaround with an internal estimate size since this primitive is
>>>>> often called in split but you dont want to connect twice in the second
>>>>> phase.
>>>>>
>>>>> Why do you need that? Simply cause you want to define an API to
>>>>> implement sources which initializes the source bean and destroys it.
>>>>> I insists it is a very very basic concern for such API. However beam
>>>>> doesn't embraces it and doesn't assume it so building any API on top of
>>>>> beam is very hurtful today and for direct beam users you hit the exact same
>>>>> issues - check how IO are implemented, the static utilities which create
>>>>> volatile connections preventing to reuse existing connection in a single
>>>>> method (
>>>>> https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862
>>>>> ).
>>>>>
>>>>> Same logic applies to the reader which is then created.
>>>>>
>>>>> B. DoFn & SDF
>>>>>
>>>>> As a fn dev you expect the same from the beam runtime: init(); try {
>>>>> while (...) process(); } finally { destroy(); } and that it is executed on
>>>>> the exact same instance to be able to be stateful at that level for
>>>>> expensive connections/operations/flow state handling.
>>>>>
>>>>> As you mentionned with the million example, this sequence should
>>>>> happen for each single instance so 1M times for your example.
>>>>>
>>>>> Now why did I mention SDF several times? Because SDF is a
>>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>>> more instances and requires to have a way more strict/explicit definition
>>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>>> (setup/teardown) which can be stateful.
>>>>>
>>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>>> connection instance to not correct performances. Now with the SDF and the
>>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>>> a single connection per thread to avoid to consume all database
>>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>>> or 3 the connection count and make the execution fail with "no more
>>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>>> release the pool and don't leak the connection information in the JVM. In
>>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>>> to get connections with bundles.
>>>>>
>>>>> Just to make it obvious: SDF mentions are just cause SDF imply all the
>>>>> current issues with the loose definition of the bean lifecycles at an
>>>>> exponential level, nothing else.
>>>>>
>>>>>
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>
>>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>>> and I believe it should become the canonical way to do that.
>>>>>>
>>>>>> (Would like to reiterate one more time, as the main author of most
>>>>>> design documents related to SDF and of its implementation in the Java
>>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>>
>>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>>
>>>>>>> Otherwise how to normalize - single api -  lifecycle of transforms?
>>>>>>>
>>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>>>> écrit :
>>>>>>>
>>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>>>> the best of my knowledge.
>>>>>>>>
>>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>>> 2. When all temporary files are complete, attempt to do a bulk copy
>>>>>>>> to put them in the final destination.
>>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>>
>>>>>>>> (This is often desirable because it minimizes the chance of seeing
>>>>>>>> partial/incomplete results in the final destination).
>>>>>>>>
>>>>>>>> In the above, you'd want step 1 to execute on many workers, likely
>>>>>>>> using a ParDo (say N different workers).
>>>>>>>> The move step should only happen once, so on one worker. This means
>>>>>>>> it will be a different DoFn, likely with some stuff done to ensure it runs
>>>>>>>> on one worker.
>>>>>>>>
>>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough. We
>>>>>>>> need an API for a PTransform to schedule some cleanup work for when the
>>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>>
>>>>>>>> In streaming this is less straightforward -- do you want to wait
>>>>>>>> until the end of the pipeline? Or do you want to wait until the end of the
>>>>>>>> window? In practice, you just want to wait until you know nobody will need
>>>>>>>> the resource anymore.
>>>>>>>>
>>>>>>>> This led to some discussions around a "cleanup" API, where you
>>>>>>>> could have a transform that output resource objects. Each resource object
>>>>>>>> would have logic for cleaning it up. And there would be something that
>>>>>>>> indicated what parts of the pipeline needed that resource, and what kind of
>>>>>>>> temporal lifetime those objects had. As soon as that part of the pipeline
>>>>>>>> had advanced far enough that it would no longer need the resources, they
>>>>>>>> would get cleaned up. This can be done at pipeline shutdown, or
>>>>>>>> incrementally during a streaming pipeline, etc.
>>>>>>>>
>>>>>>>> Would something like this be a better fit for your use case? If
>>>>>>>> not, why is handling teardown within a single DoFn sufficient?
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall execution.
>>>>>>>>> Each instance - one fn so likely in a thread of a worker - has its
>>>>>>>>> lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>>
>>>>>>>>> In practise, new is often an unsafe allocate (deserialization) but
>>>>>>>>> it doesnt matter here.
>>>>>>>>>
>>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>>
>>>>>>>>> It is as simple as it.
>>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>>> contained to implement basic transforms.
>>>>>>>>>
>>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a
>>>>>>>>>>> écrit :
>>>>>>>>>>>
>>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>>>
>>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to do):
>>>>>>>>>>>
>>>>>>>>>>> 1. Clean-up some external, global resource, that was initialized
>>>>>>>>>>> once during the startup of the pipeline. If this is the case, how are you
>>>>>>>>>>> ensuring it was really only initialized once (and not once per worker, per
>>>>>>>>>>> thread, per instance, etc.)? How do you know when the pipeline should
>>>>>>>>>>> release it? If the answer is "when it reaches step X", then what about a
>>>>>>>>>>> streaming pipeline?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> When the dofn is no more needed logically ie when the batch is
>>>>>>>>>>> done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I'm really not following what this means.
>>>>>>>>>>
>>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each
>>>>>>>>>> worker is running 1000 threads (each running a copy of the same DoFn). How
>>>>>>>>>> many cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when
>>>>>>>>>> do you want it called? When the entire pipeline is shut down? When an
>>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 2. Finalize some resources that are used within some region of
>>>>>>>>>>> the pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>>>>>>>>> (they are focused on managing resources within the DoFn), you could model
>>>>>>>>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that stores
>>>>>>>>>>> information about resources)
>>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>>> changing resource IDs)
>>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>>    d) Pipeline segments that use the resources, and eventually
>>>>>>>>>>> output the fact they're done
>>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>>
>>>>>>>>>>> By making the use of the resource part of the data it is
>>>>>>>>>>> possible to "checkpoint" which resources may be in use or have been
>>>>>>>>>>> finished by using the require deterministic input. This is important to
>>>>>>>>>>> ensuring everything is actually cleaned up.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I nees that but generic and not case by case to industrialize
>>>>>>>>>>> some api on top of beam.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> 3. Some other use case that I may be missing? If it is this
>>>>>>>>>>> case, could you elaborate on what you are trying to accomplish? That would
>>>>>>>>>>> help me understand both the problems with existing options and possibly
>>>>>>>>>>> what could be done to help.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> I understand there are sorkaround for almost all cases but means
>>>>>>>>>>> each transform is different in its lifecycle handling  except i dislike it
>>>>>>>>>>> a lot at a scale and as a user since you cant put any unified practise on
>>>>>>>>>>> top of beam, it also makes beam very hard to integrate or to use to build
>>>>>>>>>>> higher level libraries or softwares.
>>>>>>>>>>>
>>>>>>>>>>> This is why i tried to not start the workaround discussions and
>>>>>>>>>>> just stay at API level.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -- Ben
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except in
>>>>>>>>>>>>> various situations where it's logically impossible or impractical to
>>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>>> examples above.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> The main point for the user is, you *will* see non-preventable
>>>>>>>>>>>>> situations where it couldn't be called - it's not just intergalactic
>>>>>>>>>>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>>>>>>>>>>> amount of temporary files, shutting down a large number of VMs you started
>>>>>>>>>>>>> etc), you have to express it using one of the other methods that have
>>>>>>>>>>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which
>>>>>>>>>>>> which other method you speak about. Concretely if you make it really
>>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is also
>>>>>>>>>>>>>> often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it" but
>>>>>>>>>>>>>> "best effort" is too open and can be very badly and wrongly perceived by
>>>>>>>>>>>>>> users (like I did).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in the
>>>>>>>>>>>>>>> example situation you have (intergalactic crash), and in a number of more
>>>>>>>>>>>>>>> common cases: eg in case the worker container has crashed (eg user code in
>>>>>>>>>>>>>>> a different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>>>>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com>
>>>>>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm
>>>>>>>>>>>>>>>>>>> trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things die
>>>>>>>>>>>>>>>>>> so badly that @Teardown is not called then nothing else can be called to
>>>>>>>>>>>>>>>>>> close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call
>>>>>>>>>>>>>>>> then it is a bug and we are done :).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner does
>>>>>>>>>>>>>>>>>> it so if a user udes the RI in test, he will get a different behavior in
>>>>>>>>>>>>>>>>>> prod? Also dont forget the user doesnt know what the IOs he composes use so
>>>>>>>>>>>>>>>>>> this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> I understand the portability culture is new in big data
>>>>>>>>>>>>>>>>>> world but it is not a reason to ignore what people did for years and do it
>>>>>>>>>>>>>>>>>> wrong before doing right ;).
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee - in
>>>>>>>>>>>>>>>>>> the normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>>>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>
>>>>
>>>
>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

This is what i want and not 999999 teardowns for 1000000 setups until there
is an unexpected crash (= a bug).

Le 19 févr. 2018 18:57, "Reuven Lax" <re...@google.com> a écrit :

>
>
> On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <rmannibucau@gmail.com
> > wrote:
>
>>
>>
>> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>>
>>>
>>>
>>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>> @Reuven: in practise it is created by pool of 256 but leads to the same
>>>> pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>>
>>>
>>> How do you control "256?" Even if you have a pool of 256 workers,
>>> nothing in Beam guarantees how many threads and DoFns are created per
>>> worker. In theory the runner might decide to create 1000 threads on each
>>> worker.
>>>
>>
>> Nop was the other way around, in this case on AWS you can get 256
>> instances at once but not 512 (which will be 2x256). So when you compute
>> the distribution you allocate to some fn the role to own the instance
>> lookup and releasing.
>>
>
> I still don't understand. Let's be more precise. If you write the
> following code:
>
>    pCollection.apply(ParDo.of(new MyDoFn()));
>
> There is no way to control how many instances of MyDoFn are created. The
> runner might decided to create a million instances of this class across
> your worker pool, which means that you will get a million Setup and
> Teardown calls.
>
>
>> Anyway this was just an example of an external resource you must release.
>> Real topic is that beam should define asap a guaranteed generic lifecycle
>> to let user embrace its programming model.
>>
>>
>>>
>>>
>>>
>>>> @Eugene:
>>>> 1. wait logic is about passing the value which is not always possible
>>>> (like 15% of cases from my raw estimate)
>>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>>
>>>>
>>>> Concretely beam exposes a portable API (included in the SDK core). This
>>>> API defines a *container* API and therefore implies bean lifecycles. I'll
>>>> not detail them all but just use the sources and dofn (not sdf) to
>>>> illustrate the idea I'm trying to develop.
>>>>
>>>> A. Source
>>>>
>>>> A source computes a partition plan with 2 primitives: estimateSize and
>>>> split. As an user you can expect both to be called on the same bean
>>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>>
>>>> connect()
>>>> try {
>>>>   estimateSize()
>>>>   split()
>>>> } finally {
>>>>   disconnect()
>>>> }
>>>>
>>>> this is not guaranteed by the API so you must do:
>>>>
>>>> connect()
>>>> try {
>>>>   estimateSize()
>>>> } finally {
>>>>   disconnect()
>>>> }
>>>> connect()
>>>> try {
>>>>   split()
>>>> } finally {
>>>>   disconnect()
>>>> }
>>>>
>>>> + a workaround with an internal estimate size since this primitive is
>>>> often called in split but you dont want to connect twice in the second
>>>> phase.
>>>>
>>>> Why do you need that? Simply cause you want to define an API to
>>>> implement sources which initializes the source bean and destroys it.
>>>> I insists it is a very very basic concern for such API. However beam
>>>> doesn't embraces it and doesn't assume it so building any API on top of
>>>> beam is very hurtful today and for direct beam users you hit the exact same
>>>> issues - check how IO are implemented, the static utilities which create
>>>> volatile connections preventing to reuse existing connection in a single
>>>> method (https://github.com/apache/beam/blob/master/sdks/java/io/ela
>>>> sticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearc
>>>> h/ElasticsearchIO.java#L862).
>>>>
>>>> Same logic applies to the reader which is then created.
>>>>
>>>> B. DoFn & SDF
>>>>
>>>> As a fn dev you expect the same from the beam runtime: init(); try {
>>>> while (...) process(); } finally { destroy(); } and that it is executed on
>>>> the exact same instance to be able to be stateful at that level for
>>>> expensive connections/operations/flow state handling.
>>>>
>>>> As you mentionned with the million example, this sequence should happen
>>>> for each single instance so 1M times for your example.
>>>>
>>>> Now why did I mention SDF several times? Because SDF is a
>>>> generalisation of both cases (source and dofn). Therefore it creates way
>>>> more instances and requires to have a way more strict/explicit definition
>>>> of the exact lifecycle and which instance does what. Since beam handles the
>>>> full lifecycle of the bean instances it must provide init/destroy hooks
>>>> (setup/teardown) which can be stateful.
>>>>
>>>> If you take the JDBC example which was mentionned earlier. Today,
>>>> because of the teardown issue it uses bundles. Since bundles size is not
>>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>>> connection instance to not correct performances. Now with the SDF and the
>>>> split increase, how do you handle the pool size? Generally in batch you use
>>>> a single connection per thread to avoid to consume all database
>>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>>> or 3 the connection count and make the execution fail with "no more
>>>> connection available". I you picked 1 (pool of #1), then you still have to
>>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>>> release the pool and don't leak the connection information in the JVM. In
>>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>>> to get connections with bundles.
>>>>
>>>> Just to make it obvious: SDF mentions are just cause SDF imply all the
>>>> current issues with the loose definition of the bean lifecycles at an
>>>> exponential level, nothing else.
>>>>
>>>>
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> | Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>
>>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>>> and I believe it should become the canonical way to do that.
>>>>>
>>>>> (Would like to reiterate one more time, as the main author of most
>>>>> design documents related to SDF and of its implementation in the Java
>>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>>
>>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>>
>>>>>> Otherwise how to normalize - single api -  lifecycle of transforms?
>>>>>>
>>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>>> écrit :
>>>>>>
>>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>>> the best of my knowledge.
>>>>>>>
>>>>>>> For instance, consider the steps of a FileIO:
>>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>>> 2. When all temporary files are complete, attempt to do a bulk copy
>>>>>>> to put them in the final destination.
>>>>>>> 3. Cleanup all the temporary files.
>>>>>>>
>>>>>>> (This is often desirable because it minimizes the chance of seeing
>>>>>>> partial/incomplete results in the final destination).
>>>>>>>
>>>>>>> In the above, you'd want step 1 to execute on many workers, likely
>>>>>>> using a ParDo (say N different workers).
>>>>>>> The move step should only happen once, so on one worker. This means
>>>>>>> it will be a different DoFn, likely with some stuff done to ensure it runs
>>>>>>> on one worker.
>>>>>>>
>>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough. We
>>>>>>> need an API for a PTransform to schedule some cleanup work for when the
>>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>>
>>>>>>> In streaming this is less straightforward -- do you want to wait
>>>>>>> until the end of the pipeline? Or do you want to wait until the end of the
>>>>>>> window? In practice, you just want to wait until you know nobody will need
>>>>>>> the resource anymore.
>>>>>>>
>>>>>>> This led to some discussions around a "cleanup" API, where you could
>>>>>>> have a transform that output resource objects. Each resource object would
>>>>>>> have logic for cleaning it up. And there would be something that indicated
>>>>>>> what parts of the pipeline needed that resource, and what kind of temporal
>>>>>>> lifetime those objects had. As soon as that part of the pipeline had
>>>>>>> advanced far enough that it would no longer need the resources, they would
>>>>>>> get cleaned up. This can be done at pipeline shutdown, or incrementally
>>>>>>> during a streaming pipeline, etc.
>>>>>>>
>>>>>>> Would something like this be a better fit for your use case? If not,
>>>>>>> why is handling teardown within a single DoFn sufficient?
>>>>>>>
>>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> Yes 1M. Lets try to explain you simplifying the overall execution.
>>>>>>>> Each instance - one fn so likely in a thread of a worker - has its
>>>>>>>> lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>>
>>>>>>>> In practise, new is often an unsafe allocate (deserialization) but
>>>>>>>> it doesnt matter here.
>>>>>>>>
>>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>>
>>>>>>>> It is as simple as it.
>>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>>> contained to implement basic transforms.
>>>>>>>>
>>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a
>>>>>>>>>> écrit :
>>>>>>>>>>
>>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>>
>>>>>>>>>> Some possibilities (I'm not sure which one you are trying to do):
>>>>>>>>>>
>>>>>>>>>> 1. Clean-up some external, global resource, that was initialized
>>>>>>>>>> once during the startup of the pipeline. If this is the case, how are you
>>>>>>>>>> ensuring it was really only initialized once (and not once per worker, per
>>>>>>>>>> thread, per instance, etc.)? How do you know when the pipeline should
>>>>>>>>>> release it? If the answer is "when it reaches step X", then what about a
>>>>>>>>>> streaming pipeline?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> When the dofn is no more needed logically ie when the batch is
>>>>>>>>>> done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I'm really not following what this means.
>>>>>>>>>
>>>>>>>>> Let's say that a pipeline is running 1000 workers, and each worker
>>>>>>>>> is running 1000 threads (each running a copy of the same DoFn). How many
>>>>>>>>> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
>>>>>>>>> you want it called? When the entire pipeline is shut down? When an
>>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>>> about to start back up)? Something else?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 2. Finalize some resources that are used within some region of
>>>>>>>>>> the pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>>>>>>>> (they are focused on managing resources within the DoFn), you could model
>>>>>>>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that stores
>>>>>>>>>> information about resources)
>>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>>> changing resource IDs)
>>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>>    d) Pipeline segments that use the resources, and eventually
>>>>>>>>>> output the fact they're done
>>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>>
>>>>>>>>>> By making the use of the resource part of the data it is possible
>>>>>>>>>> to "checkpoint" which resources may be in use or have been finished by
>>>>>>>>>> using the require deterministic input. This is important to ensuring
>>>>>>>>>> everything is actually cleaned up.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I nees that but generic and not case by case to industrialize
>>>>>>>>>> some api on top of beam.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> 3. Some other use case that I may be missing? If it is this case,
>>>>>>>>>> could you elaborate on what you are trying to accomplish? That would help
>>>>>>>>>> me understand both the problems with existing options and possibly what
>>>>>>>>>> could be done to help.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I understand there are sorkaround for almost all cases but means
>>>>>>>>>> each transform is different in its lifecycle handling  except i dislike it
>>>>>>>>>> a lot at a scale and as a user since you cant put any unified practise on
>>>>>>>>>> top of beam, it also makes beam very hard to integrate or to use to build
>>>>>>>>>> higher level libraries or softwares.
>>>>>>>>>>
>>>>>>>>>> This is why i tried to not start the workaround discussions and
>>>>>>>>>> just stay at API level.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -- Ben
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>
>>>>>>>>>>>> "Machine state" is overly low-level because many of the
>>>>>>>>>>>> possible reasons can happen on a perfectly fine machine.
>>>>>>>>>>>> If you'd like to rephrase it to "it will be called except in
>>>>>>>>>>>> various situations where it's logically impossible or impractical to
>>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>>> examples above.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Sounds ok to me
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> The main point for the user is, you *will* see non-preventable
>>>>>>>>>>>> situations where it couldn't be called - it's not just intergalactic
>>>>>>>>>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>>>>>>>>>> amount of temporary files, shutting down a large number of VMs you started
>>>>>>>>>>>> etc), you have to express it using one of the other methods that have
>>>>>>>>>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which
>>>>>>>>>>> which other method you speak about. Concretely if you make it really
>>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is also
>>>>>>>>>>>>> often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I'm fine using "except if the machine state prevents it" but
>>>>>>>>>>>>> "best effort" is too open and can be very badly and wrongly perceived by
>>>>>>>>>>>>> users (like I did).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>>
>>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> It will not be called if it's impossible to call it: in the
>>>>>>>>>>>>>> example situation you have (intergalactic crash), and in a number of more
>>>>>>>>>>>>>> common cases: eg in case the worker container has crashed (eg user code in
>>>>>>>>>>>>>> a different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>>>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com>
>>>>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm
>>>>>>>>>>>>>>>>>> trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things die
>>>>>>>>>>>>>>>>> so badly that @Teardown is not called then nothing else can be called to
>>>>>>>>>>>>>>>>> close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call
>>>>>>>>>>>>>>> then it is a bug and we are done :).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner does
>>>>>>>>>>>>>>>>> it so if a user udes the RI in test, he will get a different behavior in
>>>>>>>>>>>>>>>>> prod? Also dont forget the user doesnt know what the IOs he composes use so
>>>>>>>>>>>>>>>>> this is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> I understand the portability culture is new in big data
>>>>>>>>>>>>>>>>> world but it is not a reason to ignore what people did for years and do it
>>>>>>>>>>>>>>>>> wrong before doing right ;).
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee - in
>>>>>>>>>>>>>>>>> the normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Technical note: even a kill should go through java
>>>>>>>>>>>>>>>>> shutdown hooks otherwise your environment (beam enclosing software) is
>>>>>>>>>>>>>>>>> fully unhandled and your overall system is uncontrolled. Only case where it
>>>>>>>>>>>>>>>>> is not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> All state is not about network, even in distributed
>>>>>>>>>>>>>>>>> systems so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>
>>>
>>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

On Mon, Feb 19, 2018 at 7:11 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

>
>
> 2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:
>
>>
>>
>> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>> @Reuven: in practise it is created by pool of 256 but leads to the same
>>> pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>>
>>
>> How do you control "256?" Even if you have a pool of 256 workers, nothing
>> in Beam guarantees how many threads and DoFns are created per worker. In
>> theory the runner might decide to create 1000 threads on each worker.
>>
>
> Nop was the other way around, in this case on AWS you can get 256
> instances at once but not 512 (which will be 2x256). So when you compute
> the distribution you allocate to some fn the role to own the instance
> lookup and releasing.
>

I still don't understand. Let's be more precise. If you write the following
code:

   pCollection.apply(ParDo.of(new MyDoFn()));

There is no way to control how many instances of MyDoFn are created. The
runner might decided to create a million instances of this class across
your worker pool, which means that you will get a million Setup and
Teardown calls.


> Anyway this was just an example of an external resource you must release.
> Real topic is that beam should define asap a guaranteed generic lifecycle
> to let user embrace its programming model.
>
>
>>
>>
>>
>>> @Eugene:
>>> 1. wait logic is about passing the value which is not always possible
>>> (like 15% of cases from my raw estimate)
>>> 2. sdf: i'll try to detail why i mention SDF more here
>>>
>>>
>>> Concretely beam exposes a portable API (included in the SDK core). This
>>> API defines a *container* API and therefore implies bean lifecycles. I'll
>>> not detail them all but just use the sources and dofn (not sdf) to
>>> illustrate the idea I'm trying to develop.
>>>
>>> A. Source
>>>
>>> A source computes a partition plan with 2 primitives: estimateSize and
>>> split. As an user you can expect both to be called on the same bean
>>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>>
>>> connect()
>>> try {
>>>   estimateSize()
>>>   split()
>>> } finally {
>>>   disconnect()
>>> }
>>>
>>> this is not guaranteed by the API so you must do:
>>>
>>> connect()
>>> try {
>>>   estimateSize()
>>> } finally {
>>>   disconnect()
>>> }
>>> connect()
>>> try {
>>>   split()
>>> } finally {
>>>   disconnect()
>>> }
>>>
>>> + a workaround with an internal estimate size since this primitive is
>>> often called in split but you dont want to connect twice in the second
>>> phase.
>>>
>>> Why do you need that? Simply cause you want to define an API to
>>> implement sources which initializes the source bean and destroys it.
>>> I insists it is a very very basic concern for such API. However beam
>>> doesn't embraces it and doesn't assume it so building any API on top of
>>> beam is very hurtful today and for direct beam users you hit the exact same
>>> issues - check how IO are implemented, the static utilities which create
>>> volatile connections preventing to reuse existing connection in a single
>>> method (https://github.com/apache/beam/blob/master/sdks/java/io/ela
>>> sticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearc
>>> h/ElasticsearchIO.java#L862).
>>>
>>> Same logic applies to the reader which is then created.
>>>
>>> B. DoFn & SDF
>>>
>>> As a fn dev you expect the same from the beam runtime: init(); try {
>>> while (...) process(); } finally { destroy(); } and that it is executed on
>>> the exact same instance to be able to be stateful at that level for
>>> expensive connections/operations/flow state handling.
>>>
>>> As you mentionned with the million example, this sequence should happen
>>> for each single instance so 1M times for your example.
>>>
>>> Now why did I mention SDF several times? Because SDF is a generalisation
>>> of both cases (source and dofn). Therefore it creates way more instances
>>> and requires to have a way more strict/explicit definition of the exact
>>> lifecycle and which instance does what. Since beam handles the full
>>> lifecycle of the bean instances it must provide init/destroy hooks
>>> (setup/teardown) which can be stateful.
>>>
>>> If you take the JDBC example which was mentionned earlier. Today,
>>> because of the teardown issue it uses bundles. Since bundles size is not
>>> defined - and will not with SDF, it must use a pool to be able to reuse a
>>> connection instance to not correct performances. Now with the SDF and the
>>> split increase, how do you handle the pool size? Generally in batch you use
>>> a single connection per thread to avoid to consume all database
>>> connections. With a pool you have 2 choices: 1. use a pool of 1, 2. use a
>>> pool a bit higher but multiplied by the number of beans you will likely x2
>>> or 3 the connection count and make the execution fail with "no more
>>> connection available". I you picked 1 (pool of #1), then you still have to
>>> have a reliable teardown by pool instance (close() generally) to ensure you
>>> release the pool and don't leak the connection information in the JVM. In
>>> all case you come back to the init()/destroy() lifecycle even if you fake
>>> to get connections with bundles.
>>>
>>> Just to make it obvious: SDF mentions are just cause SDF imply all the
>>> current issues with the loose definition of the bean lifecycles at an
>>> exponential level, nothing else.
>>>
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>
>>>> The kind of whole-transform lifecycle you're mentioning can be
>>>> accomplished using the Wait transform as I suggested in the thread above,
>>>> and I believe it should become the canonical way to do that.
>>>>
>>>> (Would like to reiterate one more time, as the main author of most
>>>> design documents related to SDF and of its implementation in the Java
>>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>>> cleanup - I'm very confused as to why it keeps coming up)
>>>>
>>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <rm...@gmail.com>
>>>> wrote:
>>>>
>>>>> I kind of agree except transforms lack a lifecycle too. My
>>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>>
>>>>> Otherwise how to normalize - single api -  lifecycle of transforms?
>>>>>
>>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a
>>>>> écrit :
>>>>>
>>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>>> the best of my knowledge.
>>>>>>
>>>>>> For instance, consider the steps of a FileIO:
>>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>>> 2. When all temporary files are complete, attempt to do a bulk copy
>>>>>> to put them in the final destination.
>>>>>> 3. Cleanup all the temporary files.
>>>>>>
>>>>>> (This is often desirable because it minimizes the chance of seeing
>>>>>> partial/incomplete results in the final destination).
>>>>>>
>>>>>> In the above, you'd want step 1 to execute on many workers, likely
>>>>>> using a ParDo (say N different workers).
>>>>>> The move step should only happen once, so on one worker. This means
>>>>>> it will be a different DoFn, likely with some stuff done to ensure it runs
>>>>>> on one worker.
>>>>>>
>>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough. We
>>>>>> need an API for a PTransform to schedule some cleanup work for when the
>>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>>> leaving files around that have failed to import into BigQuery.
>>>>>>
>>>>>> In streaming this is less straightforward -- do you want to wait
>>>>>> until the end of the pipeline? Or do you want to wait until the end of the
>>>>>> window? In practice, you just want to wait until you know nobody will need
>>>>>> the resource anymore.
>>>>>>
>>>>>> This led to some discussions around a "cleanup" API, where you could
>>>>>> have a transform that output resource objects. Each resource object would
>>>>>> have logic for cleaning it up. And there would be something that indicated
>>>>>> what parts of the pipeline needed that resource, and what kind of temporal
>>>>>> lifetime those objects had. As soon as that part of the pipeline had
>>>>>> advanced far enough that it would no longer need the resources, they would
>>>>>> get cleaned up. This can be done at pipeline shutdown, or incrementally
>>>>>> during a streaming pipeline, etc.
>>>>>>
>>>>>> Would something like this be a better fit for your use case? If not,
>>>>>> why is handling teardown within a single DoFn sufficient?
>>>>>>
>>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> Yes 1M. Lets try to explain you simplifying the overall execution.
>>>>>>> Each instance - one fn so likely in a thread of a worker - has its
>>>>>>> lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>>
>>>>>>> In practise, new is often an unsafe allocate (deserialization) but
>>>>>>> it doesnt matter here.
>>>>>>>
>>>>>>> What i want is any "new" to have a following setup before any
>>>>>>> process or stattbundle and the last time beam has the instance before it is
>>>>>>> gc-ed and after last finishbundle it calls teardown.
>>>>>>>
>>>>>>> It is as simple as it.
>>>>>>> This way no need to comibe fn in a way making a fn not self
>>>>>>> contained to implement basic transforms.
>>>>>>>
>>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a
>>>>>>>>> écrit :
>>>>>>>>>
>>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>>
>>>>>>>>> Some possibilities (I'm not sure which one you are trying to do):
>>>>>>>>>
>>>>>>>>> 1. Clean-up some external, global resource, that was initialized
>>>>>>>>> once during the startup of the pipeline. If this is the case, how are you
>>>>>>>>> ensuring it was really only initialized once (and not once per worker, per
>>>>>>>>> thread, per instance, etc.)? How do you know when the pipeline should
>>>>>>>>> release it? If the answer is "when it reaches step X", then what about a
>>>>>>>>> streaming pipeline?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> When the dofn is no more needed logically ie when the batch is
>>>>>>>>> done or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>>
>>>>>>>>
>>>>>>>> I'm really not following what this means.
>>>>>>>>
>>>>>>>> Let's say that a pipeline is running 1000 workers, and each worker
>>>>>>>> is running 1000 threads (each running a copy of the same DoFn). How many
>>>>>>>> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
>>>>>>>> you want it called? When the entire pipeline is shut down? When an
>>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>>> about to start back up)? Something else?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 2. Finalize some resources that are used within some region of the
>>>>>>>>> pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>>>>>>> (they are focused on managing resources within the DoFn), you could model
>>>>>>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>>    a) ParDo generates "resource IDs" (or some token that stores
>>>>>>>>> information about resources)
>>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>>> changing resource IDs)
>>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>>    d) Pipeline segments that use the resources, and eventually
>>>>>>>>> output the fact they're done
>>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>>
>>>>>>>>> By making the use of the resource part of the data it is possible
>>>>>>>>> to "checkpoint" which resources may be in use or have been finished by
>>>>>>>>> using the require deterministic input. This is important to ensuring
>>>>>>>>> everything is actually cleaned up.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I nees that but generic and not case by case to industrialize some
>>>>>>>>> api on top of beam.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> 3. Some other use case that I may be missing? If it is this case,
>>>>>>>>> could you elaborate on what you are trying to accomplish? That would help
>>>>>>>>> me understand both the problems with existing options and possibly what
>>>>>>>>> could be done to help.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I understand there are sorkaround for almost all cases but means
>>>>>>>>> each transform is different in its lifecycle handling  except i dislike it
>>>>>>>>> a lot at a scale and as a user since you cant put any unified practise on
>>>>>>>>> top of beam, it also makes beam very hard to integrate or to use to build
>>>>>>>>> higher level libraries or softwares.
>>>>>>>>>
>>>>>>>>> This is why i tried to not start the workaround discussions and
>>>>>>>>> just stay at API level.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -- Ben
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <kirpichov@google.com
>>>>>>>>>> >:
>>>>>>>>>>
>>>>>>>>>>> "Machine state" is overly low-level because many of the possible
>>>>>>>>>>> reasons can happen on a perfectly fine machine.
>>>>>>>>>>> If you'd like to rephrase it to "it will be called except in
>>>>>>>>>>> various situations where it's logically impossible or impractical to
>>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>>> examples above.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Sounds ok to me
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> The main point for the user is, you *will* see non-preventable
>>>>>>>>>>> situations where it couldn't be called - it's not just intergalactic
>>>>>>>>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>>>>>>>>> amount of temporary files, shutting down a large number of VMs you started
>>>>>>>>>>> etc), you have to express it using one of the other methods that have
>>>>>>>>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>>> pass-by-reference).
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which
>>>>>>>>>> which other method you speak about. Concretely if you make it really
>>>>>>>>>> unreliable - this is what best effort sounds to me - then users can use it
>>>>>>>>>> to clean anything but if you make it "can happen but it is unexpected and
>>>>>>>>>> means something happent" then it is fine to have a manual - or auto if
>>>>>>>>>> fancy - recovery procedure. This is where it makes all the difference and
>>>>>>>>>> impacts the developpers, ops (all users basically).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is also
>>>>>>>>>>>> often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>>
>>>>>>>>>>>> I'm fine using "except if the machine state prevents it" but
>>>>>>>>>>>> "best effort" is too open and can be very badly and wrongly perceived by
>>>>>>>>>>>> users (like I did).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>>
>>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>> It will not be called if it's impossible to call it: in the
>>>>>>>>>>>>> example situation you have (intergalactic crash), and in a number of more
>>>>>>>>>>>>> common cases: eg in case the worker container has crashed (eg user code in
>>>>>>>>>>>>> a different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>>
>>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com>
>>>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm
>>>>>>>>>>>>>>>>> trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things die
>>>>>>>>>>>>>>>> so badly that @Teardown is not called then nothing else can be called to
>>>>>>>>>>>>>>>> close the connection either. What AWS service are you thinking of that
>>>>>>>>>>>>>>>> stays open for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call then
>>>>>>>>>>>>>> it is a bug and we are done :).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner does it
>>>>>>>>>>>>>>>> so if a user udes the RI in test, he will get a different behavior in prod?
>>>>>>>>>>>>>>>> Also dont forget the user doesnt know what the IOs he composes use so this
>>>>>>>>>>>>>>>> is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> I understand the portability culture is new in big data
>>>>>>>>>>>>>>>> world but it is not a reason to ignore what people did for years and do it
>>>>>>>>>>>>>>>> wrong before doing right ;).
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee - in
>>>>>>>>>>>>>>>> the normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Technical note: even a kill should go through java shutdown
>>>>>>>>>>>>>>>> hooks otherwise your environment (beam enclosing software) is fully
>>>>>>>>>>>>>>>> unhandled and your overall system is uncontrolled. Only case where it is
>>>>>>>>>>>>>>>> not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> All state is not about network, even in distributed systems
>>>>>>>>>>>>>>>> so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>
>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

2018-02-19 15:57 GMT+01:00 Reuven Lax <re...@google.com>:

>
>
> On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <
> rmannibucau@gmail.com> wrote:
>
>> @Reuven: in practise it is created by pool of 256 but leads to the same
>> pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>>
>
> How do you control "256?" Even if you have a pool of 256 workers, nothing
> in Beam guarantees how many threads and DoFns are created per worker. In
> theory the runner might decide to create 1000 threads on each worker.
>

Nop was the other way around, in this case on AWS you can get 256 instances
at once but not 512 (which will be 2x256). So when you compute the
distribution you allocate to some fn the role to own the instance lookup
and releasing.

Anyway this was just an example of an external resource you must release.
Real topic is that beam should define asap a guaranteed generic lifecycle
to let user embrace its programming model.


>
>
>
>> @Eugene:
>> 1. wait logic is about passing the value which is not always possible
>> (like 15% of cases from my raw estimate)
>> 2. sdf: i'll try to detail why i mention SDF more here
>>
>>
>> Concretely beam exposes a portable API (included in the SDK core). This
>> API defines a *container* API and therefore implies bean lifecycles. I'll
>> not detail them all but just use the sources and dofn (not sdf) to
>> illustrate the idea I'm trying to develop.
>>
>> A. Source
>>
>> A source computes a partition plan with 2 primitives: estimateSize and
>> split. As an user you can expect both to be called on the same bean
>> instance to avoid to pay the same connection cost(s) twice. Concretely:
>>
>> connect()
>> try {
>>   estimateSize()
>>   split()
>> } finally {
>>   disconnect()
>> }
>>
>> this is not guaranteed by the API so you must do:
>>
>> connect()
>> try {
>>   estimateSize()
>> } finally {
>>   disconnect()
>> }
>> connect()
>> try {
>>   split()
>> } finally {
>>   disconnect()
>> }
>>
>> + a workaround with an internal estimate size since this primitive is
>> often called in split but you dont want to connect twice in the second
>> phase.
>>
>> Why do you need that? Simply cause you want to define an API to implement
>> sources which initializes the source bean and destroys it.
>> I insists it is a very very basic concern for such API. However beam
>> doesn't embraces it and doesn't assume it so building any API on top of
>> beam is very hurtful today and for direct beam users you hit the exact same
>> issues - check how IO are implemented, the static utilities which create
>> volatile connections preventing to reuse existing connection in a single
>> method (https://github.com/apache/beam/blob/master/sdks/java/io/ela
>> sticsearch/src/main/java/org/apache/beam/sdk/io/elasticsear
>> ch/ElasticsearchIO.java#L862).
>>
>> Same logic applies to the reader which is then created.
>>
>> B. DoFn & SDF
>>
>> As a fn dev you expect the same from the beam runtime: init(); try {
>> while (...) process(); } finally { destroy(); } and that it is executed on
>> the exact same instance to be able to be stateful at that level for
>> expensive connections/operations/flow state handling.
>>
>> As you mentionned with the million example, this sequence should happen
>> for each single instance so 1M times for your example.
>>
>> Now why did I mention SDF several times? Because SDF is a generalisation
>> of both cases (source and dofn). Therefore it creates way more instances
>> and requires to have a way more strict/explicit definition of the exact
>> lifecycle and which instance does what. Since beam handles the full
>> lifecycle of the bean instances it must provide init/destroy hooks
>> (setup/teardown) which can be stateful.
>>
>> If you take the JDBC example which was mentionned earlier. Today, because
>> of the teardown issue it uses bundles. Since bundles size is not defined -
>> and will not with SDF, it must use a pool to be able to reuse a connection
>> instance to not correct performances. Now with the SDF and the split
>> increase, how do you handle the pool size? Generally in batch you use a
>> single connection per thread to avoid to consume all database connections.
>> With a pool you have 2 choices: 1. use a pool of 1, 2. use a pool a bit
>> higher but multiplied by the number of beans you will likely x2 or 3 the
>> connection count and make the execution fail with "no more connection
>> available". I you picked 1 (pool of #1), then you still have to have a
>> reliable teardown by pool instance (close() generally) to ensure you
>> release the pool and don't leak the connection information in the JVM. In
>> all case you come back to the init()/destroy() lifecycle even if you fake
>> to get connections with bundles.
>>
>> Just to make it obvious: SDF mentions are just cause SDF imply all the
>> current issues with the loose definition of the bean lifecycles at an
>> exponential level, nothing else.
>>
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> | Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau> | Book
>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>
>> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>
>>> The kind of whole-transform lifecycle you're mentioning can be
>>> accomplished using the Wait transform as I suggested in the thread above,
>>> and I believe it should become the canonical way to do that.
>>>
>>> (Would like to reiterate one more time, as the main author of most
>>> design documents related to SDF and of its implementation in the Java
>>> direct and dataflow runner that SDF is fully unrelated to the topic of
>>> cleanup - I'm very confused as to why it keeps coming up)
>>>
>>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>
>>>> I kind of agree except transforms lack a lifecycle too. My
>>>> understanding is that sdf could be a way to unify it and clean the api.
>>>>
>>>> Otherwise how to normalize - single api -  lifecycle of transforms?
>>>>
>>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a écrit :
>>>>
>>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>>> composite PTransform. I think there have been discussions/proposals around
>>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>>> the best of my knowledge.
>>>>>
>>>>> For instance, consider the steps of a FileIO:
>>>>> 1. Write to a bunch (N shards) of temporary files
>>>>> 2. When all temporary files are complete, attempt to do a bulk copy to
>>>>> put them in the final destination.
>>>>> 3. Cleanup all the temporary files.
>>>>>
>>>>> (This is often desirable because it minimizes the chance of seeing
>>>>> partial/incomplete results in the final destination).
>>>>>
>>>>> In the above, you'd want step 1 to execute on many workers, likely
>>>>> using a ParDo (say N different workers).
>>>>> The move step should only happen once, so on one worker. This means it
>>>>> will be a different DoFn, likely with some stuff done to ensure it runs on
>>>>> one worker.
>>>>>
>>>>> In such a case, cleanup / @TearDown of the DoFn is not enough. We need
>>>>> an API for a PTransform to schedule some cleanup work for when the
>>>>> transform is "done". In batch this is relatively straightforward, but
>>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>>> leaving files around that have failed to import into BigQuery.
>>>>>
>>>>> In streaming this is less straightforward -- do you want to wait until
>>>>> the end of the pipeline? Or do you want to wait until the end of the
>>>>> window? In practice, you just want to wait until you know nobody will need
>>>>> the resource anymore.
>>>>>
>>>>> This led to some discussions around a "cleanup" API, where you could
>>>>> have a transform that output resource objects. Each resource object would
>>>>> have logic for cleaning it up. And there would be something that indicated
>>>>> what parts of the pipeline needed that resource, and what kind of temporal
>>>>> lifetime those objects had. As soon as that part of the pipeline had
>>>>> advanced far enough that it would no longer need the resources, they would
>>>>> get cleaned up. This can be done at pipeline shutdown, or incrementally
>>>>> during a streaming pipeline, etc.
>>>>>
>>>>> Would something like this be a better fit for your use case? If not,
>>>>> why is handling teardown within a single DoFn sufficient?
>>>>>
>>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> Yes 1M. Lets try to explain you simplifying the overall execution.
>>>>>> Each instance - one fn so likely in a thread of a worker - has its
>>>>>> lifecycle. Caricaturally: "new" and garbage collection.
>>>>>>
>>>>>> In practise, new is often an unsafe allocate (deserialization) but it
>>>>>> doesnt matter here.
>>>>>>
>>>>>> What i want is any "new" to have a following setup before any process
>>>>>> or stattbundle and the last time beam has the instance before it is gc-ed
>>>>>> and after last finishbundle it calls teardown.
>>>>>>
>>>>>> It is as simple as it.
>>>>>> This way no need to comibe fn in a way making a fn not self contained
>>>>>> to implement basic transforms.
>>>>>>
>>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a
>>>>>>>> écrit :
>>>>>>>>
>>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>>
>>>>>>>> Some possibilities (I'm not sure which one you are trying to do):
>>>>>>>>
>>>>>>>> 1. Clean-up some external, global resource, that was initialized
>>>>>>>> once during the startup of the pipeline. If this is the case, how are you
>>>>>>>> ensuring it was really only initialized once (and not once per worker, per
>>>>>>>> thread, per instance, etc.)? How do you know when the pipeline should
>>>>>>>> release it? If the answer is "when it reaches step X", then what about a
>>>>>>>> streaming pipeline?
>>>>>>>>
>>>>>>>>
>>>>>>>> When the dofn is no more needed logically ie when the batch is done
>>>>>>>> or stream is stopped (manually or by a jvm shutdown)
>>>>>>>>
>>>>>>>
>>>>>>> I'm really not following what this means.
>>>>>>>
>>>>>>> Let's say that a pipeline is running 1000 workers, and each worker
>>>>>>> is running 1000 threads (each running a copy of the same DoFn). How many
>>>>>>> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
>>>>>>> you want it called? When the entire pipeline is shut down? When an
>>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>>> about to start back up)? Something else?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 2. Finalize some resources that are used within some region of the
>>>>>>>> pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>>>>>> (they are focused on managing resources within the DoFn), you could model
>>>>>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>>    a) ParDo generates "resource IDs" (or some token that stores
>>>>>>>> information about resources)
>>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>>> changing resource IDs)
>>>>>>>>    c) ParDo that initializes the resources
>>>>>>>>    d) Pipeline segments that use the resources, and eventually
>>>>>>>> output the fact they're done
>>>>>>>>    e) "Require Deterministic Input"
>>>>>>>>    f) ParDo that frees the resources
>>>>>>>>
>>>>>>>> By making the use of the resource part of the data it is possible
>>>>>>>> to "checkpoint" which resources may be in use or have been finished by
>>>>>>>> using the require deterministic input. This is important to ensuring
>>>>>>>> everything is actually cleaned up.
>>>>>>>>
>>>>>>>>
>>>>>>>> I nees that but generic and not case by case to industrialize some
>>>>>>>> api on top of beam.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> 3. Some other use case that I may be missing? If it is this case,
>>>>>>>> could you elaborate on what you are trying to accomplish? That would help
>>>>>>>> me understand both the problems with existing options and possibly what
>>>>>>>> could be done to help.
>>>>>>>>
>>>>>>>>
>>>>>>>> I understand there are sorkaround for almost all cases but means
>>>>>>>> each transform is different in its lifecycle handling  except i dislike it
>>>>>>>> a lot at a scale and as a user since you cant put any unified practise on
>>>>>>>> top of beam, it also makes beam very hard to integrate or to use to build
>>>>>>>> higher level libraries or softwares.
>>>>>>>>
>>>>>>>> This is why i tried to not start the workaround discussions and
>>>>>>>> just stay at API level.
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> -- Ben
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>
>>>>>>>>> :
>>>>>>>>>
>>>>>>>>>> "Machine state" is overly low-level because many of the possible
>>>>>>>>>> reasons can happen on a perfectly fine machine.
>>>>>>>>>> If you'd like to rephrase it to "it will be called except in
>>>>>>>>>> various situations where it's logically impossible or impractical to
>>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>>> examples above.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sounds ok to me
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> The main point for the user is, you *will* see non-preventable
>>>>>>>>>> situations where it couldn't be called - it's not just intergalactic
>>>>>>>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>>>>>>>> amount of temporary files, shutting down a large number of VMs you started
>>>>>>>>>> etc), you have to express it using one of the other methods that have
>>>>>>>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>>> pass-by-reference).
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> FinishBundle has the exact same guarantee sadly so not which which
>>>>>>>>> other method you speak about. Concretely if you make it really unreliable -
>>>>>>>>> this is what best effort sounds to me - then users can use it to clean
>>>>>>>>> anything but if you make it "can happen but it is unexpected and means
>>>>>>>>> something happent" then it is fine to have a manual - or auto if fancy -
>>>>>>>>> recovery procedure. This is where it makes all the difference and impacts
>>>>>>>>> the developpers, ops (all users basically).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Agree Eugene except that "best effort" means that. It is also
>>>>>>>>>>> often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>>
>>>>>>>>>>> I'm fine using "except if the machine state prevents it" but
>>>>>>>>>>> "best effort" is too open and can be very badly and wrongly perceived by
>>>>>>>>>>> users (like I did).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>>
>>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>
>>>>>>>>>>>> It will not be called if it's impossible to call it: in the
>>>>>>>>>>>> example situation you have (intergalactic crash), and in a number of more
>>>>>>>>>>>> common cases: eg in case the worker container has crashed (eg user code in
>>>>>>>>>>>> a different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>>
>>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com>
>>>>>>>>>>>>>>> a écrit :
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm
>>>>>>>>>>>>>>>> trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Take a simple example of a transform requiring a
>>>>>>>>>>>>>>>> connection. Using bundles is a perf killer since size is not controlled.
>>>>>>>>>>>>>>>> Using teardown doesnt allow you to release the connection since it is a
>>>>>>>>>>>>>>>> best effort thing. Not releasing the connection makes you pay a lot - aws
>>>>>>>>>>>>>>>> ;) - or prevents you to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things die so
>>>>>>>>>>>>>>> badly that @Teardown is not called then nothing else can be called to close
>>>>>>>>>>>>>>> the connection either. What AWS service are you thinking of that stays open
>>>>>>>>>>>>>>> for a long time when everything at the other end has died?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call then
>>>>>>>>>>>>> it is a bug and we are done :).
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner does it
>>>>>>>>>>>>>>> so if a user udes the RI in test, he will get a different behavior in prod?
>>>>>>>>>>>>>>> Also dont forget the user doesnt know what the IOs he composes use so this
>>>>>>>>>>>>>>> is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I understand the portability culture is new in big data
>>>>>>>>>>>>>>> world but it is not a reason to ignore what people did for years and do it
>>>>>>>>>>>>>>> wrong before doing right ;).
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee - in
>>>>>>>>>>>>>>> the normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Technical note: even a kill should go through java shutdown
>>>>>>>>>>>>>>> hooks otherwise your environment (beam enclosing software) is fully
>>>>>>>>>>>>>>> unhandled and your overall system is uncontrolled. Only case where it is
>>>>>>>>>>>>>>> not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> All state is not about network, even in distributed systems
>>>>>>>>>>>>>>> so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>
>>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

On Mon, Feb 19, 2018 at 12:35 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

> @Reuven: in practise it is created by pool of 256 but leads to the same
> pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
>

How do you control "256?" Even if you have a pool of 256 workers, nothing
in Beam guarantees how many threads and DoFns are created per worker. In
theory the runner might decide to create 1000 threads on each worker.



> @Eugene:
> 1. wait logic is about passing the value which is not always possible
> (like 15% of cases from my raw estimate)
> 2. sdf: i'll try to detail why i mention SDF more here
>
>
> Concretely beam exposes a portable API (included in the SDK core). This
> API defines a *container* API and therefore implies bean lifecycles. I'll
> not detail them all but just use the sources and dofn (not sdf) to
> illustrate the idea I'm trying to develop.
>
> A. Source
>
> A source computes a partition plan with 2 primitives: estimateSize and
> split. As an user you can expect both to be called on the same bean
> instance to avoid to pay the same connection cost(s) twice. Concretely:
>
> connect()
> try {
>   estimateSize()
>   split()
> } finally {
>   disconnect()
> }
>
> this is not guaranteed by the API so you must do:
>
> connect()
> try {
>   estimateSize()
> } finally {
>   disconnect()
> }
> connect()
> try {
>   split()
> } finally {
>   disconnect()
> }
>
> + a workaround with an internal estimate size since this primitive is
> often called in split but you dont want to connect twice in the second
> phase.
>
> Why do you need that? Simply cause you want to define an API to implement
> sources which initializes the source bean and destroys it.
> I insists it is a very very basic concern for such API. However beam
> doesn't embraces it and doesn't assume it so building any API on top of
> beam is very hurtful today and for direct beam users you hit the exact same
> issues - check how IO are implemented, the static utilities which create
> volatile connections preventing to reuse existing connection in a single
> method (https://github.com/apache/beam/blob/master/sdks/java/io/
> elasticsearch/src/main/java/org/apache/beam/sdk/io/
> elasticsearch/ElasticsearchIO.java#L862).
>
> Same logic applies to the reader which is then created.
>
> B. DoFn & SDF
>
> As a fn dev you expect the same from the beam runtime: init(); try { while
> (...) process(); } finally { destroy(); } and that it is executed on the
> exact same instance to be able to be stateful at that level for expensive
> connections/operations/flow state handling.
>
> As you mentionned with the million example, this sequence should happen
> for each single instance so 1M times for your example.
>
> Now why did I mention SDF several times? Because SDF is a generalisation
> of both cases (source and dofn). Therefore it creates way more instances
> and requires to have a way more strict/explicit definition of the exact
> lifecycle and which instance does what. Since beam handles the full
> lifecycle of the bean instances it must provide init/destroy hooks
> (setup/teardown) which can be stateful.
>
> If you take the JDBC example which was mentionned earlier. Today, because
> of the teardown issue it uses bundles. Since bundles size is not defined -
> and will not with SDF, it must use a pool to be able to reuse a connection
> instance to not correct performances. Now with the SDF and the split
> increase, how do you handle the pool size? Generally in batch you use a
> single connection per thread to avoid to consume all database connections.
> With a pool you have 2 choices: 1. use a pool of 1, 2. use a pool a bit
> higher but multiplied by the number of beans you will likely x2 or 3 the
> connection count and make the execution fail with "no more connection
> available". I you picked 1 (pool of #1), then you still have to have a
> reliable teardown by pool instance (close() generally) to ensure you
> release the pool and don't leak the connection information in the JVM. In
> all case you come back to the init()/destroy() lifecycle even if you fake
> to get connections with bundles.
>
> Just to make it obvious: SDF mentions are just cause SDF imply all the
> current issues with the loose definition of the bean lifecycles at an
> exponential level, nothing else.
>
>
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>
> 2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>
>> The kind of whole-transform lifecycle you're mentioning can be
>> accomplished using the Wait transform as I suggested in the thread above,
>> and I believe it should become the canonical way to do that.
>>
>> (Would like to reiterate one more time, as the main author of most design
>> documents related to SDF and of its implementation in the Java direct and
>> dataflow runner that SDF is fully unrelated to the topic of cleanup - I'm
>> very confused as to why it keeps coming up)
>>
>> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> I kind of agree except transforms lack a lifecycle too. My understanding
>>> is that sdf could be a way to unify it and clean the api.
>>>
>>> Otherwise how to normalize - single api -  lifecycle of transforms?
>>>
>>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a écrit :
>>>
>>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>>> composite PTransform. I think there have been discussions/proposals around
>>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>>> the best of my knowledge.
>>>>
>>>> For instance, consider the steps of a FileIO:
>>>> 1. Write to a bunch (N shards) of temporary files
>>>> 2. When all temporary files are complete, attempt to do a bulk copy to
>>>> put them in the final destination.
>>>> 3. Cleanup all the temporary files.
>>>>
>>>> (This is often desirable because it minimizes the chance of seeing
>>>> partial/incomplete results in the final destination).
>>>>
>>>> In the above, you'd want step 1 to execute on many workers, likely
>>>> using a ParDo (say N different workers).
>>>> The move step should only happen once, so on one worker. This means it
>>>> will be a different DoFn, likely with some stuff done to ensure it runs on
>>>> one worker.
>>>>
>>>> In such a case, cleanup / @TearDown of the DoFn is not enough. We need
>>>> an API for a PTransform to schedule some cleanup work for when the
>>>> transform is "done". In batch this is relatively straightforward, but
>>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>>> leaving files around that have failed to import into BigQuery.
>>>>
>>>> In streaming this is less straightforward -- do you want to wait until
>>>> the end of the pipeline? Or do you want to wait until the end of the
>>>> window? In practice, you just want to wait until you know nobody will need
>>>> the resource anymore.
>>>>
>>>> This led to some discussions around a "cleanup" API, where you could
>>>> have a transform that output resource objects. Each resource object would
>>>> have logic for cleaning it up. And there would be something that indicated
>>>> what parts of the pipeline needed that resource, and what kind of temporal
>>>> lifetime those objects had. As soon as that part of the pipeline had
>>>> advanced far enough that it would no longer need the resources, they would
>>>> get cleaned up. This can be done at pipeline shutdown, or incrementally
>>>> during a streaming pipeline, etc.
>>>>
>>>> Would something like this be a better fit for your use case? If not,
>>>> why is handling teardown within a single DoFn sufficient?
>>>>
>>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> Yes 1M. Lets try to explain you simplifying the overall execution.
>>>>> Each instance - one fn so likely in a thread of a worker - has its
>>>>> lifecycle. Caricaturally: "new" and garbage collection.
>>>>>
>>>>> In practise, new is often an unsafe allocate (deserialization) but it
>>>>> doesnt matter here.
>>>>>
>>>>> What i want is any "new" to have a following setup before any process
>>>>> or stattbundle and the last time beam has the instance before it is gc-ed
>>>>> and after last finishbundle it calls teardown.
>>>>>
>>>>> It is as simple as it.
>>>>> This way no need to comibe fn in a way making a fn not self contained
>>>>> to implement basic transforms.
>>>>>
>>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a
>>>>>>> écrit :
>>>>>>>
>>>>>>> It feels like his thread may be a bit off-track. Rather than
>>>>>>> focusing on the semantics of the existing methods -- which have been noted
>>>>>>> to be meet many existing use cases -- it would be helpful to focus on more
>>>>>>> on the reason you are looking for something with different semantics.
>>>>>>>
>>>>>>> Some possibilities (I'm not sure which one you are trying to do):
>>>>>>>
>>>>>>> 1. Clean-up some external, global resource, that was initialized
>>>>>>> once during the startup of the pipeline. If this is the case, how are you
>>>>>>> ensuring it was really only initialized once (and not once per worker, per
>>>>>>> thread, per instance, etc.)? How do you know when the pipeline should
>>>>>>> release it? If the answer is "when it reaches step X", then what about a
>>>>>>> streaming pipeline?
>>>>>>>
>>>>>>>
>>>>>>> When the dofn is no more needed logically ie when the batch is done
>>>>>>> or stream is stopped (manually or by a jvm shutdown)
>>>>>>>
>>>>>>
>>>>>> I'm really not following what this means.
>>>>>>
>>>>>> Let's say that a pipeline is running 1000 workers, and each worker is
>>>>>> running 1000 threads (each running a copy of the same DoFn). How many
>>>>>> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
>>>>>> you want it called? When the entire pipeline is shut down? When an
>>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>>> about to start back up)? Something else?
>>>>>>
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 2. Finalize some resources that are used within some region of the
>>>>>>> pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>>>>> (they are focused on managing resources within the DoFn), you could model
>>>>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>>>>    a) ParDo generates "resource IDs" (or some token that stores
>>>>>>> information about resources)
>>>>>>>    b) "Require Deterministic Input" (to prevent retries from
>>>>>>> changing resource IDs)
>>>>>>>    c) ParDo that initializes the resources
>>>>>>>    d) Pipeline segments that use the resources, and eventually
>>>>>>> output the fact they're done
>>>>>>>    e) "Require Deterministic Input"
>>>>>>>    f) ParDo that frees the resources
>>>>>>>
>>>>>>> By making the use of the resource part of the data it is possible to
>>>>>>> "checkpoint" which resources may be in use or have been finished by using
>>>>>>> the require deterministic input. This is important to ensuring everything
>>>>>>> is actually cleaned up.
>>>>>>>
>>>>>>>
>>>>>>> I nees that but generic and not case by case to industrialize some
>>>>>>> api on top of beam.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> 3. Some other use case that I may be missing? If it is this case,
>>>>>>> could you elaborate on what you are trying to accomplish? That would help
>>>>>>> me understand both the problems with existing options and possibly what
>>>>>>> could be done to help.
>>>>>>>
>>>>>>>
>>>>>>> I understand there are sorkaround for almost all cases but means
>>>>>>> each transform is different in its lifecycle handling  except i dislike it
>>>>>>> a lot at a scale and as a user since you cant put any unified practise on
>>>>>>> top of beam, it also makes beam very hard to integrate or to use to build
>>>>>>> higher level libraries or softwares.
>>>>>>>
>>>>>>> This is why i tried to not start the workaround discussions and just
>>>>>>> stay at API level.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> -- Ben
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>>
>>>>>>>>> "Machine state" is overly low-level because many of the possible
>>>>>>>>> reasons can happen on a perfectly fine machine.
>>>>>>>>> If you'd like to rephrase it to "it will be called except in
>>>>>>>>> various situations where it's logically impossible or impractical to
>>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>>> examples above.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Sounds ok to me
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> The main point for the user is, you *will* see non-preventable
>>>>>>>>> situations where it couldn't be called - it's not just intergalactic
>>>>>>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>>>>>>> amount of temporary files, shutting down a large number of VMs you started
>>>>>>>>> etc), you have to express it using one of the other methods that have
>>>>>>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>>> pass-by-reference).
>>>>>>>>>
>>>>>>>>
>>>>>>>> FinishBundle has the exact same guarantee sadly so not which which
>>>>>>>> other method you speak about. Concretely if you make it really unreliable -
>>>>>>>> this is what best effort sounds to me - then users can use it to clean
>>>>>>>> anything but if you make it "can happen but it is unexpected and means
>>>>>>>> something happent" then it is fine to have a manual - or auto if fancy -
>>>>>>>> recovery procedure. This is where it makes all the difference and impacts
>>>>>>>> the developpers, ops (all users basically).
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Agree Eugene except that "best effort" means that. It is also
>>>>>>>>>> often used to say "at will" and this is what triggered this thread.
>>>>>>>>>>
>>>>>>>>>> I'm fine using "except if the machine state prevents it" but
>>>>>>>>>> "best effort" is too open and can be very badly and wrongly perceived by
>>>>>>>>>> users (like I did).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Romain Manni-Bucau
>>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>>
>>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <kirpichov@google.com
>>>>>>>>>> >:
>>>>>>>>>>
>>>>>>>>>>> It will not be called if it's impossible to call it: in the
>>>>>>>>>>> example situation you have (intergalactic crash), and in a number of more
>>>>>>>>>>> common cases: eg in case the worker container has crashed (eg user code in
>>>>>>>>>>> a different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>>
>>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>>>>>>>> écrit :
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm
>>>>>>>>>>>>>>> trying to write an IO for system $x and it requires the following
>>>>>>>>>>>>>>> initialization and the following cleanup logic and the following processing
>>>>>>>>>>>>>>> in between") I'll be better able to help you.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Take a simple example of a transform requiring a connection.
>>>>>>>>>>>>>>> Using bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things die so
>>>>>>>>>>>>>> badly that @Teardown is not called then nothing else can be called to close
>>>>>>>>>>>>>> the connection either. What AWS service are you thinking of that stays open
>>>>>>>>>>>>>> for a long time when everything at the other end has died?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> For aws i was thinking about starting some services -
>>>>>>>>>>>>>> machines - on the fly in a pipeline startup and closing them at the end. If
>>>>>>>>>>>>>> teardown is not called you leak machines and money. You can say it can be
>>>>>>>>>>>>>> done another way...as the full pipeline ;).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its
>>>>>>>>>>>>>> components lifecycle it can be used at scale for generic pipelines and if
>>>>>>>>>>>>>> bound to some particular IO.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>>
>>>>>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call then
>>>>>>>>>>>> it is a bug and we are done :).
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also what does it mean for the users? Direct runner does it
>>>>>>>>>>>>>> so if a user udes the RI in test, he will get a different behavior in prod?
>>>>>>>>>>>>>> Also dont forget the user doesnt know what the IOs he composes use so this
>>>>>>>>>>>>>> is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I understand the portability culture is new in big data world
>>>>>>>>>>>>>> but it is not a reason to ignore what people did for years and do it wrong
>>>>>>>>>>>>>> before doing right ;).
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Technical note: even a kill should go through java shutdown
>>>>>>>>>>>>>> hooks otherwise your environment (beam enclosing software) is fully
>>>>>>>>>>>>>> unhandled and your overall system is uncontrolled. Only case where it is
>>>>>>>>>>>>>> not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> All state is not about network, even in distributed systems
>>>>>>>>>>>>>> so this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

@Reuven: in practise it is created by pool of 256 but leads to the same
pattern, the teardown is just a "if (iCreatedThem) releaseThem();"
@Eugene:
1. wait logic is about passing the value which is not always possible (like
15% of cases from my raw estimate)
2. sdf: i'll try to detail why i mention SDF more here


Concretely beam exposes a portable API (included in the SDK core). This API
defines a *container* API and therefore implies bean lifecycles. I'll not
detail them all but just use the sources and dofn (not sdf) to illustrate
the idea I'm trying to develop.

A. Source

A source computes a partition plan with 2 primitives: estimateSize and
split. As an user you can expect both to be called on the same bean
instance to avoid to pay the same connection cost(s) twice. Concretely:

connect()
try {
  estimateSize()
  split()
} finally {
  disconnect()
}

this is not guaranteed by the API so you must do:

connect()
try {
  estimateSize()
} finally {
  disconnect()
}
connect()
try {
  split()
} finally {
  disconnect()
}

+ a workaround with an internal estimate size since this primitive is often
called in split but you dont want to connect twice in the second phase.

Why do you need that? Simply cause you want to define an API to implement
sources which initializes the source bean and destroys it.
I insists it is a very very basic concern for such API. However beam
doesn't embraces it and doesn't assume it so building any API on top of
beam is very hurtful today and for direct beam users you hit the exact same
issues - check how IO are implemented, the static utilities which create
volatile connections preventing to reuse existing connection in a single
method (
https://github.com/apache/beam/blob/master/sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java#L862
).

Same logic applies to the reader which is then created.

B. DoFn & SDF

As a fn dev you expect the same from the beam runtime: init(); try { while
(...) process(); } finally { destroy(); } and that it is executed on the
exact same instance to be able to be stateful at that level for expensive
connections/operations/flow state handling.

As you mentionned with the million example, this sequence should happen for
each single instance so 1M times for your example.

Now why did I mention SDF several times? Because SDF is a generalisation of
both cases (source and dofn). Therefore it creates way more instances and
requires to have a way more strict/explicit definition of the exact
lifecycle and which instance does what. Since beam handles the full
lifecycle of the bean instances it must provide init/destroy hooks
(setup/teardown) which can be stateful.

If you take the JDBC example which was mentionned earlier. Today, because
of the teardown issue it uses bundles. Since bundles size is not defined -
and will not with SDF, it must use a pool to be able to reuse a connection
instance to not correct performances. Now with the SDF and the split
increase, how do you handle the pool size? Generally in batch you use a
single connection per thread to avoid to consume all database connections.
With a pool you have 2 choices: 1. use a pool of 1, 2. use a pool a bit
higher but multiplied by the number of beans you will likely x2 or 3 the
connection count and make the execution fail with "no more connection
available". I you picked 1 (pool of #1), then you still have to have a
reliable teardown by pool instance (close() generally) to ensure you
release the pool and don't leak the connection information in the JVM. In
all case you come back to the init()/destroy() lifecycle even if you fake
to get connections with bundles.

Just to make it obvious: SDF mentions are just cause SDF imply all the
current issues with the loose definition of the bean lifecycles at an
exponential level, nothing else.



Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-18 22:32 GMT+01:00 Eugene Kirpichov <ki...@google.com>:

> The kind of whole-transform lifecycle you're mentioning can be
> accomplished using the Wait transform as I suggested in the thread above,
> and I believe it should become the canonical way to do that.
>
> (Would like to reiterate one more time, as the main author of most design
> documents related to SDF and of its implementation in the Java direct and
> dataflow runner that SDF is fully unrelated to the topic of cleanup - I'm
> very confused as to why it keeps coming up)
>
> On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> I kind of agree except transforms lack a lifecycle too. My understanding
>> is that sdf could be a way to unify it and clean the api.
>>
>> Otherwise how to normalize - single api -  lifecycle of transforms?
>>
>> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a écrit :
>>
>>> Are you sure that focusing on the cleanup of specific DoFn's is
>>> appropriate? Many cases where cleanup is necessary, it is around an entire
>>> composite PTransform. I think there have been discussions/proposals around
>>> a more methodical "cleanup" option, but those haven't been implemented, to
>>> the best of my knowledge.
>>>
>>> For instance, consider the steps of a FileIO:
>>> 1. Write to a bunch (N shards) of temporary files
>>> 2. When all temporary files are complete, attempt to do a bulk copy to
>>> put them in the final destination.
>>> 3. Cleanup all the temporary files.
>>>
>>> (This is often desirable because it minimizes the chance of seeing
>>> partial/incomplete results in the final destination).
>>>
>>> In the above, you'd want step 1 to execute on many workers, likely using
>>> a ParDo (say N different workers).
>>> The move step should only happen once, so on one worker. This means it
>>> will be a different DoFn, likely with some stuff done to ensure it runs on
>>> one worker.
>>>
>>> In such a case, cleanup / @TearDown of the DoFn is not enough. We need
>>> an API for a PTransform to schedule some cleanup work for when the
>>> transform is "done". In batch this is relatively straightforward, but
>>> doesn't exist. This is the source of some problems, such as BigQuery sink
>>> leaving files around that have failed to import into BigQuery.
>>>
>>> In streaming this is less straightforward -- do you want to wait until
>>> the end of the pipeline? Or do you want to wait until the end of the
>>> window? In practice, you just want to wait until you know nobody will need
>>> the resource anymore.
>>>
>>> This led to some discussions around a "cleanup" API, where you could
>>> have a transform that output resource objects. Each resource object would
>>> have logic for cleaning it up. And there would be something that indicated
>>> what parts of the pipeline needed that resource, and what kind of temporal
>>> lifetime those objects had. As soon as that part of the pipeline had
>>> advanced far enough that it would no longer need the resources, they would
>>> get cleaned up. This can be done at pipeline shutdown, or incrementally
>>> during a streaming pipeline, etc.
>>>
>>> Would something like this be a better fit for your use case? If not, why
>>> is handling teardown within a single DoFn sufficient?
>>>
>>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>> Yes 1M. Lets try to explain you simplifying the overall execution. Each
>>>> instance - one fn so likely in a thread of a worker - has its lifecycle.
>>>> Caricaturally: "new" and garbage collection.
>>>>
>>>> In practise, new is often an unsafe allocate (deserialization) but it
>>>> doesnt matter here.
>>>>
>>>> What i want is any "new" to have a following setup before any process
>>>> or stattbundle and the last time beam has the instance before it is gc-ed
>>>> and after last finishbundle it calls teardown.
>>>>
>>>> It is as simple as it.
>>>> This way no need to comibe fn in a way making a fn not self contained
>>>> to implement basic transforms.
>>>>
>>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>>>>
>>>>>
>>>>>
>>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a
>>>>>> écrit :
>>>>>>
>>>>>> It feels like his thread may be a bit off-track. Rather than focusing
>>>>>> on the semantics of the existing methods -- which have been noted to be
>>>>>> meet many existing use cases -- it would be helpful to focus on more on the
>>>>>> reason you are looking for something with different semantics.
>>>>>>
>>>>>> Some possibilities (I'm not sure which one you are trying to do):
>>>>>>
>>>>>> 1. Clean-up some external, global resource, that was initialized once
>>>>>> during the startup of the pipeline. If this is the case, how are you
>>>>>> ensuring it was really only initialized once (and not once per worker, per
>>>>>> thread, per instance, etc.)? How do you know when the pipeline should
>>>>>> release it? If the answer is "when it reaches step X", then what about a
>>>>>> streaming pipeline?
>>>>>>
>>>>>>
>>>>>> When the dofn is no more needed logically ie when the batch is done
>>>>>> or stream is stopped (manually or by a jvm shutdown)
>>>>>>
>>>>>
>>>>> I'm really not following what this means.
>>>>>
>>>>> Let's say that a pipeline is running 1000 workers, and each worker is
>>>>> running 1000 threads (each running a copy of the same DoFn). How many
>>>>> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
>>>>> you want it called? When the entire pipeline is shut down? When an
>>>>> individual worker is about to shut down (which may be temporary - may be
>>>>> about to start back up)? Something else?
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> 2. Finalize some resources that are used within some region of the
>>>>>> pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>>>> (they are focused on managing resources within the DoFn), you could model
>>>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>>>    a) ParDo generates "resource IDs" (or some token that stores
>>>>>> information about resources)
>>>>>>    b) "Require Deterministic Input" (to prevent retries from changing
>>>>>> resource IDs)
>>>>>>    c) ParDo that initializes the resources
>>>>>>    d) Pipeline segments that use the resources, and eventually output
>>>>>> the fact they're done
>>>>>>    e) "Require Deterministic Input"
>>>>>>    f) ParDo that frees the resources
>>>>>>
>>>>>> By making the use of the resource part of the data it is possible to
>>>>>> "checkpoint" which resources may be in use or have been finished by using
>>>>>> the require deterministic input. This is important to ensuring everything
>>>>>> is actually cleaned up.
>>>>>>
>>>>>>
>>>>>> I nees that but generic and not case by case to industrialize some
>>>>>> api on top of beam.
>>>>>>
>>>>>>
>>>>>>
>>>>>> 3. Some other use case that I may be missing? If it is this case,
>>>>>> could you elaborate on what you are trying to accomplish? That would help
>>>>>> me understand both the problems with existing options and possibly what
>>>>>> could be done to help.
>>>>>>
>>>>>>
>>>>>> I understand there are sorkaround for almost all cases but means each
>>>>>> transform is different in its lifecycle handling  except i dislike it a lot
>>>>>> at a scale and as a user since you cant put any unified practise on top of
>>>>>> beam, it also makes beam very hard to integrate or to use to build higher
>>>>>> level libraries or softwares.
>>>>>>
>>>>>> This is why i tried to not start the workaround discussions and just
>>>>>> stay at API level.
>>>>>>
>>>>>>
>>>>>>
>>>>>> -- Ben
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>
>>>>>>>> "Machine state" is overly low-level because many of the possible
>>>>>>>> reasons can happen on a perfectly fine machine.
>>>>>>>> If you'd like to rephrase it to "it will be called except in
>>>>>>>> various situations where it's logically impossible or impractical to
>>>>>>>> guarantee that it's called", that's fine. Or you can list some of the
>>>>>>>> examples above.
>>>>>>>>
>>>>>>>
>>>>>>> Sounds ok to me
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> The main point for the user is, you *will* see non-preventable
>>>>>>>> situations where it couldn't be called - it's not just intergalactic
>>>>>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>>>>>> amount of temporary files, shutting down a large number of VMs you started
>>>>>>>> etc), you have to express it using one of the other methods that have
>>>>>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>>> pass-by-reference).
>>>>>>>>
>>>>>>>
>>>>>>> FinishBundle has the exact same guarantee sadly so not which which
>>>>>>> other method you speak about. Concretely if you make it really unreliable -
>>>>>>> this is what best effort sounds to me - then users can use it to clean
>>>>>>> anything but if you make it "can happen but it is unexpected and means
>>>>>>> something happent" then it is fine to have a manual - or auto if fancy -
>>>>>>> recovery procedure. This is where it makes all the difference and impacts
>>>>>>> the developpers, ops (all users basically).
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Agree Eugene except that "best effort" means that. It is also
>>>>>>>>> often used to say "at will" and this is what triggered this thread.
>>>>>>>>>
>>>>>>>>> I'm fine using "except if the machine state prevents it" but "best
>>>>>>>>> effort" is too open and can be very badly and wrongly perceived by users
>>>>>>>>> (like I did).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Romain Manni-Bucau
>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>
>>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>
>>>>>>>>> :
>>>>>>>>>
>>>>>>>>>> It will not be called if it's impossible to call it: in the
>>>>>>>>>> example situation you have (intergalactic crash), and in a number of more
>>>>>>>>>> common cases: eg in case the worker container has crashed (eg user code in
>>>>>>>>>> a different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>>
>>>>>>>>>> "Best effort" is the commonly used term to describe such
>>>>>>>>>> behavior. Please feel free to file bugs for cases where you observed a
>>>>>>>>>> runner not call Teardown in a situation where it was possible to call it
>>>>>>>>>> but the runner made insufficient effort.
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <
>>>>>>>>>>> kirpichov@google.com>:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>>>>>>> écrit :
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying
>>>>>>>>>>>>>> to write an IO for system $x and it requires the following initialization
>>>>>>>>>>>>>> and the following cleanup logic and the following processing in between")
>>>>>>>>>>>>>> I'll be better able to help you.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Take a simple example of a transform requiring a connection.
>>>>>>>>>>>>>> Using bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> For this example @Teardown is an exact fit. If things die so
>>>>>>>>>>>>> badly that @Teardown is not called then nothing else can be called to close
>>>>>>>>>>>>> the connection either. What AWS service are you thinking of that stays open
>>>>>>>>>>>>> for a long time when everything at the other end has died?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>>
>>>>>>>>>>>>> For aws i was thinking about starting some services - machines
>>>>>>>>>>>>> - on the fly in a pipeline startup and closing them at the end. If teardown
>>>>>>>>>>>>> is not called you leak machines and money. You can say it can be done
>>>>>>>>>>>>> another way...as the full pipeline ;).
>>>>>>>>>>>>>
>>>>>>>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>>>>>>>> some particular IO.
>>>>>>>>>>>>>
>>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>>
>>>>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call then it
>>>>>>>>>>> is a bug and we are done :).
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> Also what does it mean for the users? Direct runner does it so
>>>>>>>>>>>>> if a user udes the RI in test, he will get a different behavior in prod?
>>>>>>>>>>>>> Also dont forget the user doesnt know what the IOs he composes use so this
>>>>>>>>>>>>> is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I understand the portability culture is new in big data world
>>>>>>>>>>>>> but it is not a reason to ignore what people did for years and do it wrong
>>>>>>>>>>>>> before doing right ;).
>>>>>>>>>>>>>
>>>>>>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Technical note: even a kill should go through java shutdown
>>>>>>>>>>>>> hooks otherwise your environment (beam enclosing software) is fully
>>>>>>>>>>>>> unhandled and your overall system is uncontrolled. Only case where it is
>>>>>>>>>>>>> not true is when the software is always owned by a vendor and never
>>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>>
>>>>>>>>>>>>> All state is not about network, even in distributed systems so
>>>>>>>>>>>>> this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Kenn
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

The kind of whole-transform lifecycle you're mentioning can be accomplished
using the Wait transform as I suggested in the thread above, and I believe
it should become the canonical way to do that.

(Would like to reiterate one more time, as the main author of most design
documents related to SDF and of its implementation in the Java direct and
dataflow runner that SDF is fully unrelated to the topic of cleanup - I'm
very confused as to why it keeps coming up)

On Sun, Feb 18, 2018, 1:15 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> I kind of agree except transforms lack a lifecycle too. My understanding
> is that sdf could be a way to unify it and clean the api.
>
> Otherwise how to normalize - single api -  lifecycle of transforms?
>
> Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a écrit :
>
>> Are you sure that focusing on the cleanup of specific DoFn's is
>> appropriate? Many cases where cleanup is necessary, it is around an entire
>> composite PTransform. I think there have been discussions/proposals around
>> a more methodical "cleanup" option, but those haven't been implemented, to
>> the best of my knowledge.
>>
>> For instance, consider the steps of a FileIO:
>> 1. Write to a bunch (N shards) of temporary files
>> 2. When all temporary files are complete, attempt to do a bulk copy to
>> put them in the final destination.
>> 3. Cleanup all the temporary files.
>>
>> (This is often desirable because it minimizes the chance of seeing
>> partial/incomplete results in the final destination).
>>
>> In the above, you'd want step 1 to execute on many workers, likely using
>> a ParDo (say N different workers).
>> The move step should only happen once, so on one worker. This means it
>> will be a different DoFn, likely with some stuff done to ensure it runs on
>> one worker.
>>
>> In such a case, cleanup / @TearDown of the DoFn is not enough. We need an
>> API for a PTransform to schedule some cleanup work for when the transform
>> is "done". In batch this is relatively straightforward, but doesn't exist.
>> This is the source of some problems, such as BigQuery sink leaving files
>> around that have failed to import into BigQuery.
>>
>> In streaming this is less straightforward -- do you want to wait until
>> the end of the pipeline? Or do you want to wait until the end of the
>> window? In practice, you just want to wait until you know nobody will need
>> the resource anymore.
>>
>> This led to some discussions around a "cleanup" API, where you could have
>> a transform that output resource objects. Each resource object would have
>> logic for cleaning it up. And there would be something that indicated what
>> parts of the pipeline needed that resource, and what kind of temporal
>> lifetime those objects had. As soon as that part of the pipeline had
>> advanced far enough that it would no longer need the resources, they would
>> get cleaned up. This can be done at pipeline shutdown, or incrementally
>> during a streaming pipeline, etc.
>>
>> Would something like this be a better fit for your use case? If not, why
>> is handling teardown within a single DoFn sufficient?
>>
>> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>> Yes 1M. Lets try to explain you simplifying the overall execution. Each
>>> instance - one fn so likely in a thread of a worker - has its lifecycle.
>>> Caricaturally: "new" and garbage collection.
>>>
>>> In practise, new is often an unsafe allocate (deserialization) but it
>>> doesnt matter here.
>>>
>>> What i want is any "new" to have a following setup before any process or
>>> stattbundle and the last time beam has the instance before it is gc-ed and
>>> after last finishbundle it calls teardown.
>>>
>>> It is as simple as it.
>>> This way no need to comibe fn in a way making a fn not self contained to
>>> implement basic transforms.
>>>
>>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>>>
>>>>
>>>>
>>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>>
>>>>>
>>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a
>>>>> écrit :
>>>>>
>>>>> It feels like his thread may be a bit off-track. Rather than focusing
>>>>> on the semantics of the existing methods -- which have been noted to be
>>>>> meet many existing use cases -- it would be helpful to focus on more on the
>>>>> reason you are looking for something with different semantics.
>>>>>
>>>>> Some possibilities (I'm not sure which one you are trying to do):
>>>>>
>>>>> 1. Clean-up some external, global resource, that was initialized once
>>>>> during the startup of the pipeline. If this is the case, how are you
>>>>> ensuring it was really only initialized once (and not once per worker, per
>>>>> thread, per instance, etc.)? How do you know when the pipeline should
>>>>> release it? If the answer is "when it reaches step X", then what about a
>>>>> streaming pipeline?
>>>>>
>>>>>
>>>>> When the dofn is no more needed logically ie when the batch is done or
>>>>> stream is stopped (manually or by a jvm shutdown)
>>>>>
>>>>
>>>> I'm really not following what this means.
>>>>
>>>> Let's say that a pipeline is running 1000 workers, and each worker is
>>>> running 1000 threads (each running a copy of the same DoFn). How many
>>>> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
>>>> you want it called? When the entire pipeline is shut down? When an
>>>> individual worker is about to shut down (which may be temporary - may be
>>>> about to start back up)? Something else?
>>>>
>>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>> 2. Finalize some resources that are used within some region of the
>>>>> pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>>> (they are focused on managing resources within the DoFn), you could model
>>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>>    a) ParDo generates "resource IDs" (or some token that stores
>>>>> information about resources)
>>>>>    b) "Require Deterministic Input" (to prevent retries from changing
>>>>> resource IDs)
>>>>>    c) ParDo that initializes the resources
>>>>>    d) Pipeline segments that use the resources, and eventually output
>>>>> the fact they're done
>>>>>    e) "Require Deterministic Input"
>>>>>    f) ParDo that frees the resources
>>>>>
>>>>> By making the use of the resource part of the data it is possible to
>>>>> "checkpoint" which resources may be in use or have been finished by using
>>>>> the require deterministic input. This is important to ensuring everything
>>>>> is actually cleaned up.
>>>>>
>>>>>
>>>>> I nees that but generic and not case by case to industrialize some api
>>>>> on top of beam.
>>>>>
>>>>>
>>>>>
>>>>> 3. Some other use case that I may be missing? If it is this case,
>>>>> could you elaborate on what you are trying to accomplish? That would help
>>>>> me understand both the problems with existing options and possibly what
>>>>> could be done to help.
>>>>>
>>>>>
>>>>> I understand there are sorkaround for almost all cases but means each
>>>>> transform is different in its lifecycle handling  except i dislike it a lot
>>>>> at a scale and as a user since you cant put any unified practise on top of
>>>>> beam, it also makes beam very hard to integrate or to use to build higher
>>>>> level libraries or softwares.
>>>>>
>>>>> This is why i tried to not start the workaround discussions and just
>>>>> stay at API level.
>>>>>
>>>>>
>>>>>
>>>>> -- Ben
>>>>>
>>>>>
>>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>
>>>>>>> "Machine state" is overly low-level because many of the possible
>>>>>>> reasons can happen on a perfectly fine machine.
>>>>>>> If you'd like to rephrase it to "it will be called except in various
>>>>>>> situations where it's logically impossible or impractical to guarantee that
>>>>>>> it's called", that's fine. Or you can list some of the examples above.
>>>>>>>
>>>>>>
>>>>>> Sounds ok to me
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> The main point for the user is, you *will* see non-preventable
>>>>>>> situations where it couldn't be called - it's not just intergalactic
>>>>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>>>>> amount of temporary files, shutting down a large number of VMs you started
>>>>>>> etc), you have to express it using one of the other methods that have
>>>>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>>>>> pass-by-reference).
>>>>>>>
>>>>>>
>>>>>> FinishBundle has the exact same guarantee sadly so not which which
>>>>>> other method you speak about. Concretely if you make it really unreliable -
>>>>>> this is what best effort sounds to me - then users can use it to clean
>>>>>> anything but if you make it "can happen but it is unexpected and means
>>>>>> something happent" then it is fine to have a manual - or auto if fancy -
>>>>>> recovery procedure. This is where it makes all the difference and impacts
>>>>>> the developpers, ops (all users basically).
>>>>>>
>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> Agree Eugene except that "best effort" means that. It is also often
>>>>>>>> used to say "at will" and this is what triggered this thread.
>>>>>>>>
>>>>>>>> I'm fine using "except if the machine state prevents it" but "best
>>>>>>>> effort" is too open and can be very badly and wrongly perceived by users
>>>>>>>> (like I did).
>>>>>>>>
>>>>>>>>
>>>>>>>> Romain Manni-Bucau
>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>
>>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>>
>>>>>>>>> It will not be called if it's impossible to call it: in the
>>>>>>>>> example situation you have (intergalactic crash), and in a number of more
>>>>>>>>> common cases: eg in case the worker container has crashed (eg user code in
>>>>>>>>> a different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>>
>>>>>>>>> "Best effort" is the commonly used term to describe such behavior.
>>>>>>>>> Please feel free to file bugs for cases where you observed a runner not
>>>>>>>>> call Teardown in a situation where it was possible to call it but the
>>>>>>>>> runner made insufficient effort.
>>>>>>>>>
>>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <kirpichov@google.com
>>>>>>>>>> >:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>>>>>> écrit :
>>>>>>>>>>>>
>>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying
>>>>>>>>>>>>> to write an IO for system $x and it requires the following initialization
>>>>>>>>>>>>> and the following cleanup logic and the following processing in between")
>>>>>>>>>>>>> I'll be better able to help you.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Take a simple example of a transform requiring a connection.
>>>>>>>>>>>>> Using bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> For this example @Teardown is an exact fit. If things die so
>>>>>>>>>>>> badly that @Teardown is not called then nothing else can be called to close
>>>>>>>>>>>> the connection either. What AWS service are you thinking of that stays open
>>>>>>>>>>>> for a long time when everything at the other end has died?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>>> "im leaving".
>>>>>>>>>>>>
>>>>>>>>>>>> For aws i was thinking about starting some services - machines
>>>>>>>>>>>> - on the fly in a pipeline startup and closing them at the end. If teardown
>>>>>>>>>>>> is not called you leak machines and money. You can say it can be done
>>>>>>>>>>>> another way...as the full pipeline ;).
>>>>>>>>>>>>
>>>>>>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>>>>>>> some particular IO.
>>>>>>>>>>>>
>>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>>
>>>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Remove "best effort" from the javadoc. If it is not call then it
>>>>>>>>>> is a bug and we are done :).
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>> Also what does it mean for the users? Direct runner does it so
>>>>>>>>>>>> if a user udes the RI in test, he will get a different behavior in prod?
>>>>>>>>>>>> Also dont forget the user doesnt know what the IOs he composes use so this
>>>>>>>>>>>> is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>>
>>>>>>>>>>>> I understand the portability culture is new in big data world
>>>>>>>>>>>> but it is not a reason to ignore what people did for years and do it wrong
>>>>>>>>>>>> before doing right ;).
>>>>>>>>>>>>
>>>>>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>>
>>>>>>>>>>>> Technical note: even a kill should go through java shutdown
>>>>>>>>>>>> hooks otherwise your environment (beam enclosing software) is fully
>>>>>>>>>>>> unhandled and your overall system is uncontrolled. Only case where it is
>>>>>>>>>>>> not true is when the software is always owned by a vendor and never
>>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>>
>>>>>>>>>>>> All state is not about network, even in distributed systems so
>>>>>>>>>>>> this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Kenn
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>
>>>>>
>>>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

I kind of agree except transforms lack a lifecycle too. My understanding is
that sdf could be a way to unify it and clean the api.

Otherwise how to normalize - single api -  lifecycle of transforms?

Le 18 févr. 2018 21:32, "Ben Chambers" <bc...@apache.org> a écrit :

> Are you sure that focusing on the cleanup of specific DoFn's is
> appropriate? Many cases where cleanup is necessary, it is around an entire
> composite PTransform. I think there have been discussions/proposals around
> a more methodical "cleanup" option, but those haven't been implemented, to
> the best of my knowledge.
>
> For instance, consider the steps of a FileIO:
> 1. Write to a bunch (N shards) of temporary files
> 2. When all temporary files are complete, attempt to do a bulk copy to put
> them in the final destination.
> 3. Cleanup all the temporary files.
>
> (This is often desirable because it minimizes the chance of seeing
> partial/incomplete results in the final destination).
>
> In the above, you'd want step 1 to execute on many workers, likely using a
> ParDo (say N different workers).
> The move step should only happen once, so on one worker. This means it
> will be a different DoFn, likely with some stuff done to ensure it runs on
> one worker.
>
> In such a case, cleanup / @TearDown of the DoFn is not enough. We need an
> API for a PTransform to schedule some cleanup work for when the transform
> is "done". In batch this is relatively straightforward, but doesn't exist.
> This is the source of some problems, such as BigQuery sink leaving files
> around that have failed to import into BigQuery.
>
> In streaming this is less straightforward -- do you want to wait until the
> end of the pipeline? Or do you want to wait until the end of the window? In
> practice, you just want to wait until you know nobody will need the
> resource anymore.
>
> This led to some discussions around a "cleanup" API, where you could have
> a transform that output resource objects. Each resource object would have
> logic for cleaning it up. And there would be something that indicated what
> parts of the pipeline needed that resource, and what kind of temporal
> lifetime those objects had. As soon as that part of the pipeline had
> advanced far enough that it would no longer need the resources, they would
> get cleaned up. This can be done at pipeline shutdown, or incrementally
> during a streaming pipeline, etc.
>
> Would something like this be a better fit for your use case? If not, why
> is handling teardown within a single DoFn sufficient?
>
> On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> Yes 1M. Lets try to explain you simplifying the overall execution. Each
>> instance - one fn so likely in a thread of a worker - has its lifecycle.
>> Caricaturally: "new" and garbage collection.
>>
>> In practise, new is often an unsafe allocate (deserialization) but it
>> doesnt matter here.
>>
>> What i want is any "new" to have a following setup before any process or
>> stattbundle and the last time beam has the instance before it is gc-ed and
>> after last finishbundle it calls teardown.
>>
>> It is as simple as it.
>> This way no need to comibe fn in a way making a fn not self contained to
>> implement basic transforms.
>>
>> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>>
>>>
>>>
>>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a écrit :
>>>>
>>>> It feels like his thread may be a bit off-track. Rather than focusing
>>>> on the semantics of the existing methods -- which have been noted to be
>>>> meet many existing use cases -- it would be helpful to focus on more on the
>>>> reason you are looking for something with different semantics.
>>>>
>>>> Some possibilities (I'm not sure which one you are trying to do):
>>>>
>>>> 1. Clean-up some external, global resource, that was initialized once
>>>> during the startup of the pipeline. If this is the case, how are you
>>>> ensuring it was really only initialized once (and not once per worker, per
>>>> thread, per instance, etc.)? How do you know when the pipeline should
>>>> release it? If the answer is "when it reaches step X", then what about a
>>>> streaming pipeline?
>>>>
>>>>
>>>> When the dofn is no more needed logically ie when the batch is done or
>>>> stream is stopped (manually or by a jvm shutdown)
>>>>
>>>
>>> I'm really not following what this means.
>>>
>>> Let's say that a pipeline is running 1000 workers, and each worker is
>>> running 1000 threads (each running a copy of the same DoFn). How many
>>> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
>>> you want it called? When the entire pipeline is shut down? When an
>>> individual worker is about to shut down (which may be temporary - may be
>>> about to start back up)? Something else?
>>>
>>>
>>>
>>>>
>>>>
>>>>
>>>> 2. Finalize some resources that are used within some region of the
>>>> pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>>> (they are focused on managing resources within the DoFn), you could model
>>>> this on how FileIO finalizes the files that it produced. For instance:
>>>>    a) ParDo generates "resource IDs" (or some token that stores
>>>> information about resources)
>>>>    b) "Require Deterministic Input" (to prevent retries from changing
>>>> resource IDs)
>>>>    c) ParDo that initializes the resources
>>>>    d) Pipeline segments that use the resources, and eventually output
>>>> the fact they're done
>>>>    e) "Require Deterministic Input"
>>>>    f) ParDo that frees the resources
>>>>
>>>> By making the use of the resource part of the data it is possible to
>>>> "checkpoint" which resources may be in use or have been finished by using
>>>> the require deterministic input. This is important to ensuring everything
>>>> is actually cleaned up.
>>>>
>>>>
>>>> I nees that but generic and not case by case to industrialize some api
>>>> on top of beam.
>>>>
>>>>
>>>>
>>>> 3. Some other use case that I may be missing? If it is this case, could
>>>> you elaborate on what you are trying to accomplish? That would help me
>>>> understand both the problems with existing options and possibly what could
>>>> be done to help.
>>>>
>>>>
>>>> I understand there are sorkaround for almost all cases but means each
>>>> transform is different in its lifecycle handling  except i dislike it a lot
>>>> at a scale and as a user since you cant put any unified practise on top of
>>>> beam, it also makes beam very hard to integrate or to use to build higher
>>>> level libraries or softwares.
>>>>
>>>> This is why i tried to not start the workaround discussions and just
>>>> stay at API level.
>>>>
>>>>
>>>>
>>>> -- Ben
>>>>
>>>>
>>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>
>>>>>> "Machine state" is overly low-level because many of the possible
>>>>>> reasons can happen on a perfectly fine machine.
>>>>>> If you'd like to rephrase it to "it will be called except in various
>>>>>> situations where it's logically impossible or impractical to guarantee that
>>>>>> it's called", that's fine. Or you can list some of the examples above.
>>>>>>
>>>>>
>>>>> Sounds ok to me
>>>>>
>>>>>
>>>>>>
>>>>>> The main point for the user is, you *will* see non-preventable
>>>>>> situations where it couldn't be called - it's not just intergalactic
>>>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>>>> amount of temporary files, shutting down a large number of VMs you started
>>>>>> etc), you have to express it using one of the other methods that have
>>>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>>>> pass-by-reference).
>>>>>>
>>>>>
>>>>> FinishBundle has the exact same guarantee sadly so not which which
>>>>> other method you speak about. Concretely if you make it really unreliable -
>>>>> this is what best effort sounds to me - then users can use it to clean
>>>>> anything but if you make it "can happen but it is unexpected and means
>>>>> something happent" then it is fine to have a manual - or auto if fancy -
>>>>> recovery procedure. This is where it makes all the difference and impacts
>>>>> the developpers, ops (all users basically).
>>>>>
>>>>>
>>>>>>
>>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> Agree Eugene except that "best effort" means that. It is also often
>>>>>>> used to say "at will" and this is what triggered this thread.
>>>>>>>
>>>>>>> I'm fine using "except if the machine state prevents it" but "best
>>>>>>> effort" is too open and can be very badly and wrongly perceived by users
>>>>>>> (like I did).
>>>>>>>
>>>>>>>
>>>>>>> Romain Manni-Bucau
>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>
>>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>
>>>>>>>> It will not be called if it's impossible to call it: in the example
>>>>>>>> situation you have (intergalactic crash), and in a number of more common
>>>>>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>>>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>>
>>>>>>>> "Best effort" is the commonly used term to describe such behavior.
>>>>>>>> Please feel free to file bugs for cases where you observed a runner not
>>>>>>>> call Teardown in a situation where it was possible to call it but the
>>>>>>>> runner made insufficient effort.
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>
>>>>>>>>> :
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>>>>> écrit :
>>>>>>>>>>>
>>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying
>>>>>>>>>>>> to write an IO for system $x and it requires the following initialization
>>>>>>>>>>>> and the following cleanup logic and the following processing in between")
>>>>>>>>>>>> I'll be better able to help you.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Take a simple example of a transform requiring a connection.
>>>>>>>>>>>> Using bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> For this example @Teardown is an exact fit. If things die so
>>>>>>>>>>> badly that @Teardown is not called then nothing else can be called to close
>>>>>>>>>>> the connection either. What AWS service are you thinking of that stays open
>>>>>>>>>>> for a long time when everything at the other end has died?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>>> "im leaving".
>>>>>>>>>>>
>>>>>>>>>>> For aws i was thinking about starting some services - machines -
>>>>>>>>>>> on the fly in a pipeline startup and closing them at the end. If teardown
>>>>>>>>>>> is not called you leak machines and money. You can say it can be done
>>>>>>>>>>> another way...as the full pipeline ;).
>>>>>>>>>>>
>>>>>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>>>>>> some particular IO.
>>>>>>>>>>>
>>>>>>>>>>> What does prevent to enforce teardown - ignoring the
>>>>>>>>>>> interstellar crash case which cant be handled by any human system? Nothing
>>>>>>>>>>> technically. Why do you push to not handle it? Is it due to some legacy
>>>>>>>>>>> code on dataflow or something else?
>>>>>>>>>>>
>>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Remove "best effort" from the javadoc. If it is not call then it
>>>>>>>>> is a bug and we are done :).
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Also what does it mean for the users? Direct runner does it so
>>>>>>>>>>> if a user udes the RI in test, he will get a different behavior in prod?
>>>>>>>>>>> Also dont forget the user doesnt know what the IOs he composes use so this
>>>>>>>>>>> is so impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>>
>>>>>>>>>>> I understand the portability culture is new in big data world
>>>>>>>>>>> but it is not a reason to ignore what people did for years and do it wrong
>>>>>>>>>>> before doing right ;).
>>>>>>>>>>>
>>>>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>>
>>>>>>>>>>> Technical note: even a kill should go through java shutdown
>>>>>>>>>>> hooks otherwise your environment (beam enclosing software) is fully
>>>>>>>>>>> unhandled and your overall system is uncontrolled. Only case where it is
>>>>>>>>>>> not true is when the software is always owned by a vendor and never
>>>>>>>>>>> installed on customer environment. In this case it belongd to the vendor to
>>>>>>>>>>> handle beam API and not to beam to adjust its API for a vendor - otherwise
>>>>>>>>>>> all unsupported features by one runner should be made optional right?
>>>>>>>>>>>
>>>>>>>>>>> All state is not about network, even in distributed systems so
>>>>>>>>>>> this is key to have an explicit and defined lifecycle.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Kenn
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>
>>>>
>>>

Re: @TearDown guarantees

Posted by Ben Chambers <bc...@apache.org>.

Are you sure that focusing on the cleanup of specific DoFn's is
appropriate? Many cases where cleanup is necessary, it is around an entire
composite PTransform. I think there have been discussions/proposals around
a more methodical "cleanup" option, but those haven't been implemented, to
the best of my knowledge.

For instance, consider the steps of a FileIO:
1. Write to a bunch (N shards) of temporary files
2. When all temporary files are complete, attempt to do a bulk copy to put
them in the final destination.
3. Cleanup all the temporary files.

(This is often desirable because it minimizes the chance of seeing
partial/incomplete results in the final destination).

In the above, you'd want step 1 to execute on many workers, likely using a
ParDo (say N different workers).
The move step should only happen once, so on one worker. This means it will
be a different DoFn, likely with some stuff done to ensure it runs on one
worker.

In such a case, cleanup / @TearDown of the DoFn is not enough. We need an
API for a PTransform to schedule some cleanup work for when the transform
is "done". In batch this is relatively straightforward, but doesn't exist.
This is the source of some problems, such as BigQuery sink leaving files
around that have failed to import into BigQuery.

In streaming this is less straightforward -- do you want to wait until the
end of the pipeline? Or do you want to wait until the end of the window? In
practice, you just want to wait until you know nobody will need the
resource anymore.

This led to some discussions around a "cleanup" API, where you could have a
transform that output resource objects. Each resource object would have
logic for cleaning it up. And there would be something that indicated what
parts of the pipeline needed that resource, and what kind of temporal
lifetime those objects had. As soon as that part of the pipeline had
advanced far enough that it would no longer need the resources, they would
get cleaned up. This can be done at pipeline shutdown, or incrementally
during a streaming pipeline, etc.

Would something like this be a better fit for your use case? If not, why is
handling teardown within a single DoFn sufficient?

On Sun, Feb 18, 2018 at 11:53 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Yes 1M. Lets try to explain you simplifying the overall execution. Each
> instance - one fn so likely in a thread of a worker - has its lifecycle.
> Caricaturally: "new" and garbage collection.
>
> In practise, new is often an unsafe allocate (deserialization) but it
> doesnt matter here.
>
> What i want is any "new" to have a following setup before any process or
> stattbundle and the last time beam has the instance before it is gc-ed and
> after last finishbundle it calls teardown.
>
> It is as simple as it.
> This way no need to comibe fn in a way making a fn not self contained to
> implement basic transforms.
>
> Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :
>
>>
>>
>> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>>
>>>
>>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a écrit :
>>>
>>> It feels like his thread may be a bit off-track. Rather than focusing on
>>> the semantics of the existing methods -- which have been noted to be meet
>>> many existing use cases -- it would be helpful to focus on more on the
>>> reason you are looking for something with different semantics.
>>>
>>> Some possibilities (I'm not sure which one you are trying to do):
>>>
>>> 1. Clean-up some external, global resource, that was initialized once
>>> during the startup of the pipeline. If this is the case, how are you
>>> ensuring it was really only initialized once (and not once per worker, per
>>> thread, per instance, etc.)? How do you know when the pipeline should
>>> release it? If the answer is "when it reaches step X", then what about a
>>> streaming pipeline?
>>>
>>>
>>> When the dofn is no more needed logically ie when the batch is done or
>>> stream is stopped (manually or by a jvm shutdown)
>>>
>>
>> I'm really not following what this means.
>>
>> Let's say that a pipeline is running 1000 workers, and each worker is
>> running 1000 threads (each running a copy of the same DoFn). How many
>> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
>> you want it called? When the entire pipeline is shut down? When an
>> individual worker is about to shut down (which may be temporary - may be
>> about to start back up)? Something else?
>>
>>
>>
>>>
>>>
>>>
>>> 2. Finalize some resources that are used within some region of the
>>> pipeline. While, the DoFn lifecycle methods are not a good fit for this
>>> (they are focused on managing resources within the DoFn), you could model
>>> this on how FileIO finalizes the files that it produced. For instance:
>>>    a) ParDo generates "resource IDs" (or some token that stores
>>> information about resources)
>>>    b) "Require Deterministic Input" (to prevent retries from changing
>>> resource IDs)
>>>    c) ParDo that initializes the resources
>>>    d) Pipeline segments that use the resources, and eventually output
>>> the fact they're done
>>>    e) "Require Deterministic Input"
>>>    f) ParDo that frees the resources
>>>
>>> By making the use of the resource part of the data it is possible to
>>> "checkpoint" which resources may be in use or have been finished by using
>>> the require deterministic input. This is important to ensuring everything
>>> is actually cleaned up.
>>>
>>>
>>> I nees that but generic and not case by case to industrialize some api
>>> on top of beam.
>>>
>>>
>>>
>>> 3. Some other use case that I may be missing? If it is this case, could
>>> you elaborate on what you are trying to accomplish? That would help me
>>> understand both the problems with existing options and possibly what could
>>> be done to help.
>>>
>>>
>>> I understand there are sorkaround for almost all cases but means each
>>> transform is different in its lifecycle handling  except i dislike it a lot
>>> at a scale and as a user since you cant put any unified practise on top of
>>> beam, it also makes beam very hard to integrate or to use to build higher
>>> level libraries or softwares.
>>>
>>> This is why i tried to not start the workaround discussions and just
>>> stay at API level.
>>>
>>>
>>>
>>> -- Ben
>>>
>>>
>>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>
>>>>> "Machine state" is overly low-level because many of the possible
>>>>> reasons can happen on a perfectly fine machine.
>>>>> If you'd like to rephrase it to "it will be called except in various
>>>>> situations where it's logically impossible or impractical to guarantee that
>>>>> it's called", that's fine. Or you can list some of the examples above.
>>>>>
>>>>
>>>> Sounds ok to me
>>>>
>>>>
>>>>>
>>>>> The main point for the user is, you *will* see non-preventable
>>>>> situations where it couldn't be called - it's not just intergalactic
>>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>>> amount of temporary files, shutting down a large number of VMs you started
>>>>> etc), you have to express it using one of the other methods that have
>>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>>> pass-by-reference).
>>>>>
>>>>
>>>> FinishBundle has the exact same guarantee sadly so not which which
>>>> other method you speak about. Concretely if you make it really unreliable -
>>>> this is what best effort sounds to me - then users can use it to clean
>>>> anything but if you make it "can happen but it is unexpected and means
>>>> something happent" then it is fine to have a manual - or auto if fancy -
>>>> recovery procedure. This is where it makes all the difference and impacts
>>>> the developpers, ops (all users basically).
>>>>
>>>>
>>>>>
>>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> Agree Eugene except that "best effort" means that. It is also often
>>>>>> used to say "at will" and this is what triggered this thread.
>>>>>>
>>>>>> I'm fine using "except if the machine state prevents it" but "best
>>>>>> effort" is too open and can be very badly and wrongly perceived by users
>>>>>> (like I did).
>>>>>>
>>>>>>
>>>>>> Romain Manni-Bucau
>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>
>>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>
>>>>>>> It will not be called if it's impossible to call it: in the example
>>>>>>> situation you have (intergalactic crash), and in a number of more common
>>>>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>>
>>>>>>> "Best effort" is the commonly used term to describe such behavior.
>>>>>>> Please feel free to file bugs for cases where you observed a runner not
>>>>>>> call Teardown in a situation where it was possible to call it but the
>>>>>>> runner made insufficient effort.
>>>>>>>
>>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>>>> écrit :
>>>>>>>>>>
>>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>>>>>> be better able to help you.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Take a simple example of a transform requiring a connection.
>>>>>>>>>>> Using bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> For this example @Teardown is an exact fit. If things die so
>>>>>>>>>> badly that @Teardown is not called then nothing else can be called to close
>>>>>>>>>> the connection either. What AWS service are you thinking of that stays open
>>>>>>>>>> for a long time when everything at the other end has died?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>>> "im leaving".
>>>>>>>>>>
>>>>>>>>>> For aws i was thinking about starting some services - machines -
>>>>>>>>>> on the fly in a pipeline startup and closing them at the end. If teardown
>>>>>>>>>> is not called you leak machines and money. You can say it can be done
>>>>>>>>>> another way...as the full pipeline ;).
>>>>>>>>>>
>>>>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>>>>> some particular IO.
>>>>>>>>>>
>>>>>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>>>>>> or something else?
>>>>>>>>>>
>>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Remove "best effort" from the javadoc. If it is not call then it is
>>>>>>>> a bug and we are done :).
>>>>>>>>
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Also what does it mean for the users? Direct runner does it so if
>>>>>>>>>> a user udes the RI in test, he will get a different behavior in prod? Also
>>>>>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>>>>>
>>>>>>>>>> I understand the portability culture is new in big data world but
>>>>>>>>>> it is not a reason to ignore what people did for years and do it wrong
>>>>>>>>>> before doing right ;).
>>>>>>>>>>
>>>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>>
>>>>>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>>>>>> features by one runner should be made optional right?
>>>>>>>>>>
>>>>>>>>>> All state is not about network, even in distributed systems so
>>>>>>>>>> this is key to have an explicit and defined lifecycle.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Kenn
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>
>>>
>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Yes 1M. Lets try to explain you simplifying the overall execution. Each
instance - one fn so likely in a thread of a worker - has its lifecycle.
Caricaturally: "new" and garbage collection.

In practise, new is often an unsafe allocate (deserialization) but it
doesnt matter here.

What i want is any "new" to have a following setup before any process or
stattbundle and the last time beam has the instance before it is gc-ed and
after last finishbundle it calls teardown.

It is as simple as it.
This way no need to comibe fn in a way making a fn not self contained to
implement basic transforms.

Le 18 févr. 2018 20:07, "Reuven Lax" <re...@google.com> a écrit :

>
>
> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
> rmannibucau@gmail.com> wrote:
>
>>
>>
>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a écrit :
>>
>> It feels like his thread may be a bit off-track. Rather than focusing on
>> the semantics of the existing methods -- which have been noted to be meet
>> many existing use cases -- it would be helpful to focus on more on the
>> reason you are looking for something with different semantics.
>>
>> Some possibilities (I'm not sure which one you are trying to do):
>>
>> 1. Clean-up some external, global resource, that was initialized once
>> during the startup of the pipeline. If this is the case, how are you
>> ensuring it was really only initialized once (and not once per worker, per
>> thread, per instance, etc.)? How do you know when the pipeline should
>> release it? If the answer is "when it reaches step X", then what about a
>> streaming pipeline?
>>
>>
>> When the dofn is no more needed logically ie when the batch is done or
>> stream is stopped (manually or by a jvm shutdown)
>>
>
> I'm really not following what this means.
>
> Let's say that a pipeline is running 1000 workers, and each worker is
> running 1000 threads (each running a copy of the same DoFn). How many
> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
> you want it called? When the entire pipeline is shut down? When an
> individual worker is about to shut down (which may be temporary - may be
> about to start back up)? Something else?
>
>
>
>>
>>
>>
>> 2. Finalize some resources that are used within some region of the
>> pipeline. While, the DoFn lifecycle methods are not a good fit for this
>> (they are focused on managing resources within the DoFn), you could model
>> this on how FileIO finalizes the files that it produced. For instance:
>>    a) ParDo generates "resource IDs" (or some token that stores
>> information about resources)
>>    b) "Require Deterministic Input" (to prevent retries from changing
>> resource IDs)
>>    c) ParDo that initializes the resources
>>    d) Pipeline segments that use the resources, and eventually output the
>> fact they're done
>>    e) "Require Deterministic Input"
>>    f) ParDo that frees the resources
>>
>> By making the use of the resource part of the data it is possible to
>> "checkpoint" which resources may be in use or have been finished by using
>> the require deterministic input. This is important to ensuring everything
>> is actually cleaned up.
>>
>>
>> I nees that but generic and not case by case to industrialize some api on
>> top of beam.
>>
>>
>>
>> 3. Some other use case that I may be missing? If it is this case, could
>> you elaborate on what you are trying to accomplish? That would help me
>> understand both the problems with existing options and possibly what could
>> be done to help.
>>
>>
>> I understand there are sorkaround for almost all cases but means each
>> transform is different in its lifecycle handling  except i dislike it a lot
>> at a scale and as a user since you cant put any unified practise on top of
>> beam, it also makes beam very hard to integrate or to use to build higher
>> level libraries or softwares.
>>
>> This is why i tried to not start the workaround discussions and just stay
>> at API level.
>>
>>
>>
>> -- Ben
>>
>>
>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>
>>>> "Machine state" is overly low-level because many of the possible
>>>> reasons can happen on a perfectly fine machine.
>>>> If you'd like to rephrase it to "it will be called except in various
>>>> situations where it's logically impossible or impractical to guarantee that
>>>> it's called", that's fine. Or you can list some of the examples above.
>>>>
>>>
>>> Sounds ok to me
>>>
>>>
>>>>
>>>> The main point for the user is, you *will* see non-preventable
>>>> situations where it couldn't be called - it's not just intergalactic
>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>> amount of temporary files, shutting down a large number of VMs you started
>>>> etc), you have to express it using one of the other methods that have
>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>> pass-by-reference).
>>>>
>>>
>>> FinishBundle has the exact same guarantee sadly so not which which other
>>> method you speak about. Concretely if you make it really unreliable - this
>>> is what best effort sounds to me - then users can use it to clean anything
>>> but if you make it "can happen but it is unexpected and means something
>>> happent" then it is fine to have a manual - or auto if fancy - recovery
>>> procedure. This is where it makes all the difference and impacts the
>>> developpers, ops (all users basically).
>>>
>>>
>>>>
>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> Agree Eugene except that "best effort" means that. It is also often
>>>>> used to say "at will" and this is what triggered this thread.
>>>>>
>>>>> I'm fine using "except if the machine state prevents it" but "best
>>>>> effort" is too open and can be very badly and wrongly perceived by users
>>>>> (like I did).
>>>>>
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>
>>>>>> It will not be called if it's impossible to call it: in the example
>>>>>> situation you have (intergalactic crash), and in a number of more common
>>>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>
>>>>>> "Best effort" is the commonly used term to describe such behavior.
>>>>>> Please feel free to file bugs for cases where you observed a runner not
>>>>>> call Teardown in a situation where it was possible to call it but the
>>>>>> runner made insufficient effort.
>>>>>>
>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>>> écrit :
>>>>>>>>>
>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>>>>> be better able to help you.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Take a simple example of a transform requiring a connection.
>>>>>>>>>> Using bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>>>>>> a long time when everything at the other end has died?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>> "im leaving".
>>>>>>>>>
>>>>>>>>> For aws i was thinking about starting some services - machines -
>>>>>>>>> on the fly in a pipeline startup and closing them at the end. If teardown
>>>>>>>>> is not called you leak machines and money. You can say it can be done
>>>>>>>>> another way...as the full pipeline ;).
>>>>>>>>>
>>>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>>>> some particular IO.
>>>>>>>>>
>>>>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>>>>> or something else?
>>>>>>>>>
>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>
>>>>>>>
>>>>>>> Remove "best effort" from the javadoc. If it is not call then it is
>>>>>>> a bug and we are done :).
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Also what does it mean for the users? Direct runner does it so if
>>>>>>>>> a user udes the RI in test, he will get a different behavior in prod? Also
>>>>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>>>>
>>>>>>>>> I understand the portability culture is new in big data world but
>>>>>>>>> it is not a reason to ignore what people did for years and do it wrong
>>>>>>>>> before doing right ;).
>>>>>>>>>
>>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>
>>>>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>>>>> features by one runner should be made optional right?
>>>>>>>>>
>>>>>>>>> All state is not about network, even in distributed systems so
>>>>>>>>> this is key to have an explicit and defined lifecycle.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>
>>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

On Sun, Feb 18, 2018 at 11:07 AM, Reuven Lax <re...@google.com> wrote:

>
>
> On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <
> rmannibucau@gmail.com> wrote:
>
>>
>>
>> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a écrit :
>>
>> It feels like his thread may be a bit off-track. Rather than focusing on
>> the semantics of the existing methods -- which have been noted to be meet
>> many existing use cases -- it would be helpful to focus on more on the
>> reason you are looking for something with different semantics.
>>
>> Some possibilities (I'm not sure which one you are trying to do):
>>
>> 1. Clean-up some external, global resource, that was initialized once
>> during the startup of the pipeline. If this is the case, how are you
>> ensuring it was really only initialized once (and not once per worker, per
>> thread, per instance, etc.)? How do you know when the pipeline should
>> release it? If the answer is "when it reaches step X", then what about a
>> streaming pipeline?
>>
>>
>> When the dofn is no more needed logically ie when the batch is done or
>> stream is stopped (manually or by a jvm shutdown)
>>
>
> I'm really not following what this means.
>
> Let's say that a pipeline is running 1000 workers, and each worker is
> running 1000 threads (each running a copy of the same DoFn). How many
> cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
> you want it called? When the entire pipeline is shut down? When an
> individual worker is about to shut down (which may be temporary - may be
> about to start back up)? Something else?
>

Maybe you can explain the use case a bit more to me. Most resources I'm
aware of that are "sticky" and need cleanup despite worker crashes (e.g.
creating a VM), are also not resources you want to be creating and
destroying millions of times.


>
>
>
>>
>>
>>
>> 2. Finalize some resources that are used within some region of the
>> pipeline. While, the DoFn lifecycle methods are not a good fit for this
>> (they are focused on managing resources within the DoFn), you could model
>> this on how FileIO finalizes the files that it produced. For instance:
>>    a) ParDo generates "resource IDs" (or some token that stores
>> information about resources)
>>    b) "Require Deterministic Input" (to prevent retries from changing
>> resource IDs)
>>    c) ParDo that initializes the resources
>>    d) Pipeline segments that use the resources, and eventually output the
>> fact they're done
>>    e) "Require Deterministic Input"
>>    f) ParDo that frees the resources
>>
>> By making the use of the resource part of the data it is possible to
>> "checkpoint" which resources may be in use or have been finished by using
>> the require deterministic input. This is important to ensuring everything
>> is actually cleaned up.
>>
>>
>> I nees that but generic and not case by case to industrialize some api on
>> top of beam.
>>
>>
>>
>> 3. Some other use case that I may be missing? If it is this case, could
>> you elaborate on what you are trying to accomplish? That would help me
>> understand both the problems with existing options and possibly what could
>> be done to help.
>>
>>
>> I understand there are sorkaround for almost all cases but means each
>> transform is different in its lifecycle handling  except i dislike it a lot
>> at a scale and as a user since you cant put any unified practise on top of
>> beam, it also makes beam very hard to integrate or to use to build higher
>> level libraries or softwares.
>>
>> This is why i tried to not start the workaround discussions and just stay
>> at API level.
>>
>>
>>
>> -- Ben
>>
>>
>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>
>>>> "Machine state" is overly low-level because many of the possible
>>>> reasons can happen on a perfectly fine machine.
>>>> If you'd like to rephrase it to "it will be called except in various
>>>> situations where it's logically impossible or impractical to guarantee that
>>>> it's called", that's fine. Or you can list some of the examples above.
>>>>
>>>
>>> Sounds ok to me
>>>
>>>
>>>>
>>>> The main point for the user is, you *will* see non-preventable
>>>> situations where it couldn't be called - it's not just intergalactic
>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>> amount of temporary files, shutting down a large number of VMs you started
>>>> etc), you have to express it using one of the other methods that have
>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>> pass-by-reference).
>>>>
>>>
>>> FinishBundle has the exact same guarantee sadly so not which which other
>>> method you speak about. Concretely if you make it really unreliable - this
>>> is what best effort sounds to me - then users can use it to clean anything
>>> but if you make it "can happen but it is unexpected and means something
>>> happent" then it is fine to have a manual - or auto if fancy - recovery
>>> procedure. This is where it makes all the difference and impacts the
>>> developpers, ops (all users basically).
>>>
>>>
>>>>
>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> Agree Eugene except that "best effort" means that. It is also often
>>>>> used to say "at will" and this is what triggered this thread.
>>>>>
>>>>> I'm fine using "except if the machine state prevents it" but "best
>>>>> effort" is too open and can be very badly and wrongly perceived by users
>>>>> (like I did).
>>>>>
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>
>>>>>> It will not be called if it's impossible to call it: in the example
>>>>>> situation you have (intergalactic crash), and in a number of more common
>>>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>
>>>>>> "Best effort" is the commonly used term to describe such behavior.
>>>>>> Please feel free to file bugs for cases where you observed a runner not
>>>>>> call Teardown in a situation where it was possible to call it but the
>>>>>> runner made insufficient effort.
>>>>>>
>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>>> écrit :
>>>>>>>>>
>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>>>>> be better able to help you.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Take a simple example of a transform requiring a connection.
>>>>>>>>>> Using bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>>>>>> a long time when everything at the other end has died?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>> "im leaving".
>>>>>>>>>
>>>>>>>>> For aws i was thinking about starting some services - machines -
>>>>>>>>> on the fly in a pipeline startup and closing them at the end. If teardown
>>>>>>>>> is not called you leak machines and money. You can say it can be done
>>>>>>>>> another way...as the full pipeline ;).
>>>>>>>>>
>>>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>>>> some particular IO.
>>>>>>>>>
>>>>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>>>>> or something else?
>>>>>>>>>
>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>
>>>>>>>
>>>>>>> Remove "best effort" from the javadoc. If it is not call then it is
>>>>>>> a bug and we are done :).
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Also what does it mean for the users? Direct runner does it so if
>>>>>>>>> a user udes the RI in test, he will get a different behavior in prod? Also
>>>>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>>>>
>>>>>>>>> I understand the portability culture is new in big data world but
>>>>>>>>> it is not a reason to ignore what people did for years and do it wrong
>>>>>>>>> before doing right ;).
>>>>>>>>>
>>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>
>>>>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>>>>> features by one runner should be made optional right?
>>>>>>>>>
>>>>>>>>> All state is not about network, even in distributed systems so
>>>>>>>>> this is key to have an explicit and defined lifecycle.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>
>>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

On Sun, Feb 18, 2018 at 10:50 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

>
>
> Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a écrit :
>
> It feels like his thread may be a bit off-track. Rather than focusing on
> the semantics of the existing methods -- which have been noted to be meet
> many existing use cases -- it would be helpful to focus on more on the
> reason you are looking for something with different semantics.
>
> Some possibilities (I'm not sure which one you are trying to do):
>
> 1. Clean-up some external, global resource, that was initialized once
> during the startup of the pipeline. If this is the case, how are you
> ensuring it was really only initialized once (and not once per worker, per
> thread, per instance, etc.)? How do you know when the pipeline should
> release it? If the answer is "when it reaches step X", then what about a
> streaming pipeline?
>
>
> When the dofn is no more needed logically ie when the batch is done or
> stream is stopped (manually or by a jvm shutdown)
>

I'm really not following what this means.

Let's say that a pipeline is running 1000 workers, and each worker is
running 1000 threads (each running a copy of the same DoFn). How many
cleanups do you want (do you want 1000 * 1000 = 1M cleanups) and when do
you want it called? When the entire pipeline is shut down? When an
individual worker is about to shut down (which may be temporary - may be
about to start back up)? Something else?



>
>
>
> 2. Finalize some resources that are used within some region of the
> pipeline. While, the DoFn lifecycle methods are not a good fit for this
> (they are focused on managing resources within the DoFn), you could model
> this on how FileIO finalizes the files that it produced. For instance:
>    a) ParDo generates "resource IDs" (or some token that stores
> information about resources)
>    b) "Require Deterministic Input" (to prevent retries from changing
> resource IDs)
>    c) ParDo that initializes the resources
>    d) Pipeline segments that use the resources, and eventually output the
> fact they're done
>    e) "Require Deterministic Input"
>    f) ParDo that frees the resources
>
> By making the use of the resource part of the data it is possible to
> "checkpoint" which resources may be in use or have been finished by using
> the require deterministic input. This is important to ensuring everything
> is actually cleaned up.
>
>
> I nees that but generic and not case by case to industrialize some api on
> top of beam.
>
>
>
> 3. Some other use case that I may be missing? If it is this case, could
> you elaborate on what you are trying to accomplish? That would help me
> understand both the problems with existing options and possibly what could
> be done to help.
>
>
> I understand there are sorkaround for almost all cases but means each
> transform is different in its lifecycle handling  except i dislike it a lot
> at a scale and as a user since you cant put any unified practise on top of
> beam, it also makes beam very hard to integrate or to use to build higher
> level libraries or softwares.
>
> This is why i tried to not start the workaround discussions and just stay
> at API level.
>
>
>
> -- Ben
>
>
> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>
>>> "Machine state" is overly low-level because many of the possible reasons
>>> can happen on a perfectly fine machine.
>>> If you'd like to rephrase it to "it will be called except in various
>>> situations where it's logically impossible or impractical to guarantee that
>>> it's called", that's fine. Or you can list some of the examples above.
>>>
>>
>> Sounds ok to me
>>
>>
>>>
>>> The main point for the user is, you *will* see non-preventable
>>> situations where it couldn't be called - it's not just intergalactic
>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>> amount of temporary files, shutting down a large number of VMs you started
>>> etc), you have to express it using one of the other methods that have
>>> stricter guarantees (which obviously come at a cost, e.g. no
>>> pass-by-reference).
>>>
>>
>> FinishBundle has the exact same guarantee sadly so not which which other
>> method you speak about. Concretely if you make it really unreliable - this
>> is what best effort sounds to me - then users can use it to clean anything
>> but if you make it "can happen but it is unexpected and means something
>> happent" then it is fine to have a manual - or auto if fancy - recovery
>> procedure. This is where it makes all the difference and impacts the
>> developpers, ops (all users basically).
>>
>>
>>>
>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>> Agree Eugene except that "best effort" means that. It is also often
>>>> used to say "at will" and this is what triggered this thread.
>>>>
>>>> I'm fine using "except if the machine state prevents it" but "best
>>>> effort" is too open and can be very badly and wrongly perceived by users
>>>> (like I did).
>>>>
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> | Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>
>>>>> It will not be called if it's impossible to call it: in the example
>>>>> situation you have (intergalactic crash), and in a number of more common
>>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>> crash due to user code OOM, in case the worker has lost network
>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>
>>>>> "Best effort" is the commonly used term to describe such behavior.
>>>>> Please feel free to file bugs for cases where you observed a runner not
>>>>> call Teardown in a situation where it was possible to call it but the
>>>>> runner made insufficient effort.
>>>>>
>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>> écrit :
>>>>>>>>
>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>>>> be better able to help you.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Take a simple example of a transform requiring a connection. Using
>>>>>>>>> bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>
>>>>>>>>
>>>>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>>>>> a long time when everything at the other end has died?
>>>>>>>>
>>>>>>>>
>>>>>>>> You assume connections are kind of stateless but some (proprietary)
>>>>>>>> protocols requires some closing exchanges which are not only "im leaving".
>>>>>>>>
>>>>>>>> For aws i was thinking about starting some services - machines - on
>>>>>>>> the fly in a pipeline startup and closing them at the end. If teardown is
>>>>>>>> not called you leak machines and money. You can say it can be done another
>>>>>>>> way...as the full pipeline ;).
>>>>>>>>
>>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>>> some particular IO.
>>>>>>>>
>>>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>>>> or something else?
>>>>>>>>
>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>
>>>>>>
>>>>>> Remove "best effort" from the javadoc. If it is not call then it is a
>>>>>> bug and we are done :).
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Also what does it mean for the users? Direct runner does it so if a
>>>>>>>> user udes the RI in test, he will get a different behavior in prod? Also
>>>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>>>
>>>>>>>> I understand the portability culture is new in big data world but
>>>>>>>> it is not a reason to ignore what people did for years and do it wrong
>>>>>>>> before doing right ;).
>>>>>>>>
>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>> unknown blocker for other runners?
>>>>>>>>
>>>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>>>> features by one runner should be made optional right?
>>>>>>>>
>>>>>>>> All state is not about network, even in distributed systems so this
>>>>>>>> is key to have an explicit and defined lifecycle.
>>>>>>>>
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 18 févr. 2018 19:28, "Ben Chambers" <bc...@apache.org> a écrit :

It feels like his thread may be a bit off-track. Rather than focusing on
the semantics of the existing methods -- which have been noted to be meet
many existing use cases -- it would be helpful to focus on more on the
reason you are looking for something with different semantics.

Some possibilities (I'm not sure which one you are trying to do):

1. Clean-up some external, global resource, that was initialized once
during the startup of the pipeline. If this is the case, how are you
ensuring it was really only initialized once (and not once per worker, per
thread, per instance, etc.)? How do you know when the pipeline should
release it? If the answer is "when it reaches step X", then what about a
streaming pipeline?


When the dofn is no more needed logically ie when the batch is done or
stream is stopped (manually or by a jvm shutdown)



2. Finalize some resources that are used within some region of the
pipeline. While, the DoFn lifecycle methods are not a good fit for this
(they are focused on managing resources within the DoFn), you could model
this on how FileIO finalizes the files that it produced. For instance:
   a) ParDo generates "resource IDs" (or some token that stores information
about resources)
   b) "Require Deterministic Input" (to prevent retries from changing
resource IDs)
   c) ParDo that initializes the resources
   d) Pipeline segments that use the resources, and eventually output the
fact they're done
   e) "Require Deterministic Input"
   f) ParDo that frees the resources

By making the use of the resource part of the data it is possible to
"checkpoint" which resources may be in use or have been finished by using
the require deterministic input. This is important to ensuring everything
is actually cleaned up.


I nees that but generic and not case by case to industrialize some api on
top of beam.



3. Some other use case that I may be missing? If it is this case, could you
elaborate on what you are trying to accomplish? That would help me
understand both the problems with existing options and possibly what could
be done to help.


I understand there are sorkaround for almost all cases but means each
transform is different in its lifecycle handling  except i dislike it a lot
at a scale and as a user since you cant put any unified practise on top of
beam, it also makes beam very hard to integrate or to use to build higher
level libraries or softwares.

This is why i tried to not start the workaround discussions and just stay
at API level.



-- Ben


On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>
>> "Machine state" is overly low-level because many of the possible reasons
>> can happen on a perfectly fine machine.
>> If you'd like to rephrase it to "it will be called except in various
>> situations where it's logically impossible or impractical to guarantee that
>> it's called", that's fine. Or you can list some of the examples above.
>>
>
> Sounds ok to me
>
>
>>
>> The main point for the user is, you *will* see non-preventable situations
>> where it couldn't be called - it's not just intergalactic crashes - so if
>> the logic is very important (e.g. cleaning up a large amount of temporary
>> files, shutting down a large number of VMs you started etc), you have to
>> express it using one of the other methods that have stricter guarantees
>> (which obviously come at a cost, e.g. no pass-by-reference).
>>
>
> FinishBundle has the exact same guarantee sadly so not which which other
> method you speak about. Concretely if you make it really unreliable - this
> is what best effort sounds to me - then users can use it to clean anything
> but if you make it "can happen but it is unexpected and means something
> happent" then it is fine to have a manual - or auto if fancy - recovery
> procedure. This is where it makes all the difference and impacts the
> developpers, ops (all users basically).
>
>
>>
>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> Agree Eugene except that "best effort" means that. It is also often used
>>> to say "at will" and this is what triggered this thread.
>>>
>>> I'm fine using "except if the machine state prevents it" but "best
>>> effort" is too open and can be very badly and wrongly perceived by users
>>> (like I did).
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>
>>>> It will not be called if it's impossible to call it: in the example
>>>> situation you have (intergalactic crash), and in a number of more common
>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>> crash due to user code OOM, in case the worker has lost network
>>>> connectivity (then it may be called but it won't be able to do anything
>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>> by the underlying cluster manager without notice or if the worker was too
>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>> (which happens quite often at scale), and in many other conditions.
>>>>
>>>> "Best effort" is the commonly used term to describe such behavior.
>>>> Please feel free to file bugs for cases where you observed a runner not
>>>> call Teardown in a situation where it was possible to call it but the
>>>> runner made insufficient effort.
>>>>
>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <rm...@gmail.com>
>>>> wrote:
>>>>
>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :
>>>>>>>
>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>>> be better able to help you.
>>>>>>>>
>>>>>>>>
>>>>>>>> Take a simple example of a transform requiring a connection. Using
>>>>>>>> bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>
>>>>>>>
>>>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>>>> a long time when everything at the other end has died?
>>>>>>>
>>>>>>>
>>>>>>> You assume connections are kind of stateless but some (proprietary)
>>>>>>> protocols requires some closing exchanges which are not only "im leaving".
>>>>>>>
>>>>>>> For aws i was thinking about starting some services - machines - on
>>>>>>> the fly in a pipeline startup and closing them at the end. If teardown is
>>>>>>> not called you leak machines and money. You can say it can be done another
>>>>>>> way...as the full pipeline ;).
>>>>>>>
>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>> some particular IO.
>>>>>>>
>>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>>> or something else?
>>>>>>>
>>>>>> Teardown *is* already documented and implemented this way
>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>
>>>>>
>>>>> Remove "best effort" from the javadoc. If it is not call then it is a
>>>>> bug and we are done :).
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> Also what does it mean for the users? Direct runner does it so if a
>>>>>>> user udes the RI in test, he will get a different behavior in prod? Also
>>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>>
>>>>>>> I understand the portability culture is new in big data world but it
>>>>>>> is not a reason to ignore what people did for years and do it wrong before
>>>>>>> doing right ;).
>>>>>>>
>>>>>>> My proposal is to list what can prevent to guarantee - in the normal
>>>>>>> IT conditions - the execution of teardown. Then we see if we can handle it
>>>>>>> and only if there is a technical reason we cant we make it
>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>> unknown blocker for other runners?
>>>>>>>
>>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>>> features by one runner should be made optional right?
>>>>>>>
>>>>>>> All state is not about network, even in distributed systems so this
>>>>>>> is key to have an explicit and defined lifecycle.
>>>>>>>
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>>
>>>>>>>
>>>

Re: @TearDown guarantees

Posted by Ben Chambers <bc...@apache.org>.

It feels like his thread may be a bit off-track. Rather than focusing on
the semantics of the existing methods -- which have been noted to be meet
many existing use cases -- it would be helpful to focus on more on the
reason you are looking for something with different semantics.

Some possibilities (I'm not sure which one you are trying to do):

1. Clean-up some external, global resource, that was initialized once
during the startup of the pipeline. If this is the case, how are you
ensuring it was really only initialized once (and not once per worker, per
thread, per instance, etc.)? How do you know when the pipeline should
release it? If the answer is "when it reaches step X", then what about a
streaming pipeline?

2. Finalize some resources that are used within some region of the
pipeline. While, the DoFn lifecycle methods are not a good fit for this
(they are focused on managing resources within the DoFn), you could model
this on how FileIO finalizes the files that it produced. For instance:
   a) ParDo generates "resource IDs" (or some token that stores information
about resources)
   b) "Require Deterministic Input" (to prevent retries from changing
resource IDs)
   c) ParDo that initializes the resources
   d) Pipeline segments that use the resources, and eventually output the
fact they're done
   e) "Require Deterministic Input"
   f) ParDo that frees the resources

By making the use of the resource part of the data it is possible to
"checkpoint" which resources may be in use or have been finished by using
the require deterministic input. This is important to ensuring everything
is actually cleaned up.

3. Some other use case that I may be missing? If it is this case, could you
elaborate on what you are trying to accomplish? That would help me
understand both the problems with existing options and possibly what could
be done to help.

-- Ben


On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>
>> "Machine state" is overly low-level because many of the possible reasons
>> can happen on a perfectly fine machine.
>> If you'd like to rephrase it to "it will be called except in various
>> situations where it's logically impossible or impractical to guarantee that
>> it's called", that's fine. Or you can list some of the examples above.
>>
>
> Sounds ok to me
>
>
>>
>> The main point for the user is, you *will* see non-preventable situations
>> where it couldn't be called - it's not just intergalactic crashes - so if
>> the logic is very important (e.g. cleaning up a large amount of temporary
>> files, shutting down a large number of VMs you started etc), you have to
>> express it using one of the other methods that have stricter guarantees
>> (which obviously come at a cost, e.g. no pass-by-reference).
>>
>
> FinishBundle has the exact same guarantee sadly so not which which other
> method you speak about. Concretely if you make it really unreliable - this
> is what best effort sounds to me - then users can use it to clean anything
> but if you make it "can happen but it is unexpected and means something
> happent" then it is fine to have a manual - or auto if fancy - recovery
> procedure. This is where it makes all the difference and impacts the
> developpers, ops (all users basically).
>
>
>>
>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> Agree Eugene except that "best effort" means that. It is also often used
>>> to say "at will" and this is what triggered this thread.
>>>
>>> I'm fine using "except if the machine state prevents it" but "best
>>> effort" is too open and can be very badly and wrongly perceived by users
>>> (like I did).
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>
>>>> It will not be called if it's impossible to call it: in the example
>>>> situation you have (intergalactic crash), and in a number of more common
>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>> crash due to user code OOM, in case the worker has lost network
>>>> connectivity (then it may be called but it won't be able to do anything
>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>> by the underlying cluster manager without notice or if the worker was too
>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>> (which happens quite often at scale), and in many other conditions.
>>>>
>>>> "Best effort" is the commonly used term to describe such behavior.
>>>> Please feel free to file bugs for cases where you observed a runner not
>>>> call Teardown in a situation where it was possible to call it but the
>>>> runner made insufficient effort.
>>>>
>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <rm...@gmail.com>
>>>> wrote:
>>>>
>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :
>>>>>>>
>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>>> be better able to help you.
>>>>>>>>
>>>>>>>>
>>>>>>>> Take a simple example of a transform requiring a connection. Using
>>>>>>>> bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>
>>>>>>>
>>>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>>>> a long time when everything at the other end has died?
>>>>>>>
>>>>>>>
>>>>>>> You assume connections are kind of stateless but some (proprietary)
>>>>>>> protocols requires some closing exchanges which are not only "im leaving".
>>>>>>>
>>>>>>> For aws i was thinking about starting some services - machines - on
>>>>>>> the fly in a pipeline startup and closing them at the end. If teardown is
>>>>>>> not called you leak machines and money. You can say it can be done another
>>>>>>> way...as the full pipeline ;).
>>>>>>>
>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>> some particular IO.
>>>>>>>
>>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>>> or something else?
>>>>>>>
>>>>>> Teardown *is* already documented and implemented this way
>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>
>>>>>
>>>>> Remove "best effort" from the javadoc. If it is not call then it is a
>>>>> bug and we are done :).
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> Also what does it mean for the users? Direct runner does it so if a
>>>>>>> user udes the RI in test, he will get a different behavior in prod? Also
>>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>>
>>>>>>> I understand the portability culture is new in big data world but it
>>>>>>> is not a reason to ignore what people did for years and do it wrong before
>>>>>>> doing right ;).
>>>>>>>
>>>>>>> My proposal is to list what can prevent to guarantee - in the normal
>>>>>>> IT conditions - the execution of teardown. Then we see if we can handle it
>>>>>>> and only if there is a technical reason we cant we make it
>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>> unknown blocker for other runners?
>>>>>>>
>>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>>> features by one runner should be made optional right?
>>>>>>>
>>>>>>> All state is not about network, even in distributed systems so this
>>>>>>> is key to have an explicit and defined lifecycle.
>>>>>>>
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>>
>>>>>>>
>>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

On Sun, Feb 18, 2018 at 10:25 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> 2018-02-18 19:19 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>
>> FinishBundle has a stronger guarantee: if the pipeline succeeded, then it
>> has been called for every succeeded bundle, and succeeded bundles together
>> cover the entire input PCollection. Of course, it may not have been called
>> for failed bundles.
>> To anticipate a possible objection "why not also keep retrying Teardown
>> until it succeeds" - because if Teardown wasn't called on a DoFn instance,
>> it's because the instance no longer exists and there's nothing to call it
>> on.
>>
>> Please take a look at implementations of WriteFiles and BigQueryIO.read()
>> and write() to see how cleanup of heavyweight resources (large number of
>> temp files, temporary BigQuery datasets) can be achieved reliably to the
>> extent possible.
>>
>
> Do you mean passing state accross the fn and having a fn responsible of
> the cleanup? Kind of making the teardown a processelement? This is a nice
> workaround but it is not always possible as mentionned. Ismael even has a
> nice case where this just fails and teardown would work - was with AWS, not
> a bigquery bug,  but same design.
>
I don't remember this case and would appreciate being reminded what it is.

But in general, yes, there unfortunately exist systems designed in a way
that reliably achieving cleanup when interacting with them from a
fault-tolerant distributed system like Beam is simply impossible. We should
consider on a case-by-case basis whether any given system is like this, and
what to do if it is.



>
>
>>
>> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>
>>>> "Machine state" is overly low-level because many of the possible
>>>> reasons can happen on a perfectly fine machine.
>>>> If you'd like to rephrase it to "it will be called except in various
>>>> situations where it's logically impossible or impractical to guarantee that
>>>> it's called", that's fine. Or you can list some of the examples above.
>>>>
>>>
>>> Sounds ok to me
>>>
>>>
>>>>
>>>> The main point for the user is, you *will* see non-preventable
>>>> situations where it couldn't be called - it's not just intergalactic
>>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>>> amount of temporary files, shutting down a large number of VMs you started
>>>> etc), you have to express it using one of the other methods that have
>>>> stricter guarantees (which obviously come at a cost, e.g. no
>>>> pass-by-reference).
>>>>
>>>
>>> FinishBundle has the exact same guarantee sadly so not which which other
>>> method you speak about. Concretely if you make it really unreliable - this
>>> is what best effort sounds to me - then users can use it to clean anything
>>> but if you make it "can happen but it is unexpected and means something
>>> happent" then it is fine to have a manual - or auto if fancy - recovery
>>> procedure. This is where it makes all the difference and impacts the
>>> developpers, ops (all users basically).
>>>
>>>
>>>>
>>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> Agree Eugene except that "best effort" means that. It is also often
>>>>> used to say "at will" and this is what triggered this thread.
>>>>>
>>>>> I'm fine using "except if the machine state prevents it" but "best
>>>>> effort" is too open and can be very badly and wrongly perceived by users
>>>>> (like I did).
>>>>>
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> | Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>
>>>>>> It will not be called if it's impossible to call it: in the example
>>>>>> situation you have (intergalactic crash), and in a number of more common
>>>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>>> crash due to user code OOM, in case the worker has lost network
>>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>>
>>>>>> "Best effort" is the commonly used term to describe such behavior.
>>>>>> Please feel free to file bugs for cases where you observed a runner not
>>>>>> call Teardown in a situation where it was possible to call it but the
>>>>>> runner made insufficient effort.
>>>>>>
>>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>>> écrit :
>>>>>>>>>
>>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>>>>> be better able to help you.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Take a simple example of a transform requiring a connection.
>>>>>>>>>> Using bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>>>>>> a long time when everything at the other end has died?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> You assume connections are kind of stateless but some
>>>>>>>>> (proprietary) protocols requires some closing exchanges which are not only
>>>>>>>>> "im leaving".
>>>>>>>>>
>>>>>>>>> For aws i was thinking about starting some services - machines -
>>>>>>>>> on the fly in a pipeline startup and closing them at the end. If teardown
>>>>>>>>> is not called you leak machines and money. You can say it can be done
>>>>>>>>> another way...as the full pipeline ;).
>>>>>>>>>
>>>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>>>> some particular IO.
>>>>>>>>>
>>>>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>>>>> or something else?
>>>>>>>>>
>>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>>
>>>>>>>
>>>>>>> Remove "best effort" from the javadoc. If it is not call then it is
>>>>>>> a bug and we are done :).
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>> Also what does it mean for the users? Direct runner does it so if
>>>>>>>>> a user udes the RI in test, he will get a different behavior in prod? Also
>>>>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>>>>
>>>>>>>>> I understand the portability culture is new in big data world but
>>>>>>>>> it is not a reason to ignore what people did for years and do it wrong
>>>>>>>>> before doing right ;).
>>>>>>>>>
>>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>>> unknown blocker for other runners?
>>>>>>>>>
>>>>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>>>>> features by one runner should be made optional right?
>>>>>>>>>
>>>>>>>>> All state is not about network, even in distributed systems so
>>>>>>>>> this is key to have an explicit and defined lifecycle.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Kenn
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

2018-02-18 19:19 GMT+01:00 Eugene Kirpichov <ki...@google.com>:

> FinishBundle has a stronger guarantee: if the pipeline succeeded, then it
> has been called for every succeeded bundle, and succeeded bundles together
> cover the entire input PCollection. Of course, it may not have been called
> for failed bundles.
> To anticipate a possible objection "why not also keep retrying Teardown
> until it succeeds" - because if Teardown wasn't called on a DoFn instance,
> it's because the instance no longer exists and there's nothing to call it
> on.
>
> Please take a look at implementations of WriteFiles and BigQueryIO.read()
> and write() to see how cleanup of heavyweight resources (large number of
> temp files, temporary BigQuery datasets) can be achieved reliably to the
> extent possible.
>

Do you mean passing state accross the fn and having a fn responsible of the
cleanup? Kind of making the teardown a processelement? This is a nice
workaround but it is not always possible as mentionned. Ismael even has a
nice case where this just fails and teardown would work - was with AWS, not
a bigquery bug,  but same design.


>
> On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>
>>> "Machine state" is overly low-level because many of the possible reasons
>>> can happen on a perfectly fine machine.
>>> If you'd like to rephrase it to "it will be called except in various
>>> situations where it's logically impossible or impractical to guarantee that
>>> it's called", that's fine. Or you can list some of the examples above.
>>>
>>
>> Sounds ok to me
>>
>>
>>>
>>> The main point for the user is, you *will* see non-preventable
>>> situations where it couldn't be called - it's not just intergalactic
>>> crashes - so if the logic is very important (e.g. cleaning up a large
>>> amount of temporary files, shutting down a large number of VMs you started
>>> etc), you have to express it using one of the other methods that have
>>> stricter guarantees (which obviously come at a cost, e.g. no
>>> pass-by-reference).
>>>
>>
>> FinishBundle has the exact same guarantee sadly so not which which other
>> method you speak about. Concretely if you make it really unreliable - this
>> is what best effort sounds to me - then users can use it to clean anything
>> but if you make it "can happen but it is unexpected and means something
>> happent" then it is fine to have a manual - or auto if fancy - recovery
>> procedure. This is where it makes all the difference and impacts the
>> developpers, ops (all users basically).
>>
>>
>>>
>>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>> Agree Eugene except that "best effort" means that. It is also often
>>>> used to say "at will" and this is what triggered this thread.
>>>>
>>>> I'm fine using "except if the machine state prevents it" but "best
>>>> effort" is too open and can be very badly and wrongly perceived by users
>>>> (like I did).
>>>>
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> | Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>
>>>>> It will not be called if it's impossible to call it: in the example
>>>>> situation you have (intergalactic crash), and in a number of more common
>>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>>> crash due to user code OOM, in case the worker has lost network
>>>>> connectivity (then it may be called but it won't be able to do anything
>>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>>> by the underlying cluster manager without notice or if the worker was too
>>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>>> (which happens quite often at scale), and in many other conditions.
>>>>>
>>>>> "Best effort" is the commonly used term to describe such behavior.
>>>>> Please feel free to file bugs for cases where you observed a runner not
>>>>> call Teardown in a situation where it was possible to call it but the
>>>>> runner made insufficient effort.
>>>>>
>>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a
>>>>>>>> écrit :
>>>>>>>>
>>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>>>> be better able to help you.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Take a simple example of a transform requiring a connection. Using
>>>>>>>>> bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>>
>>>>>>>>
>>>>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>>>>> a long time when everything at the other end has died?
>>>>>>>>
>>>>>>>>
>>>>>>>> You assume connections are kind of stateless but some (proprietary)
>>>>>>>> protocols requires some closing exchanges which are not only "im leaving".
>>>>>>>>
>>>>>>>> For aws i was thinking about starting some services - machines - on
>>>>>>>> the fly in a pipeline startup and closing them at the end. If teardown is
>>>>>>>> not called you leak machines and money. You can say it can be done another
>>>>>>>> way...as the full pipeline ;).
>>>>>>>>
>>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>>> some particular IO.
>>>>>>>>
>>>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>>>> or something else?
>>>>>>>>
>>>>>>> Teardown *is* already documented and implemented this way
>>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>>
>>>>>>
>>>>>> Remove "best effort" from the javadoc. If it is not call then it is a
>>>>>> bug and we are done :).
>>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> Also what does it mean for the users? Direct runner does it so if a
>>>>>>>> user udes the RI in test, he will get a different behavior in prod? Also
>>>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>>>
>>>>>>>> I understand the portability culture is new in big data world but
>>>>>>>> it is not a reason to ignore what people did for years and do it wrong
>>>>>>>> before doing right ;).
>>>>>>>>
>>>>>>>> My proposal is to list what can prevent to guarantee - in the
>>>>>>>> normal IT conditions - the execution of teardown. Then we see if we can
>>>>>>>> handle it and only if there is a technical reason we cant we make it
>>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>>> unknown blocker for other runners?
>>>>>>>>
>>>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>>>> features by one runner should be made optional right?
>>>>>>>>
>>>>>>>> All state is not about network, even in distributed systems so this
>>>>>>>> is key to have an explicit and defined lifecycle.
>>>>>>>>
>>>>>>>>
>>>>>>>> Kenn
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

FinishBundle has a stronger guarantee: if the pipeline succeeded, then it
has been called for every succeeded bundle, and succeeded bundles together
cover the entire input PCollection. Of course, it may not have been called
for failed bundles.
To anticipate a possible objection "why not also keep retrying Teardown
until it succeeds" - because if Teardown wasn't called on a DoFn instance,
it's because the instance no longer exists and there's nothing to call it
on.

Please take a look at implementations of WriteFiles and BigQueryIO.read()
and write() to see how cleanup of heavyweight resources (large number of
temp files, temporary BigQuery datasets) can be achieved reliably to the
extent possible.

On Sun, Feb 18, 2018 at 9:56 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> 2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>
>> "Machine state" is overly low-level because many of the possible reasons
>> can happen on a perfectly fine machine.
>> If you'd like to rephrase it to "it will be called except in various
>> situations where it's logically impossible or impractical to guarantee that
>> it's called", that's fine. Or you can list some of the examples above.
>>
>
> Sounds ok to me
>
>
>>
>> The main point for the user is, you *will* see non-preventable situations
>> where it couldn't be called - it's not just intergalactic crashes - so if
>> the logic is very important (e.g. cleaning up a large amount of temporary
>> files, shutting down a large number of VMs you started etc), you have to
>> express it using one of the other methods that have stricter guarantees
>> (which obviously come at a cost, e.g. no pass-by-reference).
>>
>
> FinishBundle has the exact same guarantee sadly so not which which other
> method you speak about. Concretely if you make it really unreliable - this
> is what best effort sounds to me - then users can use it to clean anything
> but if you make it "can happen but it is unexpected and means something
> happent" then it is fine to have a manual - or auto if fancy - recovery
> procedure. This is where it makes all the difference and impacts the
> developpers, ops (all users basically).
>
>
>>
>> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> Agree Eugene except that "best effort" means that. It is also often used
>>> to say "at will" and this is what triggered this thread.
>>>
>>> I'm fine using "except if the machine state prevents it" but "best
>>> effort" is too open and can be very badly and wrongly perceived by users
>>> (like I did).
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>
>>>> It will not be called if it's impossible to call it: in the example
>>>> situation you have (intergalactic crash), and in a number of more common
>>>> cases: eg in case the worker container has crashed (eg user code in a
>>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>>> crash due to user code OOM, in case the worker has lost network
>>>> connectivity (then it may be called but it won't be able to do anything
>>>> useful), in case this is running on a preemptible VM and it was preempted
>>>> by the underlying cluster manager without notice or if the worker was too
>>>> busy with other stuff (eg calling other Teardown functions) until the
>>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>>> (which happens quite often at scale), and in many other conditions.
>>>>
>>>> "Best effort" is the commonly used term to describe such behavior.
>>>> Please feel free to file bugs for cases where you observed a runner not
>>>> call Teardown in a situation where it was possible to call it but the
>>>> runner made insufficient effort.
>>>>
>>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <rm...@gmail.com>
>>>> wrote:
>>>>
>>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>>
>>>>>>
>>>>>>
>>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :
>>>>>>>
>>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>>> be better able to help you.
>>>>>>>>
>>>>>>>>
>>>>>>>> Take a simple example of a transform requiring a connection. Using
>>>>>>>> bundles is a perf killer since size is not controlled. Using teardown
>>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>>> to launch other processings - concurrent limit.
>>>>>>>>
>>>>>>>
>>>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>>>> a long time when everything at the other end has died?
>>>>>>>
>>>>>>>
>>>>>>> You assume connections are kind of stateless but some (proprietary)
>>>>>>> protocols requires some closing exchanges which are not only "im leaving".
>>>>>>>
>>>>>>> For aws i was thinking about starting some services - machines - on
>>>>>>> the fly in a pipeline startup and closing them at the end. If teardown is
>>>>>>> not called you leak machines and money. You can say it can be done another
>>>>>>> way...as the full pipeline ;).
>>>>>>>
>>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>>> some particular IO.
>>>>>>>
>>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>>> or something else?
>>>>>>>
>>>>>> Teardown *is* already documented and implemented this way
>>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>>
>>>>>
>>>>> Remove "best effort" from the javadoc. If it is not call then it is a
>>>>> bug and we are done :).
>>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>> Also what does it mean for the users? Direct runner does it so if a
>>>>>>> user udes the RI in test, he will get a different behavior in prod? Also
>>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>>
>>>>>>> I understand the portability culture is new in big data world but it
>>>>>>> is not a reason to ignore what people did for years and do it wrong before
>>>>>>> doing right ;).
>>>>>>>
>>>>>>> My proposal is to list what can prevent to guarantee - in the normal
>>>>>>> IT conditions - the execution of teardown. Then we see if we can handle it
>>>>>>> and only if there is a technical reason we cant we make it
>>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>>> unknown blocker for other runners?
>>>>>>>
>>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>>> features by one runner should be made optional right?
>>>>>>>
>>>>>>> All state is not about network, even in distributed systems so this
>>>>>>> is key to have an explicit and defined lifecycle.
>>>>>>>
>>>>>>>
>>>>>>> Kenn
>>>>>>>
>>>>>>>
>>>>>>>
>>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

2018-02-18 18:36 GMT+01:00 Eugene Kirpichov <ki...@google.com>:

> "Machine state" is overly low-level because many of the possible reasons
> can happen on a perfectly fine machine.
> If you'd like to rephrase it to "it will be called except in various
> situations where it's logically impossible or impractical to guarantee that
> it's called", that's fine. Or you can list some of the examples above.
>

Sounds ok to me


>
> The main point for the user is, you *will* see non-preventable situations
> where it couldn't be called - it's not just intergalactic crashes - so if
> the logic is very important (e.g. cleaning up a large amount of temporary
> files, shutting down a large number of VMs you started etc), you have to
> express it using one of the other methods that have stricter guarantees
> (which obviously come at a cost, e.g. no pass-by-reference).
>

FinishBundle has the exact same guarantee sadly so not which which other
method you speak about. Concretely if you make it really unreliable - this
is what best effort sounds to me - then users can use it to clean anything
but if you make it "can happen but it is unexpected and means something
happent" then it is fine to have a manual - or auto if fancy - recovery
procedure. This is where it makes all the difference and impacts the
developpers, ops (all users basically).


>
> On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> Agree Eugene except that "best effort" means that. It is also often used
>> to say "at will" and this is what triggered this thread.
>>
>> I'm fine using "except if the machine state prevents it" but "best
>> effort" is too open and can be very badly and wrongly perceived by users
>> (like I did).
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> | Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau> | Book
>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>
>> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>
>>> It will not be called if it's impossible to call it: in the example
>>> situation you have (intergalactic crash), and in a number of more common
>>> cases: eg in case the worker container has crashed (eg user code in a
>>> different thread called a C library over JNI and it segfaulted), JVM bug,
>>> crash due to user code OOM, in case the worker has lost network
>>> connectivity (then it may be called but it won't be able to do anything
>>> useful), in case this is running on a preemptible VM and it was preempted
>>> by the underlying cluster manager without notice or if the worker was too
>>> busy with other stuff (eg calling other Teardown functions) until the
>>> preemption timeout elapsed, in case the underlying hardware simply failed
>>> (which happens quite often at scale), and in many other conditions.
>>>
>>> "Best effort" is the commonly used term to describe such behavior.
>>> Please feel free to file bugs for cases where you observed a runner not
>>> call Teardown in a situation where it was possible to call it but the
>>> runner made insufficient effort.
>>>
>>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>
>>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>>
>>>>>
>>>>>
>>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>>
>>>>>>
>>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :
>>>>>>
>>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>>> be better able to help you.
>>>>>>>
>>>>>>>
>>>>>>> Take a simple example of a transform requiring a connection. Using
>>>>>>> bundles is a perf killer since size is not controlled. Using teardown
>>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>>> to launch other processings - concurrent limit.
>>>>>>>
>>>>>>
>>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>>> a long time when everything at the other end has died?
>>>>>>
>>>>>>
>>>>>> You assume connections are kind of stateless but some (proprietary)
>>>>>> protocols requires some closing exchanges which are not only "im leaving".
>>>>>>
>>>>>> For aws i was thinking about starting some services - machines - on
>>>>>> the fly in a pipeline startup and closing them at the end. If teardown is
>>>>>> not called you leak machines and money. You can say it can be done another
>>>>>> way...as the full pipeline ;).
>>>>>>
>>>>>> I dont want to be picky but if beam cant handle its components
>>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>>> some particular IO.
>>>>>>
>>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>>> or something else?
>>>>>>
>>>>> Teardown *is* already documented and implemented this way
>>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>>
>>>>
>>>> Remove "best effort" from the javadoc. If it is not call then it is a
>>>> bug and we are done :).
>>>>
>>>>
>>>>>
>>>>>
>>>>>> Also what does it mean for the users? Direct runner does it so if a
>>>>>> user udes the RI in test, he will get a different behavior in prod? Also
>>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>>
>>>>>> I understand the portability culture is new in big data world but it
>>>>>> is not a reason to ignore what people did for years and do it wrong before
>>>>>> doing right ;).
>>>>>>
>>>>>> My proposal is to list what can prevent to guarantee - in the normal
>>>>>> IT conditions - the execution of teardown. Then we see if we can handle it
>>>>>> and only if there is a technical reason we cant we make it
>>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>>> unknown blocker for other runners?
>>>>>>
>>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>>> the software is always owned by a vendor and never installed on customer
>>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>>> features by one runner should be made optional right?
>>>>>>
>>>>>> All state is not about network, even in distributed systems so this
>>>>>> is key to have an explicit and defined lifecycle.
>>>>>>
>>>>>>
>>>>>> Kenn
>>>>>>
>>>>>>
>>>>>>
>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

"Machine state" is overly low-level because many of the possible reasons
can happen on a perfectly fine machine.
If you'd like to rephrase it to "it will be called except in various
situations where it's logically impossible or impractical to guarantee that
it's called", that's fine. Or you can list some of the examples above.

The main point for the user is, you *will* see non-preventable situations
where it couldn't be called - it's not just intergalactic crashes - so if
the logic is very important (e.g. cleaning up a large amount of temporary
files, shutting down a large number of VMs you started etc), you have to
express it using one of the other methods that have stricter guarantees
(which obviously come at a cost, e.g. no pass-by-reference).

On Sun, Feb 18, 2018 at 9:16 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Agree Eugene except that "best effort" means that. It is also often used
> to say "at will" and this is what triggered this thread.
>
> I'm fine using "except if the machine state prevents it" but "best effort"
> is too open and can be very badly and wrongly perceived by users (like I
> did).
>
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>
> 2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>
>> It will not be called if it's impossible to call it: in the example
>> situation you have (intergalactic crash), and in a number of more common
>> cases: eg in case the worker container has crashed (eg user code in a
>> different thread called a C library over JNI and it segfaulted), JVM bug,
>> crash due to user code OOM, in case the worker has lost network
>> connectivity (then it may be called but it won't be able to do anything
>> useful), in case this is running on a preemptible VM and it was preempted
>> by the underlying cluster manager without notice or if the worker was too
>> busy with other stuff (eg calling other Teardown functions) until the
>> preemption timeout elapsed, in case the underlying hardware simply failed
>> (which happens quite often at scale), and in many other conditions.
>>
>> "Best effort" is the commonly used term to describe such behavior. Please
>> feel free to file bugs for cases where you observed a runner not call
>> Teardown in a situation where it was possible to call it but the runner
>> made insufficient effort.
>>
>> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>>
>>>>
>>>>
>>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <rm...@gmail.com>
>>>> wrote:
>>>>
>>>>>
>>>>>
>>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :
>>>>>
>>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>> If you give an example of a high-level need (e.g. "I'm trying to
>>>>>> write an IO for system $x and it requires the following initialization and
>>>>>> the following cleanup logic and the following processing in between") I'll
>>>>>> be better able to help you.
>>>>>>
>>>>>>
>>>>>> Take a simple example of a transform requiring a connection. Using
>>>>>> bundles is a perf killer since size is not controlled. Using teardown
>>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>>> to launch other processings - concurrent limit.
>>>>>>
>>>>>
>>>>> For this example @Teardown is an exact fit. If things die so badly
>>>>> that @Teardown is not called then nothing else can be called to close the
>>>>> connection either. What AWS service are you thinking of that stays open for
>>>>> a long time when everything at the other end has died?
>>>>>
>>>>>
>>>>> You assume connections are kind of stateless but some (proprietary)
>>>>> protocols requires some closing exchanges which are not only "im leaving".
>>>>>
>>>>> For aws i was thinking about starting some services - machines - on
>>>>> the fly in a pipeline startup and closing them at the end. If teardown is
>>>>> not called you leak machines and money. You can say it can be done another
>>>>> way...as the full pipeline ;).
>>>>>
>>>>> I dont want to be picky but if beam cant handle its components
>>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>>> some particular IO.
>>>>>
>>>>> What does prevent to enforce teardown - ignoring the interstellar
>>>>> crash case which cant be handled by any human system? Nothing technically.
>>>>> Why do you push to not handle it? Is it due to some legacy code on dataflow
>>>>> or something else?
>>>>>
>>>> Teardown *is* already documented and implemented this way
>>>> (best-effort). So I'm not sure what kind of change you're asking for.
>>>>
>>>
>>> Remove "best effort" from the javadoc. If it is not call then it is a
>>> bug and we are done :).
>>>
>>>
>>>>
>>>>
>>>>> Also what does it mean for the users? Direct runner does it so if a
>>>>> user udes the RI in test, he will get a different behavior in prod? Also
>>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>>> impacting for the whole product than he must be handled IMHO.
>>>>>
>>>>> I understand the portability culture is new in big data world but it
>>>>> is not a reason to ignore what people did for years and do it wrong before
>>>>> doing right ;).
>>>>>
>>>>> My proposal is to list what can prevent to guarantee - in the normal
>>>>> IT conditions - the execution of teardown. Then we see if we can handle it
>>>>> and only if there is a technical reason we cant we make it
>>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>>> unknown blocker for other runners?
>>>>>
>>>>> Technical note: even a kill should go through java shutdown hooks
>>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>>> the software is always owned by a vendor and never installed on customer
>>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>>> features by one runner should be made optional right?
>>>>>
>>>>> All state is not about network, even in distributed systems so this is
>>>>> key to have an explicit and defined lifecycle.
>>>>>
>>>>>
>>>>> Kenn
>>>>>
>>>>>
>>>>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Agree Eugene except that "best effort" means that. It is also often used to
say "at will" and this is what triggered this thread.

I'm fine using "except if the machine state prevents it" but "best effort"
is too open and can be very badly and wrongly perceived by users (like I
did).


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-18 18:13 GMT+01:00 Eugene Kirpichov <ki...@google.com>:

> It will not be called if it's impossible to call it: in the example
> situation you have (intergalactic crash), and in a number of more common
> cases: eg in case the worker container has crashed (eg user code in a
> different thread called a C library over JNI and it segfaulted), JVM bug,
> crash due to user code OOM, in case the worker has lost network
> connectivity (then it may be called but it won't be able to do anything
> useful), in case this is running on a preemptible VM and it was preempted
> by the underlying cluster manager without notice or if the worker was too
> busy with other stuff (eg calling other Teardown functions) until the
> preemption timeout elapsed, in case the underlying hardware simply failed
> (which happens quite often at scale), and in many other conditions.
>
> "Best effort" is the commonly used term to describe such behavior. Please
> feel free to file bugs for cases where you observed a runner not call
> Teardown in a situation where it was possible to call it but the runner
> made insufficient effort.
>
> On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>>
>>>
>>>
>>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <rm...@gmail.com>
>>> wrote:
>>>
>>>>
>>>>
>>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :
>>>>
>>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>> If you give an example of a high-level need (e.g. "I'm trying to write
>>>>> an IO for system $x and it requires the following initialization and the
>>>>> following cleanup logic and the following processing in between") I'll be
>>>>> better able to help you.
>>>>>
>>>>>
>>>>> Take a simple example of a transform requiring a connection. Using
>>>>> bundles is a perf killer since size is not controlled. Using teardown
>>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>>> to launch other processings - concurrent limit.
>>>>>
>>>>
>>>> For this example @Teardown is an exact fit. If things die so badly that
>>>> @Teardown is not called then nothing else can be called to close the
>>>> connection either. What AWS service are you thinking of that stays open for
>>>> a long time when everything at the other end has died?
>>>>
>>>>
>>>> You assume connections are kind of stateless but some (proprietary)
>>>> protocols requires some closing exchanges which are not only "im leaving".
>>>>
>>>> For aws i was thinking about starting some services - machines - on the
>>>> fly in a pipeline startup and closing them at the end. If teardown is not
>>>> called you leak machines and money. You can say it can be done another
>>>> way...as the full pipeline ;).
>>>>
>>>> I dont want to be picky but if beam cant handle its components
>>>> lifecycle it can be used at scale for generic pipelines and if bound to
>>>> some particular IO.
>>>>
>>>> What does prevent to enforce teardown - ignoring the interstellar crash
>>>> case which cant be handled by any human system? Nothing technically. Why do
>>>> you push to not handle it? Is it due to some legacy code on dataflow or
>>>> something else?
>>>>
>>> Teardown *is* already documented and implemented this way (best-effort).
>>> So I'm not sure what kind of change you're asking for.
>>>
>>
>> Remove "best effort" from the javadoc. If it is not call then it is a bug
>> and we are done :).
>>
>>
>>>
>>>
>>>> Also what does it mean for the users? Direct runner does it so if a
>>>> user udes the RI in test, he will get a different behavior in prod? Also
>>>> dont forget the user doesnt know what the IOs he composes use so this is so
>>>> impacting for the whole product than he must be handled IMHO.
>>>>
>>>> I understand the portability culture is new in big data world but it is
>>>> not a reason to ignore what people did for years and do it wrong before
>>>> doing right ;).
>>>>
>>>> My proposal is to list what can prevent to guarantee - in the normal IT
>>>> conditions - the execution of teardown. Then we see if we can handle it and
>>>> only if there is a technical reason we cant we make it
>>>> experimental/unsupported in the api. I know spark and flink can, any
>>>> unknown blocker for other runners?
>>>>
>>>> Technical note: even a kill should go through java shutdown hooks
>>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>>> your overall system is uncontrolled. Only case where it is not true is when
>>>> the software is always owned by a vendor and never installed on customer
>>>> environment. In this case it belongd to the vendor to handle beam API and
>>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>>> features by one runner should be made optional right?
>>>>
>>>> All state is not about network, even in distributed systems so this is
>>>> key to have an explicit and defined lifecycle.
>>>>
>>>>
>>>> Kenn
>>>>
>>>>
>>>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

It will not be called if it's impossible to call it: in the example
situation you have (intergalactic crash), and in a number of more common
cases: eg in case the worker container has crashed (eg user code in a
different thread called a C library over JNI and it segfaulted), JVM bug,
crash due to user code OOM, in case the worker has lost network
connectivity (then it may be called but it won't be able to do anything
useful), in case this is running on a preemptible VM and it was preempted
by the underlying cluster manager without notice or if the worker was too
busy with other stuff (eg calling other Teardown functions) until the
preemption timeout elapsed, in case the underlying hardware simply failed
(which happens quite often at scale), and in many other conditions.

"Best effort" is the commonly used term to describe such behavior. Please
feel free to file bugs for cases where you observed a runner not call
Teardown in a situation where it was possible to call it but the runner
made insufficient effort.

On Sun, Feb 18, 2018, 9:02 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> 2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:
>
>>
>>
>> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <rm...@gmail.com>
>> wrote:
>>
>>>
>>>
>>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :
>>>
>>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>>
>>>> If you give an example of a high-level need (e.g. "I'm trying to write
>>>> an IO for system $x and it requires the following initialization and the
>>>> following cleanup logic and the following processing in between") I'll be
>>>> better able to help you.
>>>>
>>>>
>>>> Take a simple example of a transform requiring a connection. Using
>>>> bundles is a perf killer since size is not controlled. Using teardown
>>>> doesnt allow you to release the connection since it is a best effort thing.
>>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>>> to launch other processings - concurrent limit.
>>>>
>>>
>>> For this example @Teardown is an exact fit. If things die so badly that
>>> @Teardown is not called then nothing else can be called to close the
>>> connection either. What AWS service are you thinking of that stays open for
>>> a long time when everything at the other end has died?
>>>
>>>
>>> You assume connections are kind of stateless but some (proprietary)
>>> protocols requires some closing exchanges which are not only "im leaving".
>>>
>>> For aws i was thinking about starting some services - machines - on the
>>> fly in a pipeline startup and closing them at the end. If teardown is not
>>> called you leak machines and money. You can say it can be done another
>>> way...as the full pipeline ;).
>>>
>>> I dont want to be picky but if beam cant handle its components lifecycle
>>> it can be used at scale for generic pipelines and if bound to some
>>> particular IO.
>>>
>>> What does prevent to enforce teardown - ignoring the interstellar crash
>>> case which cant be handled by any human system? Nothing technically. Why do
>>> you push to not handle it? Is it due to some legacy code on dataflow or
>>> something else?
>>>
>> Teardown *is* already documented and implemented this way (best-effort).
>> So I'm not sure what kind of change you're asking for.
>>
>
> Remove "best effort" from the javadoc. If it is not call then it is a bug
> and we are done :).
>
>
>>
>>
>>> Also what does it mean for the users? Direct runner does it so if a user
>>> udes the RI in test, he will get a different behavior in prod? Also dont
>>> forget the user doesnt know what the IOs he composes use so this is so
>>> impacting for the whole product than he must be handled IMHO.
>>>
>>> I understand the portability culture is new in big data world but it is
>>> not a reason to ignore what people did for years and do it wrong before
>>> doing right ;).
>>>
>>> My proposal is to list what can prevent to guarantee - in the normal IT
>>> conditions - the execution of teardown. Then we see if we can handle it and
>>> only if there is a technical reason we cant we make it
>>> experimental/unsupported in the api. I know spark and flink can, any
>>> unknown blocker for other runners?
>>>
>>> Technical note: even a kill should go through java shutdown hooks
>>> otherwise your environment (beam enclosing software) is fully unhandled and
>>> your overall system is uncontrolled. Only case where it is not true is when
>>> the software is always owned by a vendor and never installed on customer
>>> environment. In this case it belongd to the vendor to handle beam API and
>>> not to beam to adjust its API for a vendor - otherwise all unsupported
>>> features by one runner should be made optional right?
>>>
>>> All state is not about network, even in distributed systems so this is
>>> key to have an explicit and defined lifecycle.
>>>
>>>
>>> Kenn
>>>
>>>
>>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

2018-02-18 18:00 GMT+01:00 Eugene Kirpichov <ki...@google.com>:

>
>
> On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>>
>>
>> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :
>>
>> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>>
>>> If you give an example of a high-level need (e.g. "I'm trying to write
>>> an IO for system $x and it requires the following initialization and the
>>> following cleanup logic and the following processing in between") I'll be
>>> better able to help you.
>>>
>>>
>>> Take a simple example of a transform requiring a connection. Using
>>> bundles is a perf killer since size is not controlled. Using teardown
>>> doesnt allow you to release the connection since it is a best effort thing.
>>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>>> to launch other processings - concurrent limit.
>>>
>>
>> For this example @Teardown is an exact fit. If things die so badly that
>> @Teardown is not called then nothing else can be called to close the
>> connection either. What AWS service are you thinking of that stays open for
>> a long time when everything at the other end has died?
>>
>>
>> You assume connections are kind of stateless but some (proprietary)
>> protocols requires some closing exchanges which are not only "im leaving".
>>
>> For aws i was thinking about starting some services - machines - on the
>> fly in a pipeline startup and closing them at the end. If teardown is not
>> called you leak machines and money. You can say it can be done another
>> way...as the full pipeline ;).
>>
>> I dont want to be picky but if beam cant handle its components lifecycle
>> it can be used at scale for generic pipelines and if bound to some
>> particular IO.
>>
>> What does prevent to enforce teardown - ignoring the interstellar crash
>> case which cant be handled by any human system? Nothing technically. Why do
>> you push to not handle it? Is it due to some legacy code on dataflow or
>> something else?
>>
> Teardown *is* already documented and implemented this way (best-effort).
> So I'm not sure what kind of change you're asking for.
>

Remove "best effort" from the javadoc. If it is not call then it is a bug
and we are done :).


>
>
>> Also what does it mean for the users? Direct runner does it so if a user
>> udes the RI in test, he will get a different behavior in prod? Also dont
>> forget the user doesnt know what the IOs he composes use so this is so
>> impacting for the whole product than he must be handled IMHO.
>>
>> I understand the portability culture is new in big data world but it is
>> not a reason to ignore what people did for years and do it wrong before
>> doing right ;).
>>
>> My proposal is to list what can prevent to guarantee - in the normal IT
>> conditions - the execution of teardown. Then we see if we can handle it and
>> only if there is a technical reason we cant we make it
>> experimental/unsupported in the api. I know spark and flink can, any
>> unknown blocker for other runners?
>>
>> Technical note: even a kill should go through java shutdown hooks
>> otherwise your environment (beam enclosing software) is fully unhandled and
>> your overall system is uncontrolled. Only case where it is not true is when
>> the software is always owned by a vendor and never installed on customer
>> environment. In this case it belongd to the vendor to handle beam API and
>> not to beam to adjust its API for a vendor - otherwise all unsupported
>> features by one runner should be made optional right?
>>
>> All state is not about network, even in distributed systems so this is
>> key to have an explicit and defined lifecycle.
>>
>>
>> Kenn
>>
>>
>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

On Sun, Feb 18, 2018, 2:06 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

>
>
> Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :
>
> On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <rmannibucau@gmail.com
> > wrote:
>>
>> If you give an example of a high-level need (e.g. "I'm trying to write an
>> IO for system $x and it requires the following initialization and the
>> following cleanup logic and the following processing in between") I'll be
>> better able to help you.
>>
>>
>> Take a simple example of a transform requiring a connection. Using
>> bundles is a perf killer since size is not controlled. Using teardown
>> doesnt allow you to release the connection since it is a best effort thing.
>> Not releasing the connection makes you pay a lot - aws ;) - or prevents you
>> to launch other processings - concurrent limit.
>>
>
> For this example @Teardown is an exact fit. If things die so badly that
> @Teardown is not called then nothing else can be called to close the
> connection either. What AWS service are you thinking of that stays open for
> a long time when everything at the other end has died?
>
>
> You assume connections are kind of stateless but some (proprietary)
> protocols requires some closing exchanges which are not only "im leaving".
>
> For aws i was thinking about starting some services - machines - on the
> fly in a pipeline startup and closing them at the end. If teardown is not
> called you leak machines and money. You can say it can be done another
> way...as the full pipeline ;).
>
> I dont want to be picky but if beam cant handle its components lifecycle
> it can be used at scale for generic pipelines and if bound to some
> particular IO.
>
> What does prevent to enforce teardown - ignoring the interstellar crash
> case which cant be handled by any human system? Nothing technically. Why do
> you push to not handle it? Is it due to some legacy code on dataflow or
> something else?
>
Teardown *is* already documented and implemented this way (best-effort). So
I'm not sure what kind of change you're asking for.


> Also what does it mean for the users? Direct runner does it so if a user
> udes the RI in test, he will get a different behavior in prod? Also dont
> forget the user doesnt know what the IOs he composes use so this is so
> impacting for the whole product than he must be handled IMHO.
>
> I understand the portability culture is new in big data world but it is
> not a reason to ignore what people did for years and do it wrong before
> doing right ;).
>
> My proposal is to list what can prevent to guarantee - in the normal IT
> conditions - the execution of teardown. Then we see if we can handle it and
> only if there is a technical reason we cant we make it
> experimental/unsupported in the api. I know spark and flink can, any
> unknown blocker for other runners?
>
> Technical note: even a kill should go through java shutdown hooks
> otherwise your environment (beam enclosing software) is fully unhandled and
> your overall system is uncontrolled. Only case where it is not true is when
> the software is always owned by a vendor and never installed on customer
> environment. In this case it belongd to the vendor to handle beam API and
> not to beam to adjust its API for a vendor - otherwise all unsupported
> features by one runner should be made optional right?
>
> All state is not about network, even in distributed systems so this is key
> to have an explicit and defined lifecycle.
>
>
> Kenn
>
>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 18 févr. 2018 00:23, "Kenneth Knowles" <kl...@google.com> a écrit :

On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <rm...@gmail.com>
wrote:
>
> If you give an example of a high-level need (e.g. "I'm trying to write an
> IO for system $x and it requires the following initialization and the
> following cleanup logic and the following processing in between") I'll be
> better able to help you.
>
>
> Take a simple example of a transform requiring a connection. Using bundles
> is a perf killer since size is not controlled. Using teardown doesnt allow
> you to release the connection since it is a best effort thing. Not
> releasing the connection makes you pay a lot - aws ;) - or prevents you to
> launch other processings - concurrent limit.
>

For this example @Teardown is an exact fit. If things die so badly that
@Teardown is not called then nothing else can be called to close the
connection either. What AWS service are you thinking of that stays open for
a long time when everything at the other end has died?

You assume connections are kind of stateless but some (proprietary)
protocols requires some closing exchanges which are not only "im leaving".

For aws i was thinking about starting some services - machines - on the fly
in a pipeline startup and closing them at the end. If teardown is not
called you leak machines and money. You can say it can be done another
way...as the full pipeline ;).

I dont want to be picky but if beam cant handle its components lifecycle it
can be used at scale for generic pipelines and if bound to some particular
IO.

What does prevent to enforce teardown - ignoring the interstellar crash
case which cant be handled by any human system? Nothing technically. Why do
you push to not handle it? Is it due to some legacy code on dataflow or
something else?

Also what does it mean for the users? Direct runner does it so if a user
udes the RI in test, he will get a different behavior in prod? Also dont
forget the user doesnt know what the IOs he composes use so this is so
impacting for the whole product than he must be handled IMHO.

I understand the portability culture is new in big data world but it is not
a reason to ignore what people did for years and do it wrong before doing
right ;).

My proposal is to list what can prevent to guarantee - in the normal IT
conditions - the execution of teardown. Then we see if we can handle it and
only if there is a technical reason we cant we make it
experimental/unsupported in the api. I know spark and flink can, any
unknown blocker for other runners?

Technical note: even a kill should go through java shutdown hooks otherwise
your environment (beam enclosing software) is fully unhandled and your
overall system is uncontrolled. Only case where it is not true is when the
software is always owned by a vendor and never installed on customer
environment. In this case it belongd to the vendor to handle beam API and
not to beam to adjust its API for a vendor - otherwise all unsupported
features by one runner should be made optional right?

All state is not about network, even in distributed systems so this is key
to have an explicit and defined lifecycle.

Kenn

Re: @TearDown guarantees

Posted by Kenneth Knowles <kl...@google.com>.

On Sat, Feb 17, 2018 at 3:09 PM, Romain Manni-Bucau <rm...@gmail.com>
wrote:
>
> If you give an example of a high-level need (e.g. "I'm trying to write an
> IO for system $x and it requires the following initialization and the
> following cleanup logic and the following processing in between") I'll be
> better able to help you.
>
>
> Take a simple example of a transform requiring a connection. Using bundles
> is a perf killer since size is not controlled. Using teardown doesnt allow
> you to release the connection since it is a best effort thing. Not
> releasing the connection makes you pay a lot - aws ;) - or prevents you to
> launch other processings - concurrent limit.
>

For this example @Teardown is an exact fit. If things die so badly that
@Teardown is not called then nothing else can be called to close the
connection either. What AWS service are you thinking of that stays open for
a long time when everything at the other end has died?

Kenn

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 17 févr. 2018 22:31, "Eugene Kirpichov" <ki...@google.com> a écrit :

On Sat, Feb 17, 2018 at 1:10 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> You phrased it right Eugene - thanks for that.
>
> However the solution is not functional I think - hope I missed something.
> With distribution etc you cant use by reference param passing, therefore no
> way to clean up the internal states of another fn. So i kind of feel back
> to the original need :(.
>
Correct - in a fault-tolerant distributed data processing system you can't
have one PTransform talk to another PTransform as if it was an in-memory
Java object. If you want to talk to an in-memory non-distributed Java
object, you have to keep it contained to a single block of your own Java
code (e.g. a single ProcessElement call) and use try-with-resources or
something.

If you give an example of a high-level need (e.g. "I'm trying to write an
IO for system $x and it requires the following initialization and the
following cleanup logic and the following processing in between") I'll be
better able to help you.


Take a simple example of a transform requiring a connection. Using bundles
is a perf killer since size is not controlled. Using teardown doesnt allow
you to release the connection since it is a best effort thing. Not
releasing the connection makes you pay a lot - aws ;) - or prevents you to
launch other processings - concurrent limit.

This is a trivial and common case where a clear instance lifecycle is
required to have a well behaving and portably IO - or transform for
distributed locks for instance but this case is more complex.

There are API requiring an acquire and a release calls and setup/teardown
are the best place to do it but must be a must.

If bundles size is no more runner dependent - dont think it is possible
with the coming sdf but explaining why buncles are not an option - it can
become an option since user can set it big enough to ignore other side
effects (concurrency, perf etc). Would also allow to get rid of the maxSize
configs which are quite weird for a batch solution
(chunking/commit-interval is native since more than 30 years in most batch
products no? ... can need to be fixed later enriching sdf api when
supported by all runners, lets ignore it maybe for this thread).



>
> Also interested in a probably stupid question: why
> teardown/setup/startbundle/finishbundle are in the api if it is not
> usable? Dont we want as a portable lib fix that?
>
I'm not following this: they are usable, are successfully used in many IO
connectors, and their advertised behavior is implemented by all runners
(though with somewhat different performance characteristics), including
when using the portability framework. If you feel that this is not the
case, please file a JIRA with a code example that behaves not as you
expected in a specific runner.


>
> Once again i see to technical blocker - checked flink/spark/direct runners
> - to make it usable so why not simplifying user lifes? Anything important i
> miss?
>
> Le 17 févr. 2018 21:11, "Jean-Baptiste Onofré" <jb...@nanthrax.net> a écrit :
>
>> I agree, it's a decent assumption.
>>
>> Regards
>> JB
>>
>> On 02/17/2018 05:59 PM, Romain Manni-Bucau wrote:
>> > Assuming a Pipeline.run(); the corresponding sequence:
>> >
>> > WorkerStartFn();
>> > WorkerEndFn();
>> >
>> > So a single instance of the fn for the full pipeline execution.
>> >
>> > Le 17 févr. 2018 17:42, "Reuven Lax" <relax@google.com
>> > <ma...@google.com>> a écrit :
>> >
>> >     " and a transform is by design bound to an execution"
>> >
>> >     What do you mean by execution?
>> >
>> >     On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com
>> >     <ma...@gmail.com>> wrote:
>> >
>> >
>> >
>> >         Le 16 févr. 2018 22:41, "Reuven Lax" <relax@google.com
>> >         <ma...@google.com>> a écrit :
>> >
>> >             Kenn is correct. Allowing Fn reuse across bundles was a
>> major, major
>> >             performance improvement. Profiling on the old Dataflow SDKs
>> >             consistently showed Java serialization being the number one
>> >             performance bottleneck for streaming pipelines, and Beam
>> fixed this.
>> >
>> >
>> >         Sorry but this doesnt help me much to understand. Let me try to
>> explain.
>> >         I read it as "we were slow somehow around serialization so a
>> quick fix
>> >         was caching".
>> >
>> >         It is not to be picky but i had a lot of remote ejb over rmi
>> super fast
>> >         setup do java serialization is slower than alternative
>> serialization,
>> >         right, but doesnt justify caching most of the time.
>> >
>> >         My main interrogation is: isnt beam which is designed to be
>> slow in the
>> >         way it designed the dofn/transform and therefore serializes way
>> more
>> >         than it requires - you never care to serialize the full
>> transform and
>> >         can in 95% do a writeReplace which is light and fast compared
>> to the
>> >         default.
>> >
>> >         If so the cache is an implementation workaround and not a fix.
>> >
>> >         Hope my view is clearer on it.
>> >
>> >
>> >
>> >             Romain - can you state precisely what you want? I do think
>> there is
>> >             still a gap - IMO there's a place for a longer-lived per-fn
>> >             container; evidence for this is that people still often
>> need to use
>> >             statics to store things. However I'm not sure if this is
>> what you're
>> >             looking for.
>> >
>> >
>> >         Yes. I build a framework on top of beam and must be able to
>> provide a
>> >         lifecycle clear and reliable. The bare minimum for any user is
>> >         start-exec-stop and a transform is by design bound to an
>> execution
>> >         (stream or batch).
>> >
>> >         Bundles are not an option as explained cause not bound to the
>> execution
>> >         but an uncontrolled subpart. You can see it as a beam internal
>> until
>> >         runners unify this definition. And in any case it is closer to
>> a chunk
>> >         notion than a lifecycle one.
>> >
>> >         So setup and teardown must be symmetric.
>> >
>> >         Note that a dofn instance owns a config so is bound to an
>> execution.
>> >
>> >         This all lead to the nees of a reliable teardown.
>> >
>> >         Caching can be neat bit requires it own api like passivation
>> one of ejbs.
>> >
>> >
>> >
>> >             Reuven
>> >
>> >             On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <
>> klk@google.com
>> >             <ma...@google.com>> wrote:
>> >
>> >                 On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau
>> >                 <rmannibucau@gmail.com <ma...@gmail.com>>
>> wrote:
>> >
>> >                     The serialization of fn being once per bundle, the
>> perf
>> >                     impact is only huge if there is a bug somewhere
>> else, even
>> >                     java serialization is negligeable on big config
>> compared to
>> >                     any small pipeline (seconds vs minutes).
>> >
>> >
>> >                 Profiling is clear that this is a huge performance
>> impact. One
>> >                 of the most important backwards-incompatible changes we
>> made for
>> >                 Beam 2.0.0 was to allow Fn reuse across bundles.
>> >
>> >                 When we used a DoFn only for one bundle, there was no
>> @Teardown
>> >                 because it has ~no use. You do everything in
>> @FinishBundle. So
>> >                 for whatever use case you are working on, if your
>> pipeline
>> >                 performs well enough doing it per bundle, you can put
>> it in
>> >                 @FinishBundle. Of course it still might not get called
>> because
>> >                 that is a logical impossibility - you just know that
>> for a given
>> >                 element the element will be retried if @FinishBundle
>> fails.
>> >
>> >                 If you have cleanup logic that absolutely must get
>> executed,
>> >                 then you need to build a composite PTransform around it
>> so it
>> >                 will be retried until cleanup succeeds. In Beam's sinks
>> you can
>> >                 find many examples.
>> >
>> >                 Kenn
>> >
>> >
>> >
>> >
>>
>> --
>> Jean-Baptiste Onofré
>> jbonofre@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

On Sat, Feb 17, 2018 at 1:10 PM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> You phrased it right Eugene - thanks for that.
>
> However the solution is not functional I think - hope I missed something.
> With distribution etc you cant use by reference param passing, therefore no
> way to clean up the internal states of another fn. So i kind of feel back
> to the original need :(.
>
Correct - in a fault-tolerant distributed data processing system you can't
have one PTransform talk to another PTransform as if it was an in-memory
Java object. If you want to talk to an in-memory non-distributed Java
object, you have to keep it contained to a single block of your own Java
code (e.g. a single ProcessElement call) and use try-with-resources or
something.

If you give an example of a high-level need (e.g. "I'm trying to write an
IO for system $x and it requires the following initialization and the
following cleanup logic and the following processing in between") I'll be
better able to help you.


>
> Also interested in a probably stupid question: why
> teardown/setup/startbundle/finishbundle are in the api if it is not usable?
> Dont we want as a portable lib fix that?
>
I'm not following this: they are usable, are successfully used in many IO
connectors, and their advertised behavior is implemented by all runners
(though with somewhat different performance characteristics), including
when using the portability framework. If you feel that this is not the
case, please file a JIRA with a code example that behaves not as you
expected in a specific runner.


>
> Once again i see to technical blocker - checked flink/spark/direct runners
> - to make it usable so why not simplifying user lifes? Anything important i
> miss?
>
> Le 17 févr. 2018 21:11, "Jean-Baptiste Onofré" <jb...@nanthrax.net> a écrit :
>
>> I agree, it's a decent assumption.
>>
>> Regards
>> JB
>>
>> On 02/17/2018 05:59 PM, Romain Manni-Bucau wrote:
>> > Assuming a Pipeline.run(); the corresponding sequence:
>> >
>> > WorkerStartFn();
>> > WorkerEndFn();
>> >
>> > So a single instance of the fn for the full pipeline execution.
>> >
>> > Le 17 févr. 2018 17:42, "Reuven Lax" <relax@google.com
>> > <ma...@google.com>> a écrit :
>> >
>> >     " and a transform is by design bound to an execution"
>> >
>> >     What do you mean by execution?
>> >
>> >     On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com
>> >     <ma...@gmail.com>> wrote:
>> >
>> >
>> >
>> >         Le 16 févr. 2018 22:41, "Reuven Lax" <relax@google.com
>> >         <ma...@google.com>> a écrit :
>> >
>> >             Kenn is correct. Allowing Fn reuse across bundles was a
>> major, major
>> >             performance improvement. Profiling on the old Dataflow SDKs
>> >             consistently showed Java serialization being the number one
>> >             performance bottleneck for streaming pipelines, and Beam
>> fixed this.
>> >
>> >
>> >         Sorry but this doesnt help me much to understand. Let me try to
>> explain.
>> >         I read it as "we were slow somehow around serialization so a
>> quick fix
>> >         was caching".
>> >
>> >         It is not to be picky but i had a lot of remote ejb over rmi
>> super fast
>> >         setup do java serialization is slower than alternative
>> serialization,
>> >         right, but doesnt justify caching most of the time.
>> >
>> >         My main interrogation is: isnt beam which is designed to be
>> slow in the
>> >         way it designed the dofn/transform and therefore serializes way
>> more
>> >         than it requires - you never care to serialize the full
>> transform and
>> >         can in 95% do a writeReplace which is light and fast compared
>> to the
>> >         default.
>> >
>> >         If so the cache is an implementation workaround and not a fix.
>> >
>> >         Hope my view is clearer on it.
>> >
>> >
>> >
>> >             Romain - can you state precisely what you want? I do think
>> there is
>> >             still a gap - IMO there's a place for a longer-lived per-fn
>> >             container; evidence for this is that people still often
>> need to use
>> >             statics to store things. However I'm not sure if this is
>> what you're
>> >             looking for.
>> >
>> >
>> >         Yes. I build a framework on top of beam and must be able to
>> provide a
>> >         lifecycle clear and reliable. The bare minimum for any user is
>> >         start-exec-stop and a transform is by design bound to an
>> execution
>> >         (stream or batch).
>> >
>> >         Bundles are not an option as explained cause not bound to the
>> execution
>> >         but an uncontrolled subpart. You can see it as a beam internal
>> until
>> >         runners unify this definition. And in any case it is closer to
>> a chunk
>> >         notion than a lifecycle one.
>> >
>> >         So setup and teardown must be symmetric.
>> >
>> >         Note that a dofn instance owns a config so is bound to an
>> execution.
>> >
>> >         This all lead to the nees of a reliable teardown.
>> >
>> >         Caching can be neat bit requires it own api like passivation
>> one of ejbs.
>> >
>> >
>> >
>> >             Reuven
>> >
>> >             On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <
>> klk@google.com
>> >             <ma...@google.com>> wrote:
>> >
>> >                 On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau
>> >                 <rmannibucau@gmail.com <ma...@gmail.com>>
>> wrote:
>> >
>> >                     The serialization of fn being once per bundle, the
>> perf
>> >                     impact is only huge if there is a bug somewhere
>> else, even
>> >                     java serialization is negligeable on big config
>> compared to
>> >                     any small pipeline (seconds vs minutes).
>> >
>> >
>> >                 Profiling is clear that this is a huge performance
>> impact. One
>> >                 of the most important backwards-incompatible changes we
>> made for
>> >                 Beam 2.0.0 was to allow Fn reuse across bundles.
>> >
>> >                 When we used a DoFn only for one bundle, there was no
>> @Teardown
>> >                 because it has ~no use. You do everything in
>> @FinishBundle. So
>> >                 for whatever use case you are working on, if your
>> pipeline
>> >                 performs well enough doing it per bundle, you can put
>> it in
>> >                 @FinishBundle. Of course it still might not get called
>> because
>> >                 that is a logical impossibility - you just know that
>> for a given
>> >                 element the element will be retried if @FinishBundle
>> fails.
>> >
>> >                 If you have cleanup logic that absolutely must get
>> executed,
>> >                 then you need to build a composite PTransform around it
>> so it
>> >                 will be retried until cleanup succeeds. In Beam's sinks
>> you can
>> >                 find many examples.
>> >
>> >                 Kenn
>> >
>> >
>> >
>> >
>>
>> --
>> Jean-Baptiste Onofré
>> jbonofre@apache.org
>> http://blog.nanthrax.net
>> Talend - http://www.talend.com
>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

You phrased it right Eugene - thanks for that.

However the solution is not functional I think - hope I missed something.
With distribution etc you cant use by reference param passing, therefore no
way to clean up the internal states of another fn. So i kind of feel back
to the original need :(.

Also interested in a probably stupid question: why
teardown/setup/startbundle/finishbundle are in the api if it is not usable?
Dont we want as a portable lib fix that?

Once again i see to technical blocker - checked flink/spark/direct runners
- to make it usable so why not simplifying user lifes? Anything important i
miss?

Le 17 févr. 2018 21:11, "Jean-Baptiste Onofré" <jb...@nanthrax.net> a écrit :

> I agree, it's a decent assumption.
>
> Regards
> JB
>
> On 02/17/2018 05:59 PM, Romain Manni-Bucau wrote:
> > Assuming a Pipeline.run(); the corresponding sequence:
> >
> > WorkerStartFn();
> > WorkerEndFn();
> >
> > So a single instance of the fn for the full pipeline execution.
> >
> > Le 17 févr. 2018 17:42, "Reuven Lax" <relax@google.com
> > <ma...@google.com>> a écrit :
> >
> >     " and a transform is by design bound to an execution"
> >
> >     What do you mean by execution?
> >
> >     On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau <
> rmannibucau@gmail.com
> >     <ma...@gmail.com>> wrote:
> >
> >
> >
> >         Le 16 févr. 2018 22:41, "Reuven Lax" <relax@google.com
> >         <ma...@google.com>> a écrit :
> >
> >             Kenn is correct. Allowing Fn reuse across bundles was a
> major, major
> >             performance improvement. Profiling on the old Dataflow SDKs
> >             consistently showed Java serialization being the number one
> >             performance bottleneck for streaming pipelines, and Beam
> fixed this.
> >
> >
> >         Sorry but this doesnt help me much to understand. Let me try to
> explain.
> >         I read it as "we were slow somehow around serialization so a
> quick fix
> >         was caching".
> >
> >         It is not to be picky but i had a lot of remote ejb over rmi
> super fast
> >         setup do java serialization is slower than alternative
> serialization,
> >         right, but doesnt justify caching most of the time.
> >
> >         My main interrogation is: isnt beam which is designed to be slow
> in the
> >         way it designed the dofn/transform and therefore serializes way
> more
> >         than it requires - you never care to serialize the full
> transform and
> >         can in 95% do a writeReplace which is light and fast compared to
> the
> >         default.
> >
> >         If so the cache is an implementation workaround and not a fix.
> >
> >         Hope my view is clearer on it.
> >
> >
> >
> >             Romain - can you state precisely what you want? I do think
> there is
> >             still a gap - IMO there's a place for a longer-lived per-fn
> >             container; evidence for this is that people still often need
> to use
> >             statics to store things. However I'm not sure if this is
> what you're
> >             looking for.
> >
> >
> >         Yes. I build a framework on top of beam and must be able to
> provide a
> >         lifecycle clear and reliable. The bare minimum for any user is
> >         start-exec-stop and a transform is by design bound to an
> execution
> >         (stream or batch).
> >
> >         Bundles are not an option as explained cause not bound to the
> execution
> >         but an uncontrolled subpart. You can see it as a beam internal
> until
> >         runners unify this definition. And in any case it is closer to a
> chunk
> >         notion than a lifecycle one.
> >
> >         So setup and teardown must be symmetric.
> >
> >         Note that a dofn instance owns a config so is bound to an
> execution.
> >
> >         This all lead to the nees of a reliable teardown.
> >
> >         Caching can be neat bit requires it own api like passivation one
> of ejbs.
> >
> >
> >
> >             Reuven
> >
> >             On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <
> klk@google.com
> >             <ma...@google.com>> wrote:
> >
> >                 On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau
> >                 <rmannibucau@gmail.com <ma...@gmail.com>>
> wrote:
> >
> >                     The serialization of fn being once per bundle, the
> perf
> >                     impact is only huge if there is a bug somewhere
> else, even
> >                     java serialization is negligeable on big config
> compared to
> >                     any small pipeline (seconds vs minutes).
> >
> >
> >                 Profiling is clear that this is a huge performance
> impact. One
> >                 of the most important backwards-incompatible changes we
> made for
> >                 Beam 2.0.0 was to allow Fn reuse across bundles.
> >
> >                 When we used a DoFn only for one bundle, there was no
> @Teardown
> >                 because it has ~no use. You do everything in
> @FinishBundle. So
> >                 for whatever use case you are working on, if your
> pipeline
> >                 performs well enough doing it per bundle, you can put it
> in
> >                 @FinishBundle. Of course it still might not get called
> because
> >                 that is a logical impossibility - you just know that for
> a given
> >                 element the element will be retried if @FinishBundle
> fails.
> >
> >                 If you have cleanup logic that absolutely must get
> executed,
> >                 then you need to build a composite PTransform around it
> so it
> >                 will be retried until cleanup succeeds. In Beam's sinks
> you can
> >                 find many examples.
> >
> >                 Kenn
> >
> >
> >
> >
>
> --
> Jean-Baptiste Onofré
> jbonofre@apache.org
> http://blog.nanthrax.net
> Talend - http://www.talend.com
>

Re: @TearDown guarantees

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.

I agree, it's a decent assumption.

Regards
JB

On 02/17/2018 05:59 PM, Romain Manni-Bucau wrote:
> Assuming a Pipeline.run(); the corresponding sequence:
> 
> WorkerStartFn();
> WorkerEndFn();
> 
> So a single instance of the fn for the full pipeline execution.
> 
> Le 17 févr. 2018 17:42, "Reuven Lax" <relax@google.com
> <ma...@google.com>> a écrit :
> 
>     " and a transform is by design bound to an execution"
> 
>     What do you mean by execution?
> 
>     On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau <rmannibucau@gmail.com
>     <ma...@gmail.com>> wrote:
> 
> 
> 
>         Le 16 févr. 2018 22:41, "Reuven Lax" <relax@google.com
>         <ma...@google.com>> a écrit :
> 
>             Kenn is correct. Allowing Fn reuse across bundles was a major, major
>             performance improvement. Profiling on the old Dataflow SDKs
>             consistently showed Java serialization being the number one
>             performance bottleneck for streaming pipelines, and Beam fixed this.
> 
> 
>         Sorry but this doesnt help me much to understand. Let me try to explain.
>         I read it as "we were slow somehow around serialization so a quick fix
>         was caching".
> 
>         It is not to be picky but i had a lot of remote ejb over rmi super fast
>         setup do java serialization is slower than alternative serialization,
>         right, but doesnt justify caching most of the time.
> 
>         My main interrogation is: isnt beam which is designed to be slow in the
>         way it designed the dofn/transform and therefore serializes way more
>         than it requires - you never care to serialize the full transform and
>         can in 95% do a writeReplace which is light and fast compared to the
>         default.
> 
>         If so the cache is an implementation workaround and not a fix.
> 
>         Hope my view is clearer on it.
> 
> 
> 
>             Romain - can you state precisely what you want? I do think there is
>             still a gap - IMO there's a place for a longer-lived per-fn
>             container; evidence for this is that people still often need to use
>             statics to store things. However I'm not sure if this is what you're
>             looking for.
> 
> 
>         Yes. I build a framework on top of beam and must be able to provide a
>         lifecycle clear and reliable. The bare minimum for any user is
>         start-exec-stop and a transform is by design bound to an execution
>         (stream or batch).
> 
>         Bundles are not an option as explained cause not bound to the execution
>         but an uncontrolled subpart. You can see it as a beam internal until
>         runners unify this definition. And in any case it is closer to a chunk
>         notion than a lifecycle one.
> 
>         So setup and teardown must be symmetric.
> 
>         Note that a dofn instance owns a config so is bound to an execution.
> 
>         This all lead to the nees of a reliable teardown.
> 
>         Caching can be neat bit requires it own api like passivation one of ejbs.
> 
> 
> 
>             Reuven
> 
>             On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <klk@google.com
>             <ma...@google.com>> wrote:
> 
>                 On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau
>                 <rmannibucau@gmail.com <ma...@gmail.com>> wrote:
> 
>                     The serialization of fn being once per bundle, the perf
>                     impact is only huge if there is a bug somewhere else, even
>                     java serialization is negligeable on big config compared to
>                     any small pipeline (seconds vs minutes).
> 
>                  
>                 Profiling is clear that this is a huge performance impact. One
>                 of the most important backwards-incompatible changes we made for
>                 Beam 2.0.0 was to allow Fn reuse across bundles.
> 
>                 When we used a DoFn only for one bundle, there was no @Teardown
>                 because it has ~no use. You do everything in @FinishBundle. So
>                 for whatever use case you are working on, if your pipeline
>                 performs well enough doing it per bundle, you can put it in
>                 @FinishBundle. Of course it still might not get called because
>                 that is a logical impossibility - you just know that for a given
>                 element the element will be retried if @FinishBundle fails.
> 
>                 If you have cleanup logic that absolutely must get executed,
>                 then you need to build a composite PTransform around it so it
>                 will be retried until cleanup succeeds. In Beam's sinks you can
>                 find many examples.
> 
>                 Kenn
> 
> 
> 
> 

-- 
Jean-Baptiste Onofré
jbonofre@apache.org
http://blog.nanthrax.net
Talend - http://www.talend.com

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

Actually the initialization should be treated using Wait transform too.

So basically the pattern is just:
input.apply(Wait.on(...initialization result...))
  .apply(...your processing...)
  .apply(Wait.on(...finalization result...))

where initialization and finalization results can be computed using
arbitrary PTransforms.

On Sat, Feb 17, 2018 at 11:11 AM Eugene Kirpichov <ki...@google.com>
wrote:

> "Single instance of the fn for the full pipeline execution", if taken
> literally, is incompatible:
> - with parallelization: requiring a single instance rules out multiple
> parallel/distributed instances
> - with fault tolerance: what if the worker running this "single instance"
> crashes or becomes a zombie - then, obviously, we'll need to create another
> instance
> - with infinite collections: "full pipeline execution" is moot. More
> likely than not, you'd want this per window rather than truly globally
> Also, you probably want this sort of scoping at the level of arbitrary
> PTransforms, not DoFn's: what if at some point you need to refactor the
> DoFn into a more complex transform?
>
> But I think I understand what you mean and at the core, it's a legitimate
> need. Please correct me if this is wrong: you want to be able to write
> per-window initialization/finalization code - "code that will run before
> this PTransform starts processing any elements in window W", and "code that
> will run at a point when it's guaranteed that this PTransform will never be
> asked to process any elements in window W".
>
> We can flip this to be instead about PCollections: "code that will run
> before any PTransform can observe any elements from this collection in
> window W" and "code that will run at a point when no PTransform can ever
> observe any elements from this collection in window W".
>
> We've faced this need multiple times in the past in various forms and we
> know how to address it. This need is faced e.g. by BigQueryIO.read() and
> write(), which need to create import/export jobs ("WorkerStartFn") and
> clean up temporary files once they are done ("WorkerEndFn"). However, this
> is a separate need from the one satisfied by @Setup/@Teardown or the bundle
> methods.
>
> - @ProcessElement is the only semantically necessary primitive. What you
> want can be achieved without either of Start/FinishBundle or
> Setup/Teardown, in a way I describe below.
> - @Start/FinishBundle is a semantically no-op optimization that allows you
> to amortize the cost of processing multiple elements that can be processed
> together (batching), by letting you know what is the scope over which you
> are and aren't allowed to amortize.
> - @Setup/@Teardown is a semantically no-op optimization that allows you to
> share costly long-lived resources between bundles running in the same
> thread.
>
> What you want can be achieved using side inputs and the Wait transform,
> e.g.:
>
> PCollection<Foo> rawInput = ...;
> PCollectionView<Void> initResult = ...apply initialization transform...
> .apply(View.asSingleton());
> PCollection<Foo> protectedInput =
> input.apply(ParDo.of(...identity...).withSideInputs(initResult));
> // protectedInput has the property that, by the time you process it, the
> initialization transform has already run.
> PCollection<Bar> rawOutput = protectedInput.apply(...your processing...);
> PCollection<Void> finalizationResult = ...apply finalization transform,
> possibly using "rawOutput"...;
> PCollection<Bar> finalizedOutput =
> rawOutput.apply(Wait.on(finalizationResult));
> // finalizedOutput has the property that, by the time you process it, the
> finalization transform has already run
>
>
>
> On Sat, Feb 17, 2018 at 8:59 AM Romain Manni-Bucau <rm...@gmail.com>
> wrote:
>
>> Assuming a Pipeline.run(); the corresponding sequence:
>>
>> WorkerStartFn();
>> WorkerEndFn();
>>
>> So a single instance of the fn for the full pipeline execution.
>>
>> Le 17 févr. 2018 17:42, "Reuven Lax" <re...@google.com> a écrit :
>>
>>> " and a transform is by design bound to an execution"
>>>
>>> What do you mean by execution?
>>>
>>> On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>>
>>>>
>>>> Le 16 févr. 2018 22:41, "Reuven Lax" <re...@google.com> a écrit :
>>>>
>>>> Kenn is correct. Allowing Fn reuse across bundles was a major, major
>>>> performance improvement. Profiling on the old Dataflow SDKs consistently
>>>> showed Java serialization being the number one performance bottleneck for
>>>> streaming pipelines, and Beam fixed this.
>>>>
>>>>
>>>> Sorry but this doesnt help me much to understand. Let me try to
>>>> explain. I read it as "we were slow somehow around serialization so a quick
>>>> fix was caching".
>>>>
>>>> It is not to be picky but i had a lot of remote ejb over rmi super fast
>>>> setup do java serialization is slower than alternative serialization,
>>>> right, but doesnt justify caching most of the time.
>>>>
>>>> My main interrogation is: isnt beam which is designed to be slow in the
>>>> way it designed the dofn/transform and therefore serializes way more than
>>>> it requires - you never care to serialize the full transform and can in 95%
>>>> do a writeReplace which is light and fast compared to the default.
>>>>
>>>> If so the cache is an implementation workaround and not a fix.
>>>>
>>>> Hope my view is clearer on it.
>>>>
>>>>
>>>>
>>>> Romain - can you state precisely what you want? I do think there is
>>>> still a gap - IMO there's a place for a longer-lived per-fn container;
>>>> evidence for this is that people still often need to use statics to store
>>>> things. However I'm not sure if this is what you're looking for.
>>>>
>>>>
>>>> Yes. I build a framework on top of beam and must be able to provide a
>>>> lifecycle clear and reliable. The bare minimum for any user is
>>>> start-exec-stop and a transform is by design bound to an execution (stream
>>>> or batch).
>>>>
>>>> Bundles are not an option as explained cause not bound to the execution
>>>> but an uncontrolled subpart. You can see it as a beam internal until
>>>> runners unify this definition. And in any case it is closer to a chunk
>>>> notion than a lifecycle one.
>>>>
>>>> So setup and teardown must be symmetric.
>>>>
>>>> Note that a dofn instance owns a config so is bound to an execution.
>>>>
>>>> This all lead to the nees of a reliable teardown.
>>>>
>>>> Caching can be neat bit requires it own api like passivation one of
>>>> ejbs.
>>>>
>>>>
>>>>
>>>> Reuven
>>>>
>>>> On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <kl...@google.com>
>>>> wrote:
>>>>
>>>>> On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>> The serialization of fn being once per bundle, the perf impact is
>>>>>> only huge if there is a bug somewhere else, even java serialization is
>>>>>> negligeable on big config compared to any small pipeline (seconds vs
>>>>>> minutes).
>>>>>>
>>>>>
>>>>> Profiling is clear that this is a huge performance impact. One of the
>>>>> most important backwards-incompatible changes we made for Beam 2.0.0 was to
>>>>> allow Fn reuse across bundles.
>>>>>
>>>>> When we used a DoFn only for one bundle, there was no @Teardown
>>>>> because it has ~no use. You do everything in @FinishBundle. So for whatever
>>>>> use case you are working on, if your pipeline performs well enough doing it
>>>>> per bundle, you can put it in @FinishBundle. Of course it still might not
>>>>> get called because that is a logical impossibility - you just know that for
>>>>> a given element the element will be retried if @FinishBundle fails.
>>>>>
>>>>> If you have cleanup logic that absolutely must get executed, then you
>>>>> need to build a composite PTransform around it so it will be retried until
>>>>> cleanup succeeds. In Beam's sinks you can find many examples.
>>>>>
>>>>> Kenn
>>>>>
>>>>>
>>>>
>>>>
>>>

Re: @TearDown guarantees

Posted by Eugene Kirpichov <ki...@google.com>.

"Single instance of the fn for the full pipeline execution", if taken
literally, is incompatible:
- with parallelization: requiring a single instance rules out multiple
parallel/distributed instances
- with fault tolerance: what if the worker running this "single instance"
crashes or becomes a zombie - then, obviously, we'll need to create another
instance
- with infinite collections: "full pipeline execution" is moot. More likely
than not, you'd want this per window rather than truly globally
Also, you probably want this sort of scoping at the level of arbitrary
PTransforms, not DoFn's: what if at some point you need to refactor the
DoFn into a more complex transform?

But I think I understand what you mean and at the core, it's a legitimate
need. Please correct me if this is wrong: you want to be able to write
per-window initialization/finalization code - "code that will run before
this PTransform starts processing any elements in window W", and "code that
will run at a point when it's guaranteed that this PTransform will never be
asked to process any elements in window W".

We can flip this to be instead about PCollections: "code that will run
before any PTransform can observe any elements from this collection in
window W" and "code that will run at a point when no PTransform can ever
observe any elements from this collection in window W".

We've faced this need multiple times in the past in various forms and we
know how to address it. This need is faced e.g. by BigQueryIO.read() and
write(), which need to create import/export jobs ("WorkerStartFn") and
clean up temporary files once they are done ("WorkerEndFn"). However, this
is a separate need from the one satisfied by @Setup/@Teardown or the bundle
methods.

- @ProcessElement is the only semantically necessary primitive. What you
want can be achieved without either of Start/FinishBundle or
Setup/Teardown, in a way I describe below.
- @Start/FinishBundle is a semantically no-op optimization that allows you
to amortize the cost of processing multiple elements that can be processed
together (batching), by letting you know what is the scope over which you
are and aren't allowed to amortize.
- @Setup/@Teardown is a semantically no-op optimization that allows you to
share costly long-lived resources between bundles running in the same
thread.

What you want can be achieved using side inputs and the Wait transform,
e.g.:

PCollection<Foo> rawInput = ...;
PCollectionView<Void> initResult = ...apply initialization transform...
.apply(View.asSingleton());
PCollection<Foo> protectedInput =
input.apply(ParDo.of(...identity...).withSideInputs(initResult));
// protectedInput has the property that, by the time you process it, the
initialization transform has already run.
PCollection<Bar> rawOutput = protectedInput.apply(...your processing...);
PCollection<Void> finalizationResult = ...apply finalization transform,
possibly using "rawOutput"...;
PCollection<Bar> finalizedOutput =
rawOutput.apply(Wait.on(finalizationResult));
// finalizedOutput has the property that, by the time you process it, the
finalization transform has already run



On Sat, Feb 17, 2018 at 8:59 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Assuming a Pipeline.run(); the corresponding sequence:
>
> WorkerStartFn();
> WorkerEndFn();
>
> So a single instance of the fn for the full pipeline execution.
>
> Le 17 févr. 2018 17:42, "Reuven Lax" <re...@google.com> a écrit :
>
>> " and a transform is by design bound to an execution"
>>
>> What do you mean by execution?
>>
>> On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>>
>>>
>>> Le 16 févr. 2018 22:41, "Reuven Lax" <re...@google.com> a écrit :
>>>
>>> Kenn is correct. Allowing Fn reuse across bundles was a major, major
>>> performance improvement. Profiling on the old Dataflow SDKs consistently
>>> showed Java serialization being the number one performance bottleneck for
>>> streaming pipelines, and Beam fixed this.
>>>
>>>
>>> Sorry but this doesnt help me much to understand. Let me try to explain.
>>> I read it as "we were slow somehow around serialization so a quick fix was
>>> caching".
>>>
>>> It is not to be picky but i had a lot of remote ejb over rmi super fast
>>> setup do java serialization is slower than alternative serialization,
>>> right, but doesnt justify caching most of the time.
>>>
>>> My main interrogation is: isnt beam which is designed to be slow in the
>>> way it designed the dofn/transform and therefore serializes way more than
>>> it requires - you never care to serialize the full transform and can in 95%
>>> do a writeReplace which is light and fast compared to the default.
>>>
>>> If so the cache is an implementation workaround and not a fix.
>>>
>>> Hope my view is clearer on it.
>>>
>>>
>>>
>>> Romain - can you state precisely what you want? I do think there is
>>> still a gap - IMO there's a place for a longer-lived per-fn container;
>>> evidence for this is that people still often need to use statics to store
>>> things. However I'm not sure if this is what you're looking for.
>>>
>>>
>>> Yes. I build a framework on top of beam and must be able to provide a
>>> lifecycle clear and reliable. The bare minimum for any user is
>>> start-exec-stop and a transform is by design bound to an execution (stream
>>> or batch).
>>>
>>> Bundles are not an option as explained cause not bound to the execution
>>> but an uncontrolled subpart. You can see it as a beam internal until
>>> runners unify this definition. And in any case it is closer to a chunk
>>> notion than a lifecycle one.
>>>
>>> So setup and teardown must be symmetric.
>>>
>>> Note that a dofn instance owns a config so is bound to an execution.
>>>
>>> This all lead to the nees of a reliable teardown.
>>>
>>> Caching can be neat bit requires it own api like passivation one of ejbs.
>>>
>>>
>>>
>>> Reuven
>>>
>>> On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <kl...@google.com> wrote:
>>>
>>>> On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>> The serialization of fn being once per bundle, the perf impact is only
>>>>> huge if there is a bug somewhere else, even java serialization is
>>>>> negligeable on big config compared to any small pipeline (seconds vs
>>>>> minutes).
>>>>>
>>>>
>>>> Profiling is clear that this is a huge performance impact. One of the
>>>> most important backwards-incompatible changes we made for Beam 2.0.0 was to
>>>> allow Fn reuse across bundles.
>>>>
>>>> When we used a DoFn only for one bundle, there was no @Teardown because
>>>> it has ~no use. You do everything in @FinishBundle. So for whatever use
>>>> case you are working on, if your pipeline performs well enough doing it per
>>>> bundle, you can put it in @FinishBundle. Of course it still might not get
>>>> called because that is a logical impossibility - you just know that for a
>>>> given element the element will be retried if @FinishBundle fails.
>>>>
>>>> If you have cleanup logic that absolutely must get executed, then you
>>>> need to build a composite PTransform around it so it will be retried until
>>>> cleanup succeeds. In Beam's sinks you can find many examples.
>>>>
>>>> Kenn
>>>>
>>>>
>>>
>>>
>>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Assuming a Pipeline.run(); the corresponding sequence:

WorkerStartFn();
WorkerEndFn();

So a single instance of the fn for the full pipeline execution.

Le 17 févr. 2018 17:42, "Reuven Lax" <re...@google.com> a écrit :

> " and a transform is by design bound to an execution"
>
> What do you mean by execution?
>
> On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau <
> rmannibucau@gmail.com> wrote:
>
>>
>>
>> Le 16 févr. 2018 22:41, "Reuven Lax" <re...@google.com> a écrit :
>>
>> Kenn is correct. Allowing Fn reuse across bundles was a major, major
>> performance improvement. Profiling on the old Dataflow SDKs consistently
>> showed Java serialization being the number one performance bottleneck for
>> streaming pipelines, and Beam fixed this.
>>
>>
>> Sorry but this doesnt help me much to understand. Let me try to explain.
>> I read it as "we were slow somehow around serialization so a quick fix was
>> caching".
>>
>> It is not to be picky but i had a lot of remote ejb over rmi super fast
>> setup do java serialization is slower than alternative serialization,
>> right, but doesnt justify caching most of the time.
>>
>> My main interrogation is: isnt beam which is designed to be slow in the
>> way it designed the dofn/transform and therefore serializes way more than
>> it requires - you never care to serialize the full transform and can in 95%
>> do a writeReplace which is light and fast compared to the default.
>>
>> If so the cache is an implementation workaround and not a fix.
>>
>> Hope my view is clearer on it.
>>
>>
>>
>> Romain - can you state precisely what you want? I do think there is still
>> a gap - IMO there's a place for a longer-lived per-fn container; evidence
>> for this is that people still often need to use statics to store things.
>> However I'm not sure if this is what you're looking for.
>>
>>
>> Yes. I build a framework on top of beam and must be able to provide a
>> lifecycle clear and reliable. The bare minimum for any user is
>> start-exec-stop and a transform is by design bound to an execution (stream
>> or batch).
>>
>> Bundles are not an option as explained cause not bound to the execution
>> but an uncontrolled subpart. You can see it as a beam internal until
>> runners unify this definition. And in any case it is closer to a chunk
>> notion than a lifecycle one.
>>
>> So setup and teardown must be symmetric.
>>
>> Note that a dofn instance owns a config so is bound to an execution.
>>
>> This all lead to the nees of a reliable teardown.
>>
>> Caching can be neat bit requires it own api like passivation one of ejbs.
>>
>>
>>
>> Reuven
>>
>> On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <kl...@google.com> wrote:
>>
>>> On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>>
>>>> The serialization of fn being once per bundle, the perf impact is only
>>>> huge if there is a bug somewhere else, even java serialization is
>>>> negligeable on big config compared to any small pipeline (seconds vs
>>>> minutes).
>>>>
>>>
>>> Profiling is clear that this is a huge performance impact. One of the
>>> most important backwards-incompatible changes we made for Beam 2.0.0 was to
>>> allow Fn reuse across bundles.
>>>
>>> When we used a DoFn only for one bundle, there was no @Teardown because
>>> it has ~no use. You do everything in @FinishBundle. So for whatever use
>>> case you are working on, if your pipeline performs well enough doing it per
>>> bundle, you can put it in @FinishBundle. Of course it still might not get
>>> called because that is a logical impossibility - you just know that for a
>>> given element the element will be retried if @FinishBundle fails.
>>>
>>> If you have cleanup logic that absolutely must get executed, then you
>>> need to build a composite PTransform around it so it will be retried until
>>> cleanup succeeds. In Beam's sinks you can find many examples.
>>>
>>> Kenn
>>>
>>>
>>
>>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

" and a transform is by design bound to an execution"

What do you mean by execution?

On Sat, Feb 17, 2018 at 12:50 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

>
>
> Le 16 févr. 2018 22:41, "Reuven Lax" <re...@google.com> a écrit :
>
> Kenn is correct. Allowing Fn reuse across bundles was a major, major
> performance improvement. Profiling on the old Dataflow SDKs consistently
> showed Java serialization being the number one performance bottleneck for
> streaming pipelines, and Beam fixed this.
>
>
> Sorry but this doesnt help me much to understand. Let me try to explain. I
> read it as "we were slow somehow around serialization so a quick fix was
> caching".
>
> It is not to be picky but i had a lot of remote ejb over rmi super fast
> setup do java serialization is slower than alternative serialization,
> right, but doesnt justify caching most of the time.
>
> My main interrogation is: isnt beam which is designed to be slow in the
> way it designed the dofn/transform and therefore serializes way more than
> it requires - you never care to serialize the full transform and can in 95%
> do a writeReplace which is light and fast compared to the default.
>
> If so the cache is an implementation workaround and not a fix.
>
> Hope my view is clearer on it.
>
>
>
> Romain - can you state precisely what you want? I do think there is still
> a gap - IMO there's a place for a longer-lived per-fn container; evidence
> for this is that people still often need to use statics to store things.
> However I'm not sure if this is what you're looking for.
>
>
> Yes. I build a framework on top of beam and must be able to provide a
> lifecycle clear and reliable. The bare minimum for any user is
> start-exec-stop and a transform is by design bound to an execution (stream
> or batch).
>
> Bundles are not an option as explained cause not bound to the execution
> but an uncontrolled subpart. You can see it as a beam internal until
> runners unify this definition. And in any case it is closer to a chunk
> notion than a lifecycle one.
>
> So setup and teardown must be symmetric.
>
> Note that a dofn instance owns a config so is bound to an execution.
>
> This all lead to the nees of a reliable teardown.
>
> Caching can be neat bit requires it own api like passivation one of ejbs.
>
>
>
> Reuven
>
> On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <kl...@google.com> wrote:
>
>> On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>>
>>> The serialization of fn being once per bundle, the perf impact is only
>>> huge if there is a bug somewhere else, even java serialization is
>>> negligeable on big config compared to any small pipeline (seconds vs
>>> minutes).
>>>
>>
>> Profiling is clear that this is a huge performance impact. One of the
>> most important backwards-incompatible changes we made for Beam 2.0.0 was to
>> allow Fn reuse across bundles.
>>
>> When we used a DoFn only for one bundle, there was no @Teardown because
>> it has ~no use. You do everything in @FinishBundle. So for whatever use
>> case you are working on, if your pipeline performs well enough doing it per
>> bundle, you can put it in @FinishBundle. Of course it still might not get
>> called because that is a logical impossibility - you just know that for a
>> given element the element will be retried if @FinishBundle fails.
>>
>> If you have cleanup logic that absolutely must get executed, then you
>> need to build a composite PTransform around it so it will be retried until
>> cleanup succeeds. In Beam's sinks you can find many examples.
>>
>> Kenn
>>
>>
>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 16 févr. 2018 22:41, "Reuven Lax" <re...@google.com> a écrit :

Kenn is correct. Allowing Fn reuse across bundles was a major, major
performance improvement. Profiling on the old Dataflow SDKs consistently
showed Java serialization being the number one performance bottleneck for
streaming pipelines, and Beam fixed this.

Sorry but this doesnt help me much to understand. Let me try to explain. I
read it as "we were slow somehow around serialization so a quick fix was
caching".

It is not to be picky but i had a lot of remote ejb over rmi super fast
setup do java serialization is slower than alternative serialization,
right, but doesnt justify caching most of the time.

My main interrogation is: isnt beam which is designed to be slow in the way
it designed the dofn/transform and therefore serializes way more than it
requires - you never care to serialize the full transform and can in 95% do
a writeReplace which is light and fast compared to the default.

If so the cache is an implementation workaround and not a fix.

Hope my view is clearer on it.

Romain - can you state precisely what you want? I do think there is still a
gap - IMO there's a place for a longer-lived per-fn container; evidence for
this is that people still often need to use statics to store things.
However I'm not sure if this is what you're looking for.

Yes. I build a framework on top of beam and must be able to provide a
lifecycle clear and reliable. The bare minimum for any user is
start-exec-stop and a transform is by design bound to an execution (stream
or batch).

Bundles are not an option as explained cause not bound to the execution but
an uncontrolled subpart. You can see it as a beam internal until runners
unify this definition. And in any case it is closer to a chunk notion than
a lifecycle one.

So setup and teardown must be symmetric.

Note that a dofn instance owns a config so is bound to an execution.

This all lead to the nees of a reliable teardown.

Caching can be neat bit requires it own api like passivation one of ejbs.

Reuven

On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <kl...@google.com> wrote:

> On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau <rmannibucau@gmail.com
> > wrote:
>>
>> The serialization of fn being once per bundle, the perf impact is only
>> huge if there is a bug somewhere else, even java serialization is
>> negligeable on big config compared to any small pipeline (seconds vs
>> minutes).
>>
>
> Profiling is clear that this is a huge performance impact. One of the most
> important backwards-incompatible changes we made for Beam 2.0.0 was to
> allow Fn reuse across bundles.
>
> When we used a DoFn only for one bundle, there was no @Teardown because it
> has ~no use. You do everything in @FinishBundle. So for whatever use case
> you are working on, if your pipeline performs well enough doing it per
> bundle, you can put it in @FinishBundle. Of course it still might not get
> called because that is a logical impossibility - you just know that for a
> given element the element will be retried if @FinishBundle fails.
>
> If you have cleanup logic that absolutely must get executed, then you need
> to build a composite PTransform around it so it will be retried until
> cleanup succeeds. In Beam's sinks you can find many examples.
>
> Kenn
>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

Kenn is correct. Allowing Fn reuse across bundles was a major, major
performance improvement. Profiling on the old Dataflow SDKs consistently
showed Java serialization being the number one performance bottleneck for
streaming pipelines, and Beam fixed this.

Romain - can you state precisely what you want? I do think there is still a
gap - IMO there's a place for a longer-lived per-fn container; evidence for
this is that people still often need to use statics to store things.
However I'm not sure if this is what you're looking for.

Reuven

On Fri, Feb 16, 2018 at 1:33 PM, Kenneth Knowles <kl...@google.com> wrote:

> On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau <rmannibucau@gmail.com
> > wrote:
>>
>> The serialization of fn being once per bundle, the perf impact is only
>> huge if there is a bug somewhere else, even java serialization is
>> negligeable on big config compared to any small pipeline (seconds vs
>> minutes).
>>
>
> Profiling is clear that this is a huge performance impact. One of the most
> important backwards-incompatible changes we made for Beam 2.0.0 was to
> allow Fn reuse across bundles.
>
> When we used a DoFn only for one bundle, there was no @Teardown because it
> has ~no use. You do everything in @FinishBundle. So for whatever use case
> you are working on, if your pipeline performs well enough doing it per
> bundle, you can put it in @FinishBundle. Of course it still might not get
> called because that is a logical impossibility - you just know that for a
> given element the element will be retried if @FinishBundle fails.
>
> If you have cleanup logic that absolutely must get executed, then you need
> to build a composite PTransform around it so it will be retried until
> cleanup succeeds. In Beam's sinks you can find many examples.
>
> Kenn
>
>

Re: @TearDown guarantees

Posted by Kenneth Knowles <kl...@google.com>.

On Fri, Feb 16, 2018 at 1:00 PM, Romain Manni-Bucau <rm...@gmail.com>
wrote:
>
> The serialization of fn being once per bundle, the perf impact is only
> huge if there is a bug somewhere else, even java serialization is
> negligeable on big config compared to any small pipeline (seconds vs
> minutes).
>

Profiling is clear that this is a huge performance impact. One of the most
important backwards-incompatible changes we made for Beam 2.0.0 was to
allow Fn reuse across bundles.

When we used a DoFn only for one bundle, there was no @Teardown because it
has ~no use. You do everything in @FinishBundle. So for whatever use case
you are working on, if your pipeline performs well enough doing it per
bundle, you can put it in @FinishBundle. Of course it still might not get
called because that is a logical impossibility - you just know that for a
given element the element will be retried if @FinishBundle fails.

If you have cleanup logic that absolutely must get executed, then you need
to build a composite PTransform around it so it will be retried until
cleanup succeeds. In Beam's sinks you can find many examples.

Kenn

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

Le 16 févr. 2018 19:28, "Kenneth Knowles" <kl...@google.com> a écrit :

On Fri, Feb 16, 2018 at 9:39 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:
>
> 2018-02-16 18:18 GMT+01:00 Kenneth Knowles <kl...@google.com>:
>
>> Which runner's bundling are you concerned with? It sounds like the Flink
>> runner?
>>
>
> Flink, Spark, DirectRunner, DataFlow at least (others would be good but
> are out of scope)
>

AFAIK bundling logic/perf is satisfactory on Dataflow, DirectRunner (for
testing, so generates medium-sized local bundles) and SparkRunner (one
bundle per microbatch when streaming). So what issue did you notice there?

No place to clear execution cache and free pipeline specific data and
resources.

This cant be done by bundles cause it can impact perfs or more viciously
kind of connection frequency limit of the backend.

Beam cant help here and should embrace these user constraint IMHO and there
MUST - uppercase as in specs - call teardown per execution.

The serialization of fn being once per bundle, the perf impact is only huge
if there is a bug somewhere else, even java serialization is negligeable on
big config compared to any small pipeline (seconds vs minutes).

So no real perf issue - happy to check a real case if you can share one,  a
security severe issue leads and a user issue lead to a fix which should be
in the 2.4 no?

IIRC at some point the FlinkRunner had 1 element bundles in streaming.
Obviously if that is still the case it has to be fixed.

Kenn

Re: @TearDown guarantees

Posted by Kenneth Knowles <kl...@google.com>.

On Fri, Feb 16, 2018 at 9:39 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:
>
> 2018-02-16 18:18 GMT+01:00 Kenneth Knowles <kl...@google.com>:
>
>> Which runner's bundling are you concerned with? It sounds like the Flink
>> runner?
>>
>
> Flink, Spark, DirectRunner, DataFlow at least (others would be good but
> are out of scope)
>

AFAIK bundling logic/perf is satisfactory on Dataflow, DirectRunner (for
testing, so generates medium-sized local bundles) and SparkRunner (one
bundle per microbatch when streaming). So what issue did you notice there?

IIRC at some point the FlinkRunner had 1 element bundles in streaming.
Obviously if that is still the case it has to be fixed.

Kenn

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

So do I get it right a leak of Dataflow implementation impacts the API?
Also sounds like this perf issues is due to a blind serialization instead
of modelizing what is serialized - nothing should be slow enough in the
serialization at that level, do you have more details on that particular
point?
It also means you accept to leak particular instances data like password
etc (all the @AutoValue builder ones typically) since you dont call - or
not reliably - a postexecution hook which should get solved ASAP.

@Thomas: I understand your update was to align the dataflow behavior to the
API but actually the opposite should be done, align DataFlow impl on the
API. If we disagree tearDown is [1;1] - I'm fine with that, then teardown
is not really usable for users and we miss a
"the fact that we leave the runner discretion on when it can call teardown
does not make this poorly-defined; it means that users should not depend on
teardown being called for correct behavior, and *this has always been true
and will continue to be true*."
This is not really the case, you say it yourself "[...] does not make this
poorly-defined [...] it means that users should not depend on teardown".
This literally means @TearDown is not part of the API. Once again I'm fine
with it but this kind of API is needed.
"*this has always been true and will continue to be true"*
Not really too since it was not clear before and runner dependent so users
can depend on it.

With both statements I think it should just get fixed and made reliable
which is technically possible IMHO instead of creating a new API which
would make teardown a cache hook which is an implementation detail which
shouldn't surface in the API.

@AfterExecution. @FinishBundle is once the bundle finishes so not a
"finally" for the dofn regarding the execution.

Side note: the success callback hook which has been discussed N times
doesnt match the need which is really per instance (= accessible from that
particular instance and not globally) in both success and failure cases.


2018-02-16 18:18 GMT+01:00 Kenneth Knowles <kl...@google.com>:

> Which runner's bundling are you concerned with? It sounds like the Flink
> runner?
>

Flink, Spark, DirectRunner, DataFlow at least (others would be good but are
out of scope)


>
> Kenn
>
>
> On Fri, Feb 16, 2018 at 9:04 AM, Romain Manni-Bucau <rmannibucau@gmail.com
> > wrote:
>
>>
>> 2018-02-16 17:59 GMT+01:00 Kenneth Knowles <kl...@google.com>:
>>
>>> What I am hearing is this:
>>>
>>>  - @FinishBundle does what you want (a reliable "flush" call) but your
>>> runner is not doing a good job of bundling
>>>
>>
>> Nop, finishbundle is defined but not a bundle. Typically for 1 million
>> rows I'll get 1 million calls in flink and 1 call in spark (today) so this
>> is not a way to call a final task to release dofn internal instances or do
>> some one time auditing.
>>
>>
>>>  - @Teardown has well-defined semantics and they are not what you want
>>>
>>
>> "
>> Note that calls to the annotated method are best effort, and may not
>> occur for arbitrary reasons"
>>
>> is not really "well-defined" and is also a breaking change compared to
>> the < 2.3.x (x >= 1) .
>>
>>
>>> So you are hoping for something that is called less frequently but is
>>> still mandatory.
>>>
>>> Just trying to establish the basics to start over and get this on track
>>> to solving the real problem.
>>>
>>
>> Concretely I need a well defined lifecycle for any DoFn executed in beam
>> and today there is no such a thing making it impossible to develop
>> correctly transforms/fn on an user side.
>>
>>
>>>
>>> Kenn
>>>
>>>
>>> On Fri, Feb 16, 2018 at 8:51 AM, Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>> finish bundle is well defined and must be called, right, not at the end
>>>> so you still miss teardown as a user. Bundles are defined by the runner and
>>>> you can have 100000 bundles per batch (even more for a stream ;)) so you
>>>> dont want to release your resources or handle you execution auditing in it,
>>>> you want it at the end so in tear down.
>>>>
>>>> So yes we must have teardown reliable somehow.
>>>>
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> | Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>> 2018-02-16 17:43 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>
>>>>> +1 I think @FinishBundle is the right thing to look at here.
>>>>>
>>>>> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
>>>>> wrote:
>>>>>
>>>>>> Hi Romain
>>>>>>
>>>>>> Is it not @FinishBundle your solution ?
>>>>>>
>>>>>> Regards
>>>>>> JB
>>>>>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com>
>>>>>> a écrit:
>>>>>>>
>>>>>>> I see Reuven, so it is actually a broken contract for end users more
>>>>>>> than a bug. Concretely a user must have a way to execute code once the
>>>>>>> teardown is no more used and a teardown is populated by the user in the
>>>>>>> context of an execution.
>>>>>>> It means that if the environment wants to pool (cache) the instances
>>>>>>> it must provide a postBorrowFromCache and preReturnToCache to let the user
>>>>>>> handle that - we'll get back to EJB and passivation ;).
>>>>>>>
>>>>>>> Personally I think it is fine to cache the instances for the
>>>>>>> duration of an execution but not accross execution. Concretely if you check
>>>>>>> out the API it should just not be possible for a runner since the lifecycle
>>>>>>> is not covered and the fact teardown can not be called today is an
>>>>>>> implementation bug/leak surfacing the API.
>>>>>>>
>>>>>>> So I see 2 options:
>>>>>>>
>>>>>>> 1. make it mandatory and get rid of the caching - which shouldnt
>>>>>>> help much in current state in terms of perf
>>>>>>> 2. keep teardown a final release object (which is not that useful
>>>>>>> cause of the end of the sentence) and add a clean cache lifecycle
>>>>>>> management
>>>>>>>
>>>>>>> tempted to say 1 is saner short terms, in particular cause beam is
>>>>>>> 2.x and users already use it this way.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Romain Manni-Bucau
>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>
>>>>>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>>
>>>>>>>> So the concern is that @TearDown might not be called?
>>>>>>>>
>>>>>>>> Let's understand the reason for @TearDown. The runner is free to
>>>>>>>> cache the DoFn object across many invocations, and indeed in streaming this
>>>>>>>> is often a critical optimization. However if the runner does decide to
>>>>>>>> destroy the DoFn object (e.g. because it's being evicted from cache), often
>>>>>>>> users need a callback to tear down associated resources (file handles, RPC
>>>>>>>> connections, etc.).
>>>>>>>>
>>>>>>>> Now @TearDown isn't guaranteed to be called for a simple reason:
>>>>>>>> the runner might never tear down the DoFn object! The runner might well
>>>>>>>> decide to cache the object forever, in which case there is never .a time to
>>>>>>>> call @TearDown. There is no violation of semantics here.
>>>>>>>>
>>>>>>>> Also, the point about not calling teardown if the JVM crashes might
>>>>>>>> well sound implicit with no need to mention it. However empirically users
>>>>>>>> do misunderstand even this, so it's worth mentioning.
>>>>>>>>
>>>>>>>> Reuven
>>>>>>>>
>>>>>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hi guys,
>>>>>>>>>
>>>>>>>>> I'm a bit concerned of this PR  https://github.com/apache/beam
>>>>>>>>> /pull/4637
>>>>>>>>>
>>>>>>>>> I understand the intent but I'd like to share how I see it and why
>>>>>>>>> it is an issue for me:
>>>>>>>>>
>>>>>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try
>>>>>>>>> to preallocate some memory for instance to free it in case of OOME and try
>>>>>>>>> to recover but it never prooved useful and got dropped recently. This is a
>>>>>>>>> good example you can't do anything if there is a cataclism and therefore
>>>>>>>>> any framework or lib will not be blamed for it
>>>>>>>>> 2. if you expose an API, its behavior must be well defined. In the
>>>>>>>>> case of a portable library like Beam it is even more important otherwise it
>>>>>>>>> leads users to not use the API or the projet :(.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> These two points lead to say that if the JVM crashes it is ok to
>>>>>>>>> not call teardown and it is even implicit in any programming environment so
>>>>>>>>> no need to mention it. However that a runner doesn't call teardown is a bug
>>>>>>>>> and not a feature or something intended because it can have a huge impact
>>>>>>>>> on the user flow.
>>>>>>>>>
>>>>>>>>> The user workarounds are to use custom threads with timeouts to
>>>>>>>>> execute the actions or things like that, all bad solutions to replace a
>>>>>>>>> buggy API, if you remove the contract guarantee.
>>>>>>>>>
>>>>>>>>> To make it obvious: substring(from, to): will substring the
>>>>>>>>> current string between from and to...or not. Would you use the function?
>>>>>>>>>
>>>>>>>>> What I ask is to add in the javadoc that the contract enforces the
>>>>>>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>>>>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>>>>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>>>>>>> user and let runners document they don't respect - yet - the API when
>>>>>>>>> relevant.
>>>>>>>>>
>>>>>>>>> wdyt?
>>>>>>>>>
>>>>>>>>> Romain Manni-Bucau
>>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>
>>>
>>
>

Re: @TearDown guarantees

Posted by Kenneth Knowles <kl...@google.com>.

Which runner's bundling are you concerned with? It sounds like the Flink
runner?

Kenn


On Fri, Feb 16, 2018 at 9:04 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

>
> 2018-02-16 17:59 GMT+01:00 Kenneth Knowles <kl...@google.com>:
>
>> What I am hearing is this:
>>
>>  - @FinishBundle does what you want (a reliable "flush" call) but your
>> runner is not doing a good job of bundling
>>
>
> Nop, finishbundle is defined but not a bundle. Typically for 1 million
> rows I'll get 1 million calls in flink and 1 call in spark (today) so this
> is not a way to call a final task to release dofn internal instances or do
> some one time auditing.
>
>
>>  - @Teardown has well-defined semantics and they are not what you want
>>
>
> "
> Note that calls to the annotated method are best effort, and may not occur
> for arbitrary reasons"
>
> is not really "well-defined" and is also a breaking change compared to the
> < 2.3.x (x >= 1) .
>
>
>> So you are hoping for something that is called less frequently but is
>> still mandatory.
>>
>> Just trying to establish the basics to start over and get this on track
>> to solving the real problem.
>>
>
> Concretely I need a well defined lifecycle for any DoFn executed in beam
> and today there is no such a thing making it impossible to develop
> correctly transforms/fn on an user side.
>
>
>>
>> Kenn
>>
>>
>> On Fri, Feb 16, 2018 at 8:51 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>> finish bundle is well defined and must be called, right, not at the end
>>> so you still miss teardown as a user. Bundles are defined by the runner and
>>> you can have 100000 bundles per batch (even more for a stream ;)) so you
>>> dont want to release your resources or handle you execution auditing in it,
>>> you want it at the end so in tear down.
>>>
>>> So yes we must have teardown reliable somehow.
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-02-16 17:43 GMT+01:00 Reuven Lax <re...@google.com>:
>>>
>>>> +1 I think @FinishBundle is the right thing to look at here.
>>>>
>>>> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
>>>> wrote:
>>>>
>>>>> Hi Romain
>>>>>
>>>>> Is it not @FinishBundle your solution ?
>>>>>
>>>>> Regards
>>>>> JB
>>>>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com>
>>>>> a écrit:
>>>>>>
>>>>>> I see Reuven, so it is actually a broken contract for end users more
>>>>>> than a bug. Concretely a user must have a way to execute code once the
>>>>>> teardown is no more used and a teardown is populated by the user in the
>>>>>> context of an execution.
>>>>>> It means that if the environment wants to pool (cache) the instances
>>>>>> it must provide a postBorrowFromCache and preReturnToCache to let the user
>>>>>> handle that - we'll get back to EJB and passivation ;).
>>>>>>
>>>>>> Personally I think it is fine to cache the instances for the duration
>>>>>> of an execution but not accross execution. Concretely if you check out the
>>>>>> API it should just not be possible for a runner since the lifecycle is not
>>>>>> covered and the fact teardown can not be called today is an implementation
>>>>>> bug/leak surfacing the API.
>>>>>>
>>>>>> So I see 2 options:
>>>>>>
>>>>>> 1. make it mandatory and get rid of the caching - which shouldnt help
>>>>>> much in current state in terms of perf
>>>>>> 2. keep teardown a final release object (which is not that useful
>>>>>> cause of the end of the sentence) and add a clean cache lifecycle
>>>>>> management
>>>>>>
>>>>>> tempted to say 1 is saner short terms, in particular cause beam is
>>>>>> 2.x and users already use it this way.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Romain Manni-Bucau
>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>
>>>>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>
>>>>>>> So the concern is that @TearDown might not be called?
>>>>>>>
>>>>>>> Let's understand the reason for @TearDown. The runner is free to
>>>>>>> cache the DoFn object across many invocations, and indeed in streaming this
>>>>>>> is often a critical optimization. However if the runner does decide to
>>>>>>> destroy the DoFn object (e.g. because it's being evicted from cache), often
>>>>>>> users need a callback to tear down associated resources (file handles, RPC
>>>>>>> connections, etc.).
>>>>>>>
>>>>>>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>>>>>>> runner might never tear down the DoFn object! The runner might well decide
>>>>>>> to cache the object forever, in which case there is never .a time to call
>>>>>>> @TearDown. There is no violation of semantics here.
>>>>>>>
>>>>>>> Also, the point about not calling teardown if the JVM crashes might
>>>>>>> well sound implicit with no need to mention it. However empirically users
>>>>>>> do misunderstand even this, so it's worth mentioning.
>>>>>>>
>>>>>>> Reuven
>>>>>>>
>>>>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi guys,
>>>>>>>>
>>>>>>>> I'm a bit concerned of this PR  https://github.com/apache/beam
>>>>>>>> /pull/4637
>>>>>>>>
>>>>>>>> I understand the intent but I'd like to share how I see it and why
>>>>>>>> it is an issue for me:
>>>>>>>>
>>>>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>>>>>>> preallocate some memory for instance to free it in case of OOME and try to
>>>>>>>> recover but it never prooved useful and got dropped recently. This is a
>>>>>>>> good example you can't do anything if there is a cataclism and therefore
>>>>>>>> any framework or lib will not be blamed for it
>>>>>>>> 2. if you expose an API, its behavior must be well defined. In the
>>>>>>>> case of a portable library like Beam it is even more important otherwise it
>>>>>>>> leads users to not use the API or the projet :(.
>>>>>>>>
>>>>>>>>
>>>>>>>> These two points lead to say that if the JVM crashes it is ok to
>>>>>>>> not call teardown and it is even implicit in any programming environment so
>>>>>>>> no need to mention it. However that a runner doesn't call teardown is a bug
>>>>>>>> and not a feature or something intended because it can have a huge impact
>>>>>>>> on the user flow.
>>>>>>>>
>>>>>>>> The user workarounds are to use custom threads with timeouts to
>>>>>>>> execute the actions or things like that, all bad solutions to replace a
>>>>>>>> buggy API, if you remove the contract guarantee.
>>>>>>>>
>>>>>>>> To make it obvious: substring(from, to): will substring the current
>>>>>>>> string between from and to...or not. Would you use the function?
>>>>>>>>
>>>>>>>> What I ask is to add in the javadoc that the contract enforces the
>>>>>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>>>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>>>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>>>>>> user and let runners document they don't respect - yet - the API when
>>>>>>>> relevant.
>>>>>>>>
>>>>>>>> wdyt?
>>>>>>>>
>>>>>>>> Romain Manni-Bucau
>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>
>>
>

Re: @TearDown guarantees

Posted by Thomas Groh <tg...@google.com>.

I'll note as well that you don't need a well defined DoFn lifecycle method
- you just want less granular bundling, which is a different requirement.

Teardown has well-defined interactions with the rest of the DoFn methods,
and what the runner is permitted to do when it calls Teardown - the fact
that we leave the runner discretion on when it can call teardown does not
make this poorly-defined; it means that users should not depend on teardown
being called for correct behavior, and *this has always been true and will
continue to be true*.


On Fri, Feb 16, 2018 at 9:04 AM Romain Manni-Bucau <rm...@gmail.com>
wrote:

>
> 2018-02-16 17:59 GMT+01:00 Kenneth Knowles <kl...@google.com>:
>
>> What I am hearing is this:
>>
>>  - @FinishBundle does what you want (a reliable "flush" call) but your
>> runner is not doing a good job of bundling
>>
>
> Nop, finishbundle is defined but not a bundle. Typically for 1 million
> rows I'll get 1 million calls in flink and 1 call in spark (today) so this
> is not a way to call a final task to release dofn internal instances or do
> some one time auditing.
>
>
>>  - @Teardown has well-defined semantics and they are not what you want
>>
>
> "
> Note that calls to the annotated method are best effort, and may not occur
> for arbitrary reasons"
>
> is not really "well-defined" and is also a breaking change compared to the
> < 2.3.x (x >= 1) .
>
>
>> So you are hoping for something that is called less frequently but is
>> still mandatory.
>>
>> Just trying to establish the basics to start over and get this on track
>> to solving the real problem.
>>
>
> Concretely I need a well defined lifecycle for any DoFn executed in beam
> and today there is no such a thing making it impossible to develop
> correctly transforms/fn on an user side.
>
>
>>
>> Kenn
>>
>>
>> On Fri, Feb 16, 2018 at 8:51 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>> finish bundle is well defined and must be called, right, not at the end
>>> so you still miss teardown as a user. Bundles are defined by the runner and
>>> you can have 100000 bundles per batch (even more for a stream ;)) so you
>>> dont want to release your resources or handle you execution auditing in it,
>>> you want it at the end so in tear down.
>>>
>>> So yes we must have teardown reliable somehow.
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-02-16 17:43 GMT+01:00 Reuven Lax <re...@google.com>:
>>>
>>>> +1 I think @FinishBundle is the right thing to look at here.
>>>>
>>>> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
>>>> wrote:
>>>>
>>>>> Hi Romain
>>>>>
>>>>> Is it not @FinishBundle your solution ?
>>>>>
>>>>> Regards
>>>>> JB
>>>>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com>
>>>>> a écrit:
>>>>>>
>>>>>> I see Reuven, so it is actually a broken contract for end users more
>>>>>> than a bug. Concretely a user must have a way to execute code once the
>>>>>> teardown is no more used and a teardown is populated by the user in the
>>>>>> context of an execution.
>>>>>> It means that if the environment wants to pool (cache) the instances
>>>>>> it must provide a postBorrowFromCache and preReturnToCache to let the user
>>>>>> handle that - we'll get back to EJB and passivation ;).
>>>>>>
>>>>>> Personally I think it is fine to cache the instances for the duration
>>>>>> of an execution but not accross execution. Concretely if you check out the
>>>>>> API it should just not be possible for a runner since the lifecycle is not
>>>>>> covered and the fact teardown can not be called today is an implementation
>>>>>> bug/leak surfacing the API.
>>>>>>
>>>>>> So I see 2 options:
>>>>>>
>>>>>> 1. make it mandatory and get rid of the caching - which shouldnt help
>>>>>> much in current state in terms of perf
>>>>>> 2. keep teardown a final release object (which is not that useful
>>>>>> cause of the end of the sentence) and add a clean cache lifecycle
>>>>>> management
>>>>>>
>>>>>> tempted to say 1 is saner short terms, in particular cause beam is
>>>>>> 2.x and users already use it this way.
>>>>>>
>>>>>>
>>>>>>
>>>>>> Romain Manni-Bucau
>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>
>>>>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>>
>>>>>>> So the concern is that @TearDown might not be called?
>>>>>>>
>>>>>>> Let's understand the reason for @TearDown. The runner is free to
>>>>>>> cache the DoFn object across many invocations, and indeed in streaming this
>>>>>>> is often a critical optimization. However if the runner does decide to
>>>>>>> destroy the DoFn object (e.g. because it's being evicted from cache), often
>>>>>>> users need a callback to tear down associated resources (file handles, RPC
>>>>>>> connections, etc.).
>>>>>>>
>>>>>>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>>>>>>> runner might never tear down the DoFn object! The runner might well decide
>>>>>>> to cache the object forever, in which case there is never .a time to call
>>>>>>> @TearDown. There is no violation of semantics here.
>>>>>>>
>>>>>>> Also, the point about not calling teardown if the JVM crashes might
>>>>>>> well sound implicit with no need to mention it. However empirically users
>>>>>>> do misunderstand even this, so it's worth mentioning.
>>>>>>>
>>>>>>> Reuven
>>>>>>>
>>>>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi guys,
>>>>>>>>
>>>>>>>> I'm a bit concerned of this PR
>>>>>>>> https://github.com/apache/beam/pull/4637
>>>>>>>>
>>>>>>>> I understand the intent but I'd like to share how I see it and why
>>>>>>>> it is an issue for me:
>>>>>>>>
>>>>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>>>>>>> preallocate some memory for instance to free it in case of OOME and try to
>>>>>>>> recover but it never prooved useful and got dropped recently. This is a
>>>>>>>> good example you can't do anything if there is a cataclism and therefore
>>>>>>>> any framework or lib will not be blamed for it
>>>>>>>> 2. if you expose an API, its behavior must be well defined. In the
>>>>>>>> case of a portable library like Beam it is even more important otherwise it
>>>>>>>> leads users to not use the API or the projet :(.
>>>>>>>>
>>>>>>>>
>>>>>>>> These two points lead to say that if the JVM crashes it is ok to
>>>>>>>> not call teardown and it is even implicit in any programming environment so
>>>>>>>> no need to mention it. However that a runner doesn't call teardown is a bug
>>>>>>>> and not a feature or something intended because it can have a huge impact
>>>>>>>> on the user flow.
>>>>>>>>
>>>>>>>> The user workarounds are to use custom threads with timeouts to
>>>>>>>> execute the actions or things like that, all bad solutions to replace a
>>>>>>>> buggy API, if you remove the contract guarantee.
>>>>>>>>
>>>>>>>> To make it obvious: substring(from, to): will substring the current
>>>>>>>> string between from and to...or not. Would you use the function?
>>>>>>>>
>>>>>>>> What I ask is to add in the javadoc that the contract enforces the
>>>>>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>>>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>>>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>>>>>> user and let runners document they don't respect - yet - the API when
>>>>>>>> relevant.
>>>>>>>>
>>>>>>>> wdyt?
>>>>>>>>
>>>>>>>> Romain Manni-Bucau
>>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>
>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

2018-02-16 17:59 GMT+01:00 Kenneth Knowles <kl...@google.com>:

> What I am hearing is this:
>
>  - @FinishBundle does what you want (a reliable "flush" call) but your
> runner is not doing a good job of bundling
>

Nop, finishbundle is defined but not a bundle. Typically for 1 million rows
I'll get 1 million calls in flink and 1 call in spark (today) so this is
not a way to call a final task to release dofn internal instances or do
some one time auditing.


>  - @Teardown has well-defined semantics and they are not what you want
>

"
Note that calls to the annotated method are best effort, and may not occur
for arbitrary reasons"

is not really "well-defined" and is also a breaking change compared to the
< 2.3.x (x >= 1) .


> So you are hoping for something that is called less frequently but is
> still mandatory.
>
> Just trying to establish the basics to start over and get this on track to
> solving the real problem.
>

Concretely I need a well defined lifecycle for any DoFn executed in beam
and today there is no such a thing making it impossible to develop
correctly transforms/fn on an user side.


>
> Kenn
>
>
> On Fri, Feb 16, 2018 at 8:51 AM, Romain Manni-Bucau <rmannibucau@gmail.com
> > wrote:
>
>> finish bundle is well defined and must be called, right, not at the end
>> so you still miss teardown as a user. Bundles are defined by the runner and
>> you can have 100000 bundles per batch (even more for a stream ;)) so you
>> dont want to release your resources or handle you execution auditing in it,
>> you want it at the end so in tear down.
>>
>> So yes we must have teardown reliable somehow.
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> | Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau> | Book
>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>
>> 2018-02-16 17:43 GMT+01:00 Reuven Lax <re...@google.com>:
>>
>>> +1 I think @FinishBundle is the right thing to look at here.
>>>
>>> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
>>> wrote:
>>>
>>>> Hi Romain
>>>>
>>>> Is it not @FinishBundle your solution ?
>>>>
>>>> Regards
>>>> JB
>>>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com>
>>>> a écrit:
>>>>>
>>>>> I see Reuven, so it is actually a broken contract for end users more
>>>>> than a bug. Concretely a user must have a way to execute code once the
>>>>> teardown is no more used and a teardown is populated by the user in the
>>>>> context of an execution.
>>>>> It means that if the environment wants to pool (cache) the instances
>>>>> it must provide a postBorrowFromCache and preReturnToCache to let the user
>>>>> handle that - we'll get back to EJB and passivation ;).
>>>>>
>>>>> Personally I think it is fine to cache the instances for the duration
>>>>> of an execution but not accross execution. Concretely if you check out the
>>>>> API it should just not be possible for a runner since the lifecycle is not
>>>>> covered and the fact teardown can not be called today is an implementation
>>>>> bug/leak surfacing the API.
>>>>>
>>>>> So I see 2 options:
>>>>>
>>>>> 1. make it mandatory and get rid of the caching - which shouldnt help
>>>>> much in current state in terms of perf
>>>>> 2. keep teardown a final release object (which is not that useful
>>>>> cause of the end of the sentence) and add a clean cache lifecycle
>>>>> management
>>>>>
>>>>> tempted to say 1 is saner short terms, in particular cause beam is 2.x
>>>>> and users already use it this way.
>>>>>
>>>>>
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>>
>>>>>> So the concern is that @TearDown might not be called?
>>>>>>
>>>>>> Let's understand the reason for @TearDown. The runner is free to
>>>>>> cache the DoFn object across many invocations, and indeed in streaming this
>>>>>> is often a critical optimization. However if the runner does decide to
>>>>>> destroy the DoFn object (e.g. because it's being evicted from cache), often
>>>>>> users need a callback to tear down associated resources (file handles, RPC
>>>>>> connections, etc.).
>>>>>>
>>>>>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>>>>>> runner might never tear down the DoFn object! The runner might well decide
>>>>>> to cache the object forever, in which case there is never .a time to call
>>>>>> @TearDown. There is no violation of semantics here.
>>>>>>
>>>>>> Also, the point about not calling teardown if the JVM crashes might
>>>>>> well sound implicit with no need to mention it. However empirically users
>>>>>> do misunderstand even this, so it's worth mentioning.
>>>>>>
>>>>>> Reuven
>>>>>>
>>>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>>>>> rmannibucau@gmail.com> wrote:
>>>>>>
>>>>>>> Hi guys,
>>>>>>>
>>>>>>> I'm a bit concerned of this PR  https://github.com/apache/beam
>>>>>>> /pull/4637
>>>>>>>
>>>>>>> I understand the intent but I'd like to share how I see it and why
>>>>>>> it is an issue for me:
>>>>>>>
>>>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>>>>>> preallocate some memory for instance to free it in case of OOME and try to
>>>>>>> recover but it never prooved useful and got dropped recently. This is a
>>>>>>> good example you can't do anything if there is a cataclism and therefore
>>>>>>> any framework or lib will not be blamed for it
>>>>>>> 2. if you expose an API, its behavior must be well defined. In the
>>>>>>> case of a portable library like Beam it is even more important otherwise it
>>>>>>> leads users to not use the API or the projet :(.
>>>>>>>
>>>>>>>
>>>>>>> These two points lead to say that if the JVM crashes it is ok to not
>>>>>>> call teardown and it is even implicit in any programming environment so no
>>>>>>> need to mention it. However that a runner doesn't call teardown is a bug
>>>>>>> and not a feature or something intended because it can have a huge impact
>>>>>>> on the user flow.
>>>>>>>
>>>>>>> The user workarounds are to use custom threads with timeouts to
>>>>>>> execute the actions or things like that, all bad solutions to replace a
>>>>>>> buggy API, if you remove the contract guarantee.
>>>>>>>
>>>>>>> To make it obvious: substring(from, to): will substring the current
>>>>>>> string between from and to...or not. Would you use the function?
>>>>>>>
>>>>>>> What I ask is to add in the javadoc that the contract enforces the
>>>>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>>>>> user and let runners document they don't respect - yet - the API when
>>>>>>> relevant.
>>>>>>>
>>>>>>> wdyt?
>>>>>>>
>>>>>>> Romain Manni-Bucau
>>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>
>

Re: @TearDown guarantees

Posted by Kenneth Knowles <kl...@google.com>.

What I am hearing is this:

 - @FinishBundle does what you want (a reliable "flush" call) but your
runner is not doing a good job of bundling
 - @Teardown has well-defined semantics and they are not what you want

So you are hoping for something that is called less frequently but is still
mandatory.

Just trying to establish the basics to start over and get this on track to
solving the real problem.

Kenn


On Fri, Feb 16, 2018 at 8:51 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

> finish bundle is well defined and must be called, right, not at the end so
> you still miss teardown as a user. Bundles are defined by the runner and
> you can have 100000 bundles per batch (even more for a stream ;)) so you
> dont want to release your resources or handle you execution auditing in it,
> you want it at the end so in tear down.
>
> So yes we must have teardown reliable somehow.
>
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>
> 2018-02-16 17:43 GMT+01:00 Reuven Lax <re...@google.com>:
>
>> +1 I think @FinishBundle is the right thing to look at here.
>>
>> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
>> wrote:
>>
>>> Hi Romain
>>>
>>> Is it not @FinishBundle your solution ?
>>>
>>> Regards
>>> JB
>>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com> a
>>> écrit:
>>>>
>>>> I see Reuven, so it is actually a broken contract for end users more
>>>> than a bug. Concretely a user must have a way to execute code once the
>>>> teardown is no more used and a teardown is populated by the user in the
>>>> context of an execution.
>>>> It means that if the environment wants to pool (cache) the instances it
>>>> must provide a postBorrowFromCache and preReturnToCache to let the user
>>>> handle that - we'll get back to EJB and passivation ;).
>>>>
>>>> Personally I think it is fine to cache the instances for the duration
>>>> of an execution but not accross execution. Concretely if you check out the
>>>> API it should just not be possible for a runner since the lifecycle is not
>>>> covered and the fact teardown can not be called today is an implementation
>>>> bug/leak surfacing the API.
>>>>
>>>> So I see 2 options:
>>>>
>>>> 1. make it mandatory and get rid of the caching - which shouldnt help
>>>> much in current state in terms of perf
>>>> 2. keep teardown a final release object (which is not that useful cause
>>>> of the end of the sentence) and add a clean cache lifecycle management
>>>>
>>>> tempted to say 1 is saner short terms, in particular cause beam is 2.x
>>>> and users already use it this way.
>>>>
>>>>
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> |  Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>>>
>>>>> So the concern is that @TearDown might not be called?
>>>>>
>>>>> Let's understand the reason for @TearDown. The runner is free to cache
>>>>> the DoFn object across many invocations, and indeed in streaming this is
>>>>> often a critical optimization. However if the runner does decide to destroy
>>>>> the DoFn object (e.g. because it's being evicted from cache), often users
>>>>> need a callback to tear down associated resources (file handles, RPC
>>>>> connections, etc.).
>>>>>
>>>>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>>>>> runner might never tear down the DoFn object! The runner might well decide
>>>>> to cache the object forever, in which case there is never .a time to call
>>>>> @TearDown. There is no violation of semantics here.
>>>>>
>>>>> Also, the point about not calling teardown if the JVM crashes might
>>>>> well sound implicit with no need to mention it. However empirically users
>>>>> do misunderstand even this, so it's worth mentioning.
>>>>>
>>>>> Reuven
>>>>>
>>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>>>> rmannibucau@gmail.com> wrote:
>>>>>
>>>>>> Hi guys,
>>>>>>
>>>>>> I'm a bit concerned of this PR  https://github.com/apache/beam
>>>>>> /pull/4637
>>>>>>
>>>>>> I understand the intent but I'd like to share how I see it and why it
>>>>>> is an issue for me:
>>>>>>
>>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>>>>> preallocate some memory for instance to free it in case of OOME and try to
>>>>>> recover but it never prooved useful and got dropped recently. This is a
>>>>>> good example you can't do anything if there is a cataclism and therefore
>>>>>> any framework or lib will not be blamed for it
>>>>>> 2. if you expose an API, its behavior must be well defined. In the
>>>>>> case of a portable library like Beam it is even more important otherwise it
>>>>>> leads users to not use the API or the projet :(.
>>>>>>
>>>>>>
>>>>>> These two points lead to say that if the JVM crashes it is ok to not
>>>>>> call teardown and it is even implicit in any programming environment so no
>>>>>> need to mention it. However that a runner doesn't call teardown is a bug
>>>>>> and not a feature or something intended because it can have a huge impact
>>>>>> on the user flow.
>>>>>>
>>>>>> The user workarounds are to use custom threads with timeouts to
>>>>>> execute the actions or things like that, all bad solutions to replace a
>>>>>> buggy API, if you remove the contract guarantee.
>>>>>>
>>>>>> To make it obvious: substring(from, to): will substring the current
>>>>>> string between from and to...or not. Would you use the function?
>>>>>>
>>>>>> What I ask is to add in the javadoc that the contract enforces the
>>>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>>>> user and let runners document they don't respect - yet - the API when
>>>>>> relevant.
>>>>>>
>>>>>> wdyt?
>>>>>>
>>>>>> Romain Manni-Bucau
>>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>>
>>>>>
>>>>>
>>>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

finish bundle is well defined and must be called, right, not at the end so
you still miss teardown as a user. Bundles are defined by the runner and
you can have 100000 bundles per batch (even more for a stream ;)) so you
dont want to release your resources or handle you execution auditing in it,
you want it at the end so in tear down.

So yes we must have teardown reliable somehow.


Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-16 17:43 GMT+01:00 Reuven Lax <re...@google.com>:

> +1 I think @FinishBundle is the right thing to look at here.
>
> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
>
>> Hi Romain
>>
>> Is it not @FinishBundle your solution ?
>>
>> Regards
>> JB
>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com> a
>> écrit:
>>>
>>> I see Reuven, so it is actually a broken contract for end users more
>>> than a bug. Concretely a user must have a way to execute code once the
>>> teardown is no more used and a teardown is populated by the user in the
>>> context of an execution.
>>> It means that if the environment wants to pool (cache) the instances it
>>> must provide a postBorrowFromCache and preReturnToCache to let the user
>>> handle that - we'll get back to EJB and passivation ;).
>>>
>>> Personally I think it is fine to cache the instances for the duration of
>>> an execution but not accross execution. Concretely if you check out the API
>>> it should just not be possible for a runner since the lifecycle is not
>>> covered and the fact teardown can not be called today is an implementation
>>> bug/leak surfacing the API.
>>>
>>> So I see 2 options:
>>>
>>> 1. make it mandatory and get rid of the caching - which shouldnt help
>>> much in current state in terms of perf
>>> 2. keep teardown a final release object (which is not that useful cause
>>> of the end of the sentence) and add a clean cache lifecycle management
>>>
>>> tempted to say 1 is saner short terms, in particular cause beam is 2.x
>>> and users already use it this way.
>>>
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> |  Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>>
>>>> So the concern is that @TearDown might not be called?
>>>>
>>>> Let's understand the reason for @TearDown. The runner is free to cache
>>>> the DoFn object across many invocations, and indeed in streaming this is
>>>> often a critical optimization. However if the runner does decide to destroy
>>>> the DoFn object (e.g. because it's being evicted from cache), often users
>>>> need a callback to tear down associated resources (file handles, RPC
>>>> connections, etc.).
>>>>
>>>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>>>> runner might never tear down the DoFn object! The runner might well decide
>>>> to cache the object forever, in which case there is never .a time to call
>>>> @TearDown. There is no violation of semantics here.
>>>>
>>>> Also, the point about not calling teardown if the JVM crashes might
>>>> well sound implicit with no need to mention it. However empirically users
>>>> do misunderstand even this, so it's worth mentioning.
>>>>
>>>> Reuven
>>>>
>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> I'm a bit concerned of this PR  https://github.com/apache/
>>>>> beam/pull/4637
>>>>>
>>>>> I understand the intent but I'd like to share how I see it and why it
>>>>> is an issue for me:
>>>>>
>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>>>> preallocate some memory for instance to free it in case of OOME and try to
>>>>> recover but it never prooved useful and got dropped recently. This is a
>>>>> good example you can't do anything if there is a cataclism and therefore
>>>>> any framework or lib will not be blamed for it
>>>>> 2. if you expose an API, its behavior must be well defined. In the
>>>>> case of a portable library like Beam it is even more important otherwise it
>>>>> leads users to not use the API or the projet :(.
>>>>>
>>>>>
>>>>> These two points lead to say that if the JVM crashes it is ok to not
>>>>> call teardown and it is even implicit in any programming environment so no
>>>>> need to mention it. However that a runner doesn't call teardown is a bug
>>>>> and not a feature or something intended because it can have a huge impact
>>>>> on the user flow.
>>>>>
>>>>> The user workarounds are to use custom threads with timeouts to
>>>>> execute the actions or things like that, all bad solutions to replace a
>>>>> buggy API, if you remove the contract guarantee.
>>>>>
>>>>> To make it obvious: substring(from, to): will substring the current
>>>>> string between from and to...or not. Would you use the function?
>>>>>
>>>>> What I ask is to add in the javadoc that the contract enforces the
>>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>>> user and let runners document they don't respect - yet - the API when
>>>>> relevant.
>>>>>
>>>>> wdyt?
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>
>>>>
>>>

Re: @TearDown guarantees

Posted by Thomas Groh <tg...@google.com>.

Given that I'm the original author of both the @Setup and @Teardown methods
and the PR under discussion, I thought I'd drop in to give in a bit of
history and my thoughts on the issue.

Originally (Dataflow 1.x), the spec required a Runner to deserialize a new
instance of a DoFn for every Bundle. For runners with small bundles (such
as the Dataflow Runner in streaming mode), this can be a significant cost,
and enabling DoFn reuse enables significant performance increases, as
Reuven mentioned.

However, for IOs and other transforms which have to perform expensive
execution-time initialization, DoFn reuse isn't enough by itself to enable
effective resource reuse - instead, we introduced the @Setup and @Teardown
methods, which exist to manage long-lived resources for a DoFn, such as
external connections. However, these methods are bound to the lifetime of
the DoFn, not to the lifetime of any specific elements.

As a result, there are two families of methods on a DoFn: element-related
and instance-related methods. @StartBundle, @ProcessElement,
and @FinishBundle are all the prior - they are guaranteed to be called
exactly once per input element, as observed by the Pipeline (exactly one
_logical_ invocation, one or more physical invocations). @Setup
and @Teardown are unrelated to this lifecycle, but because we need to be
able to successfully call @Setup before performing any processing method,
we can trivially guarantee that Setup will have been called before
processing any elements.

For any behavior that is required for the correct functioning of a
pipeline, such as flushing buffered side effects (writes to an external
system, etc), use of @Teardown is *never appropriate* - because it is
unrelated to the processing of elements, if that buffer is lost for any
reason, the elements that produced it will never be reprocessed by the
Pipeline (it is permanently lost, which is not the case for anything
performed in @FinishBundle).

For long-lived resources, those will be bound to the life of the DoFn. In
general the runner is required to tear down an instance if it is going to
continue execution but discard that instance of the DoFn. However, that's
not the only way a runner can discard of a DoFn - for example, it can
reclaim the container the DoFn executes on, or kill the JVM and restart it,
or just never discard it and use it forever, and be well-behaved.

The additional documentation exists to make it obvious that performing
anything based on elements within Teardown is extremely unsafe and is a
good way to produce inconsistent results or lose data.

I'll note as well that this is behavior that has always been the case -
StartBundle, ProcessElement, and FinishBundle will always be called
logically once per element, but Teardown was always called [0, 1] times per
element, and this is an update only to the documentation and not to the
actual behavior of any runner.

On Fri, Feb 16, 2018 at 8:44 AM Reuven Lax <re...@google.com> wrote:

> +1 I think @FinishBundle is the right thing to look at here.
>
> On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net>
> wrote:
>
>> Hi Romain
>>
>> Is it not @FinishBundle your solution ?
>>
>> Regards
>> JB
>> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com> a
>> écrit:
>>>
>>> I see Reuven, so it is actually a broken contract for end users more
>>> than a bug. Concretely a user must have a way to execute code once the
>>> teardown is no more used and a teardown is populated by the user in the
>>> context of an execution.
>>> It means that if the environment wants to pool (cache) the instances it
>>> must provide a postBorrowFromCache and preReturnToCache to let the user
>>> handle that - we'll get back to EJB and passivation ;).
>>>
>>> Personally I think it is fine to cache the instances for the duration of
>>> an execution but not accross execution. Concretely if you check out the API
>>> it should just not be possible for a runner since the lifecycle is not
>>> covered and the fact teardown can not be called today is an implementation
>>> bug/leak surfacing the API.
>>>
>>> So I see 2 options:
>>>
>>> 1. make it mandatory and get rid of the caching - which shouldnt help
>>> much in current state in terms of perf
>>> 2. keep teardown a final release object (which is not that useful cause
>>> of the end of the sentence) and add a clean cache lifecycle management
>>>
>>> tempted to say 1 is saner short terms, in particular cause beam is 2.x
>>> and users already use it this way.
>>>
>>>
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> |  Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>>
>>>> So the concern is that @TearDown might not be called?
>>>>
>>>> Let's understand the reason for @TearDown. The runner is free to cache
>>>> the DoFn object across many invocations, and indeed in streaming this is
>>>> often a critical optimization. However if the runner does decide to destroy
>>>> the DoFn object (e.g. because it's being evicted from cache), often users
>>>> need a callback to tear down associated resources (file handles, RPC
>>>> connections, etc.).
>>>>
>>>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>>>> runner might never tear down the DoFn object! The runner might well decide
>>>> to cache the object forever, in which case there is never .a time to call
>>>> @TearDown. There is no violation of semantics here.
>>>>
>>>> Also, the point about not calling teardown if the JVM crashes might
>>>> well sound implicit with no need to mention it. However empirically users
>>>> do misunderstand even this, so it's worth mentioning.
>>>>
>>>> Reuven
>>>>
>>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>>> rmannibucau@gmail.com> wrote:
>>>>
>>>>> Hi guys,
>>>>>
>>>>> I'm a bit concerned of this PR
>>>>> https://github.com/apache/beam/pull/4637
>>>>>
>>>>> I understand the intent but I'd like to share how I see it and why it
>>>>> is an issue for me:
>>>>>
>>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>>>> preallocate some memory for instance to free it in case of OOME and try to
>>>>> recover but it never prooved useful and got dropped recently. This is a
>>>>> good example you can't do anything if there is a cataclism and therefore
>>>>> any framework or lib will not be blamed for it
>>>>> 2. if you expose an API, its behavior must be well defined. In the
>>>>> case of a portable library like Beam it is even more important otherwise it
>>>>> leads users to not use the API or the projet :(.
>>>>>
>>>>>
>>>>> These two points lead to say that if the JVM crashes it is ok to not
>>>>> call teardown and it is even implicit in any programming environment so no
>>>>> need to mention it. However that a runner doesn't call teardown is a bug
>>>>> and not a feature or something intended because it can have a huge impact
>>>>> on the user flow.
>>>>>
>>>>> The user workarounds are to use custom threads with timeouts to
>>>>> execute the actions or things like that, all bad solutions to replace a
>>>>> buggy API, if you remove the contract guarantee.
>>>>>
>>>>> To make it obvious: substring(from, to): will substring the current
>>>>> string between from and to...or not. Would you use the function?
>>>>>
>>>>> What I ask is to add in the javadoc that the contract enforces the
>>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>>> user and let runners document they don't respect - yet - the API when
>>>>> relevant.
>>>>>
>>>>> wdyt?
>>>>>
>>>>> Romain Manni-Bucau
>>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>>> <http://rmannibucau.wordpress.com> |  Github
>>>>> <https://github.com/rmannibucau> | LinkedIn
>>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>>
>>>>
>>>>
>>>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

+1 I think @FinishBundle is the right thing to look at here.

On Fri, Feb 16, 2018, 8:41 AM Jean-Baptiste Onofré <jb...@nanthrax.net> wrote:

> Hi Romain
>
> Is it not @FinishBundle your solution ?
>
> Regards
> JB
> Le 16 févr. 2018, à 17:06, Romain Manni-Bucau <rm...@gmail.com> a
> écrit:
>>
>> I see Reuven, so it is actually a broken contract for end users more than
>> a bug. Concretely a user must have a way to execute code once the teardown
>> is no more used and a teardown is populated by the user in the context of
>> an execution.
>> It means that if the environment wants to pool (cache) the instances it
>> must provide a postBorrowFromCache and preReturnToCache to let the user
>> handle that - we'll get back to EJB and passivation ;).
>>
>> Personally I think it is fine to cache the instances for the duration of
>> an execution but not accross execution. Concretely if you check out the API
>> it should just not be possible for a runner since the lifecycle is not
>> covered and the fact teardown can not be called today is an implementation
>> bug/leak surfacing the API.
>>
>> So I see 2 options:
>>
>> 1. make it mandatory and get rid of the caching - which shouldnt help
>> much in current state in terms of perf
>> 2. keep teardown a final release object (which is not that useful cause
>> of the end of the sentence) and add a clean cache lifecycle management
>>
>> tempted to say 1 is saner short terms, in particular cause beam is 2.x
>> and users already use it this way.
>>
>>
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> |  Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau> | Book
>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>
>> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>>
>>> So the concern is that @TearDown might not be called?
>>>
>>> Let's understand the reason for @TearDown. The runner is free to cache
>>> the DoFn object across many invocations, and indeed in streaming this is
>>> often a critical optimization. However if the runner does decide to destroy
>>> the DoFn object (e.g. because it's being evicted from cache), often users
>>> need a callback to tear down associated resources (file handles, RPC
>>> connections, etc.).
>>>
>>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>>> runner might never tear down the DoFn object! The runner might well decide
>>> to cache the object forever, in which case there is never .a time to call
>>> @TearDown. There is no violation of semantics here.
>>>
>>> Also, the point about not calling teardown if the JVM crashes might well
>>> sound implicit with no need to mention it. However empirically users do
>>> misunderstand even this, so it's worth mentioning.
>>>
>>> Reuven
>>>
>>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>>> rmannibucau@gmail.com> wrote:
>>>
>>>> Hi guys,
>>>>
>>>> I'm a bit concerned of this PR
>>>> https://github.com/apache/beam/pull/4637
>>>>
>>>> I understand the intent but I'd like to share how I see it and why it
>>>> is an issue for me:
>>>>
>>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>>> preallocate some memory for instance to free it in case of OOME and try to
>>>> recover but it never prooved useful and got dropped recently. This is a
>>>> good example you can't do anything if there is a cataclism and therefore
>>>> any framework or lib will not be blamed for it
>>>> 2. if you expose an API, its behavior must be well defined. In the case
>>>> of a portable library like Beam it is even more important otherwise it
>>>> leads users to not use the API or the projet :(.
>>>>
>>>>
>>>> These two points lead to say that if the JVM crashes it is ok to not
>>>> call teardown and it is even implicit in any programming environment so no
>>>> need to mention it. However that a runner doesn't call teardown is a bug
>>>> and not a feature or something intended because it can have a huge impact
>>>> on the user flow.
>>>>
>>>> The user workarounds are to use custom threads with timeouts to execute
>>>> the actions or things like that, all bad solutions to replace a buggy API,
>>>> if you remove the contract guarantee.
>>>>
>>>> To make it obvious: substring(from, to): will substring the current
>>>> string between from and to...or not. Would you use the function?
>>>>
>>>> What I ask is to add in the javadoc that the contract enforces the
>>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>>> the runner to do so. This way the not portable behavior is where it belongs
>>>> to, in the vendor specific code. It leads to a reliable API for the end
>>>> user and let runners document they don't respect - yet - the API when
>>>> relevant.
>>>>
>>>> wdyt?
>>>>
>>>> Romain Manni-Bucau
>>>> @rmannibucau <https://twitter.com/rmannibucau> |   Blog
>>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>>> <http://rmannibucau.wordpress.com> |  Github
>>>> <https://github.com/rmannibucau> | LinkedIn
>>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>>
>>>
>>>
>>

Re: @TearDown guarantees

Posted by Jean-Baptiste Onofré <jb...@nanthrax.net>.

Hi Romain

Is it not @FinishBundle your solution ?

Regards
JB

Le 16 févr. 2018 à 17:06, à 17:06, Romain Manni-Bucau <rm...@gmail.com> a écrit:
>I see Reuven, so it is actually a broken contract for end users more
>than a
>bug. Concretely a user must have a way to execute code once the
>teardown is
>no more used and a teardown is populated by the user in the context of
>an
>execution.
>It means that if the environment wants to pool (cache) the instances it
>must provide a postBorrowFromCache and preReturnToCache to let the user
>handle that - we'll get back to EJB and passivation ;).
>
>Personally I think it is fine to cache the instances for the duration
>of an
>execution but not accross execution. Concretely if you check out the
>API it
>should just not be possible for a runner since the lifecycle is not
>covered
>and the fact teardown can not be called today is an implementation
>bug/leak
>surfacing the API.
>
>So I see 2 options:
>
>1. make it mandatory and get rid of the caching - which shouldnt help
>much
>in current state in terms of perf
>2. keep teardown a final release object (which is not that useful cause
>of
>the end of the sentence) and add a clean cache lifecycle management
>
>tempted to say 1 is saner short terms, in particular cause beam is 2.x
>and
>users already use it this way.
>
>
>
>Romain Manni-Bucau
>@rmannibucau <https://twitter.com/rmannibucau> |  Blog
><https://rmannibucau.metawerx.net/> | Old Blog
><http://rmannibucau.wordpress.com> | Github
><https://github.com/rmannibucau> |
>LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
><https://www.packtpub.com/application-development/java-ee-8-high-performance>
>
>2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>
>> So the concern is that @TearDown might not be called?
>>
>> Let's understand the reason for @TearDown. The runner is free to
>cache the
>> DoFn object across many invocations, and indeed in streaming this is
>often
>> a critical optimization. However if the runner does decide to destroy
>the
>> DoFn object (e.g. because it's being evicted from cache), often users
>need
>> a callback to tear down associated resources (file handles, RPC
>> connections, etc.).
>>
>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>> runner might never tear down the DoFn object! The runner might well
>decide
>> to cache the object forever, in which case there is never .a time to
>call
>> @TearDown. There is no violation of semantics here.
>>
>> Also, the point about not calling teardown if the JVM crashes might
>well
>> sound implicit with no need to mention it. However empirically users
>do
>> misunderstand even this, so it's worth mentioning.
>>
>> Reuven
>>
>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau
><rmannibucau@gmail.com
>> > wrote:
>>
>>> Hi guys,
>>>
>>> I'm a bit concerned of this PR
>https://github.com/apache/beam/pull/4637
>>>
>>> I understand the intent but I'd like to share how I see it and why
>it is
>>> an issue for me:
>>>
>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>> preallocate some memory for instance to free it in case of OOME and
>try to
>>> recover but it never prooved useful and got dropped recently. This
>is a
>>> good example you can't do anything if there is a cataclism and
>therefore
>>> any framework or lib will not be blamed for it
>>> 2. if you expose an API, its behavior must be well defined. In the
>case
>>> of a portable library like Beam it is even more important otherwise
>it
>>> leads users to not use the API or the projet :(.
>>>
>>>
>>> These two points lead to say that if the JVM crashes it is ok to not
>call
>>> teardown and it is even implicit in any programming environment so
>no need
>>> to mention it. However that a runner doesn't call teardown is a bug
>and not
>>> a feature or something intended because it can have a huge impact on
>the
>>> user flow.
>>>
>>> The user workarounds are to use custom threads with timeouts to
>execute
>>> the actions or things like that, all bad solutions to replace a
>buggy API,
>>> if you remove the contract guarantee.
>>>
>>> To make it obvious: substring(from, to): will substring the current
>>> string between from and to...or not. Would you use the function?
>>>
>>> What I ask is to add in the javadoc that the contract enforces the
>runner
>>> to call that. Which means the core doesn't guarantee it but imposes
>the
>>> runner to do so. This way the not portable behavior is where it
>belongs to,
>>> in the vendor specific code. It leads to a reliable API for the end
>user
>>> and let runners document they don't respect - yet - the API when
>relevant.
>>>
>>> wdyt?
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>>
><https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>
>>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

On Fri, Feb 16, 2018 at 8:06 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

> I see Reuven, so it is actually a broken contract for end users more than
> a bug. Concretely a user must have a way to execute code once the teardown
> is no more used and a teardown is populated by the user in the context of
> an execution.
>

I don't understand what contract you think is broken. @FinishBundle has a
contract - it's guaranteed to be called even in the case of JVM crashes (in
that case the records will be reprocessed). @TearDown exists for resource
cleanup, so does not have this contract.

It means that if the environment wants to pool (cache) the instances it
> must provide a postBorrowFromCache and preReturnToCache to let the user
> handle that - we'll get back to EJB and passivation ;).
>

Let's not focus on caching. The runner might immediately reuse the DoFn (if
there is more data available).


>
> Personally I think it is fine to cache the instances for the duration of
> an execution but not accross execution. Concretely if you check out the API
> it should just not be possible for a runner since the lifecycle is not
> covered and the fact teardown can not be called today is an implementation
> bug/leak surfacing the API.
>

Across execution of what? records/bundles?  Being able to reuse DoFns
across bundles is actually critical to performance of streaming, and there
is data to back that.


> So I see 2 options:
>
> 1. make it mandatory and get rid of the caching - which shouldnt help much
> in current state in terms of perf
>

Why do you think caching doesn't help? It is the number one optimization
for streaming pipelines. There are a number of extremely high-volume
streaming pipelines that would not be able to run affordably without
caching.

2. keep teardown a final release object (which is not that useful cause of
> the end of the sentence) and add a clean cache lifecycle management
>
> tempted to say 1 is saner short terms, in particular cause beam is 2.x and
> users already use it this way.
>
>
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>
> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>
>> So the concern is that @TearDown might not be called?
>>
>> Let's understand the reason for @TearDown. The runner is free to cache
>> the DoFn object across many invocations, and indeed in streaming this is
>> often a critical optimization. However if the runner does decide to destroy
>> the DoFn object (e.g. because it's being evicted from cache), often users
>> need a callback to tear down associated resources (file handles, RPC
>> connections, etc.).
>>
>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>> runner might never tear down the DoFn object! The runner might well decide
>> to cache the object forever, in which case there is never .a time to call
>> @TearDown. There is no violation of semantics here.
>>
>> Also, the point about not calling teardown if the JVM crashes might well
>> sound implicit with no need to mention it. However empirically users do
>> misunderstand even this, so it's worth mentioning.
>>
>> Reuven
>>
>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I'm a bit concerned of this PR https://github.com/apache/beam/pull/4637
>>>
>>> I understand the intent but I'd like to share how I see it and why it is
>>> an issue for me:
>>>
>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>> preallocate some memory for instance to free it in case of OOME and try to
>>> recover but it never prooved useful and got dropped recently. This is a
>>> good example you can't do anything if there is a cataclism and therefore
>>> any framework or lib will not be blamed for it
>>> 2. if you expose an API, its behavior must be well defined. In the case
>>> of a portable library like Beam it is even more important otherwise it
>>> leads users to not use the API or the projet :(.
>>>
>>>
>>> These two points lead to say that if the JVM crashes it is ok to not
>>> call teardown and it is even implicit in any programming environment so no
>>> need to mention it. However that a runner doesn't call teardown is a bug
>>> and not a feature or something intended because it can have a huge impact
>>> on the user flow.
>>>
>>> The user workarounds are to use custom threads with timeouts to execute
>>> the actions or things like that, all bad solutions to replace a buggy API,
>>> if you remove the contract guarantee.
>>>
>>> To make it obvious: substring(from, to): will substring the current
>>> string between from and to...or not. Would you use the function?
>>>
>>> What I ask is to add in the javadoc that the contract enforces the
>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>> the runner to do so. This way the not portable behavior is where it belongs
>>> to, in the vendor specific code. It leads to a reliable API for the end
>>> user and let runners document they don't respect - yet - the API when
>>> relevant.
>>>
>>> wdyt?
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>
>>
>

Re: @TearDown guarantees

Posted by Kenneth Knowles <kl...@google.com>.

It sounds like you just want @FinishBundle

On Fri, Feb 16, 2018 at 8:06 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

> I see Reuven, so it is actually a broken contract for end users more than
> a bug. Concretely a user must have a way to execute code once the teardown
> is no more used and a teardown is populated by the user in the context of
> an execution.
> It means that if the environment wants to pool (cache) the instances it
> must provide a postBorrowFromCache and preReturnToCache to let the user
> handle that - we'll get back to EJB and passivation ;).
>
> Personally I think it is fine to cache the instances for the duration of
> an execution but not accross execution. Concretely if you check out the API
> it should just not be possible for a runner since the lifecycle is not
> covered and the fact teardown can not be called today is an implementation
> bug/leak surfacing the API.
>
> So I see 2 options:
>
> 1. make it mandatory and get rid of the caching - which shouldnt help much
> in current state in terms of perf
> 2. keep teardown a final release object (which is not that useful cause of
> the end of the sentence) and add a clean cache lifecycle management
>
> tempted to say 1 is saner short terms, in particular cause beam is 2.x and
> users already use it this way.
>
>
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>
> 2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:
>
>> So the concern is that @TearDown might not be called?
>>
>> Let's understand the reason for @TearDown. The runner is free to cache
>> the DoFn object across many invocations, and indeed in streaming this is
>> often a critical optimization. However if the runner does decide to destroy
>> the DoFn object (e.g. because it's being evicted from cache), often users
>> need a callback to tear down associated resources (file handles, RPC
>> connections, etc.).
>>
>> Now @TearDown isn't guaranteed to be called for a simple reason: the
>> runner might never tear down the DoFn object! The runner might well decide
>> to cache the object forever, in which case there is never .a time to call
>> @TearDown. There is no violation of semantics here.
>>
>> Also, the point about not calling teardown if the JVM crashes might well
>> sound implicit with no need to mention it. However empirically users do
>> misunderstand even this, so it's worth mentioning.
>>
>> Reuven
>>
>> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <
>> rmannibucau@gmail.com> wrote:
>>
>>> Hi guys,
>>>
>>> I'm a bit concerned of this PR https://github.com/apache/beam/pull/4637
>>>
>>> I understand the intent but I'd like to share how I see it and why it is
>>> an issue for me:
>>>
>>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>>> preallocate some memory for instance to free it in case of OOME and try to
>>> recover but it never prooved useful and got dropped recently. This is a
>>> good example you can't do anything if there is a cataclism and therefore
>>> any framework or lib will not be blamed for it
>>> 2. if you expose an API, its behavior must be well defined. In the case
>>> of a portable library like Beam it is even more important otherwise it
>>> leads users to not use the API or the projet :(.
>>>
>>>
>>> These two points lead to say that if the JVM crashes it is ok to not
>>> call teardown and it is even implicit in any programming environment so no
>>> need to mention it. However that a runner doesn't call teardown is a bug
>>> and not a feature or something intended because it can have a huge impact
>>> on the user flow.
>>>
>>> The user workarounds are to use custom threads with timeouts to execute
>>> the actions or things like that, all bad solutions to replace a buggy API,
>>> if you remove the contract guarantee.
>>>
>>> To make it obvious: substring(from, to): will substring the current
>>> string between from and to...or not. Would you use the function?
>>>
>>> What I ask is to add in the javadoc that the contract enforces the
>>> runner to call that. Which means the core doesn't guarantee it but imposes
>>> the runner to do so. This way the not portable behavior is where it belongs
>>> to, in the vendor specific code. It leads to a reliable API for the end
>>> user and let runners document they don't respect - yet - the API when
>>> relevant.
>>>
>>> wdyt?
>>>
>>> Romain Manni-Bucau
>>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>>> <https://rmannibucau.metawerx.net/> | Old Blog
>>> <http://rmannibucau.wordpress.com> | Github
>>> <https://github.com/rmannibucau> | LinkedIn
>>> <https://www.linkedin.com/in/rmannibucau> | Book
>>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>>
>>
>>
>

Re: @TearDown guarantees

Posted by Romain Manni-Bucau <rm...@gmail.com>.

I see Reuven, so it is actually a broken contract for end users more than a
bug. Concretely a user must have a way to execute code once the teardown is
no more used and a teardown is populated by the user in the context of an
execution.
It means that if the environment wants to pool (cache) the instances it
must provide a postBorrowFromCache and preReturnToCache to let the user
handle that - we'll get back to EJB and passivation ;).

Personally I think it is fine to cache the instances for the duration of an
execution but not accross execution. Concretely if you check out the API it
should just not be possible for a runner since the lifecycle is not covered
and the fact teardown can not be called today is an implementation bug/leak
surfacing the API.

So I see 2 options:

1. make it mandatory and get rid of the caching - which shouldnt help much
in current state in terms of perf
2. keep teardown a final release object (which is not that useful cause of
the end of the sentence) and add a clean cache lifecycle management

tempted to say 1 is saner short terms, in particular cause beam is 2.x and
users already use it this way.



Romain Manni-Bucau
@rmannibucau <https://twitter.com/rmannibucau> |  Blog
<https://rmannibucau.metawerx.net/> | Old Blog
<http://rmannibucau.wordpress.com> | Github <https://github.com/rmannibucau> |
LinkedIn <https://www.linkedin.com/in/rmannibucau> | Book
<https://www.packtpub.com/application-development/java-ee-8-high-performance>

2018-02-16 16:53 GMT+01:00 Reuven Lax <re...@google.com>:

> So the concern is that @TearDown might not be called?
>
> Let's understand the reason for @TearDown. The runner is free to cache the
> DoFn object across many invocations, and indeed in streaming this is often
> a critical optimization. However if the runner does decide to destroy the
> DoFn object (e.g. because it's being evicted from cache), often users need
> a callback to tear down associated resources (file handles, RPC
> connections, etc.).
>
> Now @TearDown isn't guaranteed to be called for a simple reason: the
> runner might never tear down the DoFn object! The runner might well decide
> to cache the object forever, in which case there is never .a time to call
> @TearDown. There is no violation of semantics here.
>
> Also, the point about not calling teardown if the JVM crashes might well
> sound implicit with no need to mention it. However empirically users do
> misunderstand even this, so it's worth mentioning.
>
> Reuven
>
> On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <rmannibucau@gmail.com
> > wrote:
>
>> Hi guys,
>>
>> I'm a bit concerned of this PR https://github.com/apache/beam/pull/4637
>>
>> I understand the intent but I'd like to share how I see it and why it is
>> an issue for me:
>>
>> 1. you can't help if the JVM crash in any case. Tomcat had a try to
>> preallocate some memory for instance to free it in case of OOME and try to
>> recover but it never prooved useful and got dropped recently. This is a
>> good example you can't do anything if there is a cataclism and therefore
>> any framework or lib will not be blamed for it
>> 2. if you expose an API, its behavior must be well defined. In the case
>> of a portable library like Beam it is even more important otherwise it
>> leads users to not use the API or the projet :(.
>>
>>
>> These two points lead to say that if the JVM crashes it is ok to not call
>> teardown and it is even implicit in any programming environment so no need
>> to mention it. However that a runner doesn't call teardown is a bug and not
>> a feature or something intended because it can have a huge impact on the
>> user flow.
>>
>> The user workarounds are to use custom threads with timeouts to execute
>> the actions or things like that, all bad solutions to replace a buggy API,
>> if you remove the contract guarantee.
>>
>> To make it obvious: substring(from, to): will substring the current
>> string between from and to...or not. Would you use the function?
>>
>> What I ask is to add in the javadoc that the contract enforces the runner
>> to call that. Which means the core doesn't guarantee it but imposes the
>> runner to do so. This way the not portable behavior is where it belongs to,
>> in the vendor specific code. It leads to a reliable API for the end user
>> and let runners document they don't respect - yet - the API when relevant.
>>
>> wdyt?
>>
>> Romain Manni-Bucau
>> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
>> <https://rmannibucau.metawerx.net/> | Old Blog
>> <http://rmannibucau.wordpress.com> | Github
>> <https://github.com/rmannibucau> | LinkedIn
>> <https://www.linkedin.com/in/rmannibucau> | Book
>> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>>
>
>

Re: @TearDown guarantees

Posted by Reuven Lax <re...@google.com>.

So the concern is that @TearDown might not be called?

Let's understand the reason for @TearDown. The runner is free to cache the
DoFn object across many invocations, and indeed in streaming this is often
a critical optimization. However if the runner does decide to destroy the
DoFn object (e.g. because it's being evicted from cache), often users need
a callback to tear down associated resources (file handles, RPC
connections, etc.).

Now @TearDown isn't guaranteed to be called for a simple reason: the runner
might never tear down the DoFn object! The runner might well decide to
cache the object forever, in which case there is never .a time to call
@TearDown. There is no violation of semantics here.

Also, the point about not calling teardown if the JVM crashes might well
sound implicit with no need to mention it. However empirically users do
misunderstand even this, so it's worth mentioning.

Reuven

On Fri, Feb 16, 2018 at 2:11 AM, Romain Manni-Bucau <rm...@gmail.com>
wrote:

> Hi guys,
>
> I'm a bit concerned of this PR https://github.com/apache/beam/pull/4637
>
> I understand the intent but I'd like to share how I see it and why it is
> an issue for me:
>
> 1. you can't help if the JVM crash in any case. Tomcat had a try to
> preallocate some memory for instance to free it in case of OOME and try to
> recover but it never prooved useful and got dropped recently. This is a
> good example you can't do anything if there is a cataclism and therefore
> any framework or lib will not be blamed for it
> 2. if you expose an API, its behavior must be well defined. In the case of
> a portable library like Beam it is even more important otherwise it leads
> users to not use the API or the projet :(.
>
>
> These two points lead to say that if the JVM crashes it is ok to not call
> teardown and it is even implicit in any programming environment so no need
> to mention it. However that a runner doesn't call teardown is a bug and not
> a feature or something intended because it can have a huge impact on the
> user flow.
>
> The user workarounds are to use custom threads with timeouts to execute
> the actions or things like that, all bad solutions to replace a buggy API,
> if you remove the contract guarantee.
>
> To make it obvious: substring(from, to): will substring the current string
> between from and to...or not. Would you use the function?
>
> What I ask is to add in the javadoc that the contract enforces the runner
> to call that. Which means the core doesn't guarantee it but imposes the
> runner to do so. This way the not portable behavior is where it belongs to,
> in the vendor specific code. It leads to a reliable API for the end user
> and let runners document they don't respect - yet - the API when relevant.
>
> wdyt?
>
> Romain Manni-Bucau
> @rmannibucau <https://twitter.com/rmannibucau> |  Blog
> <https://rmannibucau.metawerx.net/> | Old Blog
> <http://rmannibucau.wordpress.com> | Github
> <https://github.com/rmannibucau> | LinkedIn
> <https://www.linkedin.com/in/rmannibucau> | Book
> <https://www.packtpub.com/application-development/java-ee-8-high-performance>
>