You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by Yangze Guo <ka...@gmail.com> on 2020/04/01 09:22:39 UTC

[VOTE] FLIP-108: Add GPU support in Flink

Hi everyone,

I'd like to start the vote of FLIP-108 [1], which adds GPU support in
Flink. This FLIP is discussed in the thread[2].

The vote will be open for at least 72 hours. Unless there is an objection,
I will try to close it by April 4, 2020 10:00 UTC if we have received
sufficient votes.

[1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
[2] http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html

Best,
Yangze Guo

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Stephan Ewen <se...@apache.org>.
+1

On Thu, Apr 16, 2020 at 4:17 AM Yangze Guo <ka...@gmail.com> wrote:

> Hi Aljoscha,
>
> Thanks for your advice. +1 to align the config pattern.
>
> I also agree that we need to move the long discussion to the [DISCUSS]
> thread. Sorry if it bothers you.
>
> Best,
> Yangze Guo
>
> On Thu, Apr 16, 2020 at 7:52 AM Becket Qin <be...@gmail.com> wrote:
> >
> > I agree with Aljoscha. It is important to keep API the same style. And we
> > probably should move the long discussion to the [DISCUSS] thread.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Wed, Apr 15, 2020 at 11:27 PM Aljoscha Krettek <al...@apache.org>
> > wrote:
> >
> > > Is the only really new method on the public APIs
> > > getExternalResourceInfos(..) on the RuntimeContext? I'm generally quite
> > > skeptical about adding anything to that interface but the method seems
> ok.
> > >
> > > Side note for the configuration keys: the pattern is similar to metrics
> > > configuration. There we have "metrics.reporters" = <list of reporter
> > > names> and then metrics.reporter.<foobazzle>... Your proposal is
> > > slightly different in that it uses "external-resource.list". Keeping
> > > this in line with metrics configuration would suggest to use
> > > "external-resources", and then "external-resource.<foobazzle>...". What
> > > do you think?
> > >
> > > Also, why is there this long discussion in a [VOTE] thread?
> > >
> > > Best,
> > > Aljoscha
> > >
> > > On 15.04.20 10:32, Yangze Guo wrote:
> > > > Thanks for the explanation. I do not have a strong opinion regarding
> > > > this interface. So, if it is better from your perspective, I'm +1 for
> > > > this. I just saying it may not help a lot regarding the type-safe.
> > > >
> > > > Regarding the bounded wildcard type, yes, it's the implementation
> > > > detail. If it won't make a difference for user, I'm also +1 for not
> > > > using bounded wildcard type there.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Wed, Apr 15, 2020 at 4:23 PM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > > >>
> > > >> I think <T extends ExternalResourceInfo> Set<T>
> > > >> getExternalResourceInfos(String resourceName, Class<T>
> > > >> externalResourceType) is not less flexible than the other API since
> you
> > > can
> > > >> always pass in ExternalResourceInfo.class as the second argument.
> > > >>
> > > >> The benefit I see for the user is that he does not have to do the
> > > >> instanceof checks and type casts himself. This is admittedly not a
> big
> > > deal
> > > >> but still a better API imo.
> > > >>
> > > >> I think the interface of the Driver and what is returned by the
> > > >> RuntimeContext don't have to have the same type because you can
> cast it
> > > or
> > > >> repack it. If the current implementation simply stores what the
> Driver
> > > >> returns and RuntimeContext returns this map, then it might seem that
> > > there
> > > >> is a connection. But this should be an implementation detail rather
> > > than a
> > > >> necessity.
> > > >>
> > > >> Maybe we could also pull in someone from the SDK team to give us his
> > > >> opinion on the user facing API.
> > > >>
> > > >> Cheers,
> > > >> Till
> > > >>
> > > >> On Wed, Apr 15, 2020 at 10:13 AM Xintong Song <
> tonysong820@gmail.com>
> > > wrote:
> > > >>
> > > >>>>
> > > >>>> I agree that such an interface won't give compile time checks but
> I
> > > think
> > > >>>> that it could be easier to use from a user's perspective because
> > > there is
> > > >>>> no explicit casting required.
> > > >>>> public interface RuntimeContext {
> > > >>>>      <T extends ExternalResourceInfo> Set<T>
> > > >>> getExternalResourceInfos(String
> > > >>>> resourceName, Class<T> externalResourceType);
> > > >>>> }
> > > >>>
> > > >>>
> > > >>> I'm not sure how less efforts is required from users to pass in a
> > > >>> `externalResourceType` compared to do an explicit type casting.
> > > >>> A potential side effect of passing in a `externalResourceType` is
> > > that, it
> > > >>> requires user (e.g. the operator) to know which specific type
> should be
> > > >>> returned in advance, which may limit the flexibility.
> > > >>>
> > > >>> E.g., we might have an operator that can work with multiple
> different
> > > >>> implementations of `ExternalResourceInfo`. It may decide its
> behavior
> > > based
> > > >>> on the actually type returned by `getExternalResourceInfos` at
> runtime.
> > > >>>
> > > >>>
> > > >>> Thank you~
> > > >>>
> > > >>> Xintong Song
> > > >>>
> > > >>>
> > > >>>
> > > >>> On Wed, Apr 15, 2020 at 4:09 PM Yangze Guo <ka...@gmail.com>
> wrote:
> > > >>>
> > > >>>> @Till
> > > >>>> If we add "Class<T> externalResourceType" param, what if there are
> > > >>>> multiple subtypes in the ExternalResourceInfos set of one external
> > > >>>> resource? It seems user has to set the T to ExternalResourceInfo
> and
> > > >>>> the mechanism is useless at this case.
> > > >>>>
> > > >>>> Best,
> > > >>>> Yangze Guo
> > > >>>>
> > > >>>> On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <
> trohrmann@apache.org>
> > > >>>> wrote:
> > > >>>>>
> > > >>>>> Ok, if there can be multiple resources of the same type then we
> > > >>>> definitely
> > > >>>>> need the name as a differentiator.
> > > >>>>>
> > > >>>>> I agree that such an interface won't give compile time checks
> but I
> > > >>> think
> > > >>>>> that it could be easier to use from a user's perspective because
> > > there
> > > >>> is
> > > >>>>> no explicit casting required.
> > > >>>>>
> > > >>>>> public interface RuntimeContext {
> > > >>>>>      <T extends ExternalResourceInfo> Set<T>
> > > >>>> getExternalResourceInfos(String
> > > >>>>> resourceName, Class<T> externalResourceType);
> > > >>>>> }
> > > >>>>>
> > > >>>>> One minor note: I think the value of the returned map does not
> need
> > > to
> > > >>>> use
> > > >>>>> a bounded wildcard type because for the user it won't make a
> > > >>> difference.
> > > >>>>>
> > > >>>>> Cheers,
> > > >>>>> Till
> > > >>>>>
> > > >>>>> On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com>
> > > wrote:
> > > >>>>>
> > > >>>>>> Hi Till,
> > > >>>>>>
> > > >>>>>>> ExternalResourceDriver could return a Set<? extends
> > > >>>>>> ExternalResourceInfo>.
> > > >>>>>> It sounds good.
> > > >>>>>>
> > > >>>>>>> then one could make the interface type-safe by changing it to
> > > >>>>>>> public interface RuntimeContext {
> > > >>>>>>>     <T extends ExternalResourceInfo> Set<T>
> > > >>>>>>> getExternalResourceInfo(Class<T> externalResourceType);
> > > >>>>>>> }
> > > >>>>>> I think it may not help.
> > > >>>>>> - I think the assumption of "there is always only one resource
> of a
> > > >>>>>> specific type" is too strong. The external resource framework
> should
> > > >>>>>> only assume it gets a set of ExternalResourceInfo from the
> driver.
> > > >>> The
> > > >>>>>> concrete implementation is given by user. So, if we give such an
> > > >>>>>> assumption, it would hurt the flexibility. There could be
> multiple
> > > >>>>>> types in the returned externalResourceInfo set. There could
> also be
> > > >>>>>> different types returned from different driver implementation or
> > > >>>>>> version. The contract about the return type between Driver and
> > > >>>>>> Operator should be guaranteed by user.
> > > >>>>>> - Since the Drivers are loaded dynamically in runtime, if there
> is a
> > > >>>>>> type mismatch, the job would fail in runtime instead of in
> compile
> > > >>>>>> time, no matter the type extraction is done by Operator or Flink
> > > >>> core.
> > > >>>>>> This interface would not gain benefits for type safety.
> > > >>>>>>
> > > >>>>>> Best,
> > > >>>>>> Yangze Guo
> > > >>>>>>
> > > >>>>>> On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <
> trohrmann@apache.org
> > > >
> > > >>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>> Thanks for updating the FLIP, Yangze.
> > > >>>>>>>
> > > >>>>>>> If ExternalResourceInfo is a marker interface, then
> > > >>>>>> ExternalResourceDriver
> > > >>>>>>> could return a Set<? extends ExternalResourceInfo>. This makes
> is a
> > > >>>> bit
> > > >>>>>>> nicer for the implementor because he can use the concrete
> subtype.
> > > >>>>>>>
> > > >>>>>>> If we assume that users will always cast the
> ExternalResourceInfo
> > > >>>>>> instance
> > > >>>>>>> into the concrete subtype and if we assume that there is always
> > > >>> only
> > > >>>> one
> > > >>>>>>> resource of a specific type, then one could make the interface
> > > >>>> type-safe
> > > >>>>>> by
> > > >>>>>>> changing it to
> > > >>>>>>>
> > > >>>>>>> public interface RuntimeContext {
> > > >>>>>>>      <T extends ExternalResourceInfo> Set<T>
> > > >>>>>>> getExternalResourceInfo(Class<T> externalResourceType);
> > > >>>>>>> }
> > > >>>>>>>
> > > >>>>>>> If we want to support multiple GPU resources, then one would
> need
> > > >>> to
> > > >>>> use
> > > >>>>>>> the name of the respective resource as well.
> > > >>>>>>>
> > > >>>>>>> Cheers,
> > > >>>>>>> Till
> > > >>>>>>>
> > > >>>>>>> On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <
> > > >>> tonysong820@gmail.com>
> > > >>>>>> wrote:
> > > >>>>>>>
> > > >>>>>>>> Thanks for updating the FLIP, Yangze.
> > > >>>>>>>> The latest FLIP looks good to me.
> > > >>>>>>>>
> > > >>>>>>>> nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo`
> is
> > > >>>> out of
> > > >>>>>>>> sync.
> > > >>>>>>>>
> > > >>>>>>>>> Retrieve the information of the external resources according
> to
> > > >>>> the
> > > >>>>>>>>> resourceProfile.
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> Thank you~
> > > >>>>>>>>
> > > >>>>>>>> Xintong Song
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>>
> > > >>>>>>>> On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <
> > > >>> becket.qin@gmail.com>
> > > >>>>>> wrote:
> > > >>>>>>>>
> > > >>>>>>>>> Good feedback form Xintong. The latest FLIP looks good to me.
> > > >>>>>>>>>
> > > >>>>>>>>> Thanks,
> > > >>>>>>>>>
> > > >>>>>>>>> Jiangjie (Becket) Qin
> > > >>>>>>>>>
> > > >>>>>>>>> On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <
> karmagyz@gmail.com
> > > >>>>
> > > >>>>>> wrote:
> > > >>>>>>>>>
> > > >>>>>>>>>> Hi there,
> > > >>>>>>>>>> I've updated the FLIP accordingly. Please take a look. If
> you
> > > >>>> have
> > > >>>>>> any
> > > >>>>>>>>>> further concerns please let me know.
> > > >>>>>>>>>>
> > > >>>>>>>>>> Best,
> > > >>>>>>>>>> Yangze Guo
> > > >>>>>>>>>>
> > > >>>>>>>>>> On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <
> > > >>> karmagyz@gmail.com
> > > >>>>>
> > > >>>>>> wrote:
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Thanks for the feedback, Xintong.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> - Should we have a factory interface for
> > > >>>>>> `ExternalResourceDriver`,
> > > >>>>>>>> that
> > > >>>>>>>>>>> takes the configuration and returns a driver instance?
> > > >>>>>> Otherwise, if
> > > >>>>>>>> we
> > > >>>>>>>>>> are
> > > >>>>>>>>>>> creating the driver instance with reflection, we kind of
> > > >>>>>> implicitly
> > > >>>>>>>>>>> requires the driver to have a public non-argument
> > > >>>> constructor.
> > > >>>>>> If we
> > > >>>>>>>>>>> decided to go with this approach, then we will not need
> > > >>>>>>>>>>> `ExternalResourceDriver#open`.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> True, we could have an `ExternalResourceDriverFactory`,
> > > >>> like
> > > >>>>>>>>>>> interface ExternalResourceDriverFactory {
> > > >>>>>>>>>>>      ExternalResourceDriver fromConfiguration(Configuration
> > > >>>>>> config);
> > > >>>>>>>>>>> }
> > > >>>>>>>>>>> Regarding the configuration, the user should provide
> > > >>>>>>>>>>> "external-resource.{resourceName}.driver-factory.class"
> > > >>>> instead.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> - Not sure about the necessity of
> > > >>>>>> `ExternalResourceDriver#close`. I
> > > >>>>>>>>> would
> > > >>>>>>>>>>> suggest to avoid introduce more interfaces if not
> > > >>> absolutely
> > > >>>>>>>> necessary.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> I add `ExternalResourceDriver#close` in case user needs to
> > > >>>> clean
> > > >>>>>> up
> > > >>>>>>>>>>> internal states and any other resources. It's true that it
> > > >>>> may
> > > >>>>>> not
> > > >>>>>>>>>>> absolutely necessary for our GPUDriver. From my side, I'm
> > > >>> ok
> > > >>>> to
> > > >>>>>>>> remove
> > > >>>>>>>>>>> it.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> - `ExternalResourceDriver#retrieveResourceInfo` should not
> > > >>>> take
> > > >>>>>>>>>>> `ResourceProfile` as argument. This exposes more
> > > >>> information
> > > >>>>>> than it
> > > >>>>>>>>>> needs.
> > > >>>>>>>>>>> In addition, it requires the runtime/core to understand how
> > > >>>> to
> > > >>>>>>>> properly
> > > >>>>>>>>>>> wrap the external resource into `ResourceProfile`. E.g.,
> > > >>>>>>>>>>> `ResourceProfile#extendedResources` takes `Resource`, which
> > > >>>> is an
> > > >>>>>>>>>> abstract
> > > >>>>>>>>>>> class. Runtime/core has to known which implementation of
> > > >>>>>> `Resource`
> > > >>>>>>>> to
> > > >>>>>>>>>> use.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> True, at the moment, I think the amount of the resource is
> > > >>>>>> enough for
> > > >>>>>>>>>>> the `ExternalResourceDriver#retrieveResourceInfo`. In the
> > > >>>>>> future, if
> > > >>>>>>>>>>> the fine-grained external resource management is supported,
> > > >>>> the
> > > >>>>>>>> amount
> > > >>>>>>>>>>> of the resource seems to be enough either. If we want to
> > > >>>> leverage
> > > >>>>>>>> some
> > > >>>>>>>>>>> external resources which could not be measured by a single
> > > >>>> long
> > > >>>>>>>> value,
> > > >>>>>>>>>>> we might enrich this. But I'd like to keep it out of the
> > > >>>> scope of
> > > >>>>>>>> this
> > > >>>>>>>>>>> FLIP.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> - Do we really need `ExternalResourceInfo#getInformation`?
> > > >>> I
> > > >>>>>> think it
> > > >>>>>>>>>>> should be good enough to make `ExternalResourceInfo` an
> > > >>> empty
> > > >>>>>>>>> interface.
> > > >>>>>>>>>>> User can define their own `ExternalResourceInfo`
> > > >>>> implementation
> > > >>>>>> and
> > > >>>>>>>> how
> > > >>>>>>>>>> it
> > > >>>>>>>>>>> is used by the operator user codes.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Sounds good.
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> Best,
> > > >>>>>>>>>>> Yangze Guo
> > > >>>>>>>>>>>
> > > >>>>>>>>>>> On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> > > >>>>>> tonysong820@gmail.com>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Sorry to pull this back. I have some concerns about the
> > > >>>> recent
> > > >>>>>>>>> updated
> > > >>>>>>>>>>>> interface details.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> - Should we have a factory interface for
> > > >>>>>> `ExternalResourceDriver`,
> > > >>>>>>>>> that
> > > >>>>>>>>>>>> takes the configuration and returns a driver instance?
> > > >>>>>> Otherwise,
> > > >>>>>>>> if
> > > >>>>>>>>>> we are
> > > >>>>>>>>>>>> creating the driver instance with reflection, we kind of
> > > >>>>>> implicitly
> > > >>>>>>>>>>>> requires the driver to have a public non-argument
> > > >>>> constructor.
> > > >>>>>> If
> > > >>>>>>>> we
> > > >>>>>>>>>>>> decided to go with this approach, then we will not need
> > > >>>>>>>>>>>> `ExternalResourceDriver#open`.
> > > >>>>>>>>>>>> - Not sure about the necessity of
> > > >>>>>> `ExternalResourceDriver#close`. I
> > > >>>>>>>>>> would
> > > >>>>>>>>>>>> suggest to avoid introduce more interfaces if not
> > > >>>> absolutely
> > > >>>>>>>>> necessary.
> > > >>>>>>>>>>>> - `ExternalResourceDriver#retrieveResourceInfo` should
> > > >>> not
> > > >>>> take
> > > >>>>>>>>>>>> `ResourceProfile` as argument. This exposes more
> > > >>>> information
> > > >>>>>> than
> > > >>>>>>>> it
> > > >>>>>>>>>> needs.
> > > >>>>>>>>>>>> In addition, it requires the runtime/core to understand
> > > >>>> how to
> > > >>>>>>>>> properly
> > > >>>>>>>>>>>> wrap the external resource into `ResourceProfile`. E.g.,
> > > >>>>>>>>>>>> `ResourceProfile#extendedResources` takes `Resource`,
> > > >>>> which is
> > > >>>>>> an
> > > >>>>>>>>>> abstract
> > > >>>>>>>>>>>> class. Runtime/core has to known which implementation of
> > > >>>>>> `Resource`
> > > >>>>>>>>> to
> > > >>>>>>>>>> use.
> > > >>>>>>>>>>>> - Do we really need
> > > >>> `ExternalResourceInfo#getInformation`?
> > > >>>> I
> > > >>>>>> think
> > > >>>>>>>> it
> > > >>>>>>>>>>>> should be good enough to make `ExternalResourceInfo` an
> > > >>>> empty
> > > >>>>>>>>>> interface.
> > > >>>>>>>>>>>> User can define their own `ExternalResourceInfo`
> > > >>>>>> implementation and
> > > >>>>>>>>>> how it
> > > >>>>>>>>>>>> is used by the operator user codes.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Thank you~
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Xintong Song
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> > > >>>>>> becket.qin@gmail.com>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>>> +1
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Thanks for driving this effort, Ynagze. The latest FLIP
> > > >>>> wiki
> > > >>>>>>>> looks
> > > >>>>>>>>>> good to
> > > >>>>>>>>>>>>> me.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Cheers,
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Jiangjie (Becket) Qin
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> > > >>>>>> karmagyz@gmail.com>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Edit: RuntimeContext interface
> > > >>>>>>>>>>>>>> From: Map<String, Set<ExternalResourceInfo>>
> > > >>>>>>>>>>>>>> getExternalResourceInfo(ResourceSpec resourceSpec);
> > > >>>>>>>>>>>>>> To: Map<String, Set<ExternalResourceInfo>>
> > > >>>>>>>>>> getExternalResourceInfo();
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>>> Yangze Guo
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>> On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> > > >>>>>> karmagyz@gmail.com
> > > >>>>>>>>>
> > > >>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Hi, there
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> I have updated the FLIP, mainly target to make it
> > > >>>> more
> > > >>>>>>>> detailed
> > > >>>>>>>>>> and
> > > >>>>>>>>>>>>>>> clear. The general design is not changed, but there
> > > >>>> are
> > > >>>>>> still
> > > >>>>>>>>>> some
> > > >>>>>>>>>>>>>>> changes need to be notified here:
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> - Change the `ExternalResourceDriver` from abstract
> > > >>>>>> class to
> > > >>>>>>>>>>>>>>> interface, since it does not have an abstract
> > > >>>>>> implementation.
> > > >>>>>>>>>> Add life
> > > >>>>>>>>>>>>>>> cycle method `open` and `close`.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> - Specify the method added to the RuntimeContext
> > > >>> from
> > > >>>>>> which
> > > >>>>>>>>> user
> > > >>>>>>>>>> get
> > > >>>>>>>>>>>>>>> the information of external resources.
> > > >>>>>>>>>>>>>>>          Map<String, Set<ExternalResourceInfo>>
> > > >>>>>>>>>>>>>>> getExternalResourceInfo(ResourceSpec resourceSpec);
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> - Add "String getInformation()" method to
> > > >>>>>>>>> `ExternalResourceInfo`
> > > >>>>>>>>>>>>>> interface.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> - Treat adding external resource info to
> > > >>>> RestAPI/WebUI
> > > >>>>>> as a
> > > >>>>>>>>>> future
> > > >>>>>>>>>>>>> work.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> If you have any new concerns after that change,
> > > >>>> please
> > > >>>>>>>>> mentioned
> > > >>>>>>>>>> here.
> > > >>>>>>>>>>>>>>> Sorry for disturbing you.
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>>>> Yangze Guo
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>> On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > >>>>>>>>> danrtsey.wy@gmail.com>
> > > >>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Thanks Yangze for the efforts to support GPU
> > > >>>> extended
> > > >>>>>>>>>> resources.
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> +1 for this FLIP
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>>>>> Yang
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>> Till Rohrmann <tr...@apache.org>
> > > >>> 于2020年4月2日周四
> > > >>>>>>>> 下午11:10写道:
> > > >>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Thanks for driving this effort Yangze.
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> +1
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> Cheers,
> > > >>>>>>>>>>>>>>>>> Till
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > >>>>>>>>>>>>> felixzhengcb@gmail.com
> > > >>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Thanks Yangze for driving the initial CPU
> > > >>>> support!
> > > >>>>>>>>>>>>>>>>>> +1 (non-binding) from my side.
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>> Xintong Song <to...@gmail.com>
> > > >>>> 于2020年4月1日周三
> > > >>>>>>>>>> 下午6:36写道:
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> Thanks Yangze, the FLIP looks good to me.
> > > >>>>>>>>>>>>>>>>>>> +1 (non-binding) from my side.
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> Thank you~
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> Xintong Song
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>> On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > >>>>>>>>>> karmagyz@gmail.com>
> > > >>>>>>>>>>>>>> wrote:
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Hi everyone,
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> I'd like to start the vote of FLIP-108
> > > >>> [1],
> > > >>>>>> which
> > > >>>>>>>>> adds
> > > >>>>>>>>>> GPU
> > > >>>>>>>>>>>>>> support in
> > > >>>>>>>>>>>>>>>>>>>> Flink. This FLIP is discussed in the
> > > >>>> thread[2].
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> The vote will be open for at least 72
> > > >>>> hours.
> > > >>>>>> Unless
> > > >>>>>>>>>> there is
> > > >>>>>>>>>>>>> an
> > > >>>>>>>>>>>>>>>>>>> objection,
> > > >>>>>>>>>>>>>>>>>>>> I will try to close it by April 4, 2020
> > > >>>> 10:00
> > > >>>>>> UTC
> > > >>>>>>>> if
> > > >>>>>>>>>> we have
> > > >>>>>>>>>>>>>> received
> > > >>>>>>>>>>>>>>>>>>>> sufficient votes.
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> [1]
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > >>>>>>>>>>>>>>>>>>>> [2]
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>> Best,
> > > >>>>>>>>>>>>>>>>>>>> Yangze Guo
> > > >>>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>>
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>
> > > >>>>>>>>>
> > > >>>>>>>>
> > > >>>>>>
> > > >>>>
> > > >>>
> > >
> > >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yangze Guo <ka...@gmail.com>.
Hi Aljoscha,

Thanks for your advice. +1 to align the config pattern.

I also agree that we need to move the long discussion to the [DISCUSS]
thread. Sorry if it bothers you.

Best,
Yangze Guo

On Thu, Apr 16, 2020 at 7:52 AM Becket Qin <be...@gmail.com> wrote:
>
> I agree with Aljoscha. It is important to keep API the same style. And we
> probably should move the long discussion to the [DISCUSS] thread.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Wed, Apr 15, 2020 at 11:27 PM Aljoscha Krettek <al...@apache.org>
> wrote:
>
> > Is the only really new method on the public APIs
> > getExternalResourceInfos(..) on the RuntimeContext? I'm generally quite
> > skeptical about adding anything to that interface but the method seems ok.
> >
> > Side note for the configuration keys: the pattern is similar to metrics
> > configuration. There we have "metrics.reporters" = <list of reporter
> > names> and then metrics.reporter.<foobazzle>... Your proposal is
> > slightly different in that it uses "external-resource.list". Keeping
> > this in line with metrics configuration would suggest to use
> > "external-resources", and then "external-resource.<foobazzle>...". What
> > do you think?
> >
> > Also, why is there this long discussion in a [VOTE] thread?
> >
> > Best,
> > Aljoscha
> >
> > On 15.04.20 10:32, Yangze Guo wrote:
> > > Thanks for the explanation. I do not have a strong opinion regarding
> > > this interface. So, if it is better from your perspective, I'm +1 for
> > > this. I just saying it may not help a lot regarding the type-safe.
> > >
> > > Regarding the bounded wildcard type, yes, it's the implementation
> > > detail. If it won't make a difference for user, I'm also +1 for not
> > > using bounded wildcard type there.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Apr 15, 2020 at 4:23 PM Till Rohrmann <tr...@apache.org>
> > wrote:
> > >>
> > >> I think <T extends ExternalResourceInfo> Set<T>
> > >> getExternalResourceInfos(String resourceName, Class<T>
> > >> externalResourceType) is not less flexible than the other API since you
> > can
> > >> always pass in ExternalResourceInfo.class as the second argument.
> > >>
> > >> The benefit I see for the user is that he does not have to do the
> > >> instanceof checks and type casts himself. This is admittedly not a big
> > deal
> > >> but still a better API imo.
> > >>
> > >> I think the interface of the Driver and what is returned by the
> > >> RuntimeContext don't have to have the same type because you can cast it
> > or
> > >> repack it. If the current implementation simply stores what the Driver
> > >> returns and RuntimeContext returns this map, then it might seem that
> > there
> > >> is a connection. But this should be an implementation detail rather
> > than a
> > >> necessity.
> > >>
> > >> Maybe we could also pull in someone from the SDK team to give us his
> > >> opinion on the user facing API.
> > >>
> > >> Cheers,
> > >> Till
> > >>
> > >> On Wed, Apr 15, 2020 at 10:13 AM Xintong Song <to...@gmail.com>
> > wrote:
> > >>
> > >>>>
> > >>>> I agree that such an interface won't give compile time checks but I
> > think
> > >>>> that it could be easier to use from a user's perspective because
> > there is
> > >>>> no explicit casting required.
> > >>>> public interface RuntimeContext {
> > >>>>      <T extends ExternalResourceInfo> Set<T>
> > >>> getExternalResourceInfos(String
> > >>>> resourceName, Class<T> externalResourceType);
> > >>>> }
> > >>>
> > >>>
> > >>> I'm not sure how less efforts is required from users to pass in a
> > >>> `externalResourceType` compared to do an explicit type casting.
> > >>> A potential side effect of passing in a `externalResourceType` is
> > that, it
> > >>> requires user (e.g. the operator) to know which specific type should be
> > >>> returned in advance, which may limit the flexibility.
> > >>>
> > >>> E.g., we might have an operator that can work with multiple different
> > >>> implementations of `ExternalResourceInfo`. It may decide its behavior
> > based
> > >>> on the actually type returned by `getExternalResourceInfos` at runtime.
> > >>>
> > >>>
> > >>> Thank you~
> > >>>
> > >>> Xintong Song
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Apr 15, 2020 at 4:09 PM Yangze Guo <ka...@gmail.com> wrote:
> > >>>
> > >>>> @Till
> > >>>> If we add "Class<T> externalResourceType" param, what if there are
> > >>>> multiple subtypes in the ExternalResourceInfos set of one external
> > >>>> resource? It seems user has to set the T to ExternalResourceInfo and
> > >>>> the mechanism is useless at this case.
> > >>>>
> > >>>> Best,
> > >>>> Yangze Guo
> > >>>>
> > >>>> On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org>
> > >>>> wrote:
> > >>>>>
> > >>>>> Ok, if there can be multiple resources of the same type then we
> > >>>> definitely
> > >>>>> need the name as a differentiator.
> > >>>>>
> > >>>>> I agree that such an interface won't give compile time checks but I
> > >>> think
> > >>>>> that it could be easier to use from a user's perspective because
> > there
> > >>> is
> > >>>>> no explicit casting required.
> > >>>>>
> > >>>>> public interface RuntimeContext {
> > >>>>>      <T extends ExternalResourceInfo> Set<T>
> > >>>> getExternalResourceInfos(String
> > >>>>> resourceName, Class<T> externalResourceType);
> > >>>>> }
> > >>>>>
> > >>>>> One minor note: I think the value of the returned map does not need
> > to
> > >>>> use
> > >>>>> a bounded wildcard type because for the user it won't make a
> > >>> difference.
> > >>>>>
> > >>>>> Cheers,
> > >>>>> Till
> > >>>>>
> > >>>>> On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com>
> > wrote:
> > >>>>>
> > >>>>>> Hi Till,
> > >>>>>>
> > >>>>>>> ExternalResourceDriver could return a Set<? extends
> > >>>>>> ExternalResourceInfo>.
> > >>>>>> It sounds good.
> > >>>>>>
> > >>>>>>> then one could make the interface type-safe by changing it to
> > >>>>>>> public interface RuntimeContext {
> > >>>>>>>     <T extends ExternalResourceInfo> Set<T>
> > >>>>>>> getExternalResourceInfo(Class<T> externalResourceType);
> > >>>>>>> }
> > >>>>>> I think it may not help.
> > >>>>>> - I think the assumption of "there is always only one resource of a
> > >>>>>> specific type" is too strong. The external resource framework should
> > >>>>>> only assume it gets a set of ExternalResourceInfo from the driver.
> > >>> The
> > >>>>>> concrete implementation is given by user. So, if we give such an
> > >>>>>> assumption, it would hurt the flexibility. There could be multiple
> > >>>>>> types in the returned externalResourceInfo set. There could also be
> > >>>>>> different types returned from different driver implementation or
> > >>>>>> version. The contract about the return type between Driver and
> > >>>>>> Operator should be guaranteed by user.
> > >>>>>> - Since the Drivers are loaded dynamically in runtime, if there is a
> > >>>>>> type mismatch, the job would fail in runtime instead of in compile
> > >>>>>> time, no matter the type extraction is done by Operator or Flink
> > >>> core.
> > >>>>>> This interface would not gain benefits for type safety.
> > >>>>>>
> > >>>>>> Best,
> > >>>>>> Yangze Guo
> > >>>>>>
> > >>>>>> On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <trohrmann@apache.org
> > >
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>> Thanks for updating the FLIP, Yangze.
> > >>>>>>>
> > >>>>>>> If ExternalResourceInfo is a marker interface, then
> > >>>>>> ExternalResourceDriver
> > >>>>>>> could return a Set<? extends ExternalResourceInfo>. This makes is a
> > >>>> bit
> > >>>>>>> nicer for the implementor because he can use the concrete subtype.
> > >>>>>>>
> > >>>>>>> If we assume that users will always cast the ExternalResourceInfo
> > >>>>>> instance
> > >>>>>>> into the concrete subtype and if we assume that there is always
> > >>> only
> > >>>> one
> > >>>>>>> resource of a specific type, then one could make the interface
> > >>>> type-safe
> > >>>>>> by
> > >>>>>>> changing it to
> > >>>>>>>
> > >>>>>>> public interface RuntimeContext {
> > >>>>>>>      <T extends ExternalResourceInfo> Set<T>
> > >>>>>>> getExternalResourceInfo(Class<T> externalResourceType);
> > >>>>>>> }
> > >>>>>>>
> > >>>>>>> If we want to support multiple GPU resources, then one would need
> > >>> to
> > >>>> use
> > >>>>>>> the name of the respective resource as well.
> > >>>>>>>
> > >>>>>>> Cheers,
> > >>>>>>> Till
> > >>>>>>>
> > >>>>>>> On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <
> > >>> tonysong820@gmail.com>
> > >>>>>> wrote:
> > >>>>>>>
> > >>>>>>>> Thanks for updating the FLIP, Yangze.
> > >>>>>>>> The latest FLIP looks good to me.
> > >>>>>>>>
> > >>>>>>>> nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is
> > >>>> out of
> > >>>>>>>> sync.
> > >>>>>>>>
> > >>>>>>>>> Retrieve the information of the external resources according to
> > >>>> the
> > >>>>>>>>> resourceProfile.
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> Thank you~
> > >>>>>>>>
> > >>>>>>>> Xintong Song
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>>
> > >>>>>>>> On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <
> > >>> becket.qin@gmail.com>
> > >>>>>> wrote:
> > >>>>>>>>
> > >>>>>>>>> Good feedback form Xintong. The latest FLIP looks good to me.
> > >>>>>>>>>
> > >>>>>>>>> Thanks,
> > >>>>>>>>>
> > >>>>>>>>> Jiangjie (Becket) Qin
> > >>>>>>>>>
> > >>>>>>>>> On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <karmagyz@gmail.com
> > >>>>
> > >>>>>> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> Hi there,
> > >>>>>>>>>> I've updated the FLIP accordingly. Please take a look. If you
> > >>>> have
> > >>>>>> any
> > >>>>>>>>>> further concerns please let me know.
> > >>>>>>>>>>
> > >>>>>>>>>> Best,
> > >>>>>>>>>> Yangze Guo
> > >>>>>>>>>>
> > >>>>>>>>>> On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <
> > >>> karmagyz@gmail.com
> > >>>>>
> > >>>>>> wrote:
> > >>>>>>>>>>>
> > >>>>>>>>>>> Thanks for the feedback, Xintong.
> > >>>>>>>>>>>
> > >>>>>>>>>>> - Should we have a factory interface for
> > >>>>>> `ExternalResourceDriver`,
> > >>>>>>>> that
> > >>>>>>>>>>> takes the configuration and returns a driver instance?
> > >>>>>> Otherwise, if
> > >>>>>>>> we
> > >>>>>>>>>> are
> > >>>>>>>>>>> creating the driver instance with reflection, we kind of
> > >>>>>> implicitly
> > >>>>>>>>>>> requires the driver to have a public non-argument
> > >>>> constructor.
> > >>>>>> If we
> > >>>>>>>>>>> decided to go with this approach, then we will not need
> > >>>>>>>>>>> `ExternalResourceDriver#open`.
> > >>>>>>>>>>>
> > >>>>>>>>>>> True, we could have an `ExternalResourceDriverFactory`,
> > >>> like
> > >>>>>>>>>>> interface ExternalResourceDriverFactory {
> > >>>>>>>>>>>      ExternalResourceDriver fromConfiguration(Configuration
> > >>>>>> config);
> > >>>>>>>>>>> }
> > >>>>>>>>>>> Regarding the configuration, the user should provide
> > >>>>>>>>>>> "external-resource.{resourceName}.driver-factory.class"
> > >>>> instead.
> > >>>>>>>>>>>
> > >>>>>>>>>>> - Not sure about the necessity of
> > >>>>>> `ExternalResourceDriver#close`. I
> > >>>>>>>>> would
> > >>>>>>>>>>> suggest to avoid introduce more interfaces if not
> > >>> absolutely
> > >>>>>>>> necessary.
> > >>>>>>>>>>>
> > >>>>>>>>>>> I add `ExternalResourceDriver#close` in case user needs to
> > >>>> clean
> > >>>>>> up
> > >>>>>>>>>>> internal states and any other resources. It's true that it
> > >>>> may
> > >>>>>> not
> > >>>>>>>>>>> absolutely necessary for our GPUDriver. From my side, I'm
> > >>> ok
> > >>>> to
> > >>>>>>>> remove
> > >>>>>>>>>>> it.
> > >>>>>>>>>>>
> > >>>>>>>>>>> - `ExternalResourceDriver#retrieveResourceInfo` should not
> > >>>> take
> > >>>>>>>>>>> `ResourceProfile` as argument. This exposes more
> > >>> information
> > >>>>>> than it
> > >>>>>>>>>> needs.
> > >>>>>>>>>>> In addition, it requires the runtime/core to understand how
> > >>>> to
> > >>>>>>>> properly
> > >>>>>>>>>>> wrap the external resource into `ResourceProfile`. E.g.,
> > >>>>>>>>>>> `ResourceProfile#extendedResources` takes `Resource`, which
> > >>>> is an
> > >>>>>>>>>> abstract
> > >>>>>>>>>>> class. Runtime/core has to known which implementation of
> > >>>>>> `Resource`
> > >>>>>>>> to
> > >>>>>>>>>> use.
> > >>>>>>>>>>>
> > >>>>>>>>>>> True, at the moment, I think the amount of the resource is
> > >>>>>> enough for
> > >>>>>>>>>>> the `ExternalResourceDriver#retrieveResourceInfo`. In the
> > >>>>>> future, if
> > >>>>>>>>>>> the fine-grained external resource management is supported,
> > >>>> the
> > >>>>>>>> amount
> > >>>>>>>>>>> of the resource seems to be enough either. If we want to
> > >>>> leverage
> > >>>>>>>> some
> > >>>>>>>>>>> external resources which could not be measured by a single
> > >>>> long
> > >>>>>>>> value,
> > >>>>>>>>>>> we might enrich this. But I'd like to keep it out of the
> > >>>> scope of
> > >>>>>>>> this
> > >>>>>>>>>>> FLIP.
> > >>>>>>>>>>>
> > >>>>>>>>>>> - Do we really need `ExternalResourceInfo#getInformation`?
> > >>> I
> > >>>>>> think it
> > >>>>>>>>>>> should be good enough to make `ExternalResourceInfo` an
> > >>> empty
> > >>>>>>>>> interface.
> > >>>>>>>>>>> User can define their own `ExternalResourceInfo`
> > >>>> implementation
> > >>>>>> and
> > >>>>>>>> how
> > >>>>>>>>>> it
> > >>>>>>>>>>> is used by the operator user codes.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Sounds good.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Best,
> > >>>>>>>>>>> Yangze Guo
> > >>>>>>>>>>>
> > >>>>>>>>>>> On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> > >>>>>> tonysong820@gmail.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Sorry to pull this back. I have some concerns about the
> > >>>> recent
> > >>>>>>>>> updated
> > >>>>>>>>>>>> interface details.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> - Should we have a factory interface for
> > >>>>>> `ExternalResourceDriver`,
> > >>>>>>>>> that
> > >>>>>>>>>>>> takes the configuration and returns a driver instance?
> > >>>>>> Otherwise,
> > >>>>>>>> if
> > >>>>>>>>>> we are
> > >>>>>>>>>>>> creating the driver instance with reflection, we kind of
> > >>>>>> implicitly
> > >>>>>>>>>>>> requires the driver to have a public non-argument
> > >>>> constructor.
> > >>>>>> If
> > >>>>>>>> we
> > >>>>>>>>>>>> decided to go with this approach, then we will not need
> > >>>>>>>>>>>> `ExternalResourceDriver#open`.
> > >>>>>>>>>>>> - Not sure about the necessity of
> > >>>>>> `ExternalResourceDriver#close`. I
> > >>>>>>>>>> would
> > >>>>>>>>>>>> suggest to avoid introduce more interfaces if not
> > >>>> absolutely
> > >>>>>>>>> necessary.
> > >>>>>>>>>>>> - `ExternalResourceDriver#retrieveResourceInfo` should
> > >>> not
> > >>>> take
> > >>>>>>>>>>>> `ResourceProfile` as argument. This exposes more
> > >>>> information
> > >>>>>> than
> > >>>>>>>> it
> > >>>>>>>>>> needs.
> > >>>>>>>>>>>> In addition, it requires the runtime/core to understand
> > >>>> how to
> > >>>>>>>>> properly
> > >>>>>>>>>>>> wrap the external resource into `ResourceProfile`. E.g.,
> > >>>>>>>>>>>> `ResourceProfile#extendedResources` takes `Resource`,
> > >>>> which is
> > >>>>>> an
> > >>>>>>>>>> abstract
> > >>>>>>>>>>>> class. Runtime/core has to known which implementation of
> > >>>>>> `Resource`
> > >>>>>>>>> to
> > >>>>>>>>>> use.
> > >>>>>>>>>>>> - Do we really need
> > >>> `ExternalResourceInfo#getInformation`?
> > >>>> I
> > >>>>>> think
> > >>>>>>>> it
> > >>>>>>>>>>>> should be good enough to make `ExternalResourceInfo` an
> > >>>> empty
> > >>>>>>>>>> interface.
> > >>>>>>>>>>>> User can define their own `ExternalResourceInfo`
> > >>>>>> implementation and
> > >>>>>>>>>> how it
> > >>>>>>>>>>>> is used by the operator user codes.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Thank you~
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> Xintong Song
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> > >>>>>> becket.qin@gmail.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>
> > >>>>>>>>>>>>> +1
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Thanks for driving this effort, Ynagze. The latest FLIP
> > >>>> wiki
> > >>>>>>>> looks
> > >>>>>>>>>> good to
> > >>>>>>>>>>>>> me.
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> Jiangjie (Becket) Qin
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>> On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> > >>>>>> karmagyz@gmail.com>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Edit: RuntimeContext interface
> > >>>>>>>>>>>>>> From: Map<String, Set<ExternalResourceInfo>>
> > >>>>>>>>>>>>>> getExternalResourceInfo(ResourceSpec resourceSpec);
> > >>>>>>>>>>>>>> To: Map<String, Set<ExternalResourceInfo>>
> > >>>>>>>>>> getExternalResourceInfo();
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>> Yangze Guo
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>> On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> > >>>>>> karmagyz@gmail.com
> > >>>>>>>>>
> > >>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Hi, there
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> I have updated the FLIP, mainly target to make it
> > >>>> more
> > >>>>>>>> detailed
> > >>>>>>>>>> and
> > >>>>>>>>>>>>>>> clear. The general design is not changed, but there
> > >>>> are
> > >>>>>> still
> > >>>>>>>>>> some
> > >>>>>>>>>>>>>>> changes need to be notified here:
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> - Change the `ExternalResourceDriver` from abstract
> > >>>>>> class to
> > >>>>>>>>>>>>>>> interface, since it does not have an abstract
> > >>>>>> implementation.
> > >>>>>>>>>> Add life
> > >>>>>>>>>>>>>>> cycle method `open` and `close`.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> - Specify the method added to the RuntimeContext
> > >>> from
> > >>>>>> which
> > >>>>>>>>> user
> > >>>>>>>>>> get
> > >>>>>>>>>>>>>>> the information of external resources.
> > >>>>>>>>>>>>>>>          Map<String, Set<ExternalResourceInfo>>
> > >>>>>>>>>>>>>>> getExternalResourceInfo(ResourceSpec resourceSpec);
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> - Add "String getInformation()" method to
> > >>>>>>>>> `ExternalResourceInfo`
> > >>>>>>>>>>>>>> interface.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> - Treat adding external resource info to
> > >>>> RestAPI/WebUI
> > >>>>>> as a
> > >>>>>>>>>> future
> > >>>>>>>>>>>>> work.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> If you have any new concerns after that change,
> > >>>> please
> > >>>>>>>>> mentioned
> > >>>>>>>>>> here.
> > >>>>>>>>>>>>>>> Sorry for disturbing you.
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>> Yangze Guo
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>> On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > >>>>>>>>> danrtsey.wy@gmail.com>
> > >>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Thanks Yangze for the efforts to support GPU
> > >>>> extended
> > >>>>>>>>>> resources.
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> +1 for this FLIP
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>> Yang
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>> Till Rohrmann <tr...@apache.org>
> > >>> 于2020年4月2日周四
> > >>>>>>>> 下午11:10写道:
> > >>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Thanks for driving this effort Yangze.
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> +1
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> Cheers,
> > >>>>>>>>>>>>>>>>> Till
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > >>>>>>>>>>>>> felixzhengcb@gmail.com
> > >>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Thanks Yangze for driving the initial CPU
> > >>>> support!
> > >>>>>>>>>>>>>>>>>> +1 (non-binding) from my side.
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>> Xintong Song <to...@gmail.com>
> > >>>> 于2020年4月1日周三
> > >>>>>>>>>> 下午6:36写道:
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thanks Yangze, the FLIP looks good to me.
> > >>>>>>>>>>>>>>>>>>> +1 (non-binding) from my side.
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Thank you~
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> Xintong Song
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>> On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > >>>>>>>>>> karmagyz@gmail.com>
> > >>>>>>>>>>>>>> wrote:
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Hi everyone,
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> I'd like to start the vote of FLIP-108
> > >>> [1],
> > >>>>>> which
> > >>>>>>>>> adds
> > >>>>>>>>>> GPU
> > >>>>>>>>>>>>>> support in
> > >>>>>>>>>>>>>>>>>>>> Flink. This FLIP is discussed in the
> > >>>> thread[2].
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> The vote will be open for at least 72
> > >>>> hours.
> > >>>>>> Unless
> > >>>>>>>>>> there is
> > >>>>>>>>>>>>> an
> > >>>>>>>>>>>>>>>>>>> objection,
> > >>>>>>>>>>>>>>>>>>>> I will try to close it by April 4, 2020
> > >>>> 10:00
> > >>>>>> UTC
> > >>>>>>>> if
> > >>>>>>>>>> we have
> > >>>>>>>>>>>>>> received
> > >>>>>>>>>>>>>>>>>>>> sufficient votes.
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> [1]
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > >>>>>>>>>>>>>>>>>>>> [2]
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>> Best,
> > >>>>>>>>>>>>>>>>>>>> Yangze Guo
> > >>>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>>>>
> > >>>>>>>>>>>>>>
> > >>>>>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>
> > >>>>>>
> > >>>>
> > >>>
> >
> >

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Becket Qin <be...@gmail.com>.
I agree with Aljoscha. It is important to keep API the same style. And we
probably should move the long discussion to the [DISCUSS] thread.

Thanks,

Jiangjie (Becket) Qin

On Wed, Apr 15, 2020 at 11:27 PM Aljoscha Krettek <al...@apache.org>
wrote:

> Is the only really new method on the public APIs
> getExternalResourceInfos(..) on the RuntimeContext? I'm generally quite
> skeptical about adding anything to that interface but the method seems ok.
>
> Side note for the configuration keys: the pattern is similar to metrics
> configuration. There we have "metrics.reporters" = <list of reporter
> names> and then metrics.reporter.<foobazzle>... Your proposal is
> slightly different in that it uses "external-resource.list". Keeping
> this in line with metrics configuration would suggest to use
> "external-resources", and then "external-resource.<foobazzle>...". What
> do you think?
>
> Also, why is there this long discussion in a [VOTE] thread?
>
> Best,
> Aljoscha
>
> On 15.04.20 10:32, Yangze Guo wrote:
> > Thanks for the explanation. I do not have a strong opinion regarding
> > this interface. So, if it is better from your perspective, I'm +1 for
> > this. I just saying it may not help a lot regarding the type-safe.
> >
> > Regarding the bounded wildcard type, yes, it's the implementation
> > detail. If it won't make a difference for user, I'm also +1 for not
> > using bounded wildcard type there.
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Apr 15, 2020 at 4:23 PM Till Rohrmann <tr...@apache.org>
> wrote:
> >>
> >> I think <T extends ExternalResourceInfo> Set<T>
> >> getExternalResourceInfos(String resourceName, Class<T>
> >> externalResourceType) is not less flexible than the other API since you
> can
> >> always pass in ExternalResourceInfo.class as the second argument.
> >>
> >> The benefit I see for the user is that he does not have to do the
> >> instanceof checks and type casts himself. This is admittedly not a big
> deal
> >> but still a better API imo.
> >>
> >> I think the interface of the Driver and what is returned by the
> >> RuntimeContext don't have to have the same type because you can cast it
> or
> >> repack it. If the current implementation simply stores what the Driver
> >> returns and RuntimeContext returns this map, then it might seem that
> there
> >> is a connection. But this should be an implementation detail rather
> than a
> >> necessity.
> >>
> >> Maybe we could also pull in someone from the SDK team to give us his
> >> opinion on the user facing API.
> >>
> >> Cheers,
> >> Till
> >>
> >> On Wed, Apr 15, 2020 at 10:13 AM Xintong Song <to...@gmail.com>
> wrote:
> >>
> >>>>
> >>>> I agree that such an interface won't give compile time checks but I
> think
> >>>> that it could be easier to use from a user's perspective because
> there is
> >>>> no explicit casting required.
> >>>> public interface RuntimeContext {
> >>>>      <T extends ExternalResourceInfo> Set<T>
> >>> getExternalResourceInfos(String
> >>>> resourceName, Class<T> externalResourceType);
> >>>> }
> >>>
> >>>
> >>> I'm not sure how less efforts is required from users to pass in a
> >>> `externalResourceType` compared to do an explicit type casting.
> >>> A potential side effect of passing in a `externalResourceType` is
> that, it
> >>> requires user (e.g. the operator) to know which specific type should be
> >>> returned in advance, which may limit the flexibility.
> >>>
> >>> E.g., we might have an operator that can work with multiple different
> >>> implementations of `ExternalResourceInfo`. It may decide its behavior
> based
> >>> on the actually type returned by `getExternalResourceInfos` at runtime.
> >>>
> >>>
> >>> Thank you~
> >>>
> >>> Xintong Song
> >>>
> >>>
> >>>
> >>> On Wed, Apr 15, 2020 at 4:09 PM Yangze Guo <ka...@gmail.com> wrote:
> >>>
> >>>> @Till
> >>>> If we add "Class<T> externalResourceType" param, what if there are
> >>>> multiple subtypes in the ExternalResourceInfos set of one external
> >>>> resource? It seems user has to set the T to ExternalResourceInfo and
> >>>> the mechanism is useless at this case.
> >>>>
> >>>> Best,
> >>>> Yangze Guo
> >>>>
> >>>> On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org>
> >>>> wrote:
> >>>>>
> >>>>> Ok, if there can be multiple resources of the same type then we
> >>>> definitely
> >>>>> need the name as a differentiator.
> >>>>>
> >>>>> I agree that such an interface won't give compile time checks but I
> >>> think
> >>>>> that it could be easier to use from a user's perspective because
> there
> >>> is
> >>>>> no explicit casting required.
> >>>>>
> >>>>> public interface RuntimeContext {
> >>>>>      <T extends ExternalResourceInfo> Set<T>
> >>>> getExternalResourceInfos(String
> >>>>> resourceName, Class<T> externalResourceType);
> >>>>> }
> >>>>>
> >>>>> One minor note: I think the value of the returned map does not need
> to
> >>>> use
> >>>>> a bounded wildcard type because for the user it won't make a
> >>> difference.
> >>>>>
> >>>>> Cheers,
> >>>>> Till
> >>>>>
> >>>>> On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com>
> wrote:
> >>>>>
> >>>>>> Hi Till,
> >>>>>>
> >>>>>>> ExternalResourceDriver could return a Set<? extends
> >>>>>> ExternalResourceInfo>.
> >>>>>> It sounds good.
> >>>>>>
> >>>>>>> then one could make the interface type-safe by changing it to
> >>>>>>> public interface RuntimeContext {
> >>>>>>>     <T extends ExternalResourceInfo> Set<T>
> >>>>>>> getExternalResourceInfo(Class<T> externalResourceType);
> >>>>>>> }
> >>>>>> I think it may not help.
> >>>>>> - I think the assumption of "there is always only one resource of a
> >>>>>> specific type" is too strong. The external resource framework should
> >>>>>> only assume it gets a set of ExternalResourceInfo from the driver.
> >>> The
> >>>>>> concrete implementation is given by user. So, if we give such an
> >>>>>> assumption, it would hurt the flexibility. There could be multiple
> >>>>>> types in the returned externalResourceInfo set. There could also be
> >>>>>> different types returned from different driver implementation or
> >>>>>> version. The contract about the return type between Driver and
> >>>>>> Operator should be guaranteed by user.
> >>>>>> - Since the Drivers are loaded dynamically in runtime, if there is a
> >>>>>> type mismatch, the job would fail in runtime instead of in compile
> >>>>>> time, no matter the type extraction is done by Operator or Flink
> >>> core.
> >>>>>> This interface would not gain benefits for type safety.
> >>>>>>
> >>>>>> Best,
> >>>>>> Yangze Guo
> >>>>>>
> >>>>>> On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <trohrmann@apache.org
> >
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> Thanks for updating the FLIP, Yangze.
> >>>>>>>
> >>>>>>> If ExternalResourceInfo is a marker interface, then
> >>>>>> ExternalResourceDriver
> >>>>>>> could return a Set<? extends ExternalResourceInfo>. This makes is a
> >>>> bit
> >>>>>>> nicer for the implementor because he can use the concrete subtype.
> >>>>>>>
> >>>>>>> If we assume that users will always cast the ExternalResourceInfo
> >>>>>> instance
> >>>>>>> into the concrete subtype and if we assume that there is always
> >>> only
> >>>> one
> >>>>>>> resource of a specific type, then one could make the interface
> >>>> type-safe
> >>>>>> by
> >>>>>>> changing it to
> >>>>>>>
> >>>>>>> public interface RuntimeContext {
> >>>>>>>      <T extends ExternalResourceInfo> Set<T>
> >>>>>>> getExternalResourceInfo(Class<T> externalResourceType);
> >>>>>>> }
> >>>>>>>
> >>>>>>> If we want to support multiple GPU resources, then one would need
> >>> to
> >>>> use
> >>>>>>> the name of the respective resource as well.
> >>>>>>>
> >>>>>>> Cheers,
> >>>>>>> Till
> >>>>>>>
> >>>>>>> On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <
> >>> tonysong820@gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> Thanks for updating the FLIP, Yangze.
> >>>>>>>> The latest FLIP looks good to me.
> >>>>>>>>
> >>>>>>>> nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is
> >>>> out of
> >>>>>>>> sync.
> >>>>>>>>
> >>>>>>>>> Retrieve the information of the external resources according to
> >>>> the
> >>>>>>>>> resourceProfile.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Thank you~
> >>>>>>>>
> >>>>>>>> Xintong Song
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <
> >>> becket.qin@gmail.com>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> Good feedback form Xintong. The latest FLIP looks good to me.
> >>>>>>>>>
> >>>>>>>>> Thanks,
> >>>>>>>>>
> >>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>
> >>>>>>>>> On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <karmagyz@gmail.com
> >>>>
> >>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> Hi there,
> >>>>>>>>>> I've updated the FLIP accordingly. Please take a look. If you
> >>>> have
> >>>>>> any
> >>>>>>>>>> further concerns please let me know.
> >>>>>>>>>>
> >>>>>>>>>> Best,
> >>>>>>>>>> Yangze Guo
> >>>>>>>>>>
> >>>>>>>>>> On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <
> >>> karmagyz@gmail.com
> >>>>>
> >>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> Thanks for the feedback, Xintong.
> >>>>>>>>>>>
> >>>>>>>>>>> - Should we have a factory interface for
> >>>>>> `ExternalResourceDriver`,
> >>>>>>>> that
> >>>>>>>>>>> takes the configuration and returns a driver instance?
> >>>>>> Otherwise, if
> >>>>>>>> we
> >>>>>>>>>> are
> >>>>>>>>>>> creating the driver instance with reflection, we kind of
> >>>>>> implicitly
> >>>>>>>>>>> requires the driver to have a public non-argument
> >>>> constructor.
> >>>>>> If we
> >>>>>>>>>>> decided to go with this approach, then we will not need
> >>>>>>>>>>> `ExternalResourceDriver#open`.
> >>>>>>>>>>>
> >>>>>>>>>>> True, we could have an `ExternalResourceDriverFactory`,
> >>> like
> >>>>>>>>>>> interface ExternalResourceDriverFactory {
> >>>>>>>>>>>      ExternalResourceDriver fromConfiguration(Configuration
> >>>>>> config);
> >>>>>>>>>>> }
> >>>>>>>>>>> Regarding the configuration, the user should provide
> >>>>>>>>>>> "external-resource.{resourceName}.driver-factory.class"
> >>>> instead.
> >>>>>>>>>>>
> >>>>>>>>>>> - Not sure about the necessity of
> >>>>>> `ExternalResourceDriver#close`. I
> >>>>>>>>> would
> >>>>>>>>>>> suggest to avoid introduce more interfaces if not
> >>> absolutely
> >>>>>>>> necessary.
> >>>>>>>>>>>
> >>>>>>>>>>> I add `ExternalResourceDriver#close` in case user needs to
> >>>> clean
> >>>>>> up
> >>>>>>>>>>> internal states and any other resources. It's true that it
> >>>> may
> >>>>>> not
> >>>>>>>>>>> absolutely necessary for our GPUDriver. From my side, I'm
> >>> ok
> >>>> to
> >>>>>>>> remove
> >>>>>>>>>>> it.
> >>>>>>>>>>>
> >>>>>>>>>>> - `ExternalResourceDriver#retrieveResourceInfo` should not
> >>>> take
> >>>>>>>>>>> `ResourceProfile` as argument. This exposes more
> >>> information
> >>>>>> than it
> >>>>>>>>>> needs.
> >>>>>>>>>>> In addition, it requires the runtime/core to understand how
> >>>> to
> >>>>>>>> properly
> >>>>>>>>>>> wrap the external resource into `ResourceProfile`. E.g.,
> >>>>>>>>>>> `ResourceProfile#extendedResources` takes `Resource`, which
> >>>> is an
> >>>>>>>>>> abstract
> >>>>>>>>>>> class. Runtime/core has to known which implementation of
> >>>>>> `Resource`
> >>>>>>>> to
> >>>>>>>>>> use.
> >>>>>>>>>>>
> >>>>>>>>>>> True, at the moment, I think the amount of the resource is
> >>>>>> enough for
> >>>>>>>>>>> the `ExternalResourceDriver#retrieveResourceInfo`. In the
> >>>>>> future, if
> >>>>>>>>>>> the fine-grained external resource management is supported,
> >>>> the
> >>>>>>>> amount
> >>>>>>>>>>> of the resource seems to be enough either. If we want to
> >>>> leverage
> >>>>>>>> some
> >>>>>>>>>>> external resources which could not be measured by a single
> >>>> long
> >>>>>>>> value,
> >>>>>>>>>>> we might enrich this. But I'd like to keep it out of the
> >>>> scope of
> >>>>>>>> this
> >>>>>>>>>>> FLIP.
> >>>>>>>>>>>
> >>>>>>>>>>> - Do we really need `ExternalResourceInfo#getInformation`?
> >>> I
> >>>>>> think it
> >>>>>>>>>>> should be good enough to make `ExternalResourceInfo` an
> >>> empty
> >>>>>>>>> interface.
> >>>>>>>>>>> User can define their own `ExternalResourceInfo`
> >>>> implementation
> >>>>>> and
> >>>>>>>> how
> >>>>>>>>>> it
> >>>>>>>>>>> is used by the operator user codes.
> >>>>>>>>>>>
> >>>>>>>>>>> Sounds good.
> >>>>>>>>>>>
> >>>>>>>>>>> Best,
> >>>>>>>>>>> Yangze Guo
> >>>>>>>>>>>
> >>>>>>>>>>> On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> >>>>>> tonysong820@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Sorry to pull this back. I have some concerns about the
> >>>> recent
> >>>>>>>>> updated
> >>>>>>>>>>>> interface details.
> >>>>>>>>>>>>
> >>>>>>>>>>>> - Should we have a factory interface for
> >>>>>> `ExternalResourceDriver`,
> >>>>>>>>> that
> >>>>>>>>>>>> takes the configuration and returns a driver instance?
> >>>>>> Otherwise,
> >>>>>>>> if
> >>>>>>>>>> we are
> >>>>>>>>>>>> creating the driver instance with reflection, we kind of
> >>>>>> implicitly
> >>>>>>>>>>>> requires the driver to have a public non-argument
> >>>> constructor.
> >>>>>> If
> >>>>>>>> we
> >>>>>>>>>>>> decided to go with this approach, then we will not need
> >>>>>>>>>>>> `ExternalResourceDriver#open`.
> >>>>>>>>>>>> - Not sure about the necessity of
> >>>>>> `ExternalResourceDriver#close`. I
> >>>>>>>>>> would
> >>>>>>>>>>>> suggest to avoid introduce more interfaces if not
> >>>> absolutely
> >>>>>>>>> necessary.
> >>>>>>>>>>>> - `ExternalResourceDriver#retrieveResourceInfo` should
> >>> not
> >>>> take
> >>>>>>>>>>>> `ResourceProfile` as argument. This exposes more
> >>>> information
> >>>>>> than
> >>>>>>>> it
> >>>>>>>>>> needs.
> >>>>>>>>>>>> In addition, it requires the runtime/core to understand
> >>>> how to
> >>>>>>>>> properly
> >>>>>>>>>>>> wrap the external resource into `ResourceProfile`. E.g.,
> >>>>>>>>>>>> `ResourceProfile#extendedResources` takes `Resource`,
> >>>> which is
> >>>>>> an
> >>>>>>>>>> abstract
> >>>>>>>>>>>> class. Runtime/core has to known which implementation of
> >>>>>> `Resource`
> >>>>>>>>> to
> >>>>>>>>>> use.
> >>>>>>>>>>>> - Do we really need
> >>> `ExternalResourceInfo#getInformation`?
> >>>> I
> >>>>>> think
> >>>>>>>> it
> >>>>>>>>>>>> should be good enough to make `ExternalResourceInfo` an
> >>>> empty
> >>>>>>>>>> interface.
> >>>>>>>>>>>> User can define their own `ExternalResourceInfo`
> >>>>>> implementation and
> >>>>>>>>>> how it
> >>>>>>>>>>>> is used by the operator user codes.
> >>>>>>>>>>>>
> >>>>>>>>>>>> Thank you~
> >>>>>>>>>>>>
> >>>>>>>>>>>> Xintong Song
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> >>>>>> becket.qin@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> +1
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for driving this effort, Ynagze. The latest FLIP
> >>>> wiki
> >>>>>>>> looks
> >>>>>>>>>> good to
> >>>>>>>>>>>>> me.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Jiangjie (Becket) Qin
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> >>>>>> karmagyz@gmail.com>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> Edit: RuntimeContext interface
> >>>>>>>>>>>>>> From: Map<String, Set<ExternalResourceInfo>>
> >>>>>>>>>>>>>> getExternalResourceInfo(ResourceSpec resourceSpec);
> >>>>>>>>>>>>>> To: Map<String, Set<ExternalResourceInfo>>
> >>>>>>>>>> getExternalResourceInfo();
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>> Yangze Guo
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> >>>>>> karmagyz@gmail.com
> >>>>>>>>>
> >>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Hi, there
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I have updated the FLIP, mainly target to make it
> >>>> more
> >>>>>>>> detailed
> >>>>>>>>>> and
> >>>>>>>>>>>>>>> clear. The general design is not changed, but there
> >>>> are
> >>>>>> still
> >>>>>>>>>> some
> >>>>>>>>>>>>>>> changes need to be notified here:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Change the `ExternalResourceDriver` from abstract
> >>>>>> class to
> >>>>>>>>>>>>>>> interface, since it does not have an abstract
> >>>>>> implementation.
> >>>>>>>>>> Add life
> >>>>>>>>>>>>>>> cycle method `open` and `close`.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Specify the method added to the RuntimeContext
> >>> from
> >>>>>> which
> >>>>>>>>> user
> >>>>>>>>>> get
> >>>>>>>>>>>>>>> the information of external resources.
> >>>>>>>>>>>>>>>          Map<String, Set<ExternalResourceInfo>>
> >>>>>>>>>>>>>>> getExternalResourceInfo(ResourceSpec resourceSpec);
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Add "String getInformation()" method to
> >>>>>>>>> `ExternalResourceInfo`
> >>>>>>>>>>>>>> interface.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> - Treat adding external resource info to
> >>>> RestAPI/WebUI
> >>>>>> as a
> >>>>>>>>>> future
> >>>>>>>>>>>>> work.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> If you have any new concerns after that change,
> >>>> please
> >>>>>>>>> mentioned
> >>>>>>>>>> here.
> >>>>>>>>>>>>>>> Sorry for disturbing you.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>> Yangze Guo
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> >>>>>>>>> danrtsey.wy@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Thanks Yangze for the efforts to support GPU
> >>>> extended
> >>>>>>>>>> resources.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> +1 for this FLIP
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>> Yang
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Till Rohrmann <tr...@apache.org>
> >>> 于2020年4月2日周四
> >>>>>>>> 下午11:10写道:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Thanks for driving this effort Yangze.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> +1
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Cheers,
> >>>>>>>>>>>>>>>>> Till
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> >>>>>>>>>>>>> felixzhengcb@gmail.com
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Thanks Yangze for driving the initial CPU
> >>>> support!
> >>>>>>>>>>>>>>>>>> +1 (non-binding) from my side.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> Xintong Song <to...@gmail.com>
> >>>> 于2020年4月1日周三
> >>>>>>>>>> 下午6:36写道:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thanks Yangze, the FLIP looks good to me.
> >>>>>>>>>>>>>>>>>>> +1 (non-binding) from my side.
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Thank you~
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Xintong Song
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> >>>>>>>>>> karmagyz@gmail.com>
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Hi everyone,
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I'd like to start the vote of FLIP-108
> >>> [1],
> >>>>>> which
> >>>>>>>>> adds
> >>>>>>>>>> GPU
> >>>>>>>>>>>>>> support in
> >>>>>>>>>>>>>>>>>>>> Flink. This FLIP is discussed in the
> >>>> thread[2].
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> The vote will be open for at least 72
> >>>> hours.
> >>>>>> Unless
> >>>>>>>>>> there is
> >>>>>>>>>>>>> an
> >>>>>>>>>>>>>>>>>>> objection,
> >>>>>>>>>>>>>>>>>>>> I will try to close it by April 4, 2020
> >>>> 10:00
> >>>>>> UTC
> >>>>>>>> if
> >>>>>>>>>> we have
> >>>>>>>>>>>>>> received
> >>>>>>>>>>>>>>>>>>>> sufficient votes.
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> [1]
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> >>>>>>>>>>>>>>>>>>>> [2]
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> Best,
> >>>>>>>>>>>>>>>>>>>> Yangze Guo
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>
> >>>
>
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Aljoscha Krettek <al...@apache.org>.
Is the only really new method on the public APIs 
getExternalResourceInfos(..) on the RuntimeContext? I'm generally quite 
skeptical about adding anything to that interface but the method seems ok.

Side note for the configuration keys: the pattern is similar to metrics 
configuration. There we have "metrics.reporters" = <list of reporter 
names> and then metrics.reporter.<foobazzle>... Your proposal is 
slightly different in that it uses "external-resource.list". Keeping 
this in line with metrics configuration would suggest to use 
"external-resources", and then "external-resource.<foobazzle>...". What 
do you think?

Also, why is there this long discussion in a [VOTE] thread?

Best,
Aljoscha

On 15.04.20 10:32, Yangze Guo wrote:
> Thanks for the explanation. I do not have a strong opinion regarding
> this interface. So, if it is better from your perspective, I'm +1 for
> this. I just saying it may not help a lot regarding the type-safe.
> 
> Regarding the bounded wildcard type, yes, it's the implementation
> detail. If it won't make a difference for user, I'm also +1 for not
> using bounded wildcard type there.
> 
> Best,
> Yangze Guo
> 
> On Wed, Apr 15, 2020 at 4:23 PM Till Rohrmann <tr...@apache.org> wrote:
>>
>> I think <T extends ExternalResourceInfo> Set<T>
>> getExternalResourceInfos(String resourceName, Class<T>
>> externalResourceType) is not less flexible than the other API since you can
>> always pass in ExternalResourceInfo.class as the second argument.
>>
>> The benefit I see for the user is that he does not have to do the
>> instanceof checks and type casts himself. This is admittedly not a big deal
>> but still a better API imo.
>>
>> I think the interface of the Driver and what is returned by the
>> RuntimeContext don't have to have the same type because you can cast it or
>> repack it. If the current implementation simply stores what the Driver
>> returns and RuntimeContext returns this map, then it might seem that there
>> is a connection. But this should be an implementation detail rather than a
>> necessity.
>>
>> Maybe we could also pull in someone from the SDK team to give us his
>> opinion on the user facing API.
>>
>> Cheers,
>> Till
>>
>> On Wed, Apr 15, 2020 at 10:13 AM Xintong Song <to...@gmail.com> wrote:
>>
>>>>
>>>> I agree that such an interface won't give compile time checks but I think
>>>> that it could be easier to use from a user's perspective because there is
>>>> no explicit casting required.
>>>> public interface RuntimeContext {
>>>>      <T extends ExternalResourceInfo> Set<T>
>>> getExternalResourceInfos(String
>>>> resourceName, Class<T> externalResourceType);
>>>> }
>>>
>>>
>>> I'm not sure how less efforts is required from users to pass in a
>>> `externalResourceType` compared to do an explicit type casting.
>>> A potential side effect of passing in a `externalResourceType` is that, it
>>> requires user (e.g. the operator) to know which specific type should be
>>> returned in advance, which may limit the flexibility.
>>>
>>> E.g., we might have an operator that can work with multiple different
>>> implementations of `ExternalResourceInfo`. It may decide its behavior based
>>> on the actually type returned by `getExternalResourceInfos` at runtime.
>>>
>>>
>>> Thank you~
>>>
>>> Xintong Song
>>>
>>>
>>>
>>> On Wed, Apr 15, 2020 at 4:09 PM Yangze Guo <ka...@gmail.com> wrote:
>>>
>>>> @Till
>>>> If we add "Class<T> externalResourceType" param, what if there are
>>>> multiple subtypes in the ExternalResourceInfos set of one external
>>>> resource? It seems user has to set the T to ExternalResourceInfo and
>>>> the mechanism is useless at this case.
>>>>
>>>> Best,
>>>> Yangze Guo
>>>>
>>>> On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org>
>>>> wrote:
>>>>>
>>>>> Ok, if there can be multiple resources of the same type then we
>>>> definitely
>>>>> need the name as a differentiator.
>>>>>
>>>>> I agree that such an interface won't give compile time checks but I
>>> think
>>>>> that it could be easier to use from a user's perspective because there
>>> is
>>>>> no explicit casting required.
>>>>>
>>>>> public interface RuntimeContext {
>>>>>      <T extends ExternalResourceInfo> Set<T>
>>>> getExternalResourceInfos(String
>>>>> resourceName, Class<T> externalResourceType);
>>>>> }
>>>>>
>>>>> One minor note: I think the value of the returned map does not need to
>>>> use
>>>>> a bounded wildcard type because for the user it won't make a
>>> difference.
>>>>>
>>>>> Cheers,
>>>>> Till
>>>>>
>>>>> On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com> wrote:
>>>>>
>>>>>> Hi Till,
>>>>>>
>>>>>>> ExternalResourceDriver could return a Set<? extends
>>>>>> ExternalResourceInfo>.
>>>>>> It sounds good.
>>>>>>
>>>>>>> then one could make the interface type-safe by changing it to
>>>>>>> public interface RuntimeContext {
>>>>>>>     <T extends ExternalResourceInfo> Set<T>
>>>>>>> getExternalResourceInfo(Class<T> externalResourceType);
>>>>>>> }
>>>>>> I think it may not help.
>>>>>> - I think the assumption of "there is always only one resource of a
>>>>>> specific type" is too strong. The external resource framework should
>>>>>> only assume it gets a set of ExternalResourceInfo from the driver.
>>> The
>>>>>> concrete implementation is given by user. So, if we give such an
>>>>>> assumption, it would hurt the flexibility. There could be multiple
>>>>>> types in the returned externalResourceInfo set. There could also be
>>>>>> different types returned from different driver implementation or
>>>>>> version. The contract about the return type between Driver and
>>>>>> Operator should be guaranteed by user.
>>>>>> - Since the Drivers are loaded dynamically in runtime, if there is a
>>>>>> type mismatch, the job would fail in runtime instead of in compile
>>>>>> time, no matter the type extraction is done by Operator or Flink
>>> core.
>>>>>> This interface would not gain benefits for type safety.
>>>>>>
>>>>>> Best,
>>>>>> Yangze Guo
>>>>>>
>>>>>> On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <tr...@apache.org>
>>>>>> wrote:
>>>>>>>
>>>>>>> Thanks for updating the FLIP, Yangze.
>>>>>>>
>>>>>>> If ExternalResourceInfo is a marker interface, then
>>>>>> ExternalResourceDriver
>>>>>>> could return a Set<? extends ExternalResourceInfo>. This makes is a
>>>> bit
>>>>>>> nicer for the implementor because he can use the concrete subtype.
>>>>>>>
>>>>>>> If we assume that users will always cast the ExternalResourceInfo
>>>>>> instance
>>>>>>> into the concrete subtype and if we assume that there is always
>>> only
>>>> one
>>>>>>> resource of a specific type, then one could make the interface
>>>> type-safe
>>>>>> by
>>>>>>> changing it to
>>>>>>>
>>>>>>> public interface RuntimeContext {
>>>>>>>      <T extends ExternalResourceInfo> Set<T>
>>>>>>> getExternalResourceInfo(Class<T> externalResourceType);
>>>>>>> }
>>>>>>>
>>>>>>> If we want to support multiple GPU resources, then one would need
>>> to
>>>> use
>>>>>>> the name of the respective resource as well.
>>>>>>>
>>>>>>> Cheers,
>>>>>>> Till
>>>>>>>
>>>>>>> On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <
>>> tonysong820@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>>> Thanks for updating the FLIP, Yangze.
>>>>>>>> The latest FLIP looks good to me.
>>>>>>>>
>>>>>>>> nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is
>>>> out of
>>>>>>>> sync.
>>>>>>>>
>>>>>>>>> Retrieve the information of the external resources according to
>>>> the
>>>>>>>>> resourceProfile.
>>>>>>>>
>>>>>>>>
>>>>>>>> Thank you~
>>>>>>>>
>>>>>>>> Xintong Song
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <
>>> becket.qin@gmail.com>
>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Good feedback form Xintong. The latest FLIP looks good to me.
>>>>>>>>>
>>>>>>>>> Thanks,
>>>>>>>>>
>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>
>>>>>>>>> On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <karmagyz@gmail.com
>>>>
>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi there,
>>>>>>>>>> I've updated the FLIP accordingly. Please take a look. If you
>>>> have
>>>>>> any
>>>>>>>>>> further concerns please let me know.
>>>>>>>>>>
>>>>>>>>>> Best,
>>>>>>>>>> Yangze Guo
>>>>>>>>>>
>>>>>>>>>> On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <
>>> karmagyz@gmail.com
>>>>>
>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> Thanks for the feedback, Xintong.
>>>>>>>>>>>
>>>>>>>>>>> - Should we have a factory interface for
>>>>>> `ExternalResourceDriver`,
>>>>>>>> that
>>>>>>>>>>> takes the configuration and returns a driver instance?
>>>>>> Otherwise, if
>>>>>>>> we
>>>>>>>>>> are
>>>>>>>>>>> creating the driver instance with reflection, we kind of
>>>>>> implicitly
>>>>>>>>>>> requires the driver to have a public non-argument
>>>> constructor.
>>>>>> If we
>>>>>>>>>>> decided to go with this approach, then we will not need
>>>>>>>>>>> `ExternalResourceDriver#open`.
>>>>>>>>>>>
>>>>>>>>>>> True, we could have an `ExternalResourceDriverFactory`,
>>> like
>>>>>>>>>>> interface ExternalResourceDriverFactory {
>>>>>>>>>>>      ExternalResourceDriver fromConfiguration(Configuration
>>>>>> config);
>>>>>>>>>>> }
>>>>>>>>>>> Regarding the configuration, the user should provide
>>>>>>>>>>> "external-resource.{resourceName}.driver-factory.class"
>>>> instead.
>>>>>>>>>>>
>>>>>>>>>>> - Not sure about the necessity of
>>>>>> `ExternalResourceDriver#close`. I
>>>>>>>>> would
>>>>>>>>>>> suggest to avoid introduce more interfaces if not
>>> absolutely
>>>>>>>> necessary.
>>>>>>>>>>>
>>>>>>>>>>> I add `ExternalResourceDriver#close` in case user needs to
>>>> clean
>>>>>> up
>>>>>>>>>>> internal states and any other resources. It's true that it
>>>> may
>>>>>> not
>>>>>>>>>>> absolutely necessary for our GPUDriver. From my side, I'm
>>> ok
>>>> to
>>>>>>>> remove
>>>>>>>>>>> it.
>>>>>>>>>>>
>>>>>>>>>>> - `ExternalResourceDriver#retrieveResourceInfo` should not
>>>> take
>>>>>>>>>>> `ResourceProfile` as argument. This exposes more
>>> information
>>>>>> than it
>>>>>>>>>> needs.
>>>>>>>>>>> In addition, it requires the runtime/core to understand how
>>>> to
>>>>>>>> properly
>>>>>>>>>>> wrap the external resource into `ResourceProfile`. E.g.,
>>>>>>>>>>> `ResourceProfile#extendedResources` takes `Resource`, which
>>>> is an
>>>>>>>>>> abstract
>>>>>>>>>>> class. Runtime/core has to known which implementation of
>>>>>> `Resource`
>>>>>>>> to
>>>>>>>>>> use.
>>>>>>>>>>>
>>>>>>>>>>> True, at the moment, I think the amount of the resource is
>>>>>> enough for
>>>>>>>>>>> the `ExternalResourceDriver#retrieveResourceInfo`. In the
>>>>>> future, if
>>>>>>>>>>> the fine-grained external resource management is supported,
>>>> the
>>>>>>>> amount
>>>>>>>>>>> of the resource seems to be enough either. If we want to
>>>> leverage
>>>>>>>> some
>>>>>>>>>>> external resources which could not be measured by a single
>>>> long
>>>>>>>> value,
>>>>>>>>>>> we might enrich this. But I'd like to keep it out of the
>>>> scope of
>>>>>>>> this
>>>>>>>>>>> FLIP.
>>>>>>>>>>>
>>>>>>>>>>> - Do we really need `ExternalResourceInfo#getInformation`?
>>> I
>>>>>> think it
>>>>>>>>>>> should be good enough to make `ExternalResourceInfo` an
>>> empty
>>>>>>>>> interface.
>>>>>>>>>>> User can define their own `ExternalResourceInfo`
>>>> implementation
>>>>>> and
>>>>>>>> how
>>>>>>>>>> it
>>>>>>>>>>> is used by the operator user codes.
>>>>>>>>>>>
>>>>>>>>>>> Sounds good.
>>>>>>>>>>>
>>>>>>>>>>> Best,
>>>>>>>>>>> Yangze Guo
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
>>>>>> tonysong820@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>> Sorry to pull this back. I have some concerns about the
>>>> recent
>>>>>>>>> updated
>>>>>>>>>>>> interface details.
>>>>>>>>>>>>
>>>>>>>>>>>> - Should we have a factory interface for
>>>>>> `ExternalResourceDriver`,
>>>>>>>>> that
>>>>>>>>>>>> takes the configuration and returns a driver instance?
>>>>>> Otherwise,
>>>>>>>> if
>>>>>>>>>> we are
>>>>>>>>>>>> creating the driver instance with reflection, we kind of
>>>>>> implicitly
>>>>>>>>>>>> requires the driver to have a public non-argument
>>>> constructor.
>>>>>> If
>>>>>>>> we
>>>>>>>>>>>> decided to go with this approach, then we will not need
>>>>>>>>>>>> `ExternalResourceDriver#open`.
>>>>>>>>>>>> - Not sure about the necessity of
>>>>>> `ExternalResourceDriver#close`. I
>>>>>>>>>> would
>>>>>>>>>>>> suggest to avoid introduce more interfaces if not
>>>> absolutely
>>>>>>>>> necessary.
>>>>>>>>>>>> - `ExternalResourceDriver#retrieveResourceInfo` should
>>> not
>>>> take
>>>>>>>>>>>> `ResourceProfile` as argument. This exposes more
>>>> information
>>>>>> than
>>>>>>>> it
>>>>>>>>>> needs.
>>>>>>>>>>>> In addition, it requires the runtime/core to understand
>>>> how to
>>>>>>>>> properly
>>>>>>>>>>>> wrap the external resource into `ResourceProfile`. E.g.,
>>>>>>>>>>>> `ResourceProfile#extendedResources` takes `Resource`,
>>>> which is
>>>>>> an
>>>>>>>>>> abstract
>>>>>>>>>>>> class. Runtime/core has to known which implementation of
>>>>>> `Resource`
>>>>>>>>> to
>>>>>>>>>> use.
>>>>>>>>>>>> - Do we really need
>>> `ExternalResourceInfo#getInformation`?
>>>> I
>>>>>> think
>>>>>>>> it
>>>>>>>>>>>> should be good enough to make `ExternalResourceInfo` an
>>>> empty
>>>>>>>>>> interface.
>>>>>>>>>>>> User can define their own `ExternalResourceInfo`
>>>>>> implementation and
>>>>>>>>>> how it
>>>>>>>>>>>> is used by the operator user codes.
>>>>>>>>>>>>
>>>>>>>>>>>> Thank you~
>>>>>>>>>>>>
>>>>>>>>>>>> Xintong Song
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
>>>>>> becket.qin@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> +1
>>>>>>>>>>>>>
>>>>>>>>>>>>> Thanks for driving this effort, Ynagze. The latest FLIP
>>>> wiki
>>>>>>>> looks
>>>>>>>>>> good to
>>>>>>>>>>>>> me.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Jiangjie (Becket) Qin
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
>>>>>> karmagyz@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Edit: RuntimeContext interface
>>>>>>>>>>>>>> From: Map<String, Set<ExternalResourceInfo>>
>>>>>>>>>>>>>> getExternalResourceInfo(ResourceSpec resourceSpec);
>>>>>>>>>>>>>> To: Map<String, Set<ExternalResourceInfo>>
>>>>>>>>>> getExternalResourceInfo();
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>> Yangze Guo
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
>>>>>> karmagyz@gmail.com
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi, there
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I have updated the FLIP, mainly target to make it
>>>> more
>>>>>>>> detailed
>>>>>>>>>> and
>>>>>>>>>>>>>>> clear. The general design is not changed, but there
>>>> are
>>>>>> still
>>>>>>>>>> some
>>>>>>>>>>>>>>> changes need to be notified here:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Change the `ExternalResourceDriver` from abstract
>>>>>> class to
>>>>>>>>>>>>>>> interface, since it does not have an abstract
>>>>>> implementation.
>>>>>>>>>> Add life
>>>>>>>>>>>>>>> cycle method `open` and `close`.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Specify the method added to the RuntimeContext
>>> from
>>>>>> which
>>>>>>>>> user
>>>>>>>>>> get
>>>>>>>>>>>>>>> the information of external resources.
>>>>>>>>>>>>>>>          Map<String, Set<ExternalResourceInfo>>
>>>>>>>>>>>>>>> getExternalResourceInfo(ResourceSpec resourceSpec);
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Add "String getInformation()" method to
>>>>>>>>> `ExternalResourceInfo`
>>>>>>>>>>>>>> interface.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> - Treat adding external resource info to
>>>> RestAPI/WebUI
>>>>>> as a
>>>>>>>>>> future
>>>>>>>>>>>>> work.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> If you have any new concerns after that change,
>>>> please
>>>>>>>>> mentioned
>>>>>>>>>> here.
>>>>>>>>>>>>>>> Sorry for disturbing you.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>> Yangze Guo
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
>>>>>>>>> danrtsey.wy@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Thanks Yangze for the efforts to support GPU
>>>> extended
>>>>>>>>>> resources.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> +1 for this FLIP
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>> Yang
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Till Rohrmann <tr...@apache.org>
>>> 于2020年4月2日周四
>>>>>>>> 下午11:10写道:
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Thanks for driving this effort Yangze.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> +1
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> Cheers,
>>>>>>>>>>>>>>>>> Till
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
>>>>>>>>>>>>> felixzhengcb@gmail.com
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Thanks Yangze for driving the initial CPU
>>>> support!
>>>>>>>>>>>>>>>>>> +1 (non-binding) from my side.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Xintong Song <to...@gmail.com>
>>>> 于2020年4月1日周三
>>>>>>>>>> 下午6:36写道:
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thanks Yangze, the FLIP looks good to me.
>>>>>>>>>>>>>>>>>>> +1 (non-binding) from my side.
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Thank you~
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> Xintong Song
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
>>>>>>>>>> karmagyz@gmail.com>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Hi everyone,
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> I'd like to start the vote of FLIP-108
>>> [1],
>>>>>> which
>>>>>>>>> adds
>>>>>>>>>> GPU
>>>>>>>>>>>>>> support in
>>>>>>>>>>>>>>>>>>>> Flink. This FLIP is discussed in the
>>>> thread[2].
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> The vote will be open for at least 72
>>>> hours.
>>>>>> Unless
>>>>>>>>>> there is
>>>>>>>>>>>>> an
>>>>>>>>>>>>>>>>>>> objection,
>>>>>>>>>>>>>>>>>>>> I will try to close it by April 4, 2020
>>>> 10:00
>>>>>> UTC
>>>>>>>> if
>>>>>>>>>> we have
>>>>>>>>>>>>>> received
>>>>>>>>>>>>>>>>>>>> sufficient votes.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> [1]
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
>>>>>>>>>>>>>>>>>>>> [2]
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Best,
>>>>>>>>>>>>>>>>>>>> Yangze Guo
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>
>>>


Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Xintong Song <to...@gmail.com>.
True, the user can always pass in ExternalResourceInfo.class to retain the
flexibility.
As long as the flexibility is not harmed, I'm ok with both. It's probably
better to do the type checking and exception handling for users.

Thank you~

Xintong Song



On Wed, Apr 15, 2020 at 4:23 PM Till Rohrmann <tr...@apache.org> wrote:

> I think <T extends ExternalResourceInfo> Set<T>
> getExternalResourceInfos(String resourceName, Class<T>
> externalResourceType) is not less flexible than the other API since you can
> always pass in ExternalResourceInfo.class as the second argument.
>
> The benefit I see for the user is that he does not have to do the
> instanceof checks and type casts himself. This is admittedly not a big deal
> but still a better API imo.
>
> I think the interface of the Driver and what is returned by the
> RuntimeContext don't have to have the same type because you can cast it or
> repack it. If the current implementation simply stores what the Driver
> returns and RuntimeContext returns this map, then it might seem that there
> is a connection. But this should be an implementation detail rather than a
> necessity.
>
> Maybe we could also pull in someone from the SDK team to give us his
> opinion on the user facing API.
>
> Cheers,
> Till
>
> On Wed, Apr 15, 2020 at 10:13 AM Xintong Song <to...@gmail.com>
> wrote:
>
> > >
> > > I agree that such an interface won't give compile time checks but I
> think
> > > that it could be easier to use from a user's perspective because there
> is
> > > no explicit casting required.
> > > public interface RuntimeContext {
> > >     <T extends ExternalResourceInfo> Set<T>
> > getExternalResourceInfos(String
> > > resourceName, Class<T> externalResourceType);
> > > }
> >
> >
> > I'm not sure how less efforts is required from users to pass in a
> > `externalResourceType` compared to do an explicit type casting.
> > A potential side effect of passing in a `externalResourceType` is that,
> it
> > requires user (e.g. the operator) to know which specific type should be
> > returned in advance, which may limit the flexibility.
> >
> > E.g., we might have an operator that can work with multiple different
> > implementations of `ExternalResourceInfo`. It may decide its behavior
> based
> > on the actually type returned by `getExternalResourceInfos` at runtime.
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Apr 15, 2020 at 4:09 PM Yangze Guo <ka...@gmail.com> wrote:
> >
> > > @Till
> > > If we add "Class<T> externalResourceType" param, what if there are
> > > multiple subtypes in the ExternalResourceInfos set of one external
> > > resource? It seems user has to set the T to ExternalResourceInfo and
> > > the mechanism is useless at this case.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > > >
> > > > Ok, if there can be multiple resources of the same type then we
> > > definitely
> > > > need the name as a differentiator.
> > > >
> > > > I agree that such an interface won't give compile time checks but I
> > think
> > > > that it could be easier to use from a user's perspective because
> there
> > is
> > > > no explicit casting required.
> > > >
> > > > public interface RuntimeContext {
> > > >     <T extends ExternalResourceInfo> Set<T>
> > > getExternalResourceInfos(String
> > > > resourceName, Class<T> externalResourceType);
> > > > }
> > > >
> > > > One minor note: I think the value of the returned map does not need
> to
> > > use
> > > > a bounded wildcard type because for the user it won't make a
> > difference.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com>
> wrote:
> > > >
> > > > > Hi Till,
> > > > >
> > > > > > ExternalResourceDriver could return a Set<? extends
> > > > > ExternalResourceInfo>.
> > > > > It sounds good.
> > > > >
> > > > > > then one could make the interface type-safe by changing it to
> > > > > > public interface RuntimeContext {
> > > > > >    <T extends ExternalResourceInfo> Set<T>
> > > > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > > > }
> > > > > I think it may not help.
> > > > > - I think the assumption of "there is always only one resource of a
> > > > > specific type" is too strong. The external resource framework
> should
> > > > > only assume it gets a set of ExternalResourceInfo from the driver.
> > The
> > > > > concrete implementation is given by user. So, if we give such an
> > > > > assumption, it would hurt the flexibility. There could be multiple
> > > > > types in the returned externalResourceInfo set. There could also be
> > > > > different types returned from different driver implementation or
> > > > > version. The contract about the return type between Driver and
> > > > > Operator should be guaranteed by user.
> > > > > - Since the Drivers are loaded dynamically in runtime, if there is
> a
> > > > > type mismatch, the job would fail in runtime instead of in compile
> > > > > time, no matter the type extraction is done by Operator or Flink
> > core.
> > > > > This interface would not gain benefits for type safety.
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <
> trohrmann@apache.org>
> > > > > wrote:
> > > > > >
> > > > > > Thanks for updating the FLIP, Yangze.
> > > > > >
> > > > > > If ExternalResourceInfo is a marker interface, then
> > > > > ExternalResourceDriver
> > > > > > could return a Set<? extends ExternalResourceInfo>. This makes
> is a
> > > bit
> > > > > > nicer for the implementor because he can use the concrete
> subtype.
> > > > > >
> > > > > > If we assume that users will always cast the ExternalResourceInfo
> > > > > instance
> > > > > > into the concrete subtype and if we assume that there is always
> > only
> > > one
> > > > > > resource of a specific type, then one could make the interface
> > > type-safe
> > > > > by
> > > > > > changing it to
> > > > > >
> > > > > > public interface RuntimeContext {
> > > > > >     <T extends ExternalResourceInfo> Set<T>
> > > > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > > > }
> > > > > >
> > > > > > If we want to support multiple GPU resources, then one would need
> > to
> > > use
> > > > > > the name of the respective resource as well.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <
> > tonysong820@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Thanks for updating the FLIP, Yangze.
> > > > > > > The latest FLIP looks good to me.
> > > > > > >
> > > > > > > nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo`
> is
> > > out of
> > > > > > > sync.
> > > > > > >
> > > > > > > > Retrieve the information of the external resources according
> to
> > > the
> > > > > > > > resourceProfile.
> > > > > > >
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <
> > becket.qin@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > Good feedback form Xintong. The latest FLIP looks good to me.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <
> karmagyz@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi there,
> > > > > > > > > I've updated the FLIP accordingly. Please take a look. If
> you
> > > have
> > > > > any
> > > > > > > > > further concerns please let me know.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <
> > karmagyz@gmail.com
> > > >
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Thanks for the feedback, Xintong.
> > > > > > > > > >
> > > > > > > > > > - Should we have a factory interface for
> > > > > `ExternalResourceDriver`,
> > > > > > > that
> > > > > > > > > > takes the configuration and returns a driver instance?
> > > > > Otherwise, if
> > > > > > > we
> > > > > > > > > are
> > > > > > > > > > creating the driver instance with reflection, we kind of
> > > > > implicitly
> > > > > > > > > > requires the driver to have a public non-argument
> > > constructor.
> > > > > If we
> > > > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > > >
> > > > > > > > > > True, we could have an `ExternalResourceDriverFactory`,
> > like
> > > > > > > > > > interface ExternalResourceDriverFactory {
> > > > > > > > > >     ExternalResourceDriver
> fromConfiguration(Configuration
> > > > > config);
> > > > > > > > > > }
> > > > > > > > > > Regarding the configuration, the user should provide
> > > > > > > > > > "external-resource.{resourceName}.driver-factory.class"
> > > instead.
> > > > > > > > > >
> > > > > > > > > > - Not sure about the necessity of
> > > > > `ExternalResourceDriver#close`. I
> > > > > > > > would
> > > > > > > > > > suggest to avoid introduce more interfaces if not
> > absolutely
> > > > > > > necessary.
> > > > > > > > > >
> > > > > > > > > > I add `ExternalResourceDriver#close` in case user needs
> to
> > > clean
> > > > > up
> > > > > > > > > > internal states and any other resources. It's true that
> it
> > > may
> > > > > not
> > > > > > > > > > absolutely necessary for our GPUDriver. From my side, I'm
> > ok
> > > to
> > > > > > > remove
> > > > > > > > > > it.
> > > > > > > > > >
> > > > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should
> not
> > > take
> > > > > > > > > > `ResourceProfile` as argument. This exposes more
> > information
> > > > > than it
> > > > > > > > > needs.
> > > > > > > > > > In addition, it requires the runtime/core to understand
> how
> > > to
> > > > > > > properly
> > > > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > > > `ResourceProfile#extendedResources` takes `Resource`,
> which
> > > is an
> > > > > > > > > abstract
> > > > > > > > > > class. Runtime/core has to known which implementation of
> > > > > `Resource`
> > > > > > > to
> > > > > > > > > use.
> > > > > > > > > >
> > > > > > > > > > True, at the moment, I think the amount of the resource
> is
> > > > > enough for
> > > > > > > > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the
> > > > > future, if
> > > > > > > > > > the fine-grained external resource management is
> supported,
> > > the
> > > > > > > amount
> > > > > > > > > > of the resource seems to be enough either. If we want to
> > > leverage
> > > > > > > some
> > > > > > > > > > external resources which could not be measured by a
> single
> > > long
> > > > > > > value,
> > > > > > > > > > we might enrich this. But I'd like to keep it out of the
> > > scope of
> > > > > > > this
> > > > > > > > > > FLIP.
> > > > > > > > > >
> > > > > > > > > > - Do we really need
> `ExternalResourceInfo#getInformation`?
> > I
> > > > > think it
> > > > > > > > > > should be good enough to make `ExternalResourceInfo` an
> > empty
> > > > > > > > interface.
> > > > > > > > > > User can define their own `ExternalResourceInfo`
> > > implementation
> > > > > and
> > > > > > > how
> > > > > > > > > it
> > > > > > > > > > is used by the operator user codes.
> > > > > > > > > >
> > > > > > > > > > Sounds good.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yangze Guo
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> > > > > tonysong820@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Sorry to pull this back. I have some concerns about the
> > > recent
> > > > > > > > updated
> > > > > > > > > > > interface details.
> > > > > > > > > > >
> > > > > > > > > > > - Should we have a factory interface for
> > > > > `ExternalResourceDriver`,
> > > > > > > > that
> > > > > > > > > > > takes the configuration and returns a driver instance?
> > > > > Otherwise,
> > > > > > > if
> > > > > > > > > we are
> > > > > > > > > > > creating the driver instance with reflection, we kind
> of
> > > > > implicitly
> > > > > > > > > > > requires the driver to have a public non-argument
> > > constructor.
> > > > > If
> > > > > > > we
> > > > > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > > > > - Not sure about the necessity of
> > > > > `ExternalResourceDriver#close`. I
> > > > > > > > > would
> > > > > > > > > > > suggest to avoid introduce more interfaces if not
> > > absolutely
> > > > > > > > necessary.
> > > > > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should
> > not
> > > take
> > > > > > > > > > > `ResourceProfile` as argument. This exposes more
> > > information
> > > > > than
> > > > > > > it
> > > > > > > > > needs.
> > > > > > > > > > > In addition, it requires the runtime/core to understand
> > > how to
> > > > > > > > properly
> > > > > > > > > > > wrap the external resource into `ResourceProfile`.
> E.g.,
> > > > > > > > > > > `ResourceProfile#extendedResources` takes `Resource`,
> > > which is
> > > > > an
> > > > > > > > > abstract
> > > > > > > > > > > class. Runtime/core has to known which implementation
> of
> > > > > `Resource`
> > > > > > > > to
> > > > > > > > > use.
> > > > > > > > > > > - Do we really need
> > `ExternalResourceInfo#getInformation`?
> > > I
> > > > > think
> > > > > > > it
> > > > > > > > > > > should be good enough to make `ExternalResourceInfo` an
> > > empty
> > > > > > > > > interface.
> > > > > > > > > > > User can define their own `ExternalResourceInfo`
> > > > > implementation and
> > > > > > > > > how it
> > > > > > > > > > > is used by the operator user codes.
> > > > > > > > > > >
> > > > > > > > > > > Thank you~
> > > > > > > > > > >
> > > > > > > > > > > Xintong Song
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> > > > > becket.qin@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > +1
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for driving this effort, Ynagze. The latest
> FLIP
> > > wiki
> > > > > > > looks
> > > > > > > > > good to
> > > > > > > > > > > > me.
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > >
> > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> > > > > karmagyz@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Edit: RuntimeContext interface
> > > > > > > > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > getExternalResourceInfo();
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> > > > > karmagyz@gmail.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, there
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have updated the FLIP, mainly target to make it
> > > more
> > > > > > > detailed
> > > > > > > > > and
> > > > > > > > > > > > > > clear. The general design is not changed, but
> there
> > > are
> > > > > still
> > > > > > > > > some
> > > > > > > > > > > > > > changes need to be notified here:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Change the `ExternalResourceDriver` from
> abstract
> > > > > class to
> > > > > > > > > > > > > > interface, since it does not have an abstract
> > > > > implementation.
> > > > > > > > > Add life
> > > > > > > > > > > > > > cycle method `open` and `close`.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Specify the method added to the RuntimeContext
> > from
> > > > > which
> > > > > > > > user
> > > > > > > > > get
> > > > > > > > > > > > > > the information of external resources.
> > > > > > > > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > > > > getExternalResourceInfo(ResourceSpec
> resourceSpec);
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Add "String getInformation()" method to
> > > > > > > > `ExternalResourceInfo`
> > > > > > > > > > > > > interface.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Treat adding external resource info to
> > > RestAPI/WebUI
> > > > > as a
> > > > > > > > > future
> > > > > > > > > > > > work.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If you have any new concerns after that change,
> > > please
> > > > > > > > mentioned
> > > > > > > > > here.
> > > > > > > > > > > > > > Sorry for disturbing you.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > > > > > > danrtsey.wy@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks Yangze for the efforts to support GPU
> > > extended
> > > > > > > > > resources.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1 for this FLIP
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Yang
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Till Rohrmann <tr...@apache.org>
> > 于2020年4月2日周四
> > > > > > > 下午11:10写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng
> <
> > > > > > > > > > > > felixzhengcb@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks Yangze for driving the initial CPU
> > > support!
> > > > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xintong Song <to...@gmail.com>
> > > 于2020年4月1日周三
> > > > > > > > > 下午6:36写道:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze
> Guo <
> > > > > > > > > karmagyz@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I'd like to start the vote of FLIP-108
> > [1],
> > > > > which
> > > > > > > > adds
> > > > > > > > > GPU
> > > > > > > > > > > > > support in
> > > > > > > > > > > > > > > > > > > Flink. This FLIP is discussed in the
> > > thread[2].
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The vote will be open for at least 72
> > > hours.
> > > > > Unless
> > > > > > > > > there is
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > objection,
> > > > > > > > > > > > > > > > > > > I will try to close it by April 4, 2020
> > > 10:00
> > > > > UTC
> > > > > > > if
> > > > > > > > > we have
> > > > > > > > > > > > > received
> > > > > > > > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yangze Guo <ka...@gmail.com>.
Thanks for the explanation. I do not have a strong opinion regarding
this interface. So, if it is better from your perspective, I'm +1 for
this. I just saying it may not help a lot regarding the type-safe.

Regarding the bounded wildcard type, yes, it's the implementation
detail. If it won't make a difference for user, I'm also +1 for not
using bounded wildcard type there.

Best,
Yangze Guo

On Wed, Apr 15, 2020 at 4:23 PM Till Rohrmann <tr...@apache.org> wrote:
>
> I think <T extends ExternalResourceInfo> Set<T>
> getExternalResourceInfos(String resourceName, Class<T>
> externalResourceType) is not less flexible than the other API since you can
> always pass in ExternalResourceInfo.class as the second argument.
>
> The benefit I see for the user is that he does not have to do the
> instanceof checks and type casts himself. This is admittedly not a big deal
> but still a better API imo.
>
> I think the interface of the Driver and what is returned by the
> RuntimeContext don't have to have the same type because you can cast it or
> repack it. If the current implementation simply stores what the Driver
> returns and RuntimeContext returns this map, then it might seem that there
> is a connection. But this should be an implementation detail rather than a
> necessity.
>
> Maybe we could also pull in someone from the SDK team to give us his
> opinion on the user facing API.
>
> Cheers,
> Till
>
> On Wed, Apr 15, 2020 at 10:13 AM Xintong Song <to...@gmail.com> wrote:
>
> > >
> > > I agree that such an interface won't give compile time checks but I think
> > > that it could be easier to use from a user's perspective because there is
> > > no explicit casting required.
> > > public interface RuntimeContext {
> > >     <T extends ExternalResourceInfo> Set<T>
> > getExternalResourceInfos(String
> > > resourceName, Class<T> externalResourceType);
> > > }
> >
> >
> > I'm not sure how less efforts is required from users to pass in a
> > `externalResourceType` compared to do an explicit type casting.
> > A potential side effect of passing in a `externalResourceType` is that, it
> > requires user (e.g. the operator) to know which specific type should be
> > returned in advance, which may limit the flexibility.
> >
> > E.g., we might have an operator that can work with multiple different
> > implementations of `ExternalResourceInfo`. It may decide its behavior based
> > on the actually type returned by `getExternalResourceInfos` at runtime.
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Apr 15, 2020 at 4:09 PM Yangze Guo <ka...@gmail.com> wrote:
> >
> > > @Till
> > > If we add "Class<T> externalResourceType" param, what if there are
> > > multiple subtypes in the ExternalResourceInfos set of one external
> > > resource? It seems user has to set the T to ExternalResourceInfo and
> > > the mechanism is useless at this case.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > > >
> > > > Ok, if there can be multiple resources of the same type then we
> > > definitely
> > > > need the name as a differentiator.
> > > >
> > > > I agree that such an interface won't give compile time checks but I
> > think
> > > > that it could be easier to use from a user's perspective because there
> > is
> > > > no explicit casting required.
> > > >
> > > > public interface RuntimeContext {
> > > >     <T extends ExternalResourceInfo> Set<T>
> > > getExternalResourceInfos(String
> > > > resourceName, Class<T> externalResourceType);
> > > > }
> > > >
> > > > One minor note: I think the value of the returned map does not need to
> > > use
> > > > a bounded wildcard type because for the user it won't make a
> > difference.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com> wrote:
> > > >
> > > > > Hi Till,
> > > > >
> > > > > > ExternalResourceDriver could return a Set<? extends
> > > > > ExternalResourceInfo>.
> > > > > It sounds good.
> > > > >
> > > > > > then one could make the interface type-safe by changing it to
> > > > > > public interface RuntimeContext {
> > > > > >    <T extends ExternalResourceInfo> Set<T>
> > > > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > > > }
> > > > > I think it may not help.
> > > > > - I think the assumption of "there is always only one resource of a
> > > > > specific type" is too strong. The external resource framework should
> > > > > only assume it gets a set of ExternalResourceInfo from the driver.
> > The
> > > > > concrete implementation is given by user. So, if we give such an
> > > > > assumption, it would hurt the flexibility. There could be multiple
> > > > > types in the returned externalResourceInfo set. There could also be
> > > > > different types returned from different driver implementation or
> > > > > version. The contract about the return type between Driver and
> > > > > Operator should be guaranteed by user.
> > > > > - Since the Drivers are loaded dynamically in runtime, if there is a
> > > > > type mismatch, the job would fail in runtime instead of in compile
> > > > > time, no matter the type extraction is done by Operator or Flink
> > core.
> > > > > This interface would not gain benefits for type safety.
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <tr...@apache.org>
> > > > > wrote:
> > > > > >
> > > > > > Thanks for updating the FLIP, Yangze.
> > > > > >
> > > > > > If ExternalResourceInfo is a marker interface, then
> > > > > ExternalResourceDriver
> > > > > > could return a Set<? extends ExternalResourceInfo>. This makes is a
> > > bit
> > > > > > nicer for the implementor because he can use the concrete subtype.
> > > > > >
> > > > > > If we assume that users will always cast the ExternalResourceInfo
> > > > > instance
> > > > > > into the concrete subtype and if we assume that there is always
> > only
> > > one
> > > > > > resource of a specific type, then one could make the interface
> > > type-safe
> > > > > by
> > > > > > changing it to
> > > > > >
> > > > > > public interface RuntimeContext {
> > > > > >     <T extends ExternalResourceInfo> Set<T>
> > > > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > > > }
> > > > > >
> > > > > > If we want to support multiple GPU resources, then one would need
> > to
> > > use
> > > > > > the name of the respective resource as well.
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <
> > tonysong820@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Thanks for updating the FLIP, Yangze.
> > > > > > > The latest FLIP looks good to me.
> > > > > > >
> > > > > > > nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is
> > > out of
> > > > > > > sync.
> > > > > > >
> > > > > > > > Retrieve the information of the external resources according to
> > > the
> > > > > > > > resourceProfile.
> > > > > > >
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <
> > becket.qin@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > Good feedback form Xintong. The latest FLIP looks good to me.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <karmagyz@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi there,
> > > > > > > > > I've updated the FLIP accordingly. Please take a look. If you
> > > have
> > > > > any
> > > > > > > > > further concerns please let me know.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <
> > karmagyz@gmail.com
> > > >
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Thanks for the feedback, Xintong.
> > > > > > > > > >
> > > > > > > > > > - Should we have a factory interface for
> > > > > `ExternalResourceDriver`,
> > > > > > > that
> > > > > > > > > > takes the configuration and returns a driver instance?
> > > > > Otherwise, if
> > > > > > > we
> > > > > > > > > are
> > > > > > > > > > creating the driver instance with reflection, we kind of
> > > > > implicitly
> > > > > > > > > > requires the driver to have a public non-argument
> > > constructor.
> > > > > If we
> > > > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > > >
> > > > > > > > > > True, we could have an `ExternalResourceDriverFactory`,
> > like
> > > > > > > > > > interface ExternalResourceDriverFactory {
> > > > > > > > > >     ExternalResourceDriver fromConfiguration(Configuration
> > > > > config);
> > > > > > > > > > }
> > > > > > > > > > Regarding the configuration, the user should provide
> > > > > > > > > > "external-resource.{resourceName}.driver-factory.class"
> > > instead.
> > > > > > > > > >
> > > > > > > > > > - Not sure about the necessity of
> > > > > `ExternalResourceDriver#close`. I
> > > > > > > > would
> > > > > > > > > > suggest to avoid introduce more interfaces if not
> > absolutely
> > > > > > > necessary.
> > > > > > > > > >
> > > > > > > > > > I add `ExternalResourceDriver#close` in case user needs to
> > > clean
> > > > > up
> > > > > > > > > > internal states and any other resources. It's true that it
> > > may
> > > > > not
> > > > > > > > > > absolutely necessary for our GPUDriver. From my side, I'm
> > ok
> > > to
> > > > > > > remove
> > > > > > > > > > it.
> > > > > > > > > >
> > > > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not
> > > take
> > > > > > > > > > `ResourceProfile` as argument. This exposes more
> > information
> > > > > than it
> > > > > > > > > needs.
> > > > > > > > > > In addition, it requires the runtime/core to understand how
> > > to
> > > > > > > properly
> > > > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > > > `ResourceProfile#extendedResources` takes `Resource`, which
> > > is an
> > > > > > > > > abstract
> > > > > > > > > > class. Runtime/core has to known which implementation of
> > > > > `Resource`
> > > > > > > to
> > > > > > > > > use.
> > > > > > > > > >
> > > > > > > > > > True, at the moment, I think the amount of the resource is
> > > > > enough for
> > > > > > > > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the
> > > > > future, if
> > > > > > > > > > the fine-grained external resource management is supported,
> > > the
> > > > > > > amount
> > > > > > > > > > of the resource seems to be enough either. If we want to
> > > leverage
> > > > > > > some
> > > > > > > > > > external resources which could not be measured by a single
> > > long
> > > > > > > value,
> > > > > > > > > > we might enrich this. But I'd like to keep it out of the
> > > scope of
> > > > > > > this
> > > > > > > > > > FLIP.
> > > > > > > > > >
> > > > > > > > > > - Do we really need `ExternalResourceInfo#getInformation`?
> > I
> > > > > think it
> > > > > > > > > > should be good enough to make `ExternalResourceInfo` an
> > empty
> > > > > > > > interface.
> > > > > > > > > > User can define their own `ExternalResourceInfo`
> > > implementation
> > > > > and
> > > > > > > how
> > > > > > > > > it
> > > > > > > > > > is used by the operator user codes.
> > > > > > > > > >
> > > > > > > > > > Sounds good.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yangze Guo
> > > > > > > > > >
> > > > > > > > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> > > > > tonysong820@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Sorry to pull this back. I have some concerns about the
> > > recent
> > > > > > > > updated
> > > > > > > > > > > interface details.
> > > > > > > > > > >
> > > > > > > > > > > - Should we have a factory interface for
> > > > > `ExternalResourceDriver`,
> > > > > > > > that
> > > > > > > > > > > takes the configuration and returns a driver instance?
> > > > > Otherwise,
> > > > > > > if
> > > > > > > > > we are
> > > > > > > > > > > creating the driver instance with reflection, we kind of
> > > > > implicitly
> > > > > > > > > > > requires the driver to have a public non-argument
> > > constructor.
> > > > > If
> > > > > > > we
> > > > > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > > > > - Not sure about the necessity of
> > > > > `ExternalResourceDriver#close`. I
> > > > > > > > > would
> > > > > > > > > > > suggest to avoid introduce more interfaces if not
> > > absolutely
> > > > > > > > necessary.
> > > > > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should
> > not
> > > take
> > > > > > > > > > > `ResourceProfile` as argument. This exposes more
> > > information
> > > > > than
> > > > > > > it
> > > > > > > > > needs.
> > > > > > > > > > > In addition, it requires the runtime/core to understand
> > > how to
> > > > > > > > properly
> > > > > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > > > > `ResourceProfile#extendedResources` takes `Resource`,
> > > which is
> > > > > an
> > > > > > > > > abstract
> > > > > > > > > > > class. Runtime/core has to known which implementation of
> > > > > `Resource`
> > > > > > > > to
> > > > > > > > > use.
> > > > > > > > > > > - Do we really need
> > `ExternalResourceInfo#getInformation`?
> > > I
> > > > > think
> > > > > > > it
> > > > > > > > > > > should be good enough to make `ExternalResourceInfo` an
> > > empty
> > > > > > > > > interface.
> > > > > > > > > > > User can define their own `ExternalResourceInfo`
> > > > > implementation and
> > > > > > > > > how it
> > > > > > > > > > > is used by the operator user codes.
> > > > > > > > > > >
> > > > > > > > > > > Thank you~
> > > > > > > > > > >
> > > > > > > > > > > Xintong Song
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> > > > > becket.qin@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > +1
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for driving this effort, Ynagze. The latest FLIP
> > > wiki
> > > > > > > looks
> > > > > > > > > good to
> > > > > > > > > > > > me.
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > >
> > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> > > > > karmagyz@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Edit: RuntimeContext interface
> > > > > > > > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > getExternalResourceInfo();
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> > > > > karmagyz@gmail.com
> > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, there
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have updated the FLIP, mainly target to make it
> > > more
> > > > > > > detailed
> > > > > > > > > and
> > > > > > > > > > > > > > clear. The general design is not changed, but there
> > > are
> > > > > still
> > > > > > > > > some
> > > > > > > > > > > > > > changes need to be notified here:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Change the `ExternalResourceDriver` from abstract
> > > > > class to
> > > > > > > > > > > > > > interface, since it does not have an abstract
> > > > > implementation.
> > > > > > > > > Add life
> > > > > > > > > > > > > > cycle method `open` and `close`.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Specify the method added to the RuntimeContext
> > from
> > > > > which
> > > > > > > > user
> > > > > > > > > get
> > > > > > > > > > > > > > the information of external resources.
> > > > > > > > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Add "String getInformation()" method to
> > > > > > > > `ExternalResourceInfo`
> > > > > > > > > > > > > interface.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > - Treat adding external resource info to
> > > RestAPI/WebUI
> > > > > as a
> > > > > > > > > future
> > > > > > > > > > > > work.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If you have any new concerns after that change,
> > > please
> > > > > > > > mentioned
> > > > > > > > > here.
> > > > > > > > > > > > > > Sorry for disturbing you.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > > > > > > danrtsey.wy@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks Yangze for the efforts to support GPU
> > > extended
> > > > > > > > > resources.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1 for this FLIP
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Yang
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Till Rohrmann <tr...@apache.org>
> > 于2020年4月2日周四
> > > > > > > 下午11:10写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > > > > > > > > felixzhengcb@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks Yangze for driving the initial CPU
> > > support!
> > > > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xintong Song <to...@gmail.com>
> > > 于2020年4月1日周三
> > > > > > > > > 下午6:36写道:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > > > > > > > karmagyz@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I'd like to start the vote of FLIP-108
> > [1],
> > > > > which
> > > > > > > > adds
> > > > > > > > > GPU
> > > > > > > > > > > > > support in
> > > > > > > > > > > > > > > > > > > Flink. This FLIP is discussed in the
> > > thread[2].
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The vote will be open for at least 72
> > > hours.
> > > > > Unless
> > > > > > > > > there is
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > objection,
> > > > > > > > > > > > > > > > > > > I will try to close it by April 4, 2020
> > > 10:00
> > > > > UTC
> > > > > > > if
> > > > > > > > > we have
> > > > > > > > > > > > > received
> > > > > > > > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > >
> > >
> >

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Till Rohrmann <tr...@apache.org>.
I think <T extends ExternalResourceInfo> Set<T>
getExternalResourceInfos(String resourceName, Class<T>
externalResourceType) is not less flexible than the other API since you can
always pass in ExternalResourceInfo.class as the second argument.

The benefit I see for the user is that he does not have to do the
instanceof checks and type casts himself. This is admittedly not a big deal
but still a better API imo.

I think the interface of the Driver and what is returned by the
RuntimeContext don't have to have the same type because you can cast it or
repack it. If the current implementation simply stores what the Driver
returns and RuntimeContext returns this map, then it might seem that there
is a connection. But this should be an implementation detail rather than a
necessity.

Maybe we could also pull in someone from the SDK team to give us his
opinion on the user facing API.

Cheers,
Till

On Wed, Apr 15, 2020 at 10:13 AM Xintong Song <to...@gmail.com> wrote:

> >
> > I agree that such an interface won't give compile time checks but I think
> > that it could be easier to use from a user's perspective because there is
> > no explicit casting required.
> > public interface RuntimeContext {
> >     <T extends ExternalResourceInfo> Set<T>
> getExternalResourceInfos(String
> > resourceName, Class<T> externalResourceType);
> > }
>
>
> I'm not sure how less efforts is required from users to pass in a
> `externalResourceType` compared to do an explicit type casting.
> A potential side effect of passing in a `externalResourceType` is that, it
> requires user (e.g. the operator) to know which specific type should be
> returned in advance, which may limit the flexibility.
>
> E.g., we might have an operator that can work with multiple different
> implementations of `ExternalResourceInfo`. It may decide its behavior based
> on the actually type returned by `getExternalResourceInfos` at runtime.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Wed, Apr 15, 2020 at 4:09 PM Yangze Guo <ka...@gmail.com> wrote:
>
> > @Till
> > If we add "Class<T> externalResourceType" param, what if there are
> > multiple subtypes in the ExternalResourceInfos set of one external
> > resource? It seems user has to set the T to ExternalResourceInfo and
> > the mechanism is useless at this case.
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org>
> > wrote:
> > >
> > > Ok, if there can be multiple resources of the same type then we
> > definitely
> > > need the name as a differentiator.
> > >
> > > I agree that such an interface won't give compile time checks but I
> think
> > > that it could be easier to use from a user's perspective because there
> is
> > > no explicit casting required.
> > >
> > > public interface RuntimeContext {
> > >     <T extends ExternalResourceInfo> Set<T>
> > getExternalResourceInfos(String
> > > resourceName, Class<T> externalResourceType);
> > > }
> > >
> > > One minor note: I think the value of the returned map does not need to
> > use
> > > a bounded wildcard type because for the user it won't make a
> difference.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com> wrote:
> > >
> > > > Hi Till,
> > > >
> > > > > ExternalResourceDriver could return a Set<? extends
> > > > ExternalResourceInfo>.
> > > > It sounds good.
> > > >
> > > > > then one could make the interface type-safe by changing it to
> > > > > public interface RuntimeContext {
> > > > >    <T extends ExternalResourceInfo> Set<T>
> > > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > > }
> > > > I think it may not help.
> > > > - I think the assumption of "there is always only one resource of a
> > > > specific type" is too strong. The external resource framework should
> > > > only assume it gets a set of ExternalResourceInfo from the driver.
> The
> > > > concrete implementation is given by user. So, if we give such an
> > > > assumption, it would hurt the flexibility. There could be multiple
> > > > types in the returned externalResourceInfo set. There could also be
> > > > different types returned from different driver implementation or
> > > > version. The contract about the return type between Driver and
> > > > Operator should be guaranteed by user.
> > > > - Since the Drivers are loaded dynamically in runtime, if there is a
> > > > type mismatch, the job would fail in runtime instead of in compile
> > > > time, no matter the type extraction is done by Operator or Flink
> core.
> > > > This interface would not gain benefits for type safety.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <tr...@apache.org>
> > > > wrote:
> > > > >
> > > > > Thanks for updating the FLIP, Yangze.
> > > > >
> > > > > If ExternalResourceInfo is a marker interface, then
> > > > ExternalResourceDriver
> > > > > could return a Set<? extends ExternalResourceInfo>. This makes is a
> > bit
> > > > > nicer for the implementor because he can use the concrete subtype.
> > > > >
> > > > > If we assume that users will always cast the ExternalResourceInfo
> > > > instance
> > > > > into the concrete subtype and if we assume that there is always
> only
> > one
> > > > > resource of a specific type, then one could make the interface
> > type-safe
> > > > by
> > > > > changing it to
> > > > >
> > > > > public interface RuntimeContext {
> > > > >     <T extends ExternalResourceInfo> Set<T>
> > > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > > }
> > > > >
> > > > > If we want to support multiple GPU resources, then one would need
> to
> > use
> > > > > the name of the respective resource as well.
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <
> tonysong820@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Thanks for updating the FLIP, Yangze.
> > > > > > The latest FLIP looks good to me.
> > > > > >
> > > > > > nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is
> > out of
> > > > > > sync.
> > > > > >
> > > > > > > Retrieve the information of the external resources according to
> > the
> > > > > > > resourceProfile.
> > > > > >
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <
> becket.qin@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > Good feedback form Xintong. The latest FLIP looks good to me.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <karmagyz@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > Hi there,
> > > > > > > > I've updated the FLIP accordingly. Please take a look. If you
> > have
> > > > any
> > > > > > > > further concerns please let me know.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <
> karmagyz@gmail.com
> > >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > Thanks for the feedback, Xintong.
> > > > > > > > >
> > > > > > > > > - Should we have a factory interface for
> > > > `ExternalResourceDriver`,
> > > > > > that
> > > > > > > > > takes the configuration and returns a driver instance?
> > > > Otherwise, if
> > > > > > we
> > > > > > > > are
> > > > > > > > > creating the driver instance with reflection, we kind of
> > > > implicitly
> > > > > > > > > requires the driver to have a public non-argument
> > constructor.
> > > > If we
> > > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > >
> > > > > > > > > True, we could have an `ExternalResourceDriverFactory`,
> like
> > > > > > > > > interface ExternalResourceDriverFactory {
> > > > > > > > >     ExternalResourceDriver fromConfiguration(Configuration
> > > > config);
> > > > > > > > > }
> > > > > > > > > Regarding the configuration, the user should provide
> > > > > > > > > "external-resource.{resourceName}.driver-factory.class"
> > instead.
> > > > > > > > >
> > > > > > > > > - Not sure about the necessity of
> > > > `ExternalResourceDriver#close`. I
> > > > > > > would
> > > > > > > > > suggest to avoid introduce more interfaces if not
> absolutely
> > > > > > necessary.
> > > > > > > > >
> > > > > > > > > I add `ExternalResourceDriver#close` in case user needs to
> > clean
> > > > up
> > > > > > > > > internal states and any other resources. It's true that it
> > may
> > > > not
> > > > > > > > > absolutely necessary for our GPUDriver. From my side, I'm
> ok
> > to
> > > > > > remove
> > > > > > > > > it.
> > > > > > > > >
> > > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not
> > take
> > > > > > > > > `ResourceProfile` as argument. This exposes more
> information
> > > > than it
> > > > > > > > needs.
> > > > > > > > > In addition, it requires the runtime/core to understand how
> > to
> > > > > > properly
> > > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > > `ResourceProfile#extendedResources` takes `Resource`, which
> > is an
> > > > > > > > abstract
> > > > > > > > > class. Runtime/core has to known which implementation of
> > > > `Resource`
> > > > > > to
> > > > > > > > use.
> > > > > > > > >
> > > > > > > > > True, at the moment, I think the amount of the resource is
> > > > enough for
> > > > > > > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the
> > > > future, if
> > > > > > > > > the fine-grained external resource management is supported,
> > the
> > > > > > amount
> > > > > > > > > of the resource seems to be enough either. If we want to
> > leverage
> > > > > > some
> > > > > > > > > external resources which could not be measured by a single
> > long
> > > > > > value,
> > > > > > > > > we might enrich this. But I'd like to keep it out of the
> > scope of
> > > > > > this
> > > > > > > > > FLIP.
> > > > > > > > >
> > > > > > > > > - Do we really need `ExternalResourceInfo#getInformation`?
> I
> > > > think it
> > > > > > > > > should be good enough to make `ExternalResourceInfo` an
> empty
> > > > > > > interface.
> > > > > > > > > User can define their own `ExternalResourceInfo`
> > implementation
> > > > and
> > > > > > how
> > > > > > > > it
> > > > > > > > > is used by the operator user codes.
> > > > > > > > >
> > > > > > > > > Sounds good.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> > > > tonysong820@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Sorry to pull this back. I have some concerns about the
> > recent
> > > > > > > updated
> > > > > > > > > > interface details.
> > > > > > > > > >
> > > > > > > > > > - Should we have a factory interface for
> > > > `ExternalResourceDriver`,
> > > > > > > that
> > > > > > > > > > takes the configuration and returns a driver instance?
> > > > Otherwise,
> > > > > > if
> > > > > > > > we are
> > > > > > > > > > creating the driver instance with reflection, we kind of
> > > > implicitly
> > > > > > > > > > requires the driver to have a public non-argument
> > constructor.
> > > > If
> > > > > > we
> > > > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > > > - Not sure about the necessity of
> > > > `ExternalResourceDriver#close`. I
> > > > > > > > would
> > > > > > > > > > suggest to avoid introduce more interfaces if not
> > absolutely
> > > > > > > necessary.
> > > > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should
> not
> > take
> > > > > > > > > > `ResourceProfile` as argument. This exposes more
> > information
> > > > than
> > > > > > it
> > > > > > > > needs.
> > > > > > > > > > In addition, it requires the runtime/core to understand
> > how to
> > > > > > > properly
> > > > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > > > `ResourceProfile#extendedResources` takes `Resource`,
> > which is
> > > > an
> > > > > > > > abstract
> > > > > > > > > > class. Runtime/core has to known which implementation of
> > > > `Resource`
> > > > > > > to
> > > > > > > > use.
> > > > > > > > > > - Do we really need
> `ExternalResourceInfo#getInformation`?
> > I
> > > > think
> > > > > > it
> > > > > > > > > > should be good enough to make `ExternalResourceInfo` an
> > empty
> > > > > > > > interface.
> > > > > > > > > > User can define their own `ExternalResourceInfo`
> > > > implementation and
> > > > > > > > how it
> > > > > > > > > > is used by the operator user codes.
> > > > > > > > > >
> > > > > > > > > > Thank you~
> > > > > > > > > >
> > > > > > > > > > Xintong Song
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> > > > becket.qin@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > +1
> > > > > > > > > > >
> > > > > > > > > > > Thanks for driving this effort, Ynagze. The latest FLIP
> > wiki
> > > > > > looks
> > > > > > > > good to
> > > > > > > > > > > me.
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > >
> > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> > > > karmagyz@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Edit: RuntimeContext interface
> > > > > > > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > getExternalResourceInfo();
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> > > > karmagyz@gmail.com
> > > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Hi, there
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have updated the FLIP, mainly target to make it
> > more
> > > > > > detailed
> > > > > > > > and
> > > > > > > > > > > > > clear. The general design is not changed, but there
> > are
> > > > still
> > > > > > > > some
> > > > > > > > > > > > > changes need to be notified here:
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Change the `ExternalResourceDriver` from abstract
> > > > class to
> > > > > > > > > > > > > interface, since it does not have an abstract
> > > > implementation.
> > > > > > > > Add life
> > > > > > > > > > > > > cycle method `open` and `close`.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Specify the method added to the RuntimeContext
> from
> > > > which
> > > > > > > user
> > > > > > > > get
> > > > > > > > > > > > > the information of external resources.
> > > > > > > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Add "String getInformation()" method to
> > > > > > > `ExternalResourceInfo`
> > > > > > > > > > > > interface.
> > > > > > > > > > > > >
> > > > > > > > > > > > > - Treat adding external resource info to
> > RestAPI/WebUI
> > > > as a
> > > > > > > > future
> > > > > > > > > > > work.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If you have any new concerns after that change,
> > please
> > > > > > > mentioned
> > > > > > > > here.
> > > > > > > > > > > > > Sorry for disturbing you.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > > > > > danrtsey.wy@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks Yangze for the efforts to support GPU
> > extended
> > > > > > > > resources.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > +1 for this FLIP
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Yang
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Till Rohrmann <tr...@apache.org>
> 于2020年4月2日周四
> > > > > > 下午11:10写道:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > +1
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > > Till
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > > > > > > > felixzhengcb@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks Yangze for driving the initial CPU
> > support!
> > > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xintong Song <to...@gmail.com>
> > 于2020年4月1日周三
> > > > > > > > 下午6:36写道:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > > > > > > karmagyz@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I'd like to start the vote of FLIP-108
> [1],
> > > > which
> > > > > > > adds
> > > > > > > > GPU
> > > > > > > > > > > > support in
> > > > > > > > > > > > > > > > > > Flink. This FLIP is discussed in the
> > thread[2].
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > The vote will be open for at least 72
> > hours.
> > > > Unless
> > > > > > > > there is
> > > > > > > > > > > an
> > > > > > > > > > > > > > > > > objection,
> > > > > > > > > > > > > > > > > > I will try to close it by April 4, 2020
> > 10:00
> > > > UTC
> > > > > > if
> > > > > > > > we have
> > > > > > > > > > > > received
> > > > > > > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > >
> >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Xintong Song <to...@gmail.com>.
>
> I agree that such an interface won't give compile time checks but I think
> that it could be easier to use from a user's perspective because there is
> no explicit casting required.
> public interface RuntimeContext {
>     <T extends ExternalResourceInfo> Set<T> getExternalResourceInfos(String
> resourceName, Class<T> externalResourceType);
> }


I'm not sure how less efforts is required from users to pass in a
`externalResourceType` compared to do an explicit type casting.
A potential side effect of passing in a `externalResourceType` is that, it
requires user (e.g. the operator) to know which specific type should be
returned in advance, which may limit the flexibility.

E.g., we might have an operator that can work with multiple different
implementations of `ExternalResourceInfo`. It may decide its behavior based
on the actually type returned by `getExternalResourceInfos` at runtime.


Thank you~

Xintong Song



On Wed, Apr 15, 2020 at 4:09 PM Yangze Guo <ka...@gmail.com> wrote:

> @Till
> If we add "Class<T> externalResourceType" param, what if there are
> multiple subtypes in the ExternalResourceInfos set of one external
> resource? It seems user has to set the T to ExternalResourceInfo and
> the mechanism is useless at this case.
>
> Best,
> Yangze Guo
>
> On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org>
> wrote:
> >
> > Ok, if there can be multiple resources of the same type then we
> definitely
> > need the name as a differentiator.
> >
> > I agree that such an interface won't give compile time checks but I think
> > that it could be easier to use from a user's perspective because there is
> > no explicit casting required.
> >
> > public interface RuntimeContext {
> >     <T extends ExternalResourceInfo> Set<T>
> getExternalResourceInfos(String
> > resourceName, Class<T> externalResourceType);
> > }
> >
> > One minor note: I think the value of the returned map does not need to
> use
> > a bounded wildcard type because for the user it won't make a difference.
> >
> > Cheers,
> > Till
> >
> > On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com> wrote:
> >
> > > Hi Till,
> > >
> > > > ExternalResourceDriver could return a Set<? extends
> > > ExternalResourceInfo>.
> > > It sounds good.
> > >
> > > > then one could make the interface type-safe by changing it to
> > > > public interface RuntimeContext {
> > > >    <T extends ExternalResourceInfo> Set<T>
> > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > }
> > > I think it may not help.
> > > - I think the assumption of "there is always only one resource of a
> > > specific type" is too strong. The external resource framework should
> > > only assume it gets a set of ExternalResourceInfo from the driver. The
> > > concrete implementation is given by user. So, if we give such an
> > > assumption, it would hurt the flexibility. There could be multiple
> > > types in the returned externalResourceInfo set. There could also be
> > > different types returned from different driver implementation or
> > > version. The contract about the return type between Driver and
> > > Operator should be guaranteed by user.
> > > - Since the Drivers are loaded dynamically in runtime, if there is a
> > > type mismatch, the job would fail in runtime instead of in compile
> > > time, no matter the type extraction is done by Operator or Flink core.
> > > This interface would not gain benefits for type safety.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > > >
> > > > Thanks for updating the FLIP, Yangze.
> > > >
> > > > If ExternalResourceInfo is a marker interface, then
> > > ExternalResourceDriver
> > > > could return a Set<? extends ExternalResourceInfo>. This makes is a
> bit
> > > > nicer for the implementor because he can use the concrete subtype.
> > > >
> > > > If we assume that users will always cast the ExternalResourceInfo
> > > instance
> > > > into the concrete subtype and if we assume that there is always only
> one
> > > > resource of a specific type, then one could make the interface
> type-safe
> > > by
> > > > changing it to
> > > >
> > > > public interface RuntimeContext {
> > > >     <T extends ExternalResourceInfo> Set<T>
> > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > }
> > > >
> > > > If we want to support multiple GPU resources, then one would need to
> use
> > > > the name of the respective resource as well.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <to...@gmail.com>
> > > wrote:
> > > >
> > > > > Thanks for updating the FLIP, Yangze.
> > > > > The latest FLIP looks good to me.
> > > > >
> > > > > nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is
> out of
> > > > > sync.
> > > > >
> > > > > > Retrieve the information of the external resources according to
> the
> > > > > > resourceProfile.
> > > > >
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <be...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Good feedback form Xintong. The latest FLIP looks good to me.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <ka...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi there,
> > > > > > > I've updated the FLIP accordingly. Please take a look. If you
> have
> > > any
> > > > > > > further concerns please let me know.
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <karmagyz@gmail.com
> >
> > > wrote:
> > > > > > > >
> > > > > > > > Thanks for the feedback, Xintong.
> > > > > > > >
> > > > > > > > - Should we have a factory interface for
> > > `ExternalResourceDriver`,
> > > > > that
> > > > > > > > takes the configuration and returns a driver instance?
> > > Otherwise, if
> > > > > we
> > > > > > > are
> > > > > > > > creating the driver instance with reflection, we kind of
> > > implicitly
> > > > > > > > requires the driver to have a public non-argument
> constructor.
> > > If we
> > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > >
> > > > > > > > True, we could have an `ExternalResourceDriverFactory`, like
> > > > > > > > interface ExternalResourceDriverFactory {
> > > > > > > >     ExternalResourceDriver fromConfiguration(Configuration
> > > config);
> > > > > > > > }
> > > > > > > > Regarding the configuration, the user should provide
> > > > > > > > "external-resource.{resourceName}.driver-factory.class"
> instead.
> > > > > > > >
> > > > > > > > - Not sure about the necessity of
> > > `ExternalResourceDriver#close`. I
> > > > > > would
> > > > > > > > suggest to avoid introduce more interfaces if not absolutely
> > > > > necessary.
> > > > > > > >
> > > > > > > > I add `ExternalResourceDriver#close` in case user needs to
> clean
> > > up
> > > > > > > > internal states and any other resources. It's true that it
> may
> > > not
> > > > > > > > absolutely necessary for our GPUDriver. From my side, I'm ok
> to
> > > > > remove
> > > > > > > > it.
> > > > > > > >
> > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not
> take
> > > > > > > > `ResourceProfile` as argument. This exposes more information
> > > than it
> > > > > > > needs.
> > > > > > > > In addition, it requires the runtime/core to understand how
> to
> > > > > properly
> > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > `ResourceProfile#extendedResources` takes `Resource`, which
> is an
> > > > > > > abstract
> > > > > > > > class. Runtime/core has to known which implementation of
> > > `Resource`
> > > > > to
> > > > > > > use.
> > > > > > > >
> > > > > > > > True, at the moment, I think the amount of the resource is
> > > enough for
> > > > > > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the
> > > future, if
> > > > > > > > the fine-grained external resource management is supported,
> the
> > > > > amount
> > > > > > > > of the resource seems to be enough either. If we want to
> leverage
> > > > > some
> > > > > > > > external resources which could not be measured by a single
> long
> > > > > value,
> > > > > > > > we might enrich this. But I'd like to keep it out of the
> scope of
> > > > > this
> > > > > > > > FLIP.
> > > > > > > >
> > > > > > > > - Do we really need `ExternalResourceInfo#getInformation`? I
> > > think it
> > > > > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > > > > interface.
> > > > > > > > User can define their own `ExternalResourceInfo`
> implementation
> > > and
> > > > > how
> > > > > > > it
> > > > > > > > is used by the operator user codes.
> > > > > > > >
> > > > > > > > Sounds good.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> > > tonysong820@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Sorry to pull this back. I have some concerns about the
> recent
> > > > > > updated
> > > > > > > > > interface details.
> > > > > > > > >
> > > > > > > > > - Should we have a factory interface for
> > > `ExternalResourceDriver`,
> > > > > > that
> > > > > > > > > takes the configuration and returns a driver instance?
> > > Otherwise,
> > > > > if
> > > > > > > we are
> > > > > > > > > creating the driver instance with reflection, we kind of
> > > implicitly
> > > > > > > > > requires the driver to have a public non-argument
> constructor.
> > > If
> > > > > we
> > > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > > - Not sure about the necessity of
> > > `ExternalResourceDriver#close`. I
> > > > > > > would
> > > > > > > > > suggest to avoid introduce more interfaces if not
> absolutely
> > > > > > necessary.
> > > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not
> take
> > > > > > > > > `ResourceProfile` as argument. This exposes more
> information
> > > than
> > > > > it
> > > > > > > needs.
> > > > > > > > > In addition, it requires the runtime/core to understand
> how to
> > > > > > properly
> > > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > > `ResourceProfile#extendedResources` takes `Resource`,
> which is
> > > an
> > > > > > > abstract
> > > > > > > > > class. Runtime/core has to known which implementation of
> > > `Resource`
> > > > > > to
> > > > > > > use.
> > > > > > > > > - Do we really need `ExternalResourceInfo#getInformation`?
> I
> > > think
> > > > > it
> > > > > > > > > should be good enough to make `ExternalResourceInfo` an
> empty
> > > > > > > interface.
> > > > > > > > > User can define their own `ExternalResourceInfo`
> > > implementation and
> > > > > > > how it
> > > > > > > > > is used by the operator user codes.
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> > > becket.qin@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1
> > > > > > > > > >
> > > > > > > > > > Thanks for driving this effort, Ynagze. The latest FLIP
> wiki
> > > > > looks
> > > > > > > good to
> > > > > > > > > > me.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> > > karmagyz@gmail.com>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Edit: RuntimeContext interface
> > > > > > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > > > > > getExternalResourceInfo();
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> > > karmagyz@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi, there
> > > > > > > > > > > >
> > > > > > > > > > > > I have updated the FLIP, mainly target to make it
> more
> > > > > detailed
> > > > > > > and
> > > > > > > > > > > > clear. The general design is not changed, but there
> are
> > > still
> > > > > > > some
> > > > > > > > > > > > changes need to be notified here:
> > > > > > > > > > > >
> > > > > > > > > > > > - Change the `ExternalResourceDriver` from abstract
> > > class to
> > > > > > > > > > > > interface, since it does not have an abstract
> > > implementation.
> > > > > > > Add life
> > > > > > > > > > > > cycle method `open` and `close`.
> > > > > > > > > > > >
> > > > > > > > > > > > - Specify the method added to the RuntimeContext from
> > > which
> > > > > > user
> > > > > > > get
> > > > > > > > > > > > the information of external resources.
> > > > > > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > >
> > > > > > > > > > > > - Add "String getInformation()" method to
> > > > > > `ExternalResourceInfo`
> > > > > > > > > > > interface.
> > > > > > > > > > > >
> > > > > > > > > > > > - Treat adding external resource info to
> RestAPI/WebUI
> > > as a
> > > > > > > future
> > > > > > > > > > work.
> > > > > > > > > > > >
> > > > > > > > > > > > If you have any new concerns after that change,
> please
> > > > > > mentioned
> > > > > > > here.
> > > > > > > > > > > > Sorry for disturbing you.
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > > > > danrtsey.wy@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks Yangze for the efforts to support GPU
> extended
> > > > > > > resources.
> > > > > > > > > > > > >
> > > > > > > > > > > > > +1 for this FLIP
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yang
> > > > > > > > > > > > >
> > > > > > > > > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四
> > > > > 下午11:10写道:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > +1
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > Till
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > > > > > > felixzhengcb@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks Yangze for driving the initial CPU
> support!
> > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song <to...@gmail.com>
> 于2020年4月1日周三
> > > > > > > 下午6:36写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > > > > > karmagyz@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I'd like to start the vote of FLIP-108 [1],
> > > which
> > > > > > adds
> > > > > > > GPU
> > > > > > > > > > > support in
> > > > > > > > > > > > > > > > > Flink. This FLIP is discussed in the
> thread[2].
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The vote will be open for at least 72
> hours.
> > > Unless
> > > > > > > there is
> > > > > > > > > > an
> > > > > > > > > > > > > > > > objection,
> > > > > > > > > > > > > > > > > I will try to close it by April 4, 2020
> 10:00
> > > UTC
> > > > > if
> > > > > > > we have
> > > > > > > > > > > received
> > > > > > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yangze Guo <ka...@gmail.com>.
> One minor note: I think the value of the returned map does not need to use a bounded wildcard type because for the user it won't make a difference.

Since the Driver now use a bounded wildcard type, it seems we could
not hold the return value(Set<? extends ExternalResourceInfo>) with
Set<ExternalResourceInfo>. Am I right?

Best,
Yangze Guo

On Wed, Apr 15, 2020 at 4:08 PM Yangze Guo <ka...@gmail.com> wrote:
>
> @Till
> If we add "Class<T> externalResourceType" param, what if there are
> multiple subtypes in the ExternalResourceInfos set of one external
> resource? It seems user has to set the T to ExternalResourceInfo and
> the mechanism is useless at this case.
>
> Best,
> Yangze Guo
>
> On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org> wrote:
> >
> > Ok, if there can be multiple resources of the same type then we definitely
> > need the name as a differentiator.
> >
> > I agree that such an interface won't give compile time checks but I think
> > that it could be easier to use from a user's perspective because there is
> > no explicit casting required.
> >
> > public interface RuntimeContext {
> >     <T extends ExternalResourceInfo> Set<T> getExternalResourceInfos(String
> > resourceName, Class<T> externalResourceType);
> > }
> >
> > One minor note: I think the value of the returned map does not need to use
> > a bounded wildcard type because for the user it won't make a difference.
> >
> > Cheers,
> > Till
> >
> > On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com> wrote:
> >
> > > Hi Till,
> > >
> > > > ExternalResourceDriver could return a Set<? extends
> > > ExternalResourceInfo>.
> > > It sounds good.
> > >
> > > > then one could make the interface type-safe by changing it to
> > > > public interface RuntimeContext {
> > > >    <T extends ExternalResourceInfo> Set<T>
> > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > }
> > > I think it may not help.
> > > - I think the assumption of "there is always only one resource of a
> > > specific type" is too strong. The external resource framework should
> > > only assume it gets a set of ExternalResourceInfo from the driver. The
> > > concrete implementation is given by user. So, if we give such an
> > > assumption, it would hurt the flexibility. There could be multiple
> > > types in the returned externalResourceInfo set. There could also be
> > > different types returned from different driver implementation or
> > > version. The contract about the return type between Driver and
> > > Operator should be guaranteed by user.
> > > - Since the Drivers are loaded dynamically in runtime, if there is a
> > > type mismatch, the job would fail in runtime instead of in compile
> > > time, no matter the type extraction is done by Operator or Flink core.
> > > This interface would not gain benefits for type safety.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > > >
> > > > Thanks for updating the FLIP, Yangze.
> > > >
> > > > If ExternalResourceInfo is a marker interface, then
> > > ExternalResourceDriver
> > > > could return a Set<? extends ExternalResourceInfo>. This makes is a bit
> > > > nicer for the implementor because he can use the concrete subtype.
> > > >
> > > > If we assume that users will always cast the ExternalResourceInfo
> > > instance
> > > > into the concrete subtype and if we assume that there is always only one
> > > > resource of a specific type, then one could make the interface type-safe
> > > by
> > > > changing it to
> > > >
> > > > public interface RuntimeContext {
> > > >     <T extends ExternalResourceInfo> Set<T>
> > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > }
> > > >
> > > > If we want to support multiple GPU resources, then one would need to use
> > > > the name of the respective resource as well.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <to...@gmail.com>
> > > wrote:
> > > >
> > > > > Thanks for updating the FLIP, Yangze.
> > > > > The latest FLIP looks good to me.
> > > > >
> > > > > nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is out of
> > > > > sync.
> > > > >
> > > > > > Retrieve the information of the external resources according to the
> > > > > > resourceProfile.
> > > > >
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <be...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Good feedback form Xintong. The latest FLIP looks good to me.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <ka...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi there,
> > > > > > > I've updated the FLIP accordingly. Please take a look. If you have
> > > any
> > > > > > > further concerns please let me know.
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <ka...@gmail.com>
> > > wrote:
> > > > > > > >
> > > > > > > > Thanks for the feedback, Xintong.
> > > > > > > >
> > > > > > > > - Should we have a factory interface for
> > > `ExternalResourceDriver`,
> > > > > that
> > > > > > > > takes the configuration and returns a driver instance?
> > > Otherwise, if
> > > > > we
> > > > > > > are
> > > > > > > > creating the driver instance with reflection, we kind of
> > > implicitly
> > > > > > > > requires the driver to have a public non-argument constructor.
> > > If we
> > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > >
> > > > > > > > True, we could have an `ExternalResourceDriverFactory`, like
> > > > > > > > interface ExternalResourceDriverFactory {
> > > > > > > >     ExternalResourceDriver fromConfiguration(Configuration
> > > config);
> > > > > > > > }
> > > > > > > > Regarding the configuration, the user should provide
> > > > > > > > "external-resource.{resourceName}.driver-factory.class" instead.
> > > > > > > >
> > > > > > > > - Not sure about the necessity of
> > > `ExternalResourceDriver#close`. I
> > > > > > would
> > > > > > > > suggest to avoid introduce more interfaces if not absolutely
> > > > > necessary.
> > > > > > > >
> > > > > > > > I add `ExternalResourceDriver#close` in case user needs to clean
> > > up
> > > > > > > > internal states and any other resources. It's true that it may
> > > not
> > > > > > > > absolutely necessary for our GPUDriver. From my side, I'm ok to
> > > > > remove
> > > > > > > > it.
> > > > > > > >
> > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > > > > > `ResourceProfile` as argument. This exposes more information
> > > than it
> > > > > > > needs.
> > > > > > > > In addition, it requires the runtime/core to understand how to
> > > > > properly
> > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> > > > > > > abstract
> > > > > > > > class. Runtime/core has to known which implementation of
> > > `Resource`
> > > > > to
> > > > > > > use.
> > > > > > > >
> > > > > > > > True, at the moment, I think the amount of the resource is
> > > enough for
> > > > > > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the
> > > future, if
> > > > > > > > the fine-grained external resource management is supported, the
> > > > > amount
> > > > > > > > of the resource seems to be enough either. If we want to leverage
> > > > > some
> > > > > > > > external resources which could not be measured by a single long
> > > > > value,
> > > > > > > > we might enrich this. But I'd like to keep it out of the scope of
> > > > > this
> > > > > > > > FLIP.
> > > > > > > >
> > > > > > > > - Do we really need `ExternalResourceInfo#getInformation`? I
> > > think it
> > > > > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > > > > interface.
> > > > > > > > User can define their own `ExternalResourceInfo` implementation
> > > and
> > > > > how
> > > > > > > it
> > > > > > > > is used by the operator user codes.
> > > > > > > >
> > > > > > > > Sounds good.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> > > tonysong820@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Sorry to pull this back. I have some concerns about the recent
> > > > > > updated
> > > > > > > > > interface details.
> > > > > > > > >
> > > > > > > > > - Should we have a factory interface for
> > > `ExternalResourceDriver`,
> > > > > > that
> > > > > > > > > takes the configuration and returns a driver instance?
> > > Otherwise,
> > > > > if
> > > > > > > we are
> > > > > > > > > creating the driver instance with reflection, we kind of
> > > implicitly
> > > > > > > > > requires the driver to have a public non-argument constructor.
> > > If
> > > > > we
> > > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > > - Not sure about the necessity of
> > > `ExternalResourceDriver#close`. I
> > > > > > > would
> > > > > > > > > suggest to avoid introduce more interfaces if not absolutely
> > > > > > necessary.
> > > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > > > > > > `ResourceProfile` as argument. This exposes more information
> > > than
> > > > > it
> > > > > > > needs.
> > > > > > > > > In addition, it requires the runtime/core to understand how to
> > > > > > properly
> > > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > > `ResourceProfile#extendedResources` takes `Resource`, which is
> > > an
> > > > > > > abstract
> > > > > > > > > class. Runtime/core has to known which implementation of
> > > `Resource`
> > > > > > to
> > > > > > > use.
> > > > > > > > > - Do we really need `ExternalResourceInfo#getInformation`? I
> > > think
> > > > > it
> > > > > > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > > > > > interface.
> > > > > > > > > User can define their own `ExternalResourceInfo`
> > > implementation and
> > > > > > > how it
> > > > > > > > > is used by the operator user codes.
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> > > becket.qin@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1
> > > > > > > > > >
> > > > > > > > > > Thanks for driving this effort, Ynagze. The latest FLIP wiki
> > > > > looks
> > > > > > > good to
> > > > > > > > > > me.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> > > karmagyz@gmail.com>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Edit: RuntimeContext interface
> > > > > > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > > > > > getExternalResourceInfo();
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> > > karmagyz@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi, there
> > > > > > > > > > > >
> > > > > > > > > > > > I have updated the FLIP, mainly target to make it more
> > > > > detailed
> > > > > > > and
> > > > > > > > > > > > clear. The general design is not changed, but there are
> > > still
> > > > > > > some
> > > > > > > > > > > > changes need to be notified here:
> > > > > > > > > > > >
> > > > > > > > > > > > - Change the `ExternalResourceDriver` from abstract
> > > class to
> > > > > > > > > > > > interface, since it does not have an abstract
> > > implementation.
> > > > > > > Add life
> > > > > > > > > > > > cycle method `open` and `close`.
> > > > > > > > > > > >
> > > > > > > > > > > > - Specify the method added to the RuntimeContext from
> > > which
> > > > > > user
> > > > > > > get
> > > > > > > > > > > > the information of external resources.
> > > > > > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > >
> > > > > > > > > > > > - Add "String getInformation()" method to
> > > > > > `ExternalResourceInfo`
> > > > > > > > > > > interface.
> > > > > > > > > > > >
> > > > > > > > > > > > - Treat adding external resource info to RestAPI/WebUI
> > > as a
> > > > > > > future
> > > > > > > > > > work.
> > > > > > > > > > > >
> > > > > > > > > > > > If you have any new concerns after that change, please
> > > > > > mentioned
> > > > > > > here.
> > > > > > > > > > > > Sorry for disturbing you.
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > > > > danrtsey.wy@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks Yangze for the efforts to support GPU extended
> > > > > > > resources.
> > > > > > > > > > > > >
> > > > > > > > > > > > > +1 for this FLIP
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yang
> > > > > > > > > > > > >
> > > > > > > > > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四
> > > > > 下午11:10写道:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > +1
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > Till
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > > > > > > felixzhengcb@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三
> > > > > > > 下午6:36写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > > > > > karmagyz@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I'd like to start the vote of FLIP-108 [1],
> > > which
> > > > > > adds
> > > > > > > GPU
> > > > > > > > > > > support in
> > > > > > > > > > > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The vote will be open for at least 72 hours.
> > > Unless
> > > > > > > there is
> > > > > > > > > > an
> > > > > > > > > > > > > > > > objection,
> > > > > > > > > > > > > > > > > I will try to close it by April 4, 2020 10:00
> > > UTC
> > > > > if
> > > > > > > we have
> > > > > > > > > > > received
> > > > > > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Till Rohrmann <tr...@apache.org>.
Is this something we expect to happen? If there is an external
resource implementation which shows this behaviour I assume that it is
quite hard to use for a user.

In this case, externalResourceType could also be set
to ExternalResourceInfo.class or any common super type. Then the API won't
give you any benefit and you would have to do the type casting yourself, of
course.

Cheers,
Till

On Wed, Apr 15, 2020 at 10:09 AM Yangze Guo <ka...@gmail.com> wrote:

> @Till
> If we add "Class<T> externalResourceType" param, what if there are
> multiple subtypes in the ExternalResourceInfos set of one external
> resource? It seems user has to set the T to ExternalResourceInfo and
> the mechanism is useless at this case.
>
> Best,
> Yangze Guo
>
> On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org>
> wrote:
> >
> > Ok, if there can be multiple resources of the same type then we
> definitely
> > need the name as a differentiator.
> >
> > I agree that such an interface won't give compile time checks but I think
> > that it could be easier to use from a user's perspective because there is
> > no explicit casting required.
> >
> > public interface RuntimeContext {
> >     <T extends ExternalResourceInfo> Set<T>
> getExternalResourceInfos(String
> > resourceName, Class<T> externalResourceType);
> > }
> >
> > One minor note: I think the value of the returned map does not need to
> use
> > a bounded wildcard type because for the user it won't make a difference.
> >
> > Cheers,
> > Till
> >
> > On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com> wrote:
> >
> > > Hi Till,
> > >
> > > > ExternalResourceDriver could return a Set<? extends
> > > ExternalResourceInfo>.
> > > It sounds good.
> > >
> > > > then one could make the interface type-safe by changing it to
> > > > public interface RuntimeContext {
> > > >    <T extends ExternalResourceInfo> Set<T>
> > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > }
> > > I think it may not help.
> > > - I think the assumption of "there is always only one resource of a
> > > specific type" is too strong. The external resource framework should
> > > only assume it gets a set of ExternalResourceInfo from the driver. The
> > > concrete implementation is given by user. So, if we give such an
> > > assumption, it would hurt the flexibility. There could be multiple
> > > types in the returned externalResourceInfo set. There could also be
> > > different types returned from different driver implementation or
> > > version. The contract about the return type between Driver and
> > > Operator should be guaranteed by user.
> > > - Since the Drivers are loaded dynamically in runtime, if there is a
> > > type mismatch, the job would fail in runtime instead of in compile
> > > time, no matter the type extraction is done by Operator or Flink core.
> > > This interface would not gain benefits for type safety.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <tr...@apache.org>
> > > wrote:
> > > >
> > > > Thanks for updating the FLIP, Yangze.
> > > >
> > > > If ExternalResourceInfo is a marker interface, then
> > > ExternalResourceDriver
> > > > could return a Set<? extends ExternalResourceInfo>. This makes is a
> bit
> > > > nicer for the implementor because he can use the concrete subtype.
> > > >
> > > > If we assume that users will always cast the ExternalResourceInfo
> > > instance
> > > > into the concrete subtype and if we assume that there is always only
> one
> > > > resource of a specific type, then one could make the interface
> type-safe
> > > by
> > > > changing it to
> > > >
> > > > public interface RuntimeContext {
> > > >     <T extends ExternalResourceInfo> Set<T>
> > > > getExternalResourceInfo(Class<T> externalResourceType);
> > > > }
> > > >
> > > > If we want to support multiple GPU resources, then one would need to
> use
> > > > the name of the respective resource as well.
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <to...@gmail.com>
> > > wrote:
> > > >
> > > > > Thanks for updating the FLIP, Yangze.
> > > > > The latest FLIP looks good to me.
> > > > >
> > > > > nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is
> out of
> > > > > sync.
> > > > >
> > > > > > Retrieve the information of the external resources according to
> the
> > > > > > resourceProfile.
> > > > >
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <be...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Good feedback form Xintong. The latest FLIP looks good to me.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <ka...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hi there,
> > > > > > > I've updated the FLIP accordingly. Please take a look. If you
> have
> > > any
> > > > > > > further concerns please let me know.
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <karmagyz@gmail.com
> >
> > > wrote:
> > > > > > > >
> > > > > > > > Thanks for the feedback, Xintong.
> > > > > > > >
> > > > > > > > - Should we have a factory interface for
> > > `ExternalResourceDriver`,
> > > > > that
> > > > > > > > takes the configuration and returns a driver instance?
> > > Otherwise, if
> > > > > we
> > > > > > > are
> > > > > > > > creating the driver instance with reflection, we kind of
> > > implicitly
> > > > > > > > requires the driver to have a public non-argument
> constructor.
> > > If we
> > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > >
> > > > > > > > True, we could have an `ExternalResourceDriverFactory`, like
> > > > > > > > interface ExternalResourceDriverFactory {
> > > > > > > >     ExternalResourceDriver fromConfiguration(Configuration
> > > config);
> > > > > > > > }
> > > > > > > > Regarding the configuration, the user should provide
> > > > > > > > "external-resource.{resourceName}.driver-factory.class"
> instead.
> > > > > > > >
> > > > > > > > - Not sure about the necessity of
> > > `ExternalResourceDriver#close`. I
> > > > > > would
> > > > > > > > suggest to avoid introduce more interfaces if not absolutely
> > > > > necessary.
> > > > > > > >
> > > > > > > > I add `ExternalResourceDriver#close` in case user needs to
> clean
> > > up
> > > > > > > > internal states and any other resources. It's true that it
> may
> > > not
> > > > > > > > absolutely necessary for our GPUDriver. From my side, I'm ok
> to
> > > > > remove
> > > > > > > > it.
> > > > > > > >
> > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not
> take
> > > > > > > > `ResourceProfile` as argument. This exposes more information
> > > than it
> > > > > > > needs.
> > > > > > > > In addition, it requires the runtime/core to understand how
> to
> > > > > properly
> > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > `ResourceProfile#extendedResources` takes `Resource`, which
> is an
> > > > > > > abstract
> > > > > > > > class. Runtime/core has to known which implementation of
> > > `Resource`
> > > > > to
> > > > > > > use.
> > > > > > > >
> > > > > > > > True, at the moment, I think the amount of the resource is
> > > enough for
> > > > > > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the
> > > future, if
> > > > > > > > the fine-grained external resource management is supported,
> the
> > > > > amount
> > > > > > > > of the resource seems to be enough either. If we want to
> leverage
> > > > > some
> > > > > > > > external resources which could not be measured by a single
> long
> > > > > value,
> > > > > > > > we might enrich this. But I'd like to keep it out of the
> scope of
> > > > > this
> > > > > > > > FLIP.
> > > > > > > >
> > > > > > > > - Do we really need `ExternalResourceInfo#getInformation`? I
> > > think it
> > > > > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > > > > interface.
> > > > > > > > User can define their own `ExternalResourceInfo`
> implementation
> > > and
> > > > > how
> > > > > > > it
> > > > > > > > is used by the operator user codes.
> > > > > > > >
> > > > > > > > Sounds good.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> > > tonysong820@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Sorry to pull this back. I have some concerns about the
> recent
> > > > > > updated
> > > > > > > > > interface details.
> > > > > > > > >
> > > > > > > > > - Should we have a factory interface for
> > > `ExternalResourceDriver`,
> > > > > > that
> > > > > > > > > takes the configuration and returns a driver instance?
> > > Otherwise,
> > > > > if
> > > > > > > we are
> > > > > > > > > creating the driver instance with reflection, we kind of
> > > implicitly
> > > > > > > > > requires the driver to have a public non-argument
> constructor.
> > > If
> > > > > we
> > > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > > - Not sure about the necessity of
> > > `ExternalResourceDriver#close`. I
> > > > > > > would
> > > > > > > > > suggest to avoid introduce more interfaces if not
> absolutely
> > > > > > necessary.
> > > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not
> take
> > > > > > > > > `ResourceProfile` as argument. This exposes more
> information
> > > than
> > > > > it
> > > > > > > needs.
> > > > > > > > > In addition, it requires the runtime/core to understand
> how to
> > > > > > properly
> > > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > > `ResourceProfile#extendedResources` takes `Resource`,
> which is
> > > an
> > > > > > > abstract
> > > > > > > > > class. Runtime/core has to known which implementation of
> > > `Resource`
> > > > > > to
> > > > > > > use.
> > > > > > > > > - Do we really need `ExternalResourceInfo#getInformation`?
> I
> > > think
> > > > > it
> > > > > > > > > should be good enough to make `ExternalResourceInfo` an
> empty
> > > > > > > interface.
> > > > > > > > > User can define their own `ExternalResourceInfo`
> > > implementation and
> > > > > > > how it
> > > > > > > > > is used by the operator user codes.
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> > > becket.qin@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > +1
> > > > > > > > > >
> > > > > > > > > > Thanks for driving this effort, Ynagze. The latest FLIP
> wiki
> > > > > looks
> > > > > > > good to
> > > > > > > > > > me.
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> > > karmagyz@gmail.com>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Edit: RuntimeContext interface
> > > > > > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > > > > > getExternalResourceInfo();
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> > > karmagyz@gmail.com
> > > > > >
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi, there
> > > > > > > > > > > >
> > > > > > > > > > > > I have updated the FLIP, mainly target to make it
> more
> > > > > detailed
> > > > > > > and
> > > > > > > > > > > > clear. The general design is not changed, but there
> are
> > > still
> > > > > > > some
> > > > > > > > > > > > changes need to be notified here:
> > > > > > > > > > > >
> > > > > > > > > > > > - Change the `ExternalResourceDriver` from abstract
> > > class to
> > > > > > > > > > > > interface, since it does not have an abstract
> > > implementation.
> > > > > > > Add life
> > > > > > > > > > > > cycle method `open` and `close`.
> > > > > > > > > > > >
> > > > > > > > > > > > - Specify the method added to the RuntimeContext from
> > > which
> > > > > > user
> > > > > > > get
> > > > > > > > > > > > the information of external resources.
> > > > > > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > > >
> > > > > > > > > > > > - Add "String getInformation()" method to
> > > > > > `ExternalResourceInfo`
> > > > > > > > > > > interface.
> > > > > > > > > > > >
> > > > > > > > > > > > - Treat adding external resource info to
> RestAPI/WebUI
> > > as a
> > > > > > > future
> > > > > > > > > > work.
> > > > > > > > > > > >
> > > > > > > > > > > > If you have any new concerns after that change,
> please
> > > > > > mentioned
> > > > > > > here.
> > > > > > > > > > > > Sorry for disturbing you.
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > > > > danrtsey.wy@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks Yangze for the efforts to support GPU
> extended
> > > > > > > resources.
> > > > > > > > > > > > >
> > > > > > > > > > > > > +1 for this FLIP
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yang
> > > > > > > > > > > > >
> > > > > > > > > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四
> > > > > 下午11:10写道:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > +1
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > > Till
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > > > > > > felixzhengcb@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks Yangze for driving the initial CPU
> support!
> > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song <to...@gmail.com>
> 于2020年4月1日周三
> > > > > > > 下午6:36写道:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > > > > > karmagyz@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I'd like to start the vote of FLIP-108 [1],
> > > which
> > > > > > adds
> > > > > > > GPU
> > > > > > > > > > > support in
> > > > > > > > > > > > > > > > > Flink. This FLIP is discussed in the
> thread[2].
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The vote will be open for at least 72
> hours.
> > > Unless
> > > > > > > there is
> > > > > > > > > > an
> > > > > > > > > > > > > > > > objection,
> > > > > > > > > > > > > > > > > I will try to close it by April 4, 2020
> 10:00
> > > UTC
> > > > > if
> > > > > > > we have
> > > > > > > > > > > received
> > > > > > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yangze Guo <ka...@gmail.com>.
@Till
If we add "Class<T> externalResourceType" param, what if there are
multiple subtypes in the ExternalResourceInfos set of one external
resource? It seems user has to set the T to ExternalResourceInfo and
the mechanism is useless at this case.

Best,
Yangze Guo

On Wed, Apr 15, 2020 at 3:57 PM Till Rohrmann <tr...@apache.org> wrote:
>
> Ok, if there can be multiple resources of the same type then we definitely
> need the name as a differentiator.
>
> I agree that such an interface won't give compile time checks but I think
> that it could be easier to use from a user's perspective because there is
> no explicit casting required.
>
> public interface RuntimeContext {
>     <T extends ExternalResourceInfo> Set<T> getExternalResourceInfos(String
> resourceName, Class<T> externalResourceType);
> }
>
> One minor note: I think the value of the returned map does not need to use
> a bounded wildcard type because for the user it won't make a difference.
>
> Cheers,
> Till
>
> On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com> wrote:
>
> > Hi Till,
> >
> > > ExternalResourceDriver could return a Set<? extends
> > ExternalResourceInfo>.
> > It sounds good.
> >
> > > then one could make the interface type-safe by changing it to
> > > public interface RuntimeContext {
> > >    <T extends ExternalResourceInfo> Set<T>
> > > getExternalResourceInfo(Class<T> externalResourceType);
> > > }
> > I think it may not help.
> > - I think the assumption of "there is always only one resource of a
> > specific type" is too strong. The external resource framework should
> > only assume it gets a set of ExternalResourceInfo from the driver. The
> > concrete implementation is given by user. So, if we give such an
> > assumption, it would hurt the flexibility. There could be multiple
> > types in the returned externalResourceInfo set. There could also be
> > different types returned from different driver implementation or
> > version. The contract about the return type between Driver and
> > Operator should be guaranteed by user.
> > - Since the Drivers are loaded dynamically in runtime, if there is a
> > type mismatch, the job would fail in runtime instead of in compile
> > time, no matter the type extraction is done by Operator or Flink core.
> > This interface would not gain benefits for type safety.
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <tr...@apache.org>
> > wrote:
> > >
> > > Thanks for updating the FLIP, Yangze.
> > >
> > > If ExternalResourceInfo is a marker interface, then
> > ExternalResourceDriver
> > > could return a Set<? extends ExternalResourceInfo>. This makes is a bit
> > > nicer for the implementor because he can use the concrete subtype.
> > >
> > > If we assume that users will always cast the ExternalResourceInfo
> > instance
> > > into the concrete subtype and if we assume that there is always only one
> > > resource of a specific type, then one could make the interface type-safe
> > by
> > > changing it to
> > >
> > > public interface RuntimeContext {
> > >     <T extends ExternalResourceInfo> Set<T>
> > > getExternalResourceInfo(Class<T> externalResourceType);
> > > }
> > >
> > > If we want to support multiple GPU resources, then one would need to use
> > > the name of the respective resource as well.
> > >
> > > Cheers,
> > > Till
> > >
> > > On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <to...@gmail.com>
> > wrote:
> > >
> > > > Thanks for updating the FLIP, Yangze.
> > > > The latest FLIP looks good to me.
> > > >
> > > > nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is out of
> > > > sync.
> > > >
> > > > > Retrieve the information of the external resources according to the
> > > > > resourceProfile.
> > > >
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <be...@gmail.com>
> > wrote:
> > > >
> > > > > Good feedback form Xintong. The latest FLIP looks good to me.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <ka...@gmail.com>
> > wrote:
> > > > >
> > > > > > Hi there,
> > > > > > I've updated the FLIP accordingly. Please take a look. If you have
> > any
> > > > > > further concerns please let me know.
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <ka...@gmail.com>
> > wrote:
> > > > > > >
> > > > > > > Thanks for the feedback, Xintong.
> > > > > > >
> > > > > > > - Should we have a factory interface for
> > `ExternalResourceDriver`,
> > > > that
> > > > > > > takes the configuration and returns a driver instance?
> > Otherwise, if
> > > > we
> > > > > > are
> > > > > > > creating the driver instance with reflection, we kind of
> > implicitly
> > > > > > > requires the driver to have a public non-argument constructor.
> > If we
> > > > > > > decided to go with this approach, then we will not need
> > > > > > > `ExternalResourceDriver#open`.
> > > > > > >
> > > > > > > True, we could have an `ExternalResourceDriverFactory`, like
> > > > > > > interface ExternalResourceDriverFactory {
> > > > > > >     ExternalResourceDriver fromConfiguration(Configuration
> > config);
> > > > > > > }
> > > > > > > Regarding the configuration, the user should provide
> > > > > > > "external-resource.{resourceName}.driver-factory.class" instead.
> > > > > > >
> > > > > > > - Not sure about the necessity of
> > `ExternalResourceDriver#close`. I
> > > > > would
> > > > > > > suggest to avoid introduce more interfaces if not absolutely
> > > > necessary.
> > > > > > >
> > > > > > > I add `ExternalResourceDriver#close` in case user needs to clean
> > up
> > > > > > > internal states and any other resources. It's true that it may
> > not
> > > > > > > absolutely necessary for our GPUDriver. From my side, I'm ok to
> > > > remove
> > > > > > > it.
> > > > > > >
> > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > > > > `ResourceProfile` as argument. This exposes more information
> > than it
> > > > > > needs.
> > > > > > > In addition, it requires the runtime/core to understand how to
> > > > properly
> > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> > > > > > abstract
> > > > > > > class. Runtime/core has to known which implementation of
> > `Resource`
> > > > to
> > > > > > use.
> > > > > > >
> > > > > > > True, at the moment, I think the amount of the resource is
> > enough for
> > > > > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the
> > future, if
> > > > > > > the fine-grained external resource management is supported, the
> > > > amount
> > > > > > > of the resource seems to be enough either. If we want to leverage
> > > > some
> > > > > > > external resources which could not be measured by a single long
> > > > value,
> > > > > > > we might enrich this. But I'd like to keep it out of the scope of
> > > > this
> > > > > > > FLIP.
> > > > > > >
> > > > > > > - Do we really need `ExternalResourceInfo#getInformation`? I
> > think it
> > > > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > > > interface.
> > > > > > > User can define their own `ExternalResourceInfo` implementation
> > and
> > > > how
> > > > > > it
> > > > > > > is used by the operator user codes.
> > > > > > >
> > > > > > > Sounds good.
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> > tonysong820@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > Sorry to pull this back. I have some concerns about the recent
> > > > > updated
> > > > > > > > interface details.
> > > > > > > >
> > > > > > > > - Should we have a factory interface for
> > `ExternalResourceDriver`,
> > > > > that
> > > > > > > > takes the configuration and returns a driver instance?
> > Otherwise,
> > > > if
> > > > > > we are
> > > > > > > > creating the driver instance with reflection, we kind of
> > implicitly
> > > > > > > > requires the driver to have a public non-argument constructor.
> > If
> > > > we
> > > > > > > > decided to go with this approach, then we will not need
> > > > > > > > `ExternalResourceDriver#open`.
> > > > > > > > - Not sure about the necessity of
> > `ExternalResourceDriver#close`. I
> > > > > > would
> > > > > > > > suggest to avoid introduce more interfaces if not absolutely
> > > > > necessary.
> > > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > > > > > `ResourceProfile` as argument. This exposes more information
> > than
> > > > it
> > > > > > needs.
> > > > > > > > In addition, it requires the runtime/core to understand how to
> > > > > properly
> > > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > > `ResourceProfile#extendedResources` takes `Resource`, which is
> > an
> > > > > > abstract
> > > > > > > > class. Runtime/core has to known which implementation of
> > `Resource`
> > > > > to
> > > > > > use.
> > > > > > > > - Do we really need `ExternalResourceInfo#getInformation`? I
> > think
> > > > it
> > > > > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > > > > interface.
> > > > > > > > User can define their own `ExternalResourceInfo`
> > implementation and
> > > > > > how it
> > > > > > > > is used by the operator user codes.
> > > > > > > >
> > > > > > > > Thank you~
> > > > > > > >
> > > > > > > > Xintong Song
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> > becket.qin@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > Thanks for driving this effort, Ynagze. The latest FLIP wiki
> > > > looks
> > > > > > good to
> > > > > > > > > me.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > >
> > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > >
> > > > > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> > karmagyz@gmail.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Edit: RuntimeContext interface
> > > > > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > > > > getExternalResourceInfo();
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yangze Guo
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> > karmagyz@gmail.com
> > > > >
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi, there
> > > > > > > > > > >
> > > > > > > > > > > I have updated the FLIP, mainly target to make it more
> > > > detailed
> > > > > > and
> > > > > > > > > > > clear. The general design is not changed, but there are
> > still
> > > > > > some
> > > > > > > > > > > changes need to be notified here:
> > > > > > > > > > >
> > > > > > > > > > > - Change the `ExternalResourceDriver` from abstract
> > class to
> > > > > > > > > > > interface, since it does not have an abstract
> > implementation.
> > > > > > Add life
> > > > > > > > > > > cycle method `open` and `close`.
> > > > > > > > > > >
> > > > > > > > > > > - Specify the method added to the RuntimeContext from
> > which
> > > > > user
> > > > > > get
> > > > > > > > > > > the information of external resources.
> > > > > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > > >
> > > > > > > > > > > - Add "String getInformation()" method to
> > > > > `ExternalResourceInfo`
> > > > > > > > > > interface.
> > > > > > > > > > >
> > > > > > > > > > > - Treat adding external resource info to RestAPI/WebUI
> > as a
> > > > > > future
> > > > > > > > > work.
> > > > > > > > > > >
> > > > > > > > > > > If you have any new concerns after that change, please
> > > > > mentioned
> > > > > > here.
> > > > > > > > > > > Sorry for disturbing you.
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > > > danrtsey.wy@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks Yangze for the efforts to support GPU extended
> > > > > > resources.
> > > > > > > > > > > >
> > > > > > > > > > > > +1 for this FLIP
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yang
> > > > > > > > > > > >
> > > > > > > > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四
> > > > 下午11:10写道:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > > > > >
> > > > > > > > > > > > > +1
> > > > > > > > > > > > >
> > > > > > > > > > > > > Cheers,
> > > > > > > > > > > > > Till
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > > > > > felixzhengcb@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三
> > > > > > 下午6:36写道:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > > > > karmagyz@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I'd like to start the vote of FLIP-108 [1],
> > which
> > > > > adds
> > > > > > GPU
> > > > > > > > > > support in
> > > > > > > > > > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The vote will be open for at least 72 hours.
> > Unless
> > > > > > there is
> > > > > > > > > an
> > > > > > > > > > > > > > > objection,
> > > > > > > > > > > > > > > > I will try to close it by April 4, 2020 10:00
> > UTC
> > > > if
> > > > > > we have
> > > > > > > > > > received
> > > > > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > > >
> > > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > > >
> > > >
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > > >
> > > >
> >

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Till Rohrmann <tr...@apache.org>.
Ok, if there can be multiple resources of the same type then we definitely
need the name as a differentiator.

I agree that such an interface won't give compile time checks but I think
that it could be easier to use from a user's perspective because there is
no explicit casting required.

public interface RuntimeContext {
    <T extends ExternalResourceInfo> Set<T> getExternalResourceInfos(String
resourceName, Class<T> externalResourceType);
}

One minor note: I think the value of the returned map does not need to use
a bounded wildcard type because for the user it won't make a difference.

Cheers,
Till

On Wed, Apr 15, 2020 at 8:20 AM Yangze Guo <ka...@gmail.com> wrote:

> Hi Till,
>
> > ExternalResourceDriver could return a Set<? extends
> ExternalResourceInfo>.
> It sounds good.
>
> > then one could make the interface type-safe by changing it to
> > public interface RuntimeContext {
> >    <T extends ExternalResourceInfo> Set<T>
> > getExternalResourceInfo(Class<T> externalResourceType);
> > }
> I think it may not help.
> - I think the assumption of "there is always only one resource of a
> specific type" is too strong. The external resource framework should
> only assume it gets a set of ExternalResourceInfo from the driver. The
> concrete implementation is given by user. So, if we give such an
> assumption, it would hurt the flexibility. There could be multiple
> types in the returned externalResourceInfo set. There could also be
> different types returned from different driver implementation or
> version. The contract about the return type between Driver and
> Operator should be guaranteed by user.
> - Since the Drivers are loaded dynamically in runtime, if there is a
> type mismatch, the job would fail in runtime instead of in compile
> time, no matter the type extraction is done by Operator or Flink core.
> This interface would not gain benefits for type safety.
>
> Best,
> Yangze Guo
>
> On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <tr...@apache.org>
> wrote:
> >
> > Thanks for updating the FLIP, Yangze.
> >
> > If ExternalResourceInfo is a marker interface, then
> ExternalResourceDriver
> > could return a Set<? extends ExternalResourceInfo>. This makes is a bit
> > nicer for the implementor because he can use the concrete subtype.
> >
> > If we assume that users will always cast the ExternalResourceInfo
> instance
> > into the concrete subtype and if we assume that there is always only one
> > resource of a specific type, then one could make the interface type-safe
> by
> > changing it to
> >
> > public interface RuntimeContext {
> >     <T extends ExternalResourceInfo> Set<T>
> > getExternalResourceInfo(Class<T> externalResourceType);
> > }
> >
> > If we want to support multiple GPU resources, then one would need to use
> > the name of the respective resource as well.
> >
> > Cheers,
> > Till
> >
> > On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <to...@gmail.com>
> wrote:
> >
> > > Thanks for updating the FLIP, Yangze.
> > > The latest FLIP looks good to me.
> > >
> > > nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is out of
> > > sync.
> > >
> > > > Retrieve the information of the external resources according to the
> > > > resourceProfile.
> > >
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <be...@gmail.com>
> wrote:
> > >
> > > > Good feedback form Xintong. The latest FLIP looks good to me.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <ka...@gmail.com>
> wrote:
> > > >
> > > > > Hi there,
> > > > > I've updated the FLIP accordingly. Please take a look. If you have
> any
> > > > > further concerns please let me know.
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <ka...@gmail.com>
> wrote:
> > > > > >
> > > > > > Thanks for the feedback, Xintong.
> > > > > >
> > > > > > - Should we have a factory interface for
> `ExternalResourceDriver`,
> > > that
> > > > > > takes the configuration and returns a driver instance?
> Otherwise, if
> > > we
> > > > > are
> > > > > > creating the driver instance with reflection, we kind of
> implicitly
> > > > > > requires the driver to have a public non-argument constructor.
> If we
> > > > > > decided to go with this approach, then we will not need
> > > > > > `ExternalResourceDriver#open`.
> > > > > >
> > > > > > True, we could have an `ExternalResourceDriverFactory`, like
> > > > > > interface ExternalResourceDriverFactory {
> > > > > >     ExternalResourceDriver fromConfiguration(Configuration
> config);
> > > > > > }
> > > > > > Regarding the configuration, the user should provide
> > > > > > "external-resource.{resourceName}.driver-factory.class" instead.
> > > > > >
> > > > > > - Not sure about the necessity of
> `ExternalResourceDriver#close`. I
> > > > would
> > > > > > suggest to avoid introduce more interfaces if not absolutely
> > > necessary.
> > > > > >
> > > > > > I add `ExternalResourceDriver#close` in case user needs to clean
> up
> > > > > > internal states and any other resources. It's true that it may
> not
> > > > > > absolutely necessary for our GPUDriver. From my side, I'm ok to
> > > remove
> > > > > > it.
> > > > > >
> > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > > > `ResourceProfile` as argument. This exposes more information
> than it
> > > > > needs.
> > > > > > In addition, it requires the runtime/core to understand how to
> > > properly
> > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> > > > > abstract
> > > > > > class. Runtime/core has to known which implementation of
> `Resource`
> > > to
> > > > > use.
> > > > > >
> > > > > > True, at the moment, I think the amount of the resource is
> enough for
> > > > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the
> future, if
> > > > > > the fine-grained external resource management is supported, the
> > > amount
> > > > > > of the resource seems to be enough either. If we want to leverage
> > > some
> > > > > > external resources which could not be measured by a single long
> > > value,
> > > > > > we might enrich this. But I'd like to keep it out of the scope of
> > > this
> > > > > > FLIP.
> > > > > >
> > > > > > - Do we really need `ExternalResourceInfo#getInformation`? I
> think it
> > > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > > interface.
> > > > > > User can define their own `ExternalResourceInfo` implementation
> and
> > > how
> > > > > it
> > > > > > is used by the operator user codes.
> > > > > >
> > > > > > Sounds good.
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <
> tonysong820@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > Sorry to pull this back. I have some concerns about the recent
> > > > updated
> > > > > > > interface details.
> > > > > > >
> > > > > > > - Should we have a factory interface for
> `ExternalResourceDriver`,
> > > > that
> > > > > > > takes the configuration and returns a driver instance?
> Otherwise,
> > > if
> > > > > we are
> > > > > > > creating the driver instance with reflection, we kind of
> implicitly
> > > > > > > requires the driver to have a public non-argument constructor.
> If
> > > we
> > > > > > > decided to go with this approach, then we will not need
> > > > > > > `ExternalResourceDriver#open`.
> > > > > > > - Not sure about the necessity of
> `ExternalResourceDriver#close`. I
> > > > > would
> > > > > > > suggest to avoid introduce more interfaces if not absolutely
> > > > necessary.
> > > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > > > > `ResourceProfile` as argument. This exposes more information
> than
> > > it
> > > > > needs.
> > > > > > > In addition, it requires the runtime/core to understand how to
> > > > properly
> > > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > > `ResourceProfile#extendedResources` takes `Resource`, which is
> an
> > > > > abstract
> > > > > > > class. Runtime/core has to known which implementation of
> `Resource`
> > > > to
> > > > > use.
> > > > > > > - Do we really need `ExternalResourceInfo#getInformation`? I
> think
> > > it
> > > > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > > > interface.
> > > > > > > User can define their own `ExternalResourceInfo`
> implementation and
> > > > > how it
> > > > > > > is used by the operator user codes.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <
> becket.qin@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > Thanks for driving this effort, Ynagze. The latest FLIP wiki
> > > looks
> > > > > good to
> > > > > > > > me.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <
> karmagyz@gmail.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > > > Edit: RuntimeContext interface
> > > > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > > > getExternalResourceInfo();
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <
> karmagyz@gmail.com
> > > >
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Hi, there
> > > > > > > > > >
> > > > > > > > > > I have updated the FLIP, mainly target to make it more
> > > detailed
> > > > > and
> > > > > > > > > > clear. The general design is not changed, but there are
> still
> > > > > some
> > > > > > > > > > changes need to be notified here:
> > > > > > > > > >
> > > > > > > > > > - Change the `ExternalResourceDriver` from abstract
> class to
> > > > > > > > > > interface, since it does not have an abstract
> implementation.
> > > > > Add life
> > > > > > > > > > cycle method `open` and `close`.
> > > > > > > > > >
> > > > > > > > > > - Specify the method added to the RuntimeContext from
> which
> > > > user
> > > > > get
> > > > > > > > > > the information of external resources.
> > > > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > > >
> > > > > > > > > > - Add "String getInformation()" method to
> > > > `ExternalResourceInfo`
> > > > > > > > > interface.
> > > > > > > > > >
> > > > > > > > > > - Treat adding external resource info to RestAPI/WebUI
> as a
> > > > > future
> > > > > > > > work.
> > > > > > > > > >
> > > > > > > > > > If you have any new concerns after that change, please
> > > > mentioned
> > > > > here.
> > > > > > > > > > Sorry for disturbing you.
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yangze Guo
> > > > > > > > > >
> > > > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > > danrtsey.wy@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > Thanks Yangze for the efforts to support GPU extended
> > > > > resources.
> > > > > > > > > > >
> > > > > > > > > > > +1 for this FLIP
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yang
> > > > > > > > > > >
> > > > > > > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四
> > > 下午11:10写道:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > > > >
> > > > > > > > > > > > +1
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > > Till
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > > > > felixzhengcb@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三
> > > > > 下午6:36写道:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > > > karmagyz@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I'd like to start the vote of FLIP-108 [1],
> which
> > > > adds
> > > > > GPU
> > > > > > > > > support in
> > > > > > > > > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The vote will be open for at least 72 hours.
> Unless
> > > > > there is
> > > > > > > > an
> > > > > > > > > > > > > > objection,
> > > > > > > > > > > > > > > I will try to close it by April 4, 2020 10:00
> UTC
> > > if
> > > > > we have
> > > > > > > > > received
> > > > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > >
> > > >
> > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > >
> > > >
> > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > >
> > > >
> > >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yangze Guo <ka...@gmail.com>.
Hi Till,

> ExternalResourceDriver could return a Set<? extends ExternalResourceInfo>.
It sounds good.

> then one could make the interface type-safe by changing it to
> public interface RuntimeContext {
>    <T extends ExternalResourceInfo> Set<T>
> getExternalResourceInfo(Class<T> externalResourceType);
> }
I think it may not help.
- I think the assumption of "there is always only one resource of a
specific type" is too strong. The external resource framework should
only assume it gets a set of ExternalResourceInfo from the driver. The
concrete implementation is given by user. So, if we give such an
assumption, it would hurt the flexibility. There could be multiple
types in the returned externalResourceInfo set. There could also be
different types returned from different driver implementation or
version. The contract about the return type between Driver and
Operator should be guaranteed by user.
- Since the Drivers are loaded dynamically in runtime, if there is a
type mismatch, the job would fail in runtime instead of in compile
time, no matter the type extraction is done by Operator or Flink core.
This interface would not gain benefits for type safety.

Best,
Yangze Guo

On Wed, Apr 15, 2020 at 1:38 AM Till Rohrmann <tr...@apache.org> wrote:
>
> Thanks for updating the FLIP, Yangze.
>
> If ExternalResourceInfo is a marker interface, then ExternalResourceDriver
> could return a Set<? extends ExternalResourceInfo>. This makes is a bit
> nicer for the implementor because he can use the concrete subtype.
>
> If we assume that users will always cast the ExternalResourceInfo instance
> into the concrete subtype and if we assume that there is always only one
> resource of a specific type, then one could make the interface type-safe by
> changing it to
>
> public interface RuntimeContext {
>     <T extends ExternalResourceInfo> Set<T>
> getExternalResourceInfo(Class<T> externalResourceType);
> }
>
> If we want to support multiple GPU resources, then one would need to use
> the name of the respective resource as well.
>
> Cheers,
> Till
>
> On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <to...@gmail.com> wrote:
>
> > Thanks for updating the FLIP, Yangze.
> > The latest FLIP looks good to me.
> >
> > nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is out of
> > sync.
> >
> > > Retrieve the information of the external resources according to the
> > > resourceProfile.
> >
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <be...@gmail.com> wrote:
> >
> > > Good feedback form Xintong. The latest FLIP looks good to me.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <ka...@gmail.com> wrote:
> > >
> > > > Hi there,
> > > > I've updated the FLIP accordingly. Please take a look. If you have any
> > > > further concerns please let me know.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <ka...@gmail.com> wrote:
> > > > >
> > > > > Thanks for the feedback, Xintong.
> > > > >
> > > > > - Should we have a factory interface for `ExternalResourceDriver`,
> > that
> > > > > takes the configuration and returns a driver instance? Otherwise, if
> > we
> > > > are
> > > > > creating the driver instance with reflection, we kind of implicitly
> > > > > requires the driver to have a public non-argument constructor. If we
> > > > > decided to go with this approach, then we will not need
> > > > > `ExternalResourceDriver#open`.
> > > > >
> > > > > True, we could have an `ExternalResourceDriverFactory`, like
> > > > > interface ExternalResourceDriverFactory {
> > > > >     ExternalResourceDriver fromConfiguration(Configuration config);
> > > > > }
> > > > > Regarding the configuration, the user should provide
> > > > > "external-resource.{resourceName}.driver-factory.class" instead.
> > > > >
> > > > > - Not sure about the necessity of `ExternalResourceDriver#close`. I
> > > would
> > > > > suggest to avoid introduce more interfaces if not absolutely
> > necessary.
> > > > >
> > > > > I add `ExternalResourceDriver#close` in case user needs to clean up
> > > > > internal states and any other resources. It's true that it may not
> > > > > absolutely necessary for our GPUDriver. From my side, I'm ok to
> > remove
> > > > > it.
> > > > >
> > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > > `ResourceProfile` as argument. This exposes more information than it
> > > > needs.
> > > > > In addition, it requires the runtime/core to understand how to
> > properly
> > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> > > > abstract
> > > > > class. Runtime/core has to known which implementation of `Resource`
> > to
> > > > use.
> > > > >
> > > > > True, at the moment, I think the amount of the resource is enough for
> > > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the future, if
> > > > > the fine-grained external resource management is supported, the
> > amount
> > > > > of the resource seems to be enough either. If we want to leverage
> > some
> > > > > external resources which could not be measured by a single long
> > value,
> > > > > we might enrich this. But I'd like to keep it out of the scope of
> > this
> > > > > FLIP.
> > > > >
> > > > > - Do we really need `ExternalResourceInfo#getInformation`? I think it
> > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > interface.
> > > > > User can define their own `ExternalResourceInfo` implementation and
> > how
> > > > it
> > > > > is used by the operator user codes.
> > > > >
> > > > > Sounds good.
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <to...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > Sorry to pull this back. I have some concerns about the recent
> > > updated
> > > > > > interface details.
> > > > > >
> > > > > > - Should we have a factory interface for `ExternalResourceDriver`,
> > > that
> > > > > > takes the configuration and returns a driver instance? Otherwise,
> > if
> > > > we are
> > > > > > creating the driver instance with reflection, we kind of implicitly
> > > > > > requires the driver to have a public non-argument constructor. If
> > we
> > > > > > decided to go with this approach, then we will not need
> > > > > > `ExternalResourceDriver#open`.
> > > > > > - Not sure about the necessity of `ExternalResourceDriver#close`. I
> > > > would
> > > > > > suggest to avoid introduce more interfaces if not absolutely
> > > necessary.
> > > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > > > `ResourceProfile` as argument. This exposes more information than
> > it
> > > > needs.
> > > > > > In addition, it requires the runtime/core to understand how to
> > > properly
> > > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> > > > abstract
> > > > > > class. Runtime/core has to known which implementation of `Resource`
> > > to
> > > > use.
> > > > > > - Do we really need `ExternalResourceInfo#getInformation`? I think
> > it
> > > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > > interface.
> > > > > > User can define their own `ExternalResourceInfo` implementation and
> > > > how it
> > > > > > is used by the operator user codes.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <be...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > Thanks for driving this effort, Ynagze. The latest FLIP wiki
> > looks
> > > > good to
> > > > > > > me.
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <ka...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > > Edit: RuntimeContext interface
> > > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > > getExternalResourceInfo();
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <karmagyz@gmail.com
> > >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > Hi, there
> > > > > > > > >
> > > > > > > > > I have updated the FLIP, mainly target to make it more
> > detailed
> > > > and
> > > > > > > > > clear. The general design is not changed, but there are still
> > > > some
> > > > > > > > > changes need to be notified here:
> > > > > > > > >
> > > > > > > > > - Change the `ExternalResourceDriver` from abstract class to
> > > > > > > > > interface, since it does not have an abstract implementation.
> > > > Add life
> > > > > > > > > cycle method `open` and `close`.
> > > > > > > > >
> > > > > > > > > - Specify the method added to the RuntimeContext from which
> > > user
> > > > get
> > > > > > > > > the information of external resources.
> > > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > > >
> > > > > > > > > - Add "String getInformation()" method to
> > > `ExternalResourceInfo`
> > > > > > > > interface.
> > > > > > > > >
> > > > > > > > > - Treat adding external resource info to RestAPI/WebUI as a
> > > > future
> > > > > > > work.
> > > > > > > > >
> > > > > > > > > If you have any new concerns after that change, please
> > > mentioned
> > > > here.
> > > > > > > > > Sorry for disturbing you.
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > > danrtsey.wy@gmail.com>
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Thanks Yangze for the efforts to support GPU extended
> > > > resources.
> > > > > > > > > >
> > > > > > > > > > +1 for this FLIP
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yang
> > > > > > > > > >
> > > > > > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四
> > 下午11:10写道:
> > > > > > > > > >
> > > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > > >
> > > > > > > > > > > +1
> > > > > > > > > > >
> > > > > > > > > > > Cheers,
> > > > > > > > > > > Till
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > > > felixzhengcb@gmail.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三
> > > > 下午6:36写道:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you~
> > > > > > > > > > > > >
> > > > > > > > > > > > > Xintong Song
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > > karmagyz@gmail.com>
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I'd like to start the vote of FLIP-108 [1], which
> > > adds
> > > > GPU
> > > > > > > > support in
> > > > > > > > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The vote will be open for at least 72 hours. Unless
> > > > there is
> > > > > > > an
> > > > > > > > > > > > > objection,
> > > > > > > > > > > > > > I will try to close it by April 4, 2020 10:00 UTC
> > if
> > > > we have
> > > > > > > > received
> > > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [1]
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > > [2]
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > >
> > >
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > >
> > > > > > >
> > > >
> > >
> >

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Till Rohrmann <tr...@apache.org>.
Thanks for updating the FLIP, Yangze.

If ExternalResourceInfo is a marker interface, then ExternalResourceDriver
could return a Set<? extends ExternalResourceInfo>. This makes is a bit
nicer for the implementor because he can use the concrete subtype.

If we assume that users will always cast the ExternalResourceInfo instance
into the concrete subtype and if we assume that there is always only one
resource of a specific type, then one could make the interface type-safe by
changing it to

public interface RuntimeContext {
    <T extends ExternalResourceInfo> Set<T>
getExternalResourceInfo(Class<T> externalResourceType);
}

If we want to support multiple GPU resources, then one would need to use
the name of the respective resource as well.

Cheers,
Till

On Mon, Apr 13, 2020 at 4:19 AM Xintong Song <to...@gmail.com> wrote:

> Thanks for updating the FLIP, Yangze.
> The latest FLIP looks good to me.
>
> nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is out of
> sync.
>
> > Retrieve the information of the external resources according to the
> > resourceProfile.
>
>
> Thank you~
>
> Xintong Song
>
>
>
> On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <be...@gmail.com> wrote:
>
> > Good feedback form Xintong. The latest FLIP looks good to me.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <ka...@gmail.com> wrote:
> >
> > > Hi there,
> > > I've updated the FLIP accordingly. Please take a look. If you have any
> > > further concerns please let me know.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <ka...@gmail.com> wrote:
> > > >
> > > > Thanks for the feedback, Xintong.
> > > >
> > > > - Should we have a factory interface for `ExternalResourceDriver`,
> that
> > > > takes the configuration and returns a driver instance? Otherwise, if
> we
> > > are
> > > > creating the driver instance with reflection, we kind of implicitly
> > > > requires the driver to have a public non-argument constructor. If we
> > > > decided to go with this approach, then we will not need
> > > > `ExternalResourceDriver#open`.
> > > >
> > > > True, we could have an `ExternalResourceDriverFactory`, like
> > > > interface ExternalResourceDriverFactory {
> > > >     ExternalResourceDriver fromConfiguration(Configuration config);
> > > > }
> > > > Regarding the configuration, the user should provide
> > > > "external-resource.{resourceName}.driver-factory.class" instead.
> > > >
> > > > - Not sure about the necessity of `ExternalResourceDriver#close`. I
> > would
> > > > suggest to avoid introduce more interfaces if not absolutely
> necessary.
> > > >
> > > > I add `ExternalResourceDriver#close` in case user needs to clean up
> > > > internal states and any other resources. It's true that it may not
> > > > absolutely necessary for our GPUDriver. From my side, I'm ok to
> remove
> > > > it.
> > > >
> > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > `ResourceProfile` as argument. This exposes more information than it
> > > needs.
> > > > In addition, it requires the runtime/core to understand how to
> properly
> > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> > > abstract
> > > > class. Runtime/core has to known which implementation of `Resource`
> to
> > > use.
> > > >
> > > > True, at the moment, I think the amount of the resource is enough for
> > > > the `ExternalResourceDriver#retrieveResourceInfo`. In the future, if
> > > > the fine-grained external resource management is supported, the
> amount
> > > > of the resource seems to be enough either. If we want to leverage
> some
> > > > external resources which could not be measured by a single long
> value,
> > > > we might enrich this. But I'd like to keep it out of the scope of
> this
> > > > FLIP.
> > > >
> > > > - Do we really need `ExternalResourceInfo#getInformation`? I think it
> > > > should be good enough to make `ExternalResourceInfo` an empty
> > interface.
> > > > User can define their own `ExternalResourceInfo` implementation and
> how
> > > it
> > > > is used by the operator user codes.
> > > >
> > > > Sounds good.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <to...@gmail.com>
> > > wrote:
> > > > >
> > > > > Sorry to pull this back. I have some concerns about the recent
> > updated
> > > > > interface details.
> > > > >
> > > > > - Should we have a factory interface for `ExternalResourceDriver`,
> > that
> > > > > takes the configuration and returns a driver instance? Otherwise,
> if
> > > we are
> > > > > creating the driver instance with reflection, we kind of implicitly
> > > > > requires the driver to have a public non-argument constructor. If
> we
> > > > > decided to go with this approach, then we will not need
> > > > > `ExternalResourceDriver#open`.
> > > > > - Not sure about the necessity of `ExternalResourceDriver#close`. I
> > > would
> > > > > suggest to avoid introduce more interfaces if not absolutely
> > necessary.
> > > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > > `ResourceProfile` as argument. This exposes more information than
> it
> > > needs.
> > > > > In addition, it requires the runtime/core to understand how to
> > properly
> > > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> > > abstract
> > > > > class. Runtime/core has to known which implementation of `Resource`
> > to
> > > use.
> > > > > - Do we really need `ExternalResourceInfo#getInformation`? I think
> it
> > > > > should be good enough to make `ExternalResourceInfo` an empty
> > > interface.
> > > > > User can define their own `ExternalResourceInfo` implementation and
> > > how it
> > > > > is used by the operator user codes.
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <be...@gmail.com>
> > > wrote:
> > > > >
> > > > > > +1
> > > > > >
> > > > > > Thanks for driving this effort, Ynagze. The latest FLIP wiki
> looks
> > > good to
> > > > > > me.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <ka...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Edit: RuntimeContext interface
> > > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > > getExternalResourceInfo();
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <karmagyz@gmail.com
> >
> > > wrote:
> > > > > > > >
> > > > > > > > Hi, there
> > > > > > > >
> > > > > > > > I have updated the FLIP, mainly target to make it more
> detailed
> > > and
> > > > > > > > clear. The general design is not changed, but there are still
> > > some
> > > > > > > > changes need to be notified here:
> > > > > > > >
> > > > > > > > - Change the `ExternalResourceDriver` from abstract class to
> > > > > > > > interface, since it does not have an abstract implementation.
> > > Add life
> > > > > > > > cycle method `open` and `close`.
> > > > > > > >
> > > > > > > > - Specify the method added to the RuntimeContext from which
> > user
> > > get
> > > > > > > > the information of external resources.
> > > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > > >
> > > > > > > > - Add "String getInformation()" method to
> > `ExternalResourceInfo`
> > > > > > > interface.
> > > > > > > >
> > > > > > > > - Treat adding external resource info to RestAPI/WebUI as a
> > > future
> > > > > > work.
> > > > > > > >
> > > > > > > > If you have any new concerns after that change, please
> > mentioned
> > > here.
> > > > > > > > Sorry for disturbing you.
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> > danrtsey.wy@gmail.com>
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > Thanks Yangze for the efforts to support GPU extended
> > > resources.
> > > > > > > > >
> > > > > > > > > +1 for this FLIP
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yang
> > > > > > > > >
> > > > > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四
> 下午11:10写道:
> > > > > > > > >
> > > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > > >
> > > > > > > > > > +1
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Till
> > > > > > > > > >
> > > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > > felixzhengcb@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三
> > > 下午6:36写道:
> > > > > > > > > > >
> > > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you~
> > > > > > > > > > > >
> > > > > > > > > > > > Xintong Song
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > > karmagyz@gmail.com>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'd like to start the vote of FLIP-108 [1], which
> > adds
> > > GPU
> > > > > > > support in
> > > > > > > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > > > > > > >
> > > > > > > > > > > > > The vote will be open for at least 72 hours. Unless
> > > there is
> > > > > > an
> > > > > > > > > > > > objection,
> > > > > > > > > > > > > I will try to close it by April 4, 2020 10:00 UTC
> if
> > > we have
> > > > > > > received
> > > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > > >
> > > > > > > > > > > > > [1]
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > > [2]
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > > >
> > > > > > > > > > > > > Best,
> > > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > >
> > > > > >
> > >
> >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Xintong Song <to...@gmail.com>.
Thanks for updating the FLIP, Yangze.
The latest FLIP looks good to me.

nit: Javadoc of `ExternalResourceDriver#retrieveResourceInfo` is out of
sync.

> Retrieve the information of the external resources according to the
> resourceProfile.


Thank you~

Xintong Song



On Sat, Apr 11, 2020 at 11:04 AM Becket Qin <be...@gmail.com> wrote:

> Good feedback form Xintong. The latest FLIP looks good to me.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <ka...@gmail.com> wrote:
>
> > Hi there,
> > I've updated the FLIP accordingly. Please take a look. If you have any
> > further concerns please let me know.
> >
> > Best,
> > Yangze Guo
> >
> > On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <ka...@gmail.com> wrote:
> > >
> > > Thanks for the feedback, Xintong.
> > >
> > > - Should we have a factory interface for `ExternalResourceDriver`, that
> > > takes the configuration and returns a driver instance? Otherwise, if we
> > are
> > > creating the driver instance with reflection, we kind of implicitly
> > > requires the driver to have a public non-argument constructor. If we
> > > decided to go with this approach, then we will not need
> > > `ExternalResourceDriver#open`.
> > >
> > > True, we could have an `ExternalResourceDriverFactory`, like
> > > interface ExternalResourceDriverFactory {
> > >     ExternalResourceDriver fromConfiguration(Configuration config);
> > > }
> > > Regarding the configuration, the user should provide
> > > "external-resource.{resourceName}.driver-factory.class" instead.
> > >
> > > - Not sure about the necessity of `ExternalResourceDriver#close`. I
> would
> > > suggest to avoid introduce more interfaces if not absolutely necessary.
> > >
> > > I add `ExternalResourceDriver#close` in case user needs to clean up
> > > internal states and any other resources. It's true that it may not
> > > absolutely necessary for our GPUDriver. From my side, I'm ok to remove
> > > it.
> > >
> > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > `ResourceProfile` as argument. This exposes more information than it
> > needs.
> > > In addition, it requires the runtime/core to understand how to properly
> > > wrap the external resource into `ResourceProfile`. E.g.,
> > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> > abstract
> > > class. Runtime/core has to known which implementation of `Resource` to
> > use.
> > >
> > > True, at the moment, I think the amount of the resource is enough for
> > > the `ExternalResourceDriver#retrieveResourceInfo`. In the future, if
> > > the fine-grained external resource management is supported, the amount
> > > of the resource seems to be enough either. If we want to leverage some
> > > external resources which could not be measured by a single long value,
> > > we might enrich this. But I'd like to keep it out of the scope of this
> > > FLIP.
> > >
> > > - Do we really need `ExternalResourceInfo#getInformation`? I think it
> > > should be good enough to make `ExternalResourceInfo` an empty
> interface.
> > > User can define their own `ExternalResourceInfo` implementation and how
> > it
> > > is used by the operator user codes.
> > >
> > > Sounds good.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <to...@gmail.com>
> > wrote:
> > > >
> > > > Sorry to pull this back. I have some concerns about the recent
> updated
> > > > interface details.
> > > >
> > > > - Should we have a factory interface for `ExternalResourceDriver`,
> that
> > > > takes the configuration and returns a driver instance? Otherwise, if
> > we are
> > > > creating the driver instance with reflection, we kind of implicitly
> > > > requires the driver to have a public non-argument constructor. If we
> > > > decided to go with this approach, then we will not need
> > > > `ExternalResourceDriver#open`.
> > > > - Not sure about the necessity of `ExternalResourceDriver#close`. I
> > would
> > > > suggest to avoid introduce more interfaces if not absolutely
> necessary.
> > > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > > `ResourceProfile` as argument. This exposes more information than it
> > needs.
> > > > In addition, it requires the runtime/core to understand how to
> properly
> > > > wrap the external resource into `ResourceProfile`. E.g.,
> > > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> > abstract
> > > > class. Runtime/core has to known which implementation of `Resource`
> to
> > use.
> > > > - Do we really need `ExternalResourceInfo#getInformation`? I think it
> > > > should be good enough to make `ExternalResourceInfo` an empty
> > interface.
> > > > User can define their own `ExternalResourceInfo` implementation and
> > how it
> > > > is used by the operator user codes.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <be...@gmail.com>
> > wrote:
> > > >
> > > > > +1
> > > > >
> > > > > Thanks for driving this effort, Ynagze. The latest FLIP wiki looks
> > good to
> > > > > me.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <ka...@gmail.com>
> > wrote:
> > > > >
> > > > > > Edit: RuntimeContext interface
> > > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > To: Map<String, Set<ExternalResourceInfo>>
> > getExternalResourceInfo();
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > > >
> > > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <ka...@gmail.com>
> > wrote:
> > > > > > >
> > > > > > > Hi, there
> > > > > > >
> > > > > > > I have updated the FLIP, mainly target to make it more detailed
> > and
> > > > > > > clear. The general design is not changed, but there are still
> > some
> > > > > > > changes need to be notified here:
> > > > > > >
> > > > > > > - Change the `ExternalResourceDriver` from abstract class to
> > > > > > > interface, since it does not have an abstract implementation.
> > Add life
> > > > > > > cycle method `open` and `close`.
> > > > > > >
> > > > > > > - Specify the method added to the RuntimeContext from which
> user
> > get
> > > > > > > the information of external resources.
> > > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > > >
> > > > > > > - Add "String getInformation()" method to
> `ExternalResourceInfo`
> > > > > > interface.
> > > > > > >
> > > > > > > - Treat adding external resource info to RestAPI/WebUI as a
> > future
> > > > > work.
> > > > > > >
> > > > > > > If you have any new concerns after that change, please
> mentioned
> > here.
> > > > > > > Sorry for disturbing you.
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <
> danrtsey.wy@gmail.com>
> > > > > wrote:
> > > > > > > >
> > > > > > > > Thanks Yangze for the efforts to support GPU extended
> > resources.
> > > > > > > >
> > > > > > > > +1 for this FLIP
> > > > > > > >
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yang
> > > > > > > >
> > > > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四 下午11:10写道:
> > > > > > > >
> > > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > > >
> > > > > > > > > +1
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > > Till
> > > > > > > > >
> > > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > > felixzhengcb@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三
> > 下午6:36写道:
> > > > > > > > > >
> > > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > > >
> > > > > > > > > > > Thank you~
> > > > > > > > > > >
> > > > > > > > > > > Xintong Song
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> > karmagyz@gmail.com>
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi everyone,
> > > > > > > > > > > >
> > > > > > > > > > > > I'd like to start the vote of FLIP-108 [1], which
> adds
> > GPU
> > > > > > support in
> > > > > > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > > > > > >
> > > > > > > > > > > > The vote will be open for at least 72 hours. Unless
> > there is
> > > > > an
> > > > > > > > > > > objection,
> > > > > > > > > > > > I will try to close it by April 4, 2020 10:00 UTC if
> > we have
> > > > > > received
> > > > > > > > > > > > sufficient votes.
> > > > > > > > > > > >
> > > > > > > > > > > > [1]
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > > [2]
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > > >
> > > > > > > > > > > > Best,
> > > > > > > > > > > > Yangze Guo
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > >
> > > > >
> >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Becket Qin <be...@gmail.com>.
Good feedback form Xintong. The latest FLIP looks good to me.

Thanks,

Jiangjie (Becket) Qin

On Sat, Apr 11, 2020 at 9:20 AM Yangze Guo <ka...@gmail.com> wrote:

> Hi there,
> I've updated the FLIP accordingly. Please take a look. If you have any
> further concerns please let me know.
>
> Best,
> Yangze Guo
>
> On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <ka...@gmail.com> wrote:
> >
> > Thanks for the feedback, Xintong.
> >
> > - Should we have a factory interface for `ExternalResourceDriver`, that
> > takes the configuration and returns a driver instance? Otherwise, if we
> are
> > creating the driver instance with reflection, we kind of implicitly
> > requires the driver to have a public non-argument constructor. If we
> > decided to go with this approach, then we will not need
> > `ExternalResourceDriver#open`.
> >
> > True, we could have an `ExternalResourceDriverFactory`, like
> > interface ExternalResourceDriverFactory {
> >     ExternalResourceDriver fromConfiguration(Configuration config);
> > }
> > Regarding the configuration, the user should provide
> > "external-resource.{resourceName}.driver-factory.class" instead.
> >
> > - Not sure about the necessity of `ExternalResourceDriver#close`. I would
> > suggest to avoid introduce more interfaces if not absolutely necessary.
> >
> > I add `ExternalResourceDriver#close` in case user needs to clean up
> > internal states and any other resources. It's true that it may not
> > absolutely necessary for our GPUDriver. From my side, I'm ok to remove
> > it.
> >
> > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > `ResourceProfile` as argument. This exposes more information than it
> needs.
> > In addition, it requires the runtime/core to understand how to properly
> > wrap the external resource into `ResourceProfile`. E.g.,
> > `ResourceProfile#extendedResources` takes `Resource`, which is an
> abstract
> > class. Runtime/core has to known which implementation of `Resource` to
> use.
> >
> > True, at the moment, I think the amount of the resource is enough for
> > the `ExternalResourceDriver#retrieveResourceInfo`. In the future, if
> > the fine-grained external resource management is supported, the amount
> > of the resource seems to be enough either. If we want to leverage some
> > external resources which could not be measured by a single long value,
> > we might enrich this. But I'd like to keep it out of the scope of this
> > FLIP.
> >
> > - Do we really need `ExternalResourceInfo#getInformation`? I think it
> > should be good enough to make `ExternalResourceInfo` an empty interface.
> > User can define their own `ExternalResourceInfo` implementation and how
> it
> > is used by the operator user codes.
> >
> > Sounds good.
> >
> > Best,
> > Yangze Guo
> >
> > On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <to...@gmail.com>
> wrote:
> > >
> > > Sorry to pull this back. I have some concerns about the recent updated
> > > interface details.
> > >
> > > - Should we have a factory interface for `ExternalResourceDriver`, that
> > > takes the configuration and returns a driver instance? Otherwise, if
> we are
> > > creating the driver instance with reflection, we kind of implicitly
> > > requires the driver to have a public non-argument constructor. If we
> > > decided to go with this approach, then we will not need
> > > `ExternalResourceDriver#open`.
> > > - Not sure about the necessity of `ExternalResourceDriver#close`. I
> would
> > > suggest to avoid introduce more interfaces if not absolutely necessary.
> > > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > > `ResourceProfile` as argument. This exposes more information than it
> needs.
> > > In addition, it requires the runtime/core to understand how to properly
> > > wrap the external resource into `ResourceProfile`. E.g.,
> > > `ResourceProfile#extendedResources` takes `Resource`, which is an
> abstract
> > > class. Runtime/core has to known which implementation of `Resource` to
> use.
> > > - Do we really need `ExternalResourceInfo#getInformation`? I think it
> > > should be good enough to make `ExternalResourceInfo` an empty
> interface.
> > > User can define their own `ExternalResourceInfo` implementation and
> how it
> > > is used by the operator user codes.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <be...@gmail.com>
> wrote:
> > >
> > > > +1
> > > >
> > > > Thanks for driving this effort, Ynagze. The latest FLIP wiki looks
> good to
> > > > me.
> > > >
> > > > Cheers,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <ka...@gmail.com>
> wrote:
> > > >
> > > > > Edit: RuntimeContext interface
> > > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > To: Map<String, Set<ExternalResourceInfo>>
> getExternalResourceInfo();
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > >
> > > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <ka...@gmail.com>
> wrote:
> > > > > >
> > > > > > Hi, there
> > > > > >
> > > > > > I have updated the FLIP, mainly target to make it more detailed
> and
> > > > > > clear. The general design is not changed, but there are still
> some
> > > > > > changes need to be notified here:
> > > > > >
> > > > > > - Change the `ExternalResourceDriver` from abstract class to
> > > > > > interface, since it does not have an abstract implementation.
> Add life
> > > > > > cycle method `open` and `close`.
> > > > > >
> > > > > > - Specify the method added to the RuntimeContext from which user
> get
> > > > > > the information of external resources.
> > > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > > >
> > > > > > - Add "String getInformation()" method to `ExternalResourceInfo`
> > > > > interface.
> > > > > >
> > > > > > - Treat adding external resource info to RestAPI/WebUI as a
> future
> > > > work.
> > > > > >
> > > > > > If you have any new concerns after that change, please mentioned
> here.
> > > > > > Sorry for disturbing you.
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <da...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > > Thanks Yangze for the efforts to support GPU extended
> resources.
> > > > > > >
> > > > > > > +1 for this FLIP
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Yang
> > > > > > >
> > > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四 下午11:10写道:
> > > > > > >
> > > > > > > > Thanks for driving this effort Yangze.
> > > > > > > >
> > > > > > > > +1
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Till
> > > > > > > >
> > > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > > felixzhengcb@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三
> 下午6:36写道:
> > > > > > > > >
> > > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > > >
> > > > > > > > > > Thank you~
> > > > > > > > > >
> > > > > > > > > > Xintong Song
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <
> karmagyz@gmail.com>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi everyone,
> > > > > > > > > > >
> > > > > > > > > > > I'd like to start the vote of FLIP-108 [1], which adds
> GPU
> > > > > support in
> > > > > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > > > > >
> > > > > > > > > > > The vote will be open for at least 72 hours. Unless
> there is
> > > > an
> > > > > > > > > > objection,
> > > > > > > > > > > I will try to close it by April 4, 2020 10:00 UTC if
> we have
> > > > > received
> > > > > > > > > > > sufficient votes.
> > > > > > > > > > >
> > > > > > > > > > > [1]
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > > [2]
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > >
> > > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > > >
> > > > > > > > > > > Best,
> > > > > > > > > > > Yangze Guo
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > >
> > > >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yangze Guo <ka...@gmail.com>.
Hi there,
I've updated the FLIP accordingly. Please take a look. If you have any
further concerns please let me know.

Best,
Yangze Guo

On Fri, Apr 10, 2020 at 6:40 PM Yangze Guo <ka...@gmail.com> wrote:
>
> Thanks for the feedback, Xintong.
>
> - Should we have a factory interface for `ExternalResourceDriver`, that
> takes the configuration and returns a driver instance? Otherwise, if we are
> creating the driver instance with reflection, we kind of implicitly
> requires the driver to have a public non-argument constructor. If we
> decided to go with this approach, then we will not need
> `ExternalResourceDriver#open`.
>
> True, we could have an `ExternalResourceDriverFactory`, like
> interface ExternalResourceDriverFactory {
>     ExternalResourceDriver fromConfiguration(Configuration config);
> }
> Regarding the configuration, the user should provide
> "external-resource.{resourceName}.driver-factory.class" instead.
>
> - Not sure about the necessity of `ExternalResourceDriver#close`. I would
> suggest to avoid introduce more interfaces if not absolutely necessary.
>
> I add `ExternalResourceDriver#close` in case user needs to clean up
> internal states and any other resources. It's true that it may not
> absolutely necessary for our GPUDriver. From my side, I'm ok to remove
> it.
>
> - `ExternalResourceDriver#retrieveResourceInfo` should not take
> `ResourceProfile` as argument. This exposes more information than it needs.
> In addition, it requires the runtime/core to understand how to properly
> wrap the external resource into `ResourceProfile`. E.g.,
> `ResourceProfile#extendedResources` takes `Resource`, which is an abstract
> class. Runtime/core has to known which implementation of `Resource` to use.
>
> True, at the moment, I think the amount of the resource is enough for
> the `ExternalResourceDriver#retrieveResourceInfo`. In the future, if
> the fine-grained external resource management is supported, the amount
> of the resource seems to be enough either. If we want to leverage some
> external resources which could not be measured by a single long value,
> we might enrich this. But I'd like to keep it out of the scope of this
> FLIP.
>
> - Do we really need `ExternalResourceInfo#getInformation`? I think it
> should be good enough to make `ExternalResourceInfo` an empty interface.
> User can define their own `ExternalResourceInfo` implementation and how it
> is used by the operator user codes.
>
> Sounds good.
>
> Best,
> Yangze Guo
>
> On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <to...@gmail.com> wrote:
> >
> > Sorry to pull this back. I have some concerns about the recent updated
> > interface details.
> >
> > - Should we have a factory interface for `ExternalResourceDriver`, that
> > takes the configuration and returns a driver instance? Otherwise, if we are
> > creating the driver instance with reflection, we kind of implicitly
> > requires the driver to have a public non-argument constructor. If we
> > decided to go with this approach, then we will not need
> > `ExternalResourceDriver#open`.
> > - Not sure about the necessity of `ExternalResourceDriver#close`. I would
> > suggest to avoid introduce more interfaces if not absolutely necessary.
> > - `ExternalResourceDriver#retrieveResourceInfo` should not take
> > `ResourceProfile` as argument. This exposes more information than it needs.
> > In addition, it requires the runtime/core to understand how to properly
> > wrap the external resource into `ResourceProfile`. E.g.,
> > `ResourceProfile#extendedResources` takes `Resource`, which is an abstract
> > class. Runtime/core has to known which implementation of `Resource` to use.
> > - Do we really need `ExternalResourceInfo#getInformation`? I think it
> > should be good enough to make `ExternalResourceInfo` an empty interface.
> > User can define their own `ExternalResourceInfo` implementation and how it
> > is used by the operator user codes.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <be...@gmail.com> wrote:
> >
> > > +1
> > >
> > > Thanks for driving this effort, Ynagze. The latest FLIP wiki looks good to
> > > me.
> > >
> > > Cheers,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <ka...@gmail.com> wrote:
> > >
> > > > Edit: RuntimeContext interface
> > > > From: Map<String, Set<ExternalResourceInfo>>
> > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > To: Map<String, Set<ExternalResourceInfo>> getExternalResourceInfo();
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > >
> > > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <ka...@gmail.com> wrote:
> > > > >
> > > > > Hi, there
> > > > >
> > > > > I have updated the FLIP, mainly target to make it more detailed and
> > > > > clear. The general design is not changed, but there are still some
> > > > > changes need to be notified here:
> > > > >
> > > > > - Change the `ExternalResourceDriver` from abstract class to
> > > > > interface, since it does not have an abstract implementation. Add life
> > > > > cycle method `open` and `close`.
> > > > >
> > > > > - Specify the method added to the RuntimeContext from which user get
> > > > > the information of external resources.
> > > > >         Map<String, Set<ExternalResourceInfo>>
> > > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > > >
> > > > > - Add "String getInformation()" method to `ExternalResourceInfo`
> > > > interface.
> > > > >
> > > > > - Treat adding external resource info to RestAPI/WebUI as a future
> > > work.
> > > > >
> > > > > If you have any new concerns after that change, please mentioned here.
> > > > > Sorry for disturbing you.
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <da...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > Thanks Yangze for the efforts to support GPU extended resources.
> > > > > >
> > > > > > +1 for this FLIP
> > > > > >
> > > > > >
> > > > > > Best,
> > > > > > Yang
> > > > > >
> > > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四 下午11:10写道:
> > > > > >
> > > > > > > Thanks for driving this effort Yangze.
> > > > > > >
> > > > > > > +1
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Till
> > > > > > >
> > > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > > felixzhengcb@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > > > +1 (non-binding) from my side.
> > > > > > > >
> > > > > > > >
> > > > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三 下午6:36写道:
> > > > > > > >
> > > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > > +1 (non-binding) from my side.
> > > > > > > > >
> > > > > > > > > Thank you~
> > > > > > > > >
> > > > > > > > > Xintong Song
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi everyone,
> > > > > > > > > >
> > > > > > > > > > I'd like to start the vote of FLIP-108 [1], which adds GPU
> > > > support in
> > > > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > > > >
> > > > > > > > > > The vote will be open for at least 72 hours. Unless there is
> > > an
> > > > > > > > > objection,
> > > > > > > > > > I will try to close it by April 4, 2020 10:00 UTC if we have
> > > > received
> > > > > > > > > > sufficient votes.
> > > > > > > > > >
> > > > > > > > > > [1]
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > > [2]
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > >
> > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > > >
> > > > > > > > > > Best,
> > > > > > > > > > Yangze Guo
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > >
> > >

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yangze Guo <ka...@gmail.com>.
Thanks for the feedback, Xintong.

- Should we have a factory interface for `ExternalResourceDriver`, that
takes the configuration and returns a driver instance? Otherwise, if we are
creating the driver instance with reflection, we kind of implicitly
requires the driver to have a public non-argument constructor. If we
decided to go with this approach, then we will not need
`ExternalResourceDriver#open`.

True, we could have an `ExternalResourceDriverFactory`, like
interface ExternalResourceDriverFactory {
    ExternalResourceDriver fromConfiguration(Configuration config);
}
Regarding the configuration, the user should provide
"external-resource.{resourceName}.driver-factory.class" instead.

- Not sure about the necessity of `ExternalResourceDriver#close`. I would
suggest to avoid introduce more interfaces if not absolutely necessary.

I add `ExternalResourceDriver#close` in case user needs to clean up
internal states and any other resources. It's true that it may not
absolutely necessary for our GPUDriver. From my side, I'm ok to remove
it.

- `ExternalResourceDriver#retrieveResourceInfo` should not take
`ResourceProfile` as argument. This exposes more information than it needs.
In addition, it requires the runtime/core to understand how to properly
wrap the external resource into `ResourceProfile`. E.g.,
`ResourceProfile#extendedResources` takes `Resource`, which is an abstract
class. Runtime/core has to known which implementation of `Resource` to use.

True, at the moment, I think the amount of the resource is enough for
the `ExternalResourceDriver#retrieveResourceInfo`. In the future, if
the fine-grained external resource management is supported, the amount
of the resource seems to be enough either. If we want to leverage some
external resources which could not be measured by a single long value,
we might enrich this. But I'd like to keep it out of the scope of this
FLIP.

- Do we really need `ExternalResourceInfo#getInformation`? I think it
should be good enough to make `ExternalResourceInfo` an empty interface.
User can define their own `ExternalResourceInfo` implementation and how it
is used by the operator user codes.

Sounds good.

Best,
Yangze Guo

On Fri, Apr 10, 2020 at 6:04 PM Xintong Song <to...@gmail.com> wrote:
>
> Sorry to pull this back. I have some concerns about the recent updated
> interface details.
>
> - Should we have a factory interface for `ExternalResourceDriver`, that
> takes the configuration and returns a driver instance? Otherwise, if we are
> creating the driver instance with reflection, we kind of implicitly
> requires the driver to have a public non-argument constructor. If we
> decided to go with this approach, then we will not need
> `ExternalResourceDriver#open`.
> - Not sure about the necessity of `ExternalResourceDriver#close`. I would
> suggest to avoid introduce more interfaces if not absolutely necessary.
> - `ExternalResourceDriver#retrieveResourceInfo` should not take
> `ResourceProfile` as argument. This exposes more information than it needs.
> In addition, it requires the runtime/core to understand how to properly
> wrap the external resource into `ResourceProfile`. E.g.,
> `ResourceProfile#extendedResources` takes `Resource`, which is an abstract
> class. Runtime/core has to known which implementation of `Resource` to use.
> - Do we really need `ExternalResourceInfo#getInformation`? I think it
> should be good enough to make `ExternalResourceInfo` an empty interface.
> User can define their own `ExternalResourceInfo` implementation and how it
> is used by the operator user codes.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <be...@gmail.com> wrote:
>
> > +1
> >
> > Thanks for driving this effort, Ynagze. The latest FLIP wiki looks good to
> > me.
> >
> > Cheers,
> >
> > Jiangjie (Becket) Qin
> >
> > On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <ka...@gmail.com> wrote:
> >
> > > Edit: RuntimeContext interface
> > > From: Map<String, Set<ExternalResourceInfo>>
> > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > To: Map<String, Set<ExternalResourceInfo>> getExternalResourceInfo();
> > >
> > > Best,
> > > Yangze Guo
> > >
> > >
> > > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <ka...@gmail.com> wrote:
> > > >
> > > > Hi, there
> > > >
> > > > I have updated the FLIP, mainly target to make it more detailed and
> > > > clear. The general design is not changed, but there are still some
> > > > changes need to be notified here:
> > > >
> > > > - Change the `ExternalResourceDriver` from abstract class to
> > > > interface, since it does not have an abstract implementation. Add life
> > > > cycle method `open` and `close`.
> > > >
> > > > - Specify the method added to the RuntimeContext from which user get
> > > > the information of external resources.
> > > >         Map<String, Set<ExternalResourceInfo>>
> > > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > > >
> > > > - Add "String getInformation()" method to `ExternalResourceInfo`
> > > interface.
> > > >
> > > > - Treat adding external resource info to RestAPI/WebUI as a future
> > work.
> > > >
> > > > If you have any new concerns after that change, please mentioned here.
> > > > Sorry for disturbing you.
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <da...@gmail.com>
> > wrote:
> > > > >
> > > > > Thanks Yangze for the efforts to support GPU extended resources.
> > > > >
> > > > > +1 for this FLIP
> > > > >
> > > > >
> > > > > Best,
> > > > > Yang
> > > > >
> > > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四 下午11:10写道:
> > > > >
> > > > > > Thanks for driving this effort Yangze.
> > > > > >
> > > > > > +1
> > > > > >
> > > > > > Cheers,
> > > > > > Till
> > > > > >
> > > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> > felixzhengcb@gmail.com
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > > +1 (non-binding) from my side.
> > > > > > >
> > > > > > >
> > > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三 下午6:36写道:
> > > > > > >
> > > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > > +1 (non-binding) from my side.
> > > > > > > >
> > > > > > > > Thank you~
> > > > > > > >
> > > > > > > > Xintong Song
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi everyone,
> > > > > > > > >
> > > > > > > > > I'd like to start the vote of FLIP-108 [1], which adds GPU
> > > support in
> > > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > > >
> > > > > > > > > The vote will be open for at least 72 hours. Unless there is
> > an
> > > > > > > > objection,
> > > > > > > > > I will try to close it by April 4, 2020 10:00 UTC if we have
> > > received
> > > > > > > > > sufficient votes.
> > > > > > > > >
> > > > > > > > > [1]
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > > [2]
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > >
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > > >
> > > > > > > > > Best,
> > > > > > > > > Yangze Guo
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > >
> >

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Xintong Song <to...@gmail.com>.
Sorry to pull this back. I have some concerns about the recent updated
interface details.

- Should we have a factory interface for `ExternalResourceDriver`, that
takes the configuration and returns a driver instance? Otherwise, if we are
creating the driver instance with reflection, we kind of implicitly
requires the driver to have a public non-argument constructor. If we
decided to go with this approach, then we will not need
`ExternalResourceDriver#open`.
- Not sure about the necessity of `ExternalResourceDriver#close`. I would
suggest to avoid introduce more interfaces if not absolutely necessary.
- `ExternalResourceDriver#retrieveResourceInfo` should not take
`ResourceProfile` as argument. This exposes more information than it needs.
In addition, it requires the runtime/core to understand how to properly
wrap the external resource into `ResourceProfile`. E.g.,
`ResourceProfile#extendedResources` takes `Resource`, which is an abstract
class. Runtime/core has to known which implementation of `Resource` to use.
- Do we really need `ExternalResourceInfo#getInformation`? I think it
should be good enough to make `ExternalResourceInfo` an empty interface.
User can define their own `ExternalResourceInfo` implementation and how it
is used by the operator user codes.

Thank you~

Xintong Song



On Thu, Apr 9, 2020 at 2:25 PM Becket Qin <be...@gmail.com> wrote:

> +1
>
> Thanks for driving this effort, Ynagze. The latest FLIP wiki looks good to
> me.
>
> Cheers,
>
> Jiangjie (Becket) Qin
>
> On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <ka...@gmail.com> wrote:
>
> > Edit: RuntimeContext interface
> > From: Map<String, Set<ExternalResourceInfo>>
> > getExternalResourceInfo(ResourceSpec resourceSpec);
> > To: Map<String, Set<ExternalResourceInfo>> getExternalResourceInfo();
> >
> > Best,
> > Yangze Guo
> >
> >
> > On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <ka...@gmail.com> wrote:
> > >
> > > Hi, there
> > >
> > > I have updated the FLIP, mainly target to make it more detailed and
> > > clear. The general design is not changed, but there are still some
> > > changes need to be notified here:
> > >
> > > - Change the `ExternalResourceDriver` from abstract class to
> > > interface, since it does not have an abstract implementation. Add life
> > > cycle method `open` and `close`.
> > >
> > > - Specify the method added to the RuntimeContext from which user get
> > > the information of external resources.
> > >         Map<String, Set<ExternalResourceInfo>>
> > > getExternalResourceInfo(ResourceSpec resourceSpec);
> > >
> > > - Add "String getInformation()" method to `ExternalResourceInfo`
> > interface.
> > >
> > > - Treat adding external resource info to RestAPI/WebUI as a future
> work.
> > >
> > > If you have any new concerns after that change, please mentioned here.
> > > Sorry for disturbing you.
> > >
> > > Best,
> > > Yangze Guo
> > >
> > > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <da...@gmail.com>
> wrote:
> > > >
> > > > Thanks Yangze for the efforts to support GPU extended resources.
> > > >
> > > > +1 for this FLIP
> > > >
> > > >
> > > > Best,
> > > > Yang
> > > >
> > > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四 下午11:10写道:
> > > >
> > > > > Thanks for driving this effort Yangze.
> > > > >
> > > > > +1
> > > > >
> > > > > Cheers,
> > > > > Till
> > > > >
> > > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <
> felixzhengcb@gmail.com
> > >
> > > > > wrote:
> > > > >
> > > > > > Thanks Yangze for driving the initial CPU support!
> > > > > > +1 (non-binding) from my side.
> > > > > >
> > > > > >
> > > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三 下午6:36写道:
> > > > > >
> > > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > > +1 (non-binding) from my side.
> > > > > > >
> > > > > > > Thank you~
> > > > > > >
> > > > > > > Xintong Song
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com>
> > wrote:
> > > > > > >
> > > > > > > > Hi everyone,
> > > > > > > >
> > > > > > > > I'd like to start the vote of FLIP-108 [1], which adds GPU
> > support in
> > > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > > >
> > > > > > > > The vote will be open for at least 72 hours. Unless there is
> an
> > > > > > > objection,
> > > > > > > > I will try to close it by April 4, 2020 10:00 UTC if we have
> > received
> > > > > > > > sufficient votes.
> > > > > > > >
> > > > > > > > [1]
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > > [2]
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > > >
> > > > > > > > Best,
> > > > > > > > Yangze Guo
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Becket Qin <be...@gmail.com>.
+1

Thanks for driving this effort, Ynagze. The latest FLIP wiki looks good to
me.

Cheers,

Jiangjie (Becket) Qin

On Wed, Apr 8, 2020 at 4:10 PM Yangze Guo <ka...@gmail.com> wrote:

> Edit: RuntimeContext interface
> From: Map<String, Set<ExternalResourceInfo>>
> getExternalResourceInfo(ResourceSpec resourceSpec);
> To: Map<String, Set<ExternalResourceInfo>> getExternalResourceInfo();
>
> Best,
> Yangze Guo
>
>
> On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <ka...@gmail.com> wrote:
> >
> > Hi, there
> >
> > I have updated the FLIP, mainly target to make it more detailed and
> > clear. The general design is not changed, but there are still some
> > changes need to be notified here:
> >
> > - Change the `ExternalResourceDriver` from abstract class to
> > interface, since it does not have an abstract implementation. Add life
> > cycle method `open` and `close`.
> >
> > - Specify the method added to the RuntimeContext from which user get
> > the information of external resources.
> >         Map<String, Set<ExternalResourceInfo>>
> > getExternalResourceInfo(ResourceSpec resourceSpec);
> >
> > - Add "String getInformation()" method to `ExternalResourceInfo`
> interface.
> >
> > - Treat adding external resource info to RestAPI/WebUI as a future work.
> >
> > If you have any new concerns after that change, please mentioned here.
> > Sorry for disturbing you.
> >
> > Best,
> > Yangze Guo
> >
> > On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <da...@gmail.com> wrote:
> > >
> > > Thanks Yangze for the efforts to support GPU extended resources.
> > >
> > > +1 for this FLIP
> > >
> > >
> > > Best,
> > > Yang
> > >
> > > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四 下午11:10写道:
> > >
> > > > Thanks for driving this effort Yangze.
> > > >
> > > > +1
> > > >
> > > > Cheers,
> > > > Till
> > > >
> > > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <felixzhengcb@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Thanks Yangze for driving the initial CPU support!
> > > > > +1 (non-binding) from my side.
> > > > >
> > > > >
> > > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三 下午6:36写道:
> > > > >
> > > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > > +1 (non-binding) from my side.
> > > > > >
> > > > > > Thank you~
> > > > > >
> > > > > > Xintong Song
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com>
> wrote:
> > > > > >
> > > > > > > Hi everyone,
> > > > > > >
> > > > > > > I'd like to start the vote of FLIP-108 [1], which adds GPU
> support in
> > > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > > >
> > > > > > > The vote will be open for at least 72 hours. Unless there is an
> > > > > > objection,
> > > > > > > I will try to close it by April 4, 2020 10:00 UTC if we have
> received
> > > > > > > sufficient votes.
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > >
> > > > >
> > > >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > > [2]
> > > > > > >
> > > > > >
> > > > >
> > > >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > > >
> > > > > > > Best,
> > > > > > > Yangze Guo
> > > > > > >
> > > > > >
> > > > >
> > > >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yangze Guo <ka...@gmail.com>.
Edit: RuntimeContext interface
From: Map<String, Set<ExternalResourceInfo>>
getExternalResourceInfo(ResourceSpec resourceSpec);
To: Map<String, Set<ExternalResourceInfo>> getExternalResourceInfo();

Best,
Yangze Guo


On Wed, Apr 8, 2020 at 11:36 AM Yangze Guo <ka...@gmail.com> wrote:
>
> Hi, there
>
> I have updated the FLIP, mainly target to make it more detailed and
> clear. The general design is not changed, but there are still some
> changes need to be notified here:
>
> - Change the `ExternalResourceDriver` from abstract class to
> interface, since it does not have an abstract implementation. Add life
> cycle method `open` and `close`.
>
> - Specify the method added to the RuntimeContext from which user get
> the information of external resources.
>         Map<String, Set<ExternalResourceInfo>>
> getExternalResourceInfo(ResourceSpec resourceSpec);
>
> - Add "String getInformation()" method to `ExternalResourceInfo` interface.
>
> - Treat adding external resource info to RestAPI/WebUI as a future work.
>
> If you have any new concerns after that change, please mentioned here.
> Sorry for disturbing you.
>
> Best,
> Yangze Guo
>
> On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <da...@gmail.com> wrote:
> >
> > Thanks Yangze for the efforts to support GPU extended resources.
> >
> > +1 for this FLIP
> >
> >
> > Best,
> > Yang
> >
> > Till Rohrmann <tr...@apache.org> 于2020年4月2日周四 下午11:10写道:
> >
> > > Thanks for driving this effort Yangze.
> > >
> > > +1
> > >
> > > Cheers,
> > > Till
> > >
> > > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <fe...@gmail.com>
> > > wrote:
> > >
> > > > Thanks Yangze for driving the initial CPU support!
> > > > +1 (non-binding) from my side.
> > > >
> > > >
> > > > Xintong Song <to...@gmail.com> 于2020年4月1日周三 下午6:36写道:
> > > >
> > > > > Thanks Yangze, the FLIP looks good to me.
> > > > > +1 (non-binding) from my side.
> > > > >
> > > > > Thank you~
> > > > >
> > > > > Xintong Song
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com> wrote:
> > > > >
> > > > > > Hi everyone,
> > > > > >
> > > > > > I'd like to start the vote of FLIP-108 [1], which adds GPU support in
> > > > > > Flink. This FLIP is discussed in the thread[2].
> > > > > >
> > > > > > The vote will be open for at least 72 hours. Unless there is an
> > > > > objection,
> > > > > > I will try to close it by April 4, 2020 10:00 UTC if we have received
> > > > > > sufficient votes.
> > > > > >
> > > > > > [1]
> > > > > >
> > > > >
> > > >
> > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > > [2]
> > > > > >
> > > > >
> > > >
> > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > > >
> > > > > > Best,
> > > > > > Yangze Guo
> > > > > >
> > > > >
> > > >
> > >

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yangze Guo <ka...@gmail.com>.
Hi, there

I have updated the FLIP, mainly target to make it more detailed and
clear. The general design is not changed, but there are still some
changes need to be notified here:

- Change the `ExternalResourceDriver` from abstract class to
interface, since it does not have an abstract implementation. Add life
cycle method `open` and `close`.

- Specify the method added to the RuntimeContext from which user get
the information of external resources.
        Map<String, Set<ExternalResourceInfo>>
getExternalResourceInfo(ResourceSpec resourceSpec);

- Add "String getInformation()" method to `ExternalResourceInfo` interface.

- Treat adding external resource info to RestAPI/WebUI as a future work.

If you have any new concerns after that change, please mentioned here.
Sorry for disturbing you.

Best,
Yangze Guo

On Wed, Apr 8, 2020 at 9:55 AM Yang Wang <da...@gmail.com> wrote:
>
> Thanks Yangze for the efforts to support GPU extended resources.
>
> +1 for this FLIP
>
>
> Best,
> Yang
>
> Till Rohrmann <tr...@apache.org> 于2020年4月2日周四 下午11:10写道:
>
> > Thanks for driving this effort Yangze.
> >
> > +1
> >
> > Cheers,
> > Till
> >
> > On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <fe...@gmail.com>
> > wrote:
> >
> > > Thanks Yangze for driving the initial CPU support!
> > > +1 (non-binding) from my side.
> > >
> > >
> > > Xintong Song <to...@gmail.com> 于2020年4月1日周三 下午6:36写道:
> > >
> > > > Thanks Yangze, the FLIP looks good to me.
> > > > +1 (non-binding) from my side.
> > > >
> > > > Thank you~
> > > >
> > > > Xintong Song
> > > >
> > > >
> > > >
> > > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com> wrote:
> > > >
> > > > > Hi everyone,
> > > > >
> > > > > I'd like to start the vote of FLIP-108 [1], which adds GPU support in
> > > > > Flink. This FLIP is discussed in the thread[2].
> > > > >
> > > > > The vote will be open for at least 72 hours. Unless there is an
> > > > objection,
> > > > > I will try to close it by April 4, 2020 10:00 UTC if we have received
> > > > > sufficient votes.
> > > > >
> > > > > [1]
> > > > >
> > > >
> > >
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > > [2]
> > > > >
> > > >
> > >
> > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > > >
> > > > > Best,
> > > > > Yangze Guo
> > > > >
> > > >
> > >
> >

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Yang Wang <da...@gmail.com>.
Thanks Yangze for the efforts to support GPU extended resources.

+1 for this FLIP


Best,
Yang

Till Rohrmann <tr...@apache.org> 于2020年4月2日周四 下午11:10写道:

> Thanks for driving this effort Yangze.
>
> +1
>
> Cheers,
> Till
>
> On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <fe...@gmail.com>
> wrote:
>
> > Thanks Yangze for driving the initial CPU support!
> > +1 (non-binding) from my side.
> >
> >
> > Xintong Song <to...@gmail.com> 于2020年4月1日周三 下午6:36写道:
> >
> > > Thanks Yangze, the FLIP looks good to me.
> > > +1 (non-binding) from my side.
> > >
> > > Thank you~
> > >
> > > Xintong Song
> > >
> > >
> > >
> > > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com> wrote:
> > >
> > > > Hi everyone,
> > > >
> > > > I'd like to start the vote of FLIP-108 [1], which adds GPU support in
> > > > Flink. This FLIP is discussed in the thread[2].
> > > >
> > > > The vote will be open for at least 72 hours. Unless there is an
> > > objection,
> > > > I will try to close it by April 4, 2020 10:00 UTC if we have received
> > > > sufficient votes.
> > > >
> > > > [1]
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > > [2]
> > > >
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > > >
> > > > Best,
> > > > Yangze Guo
> > > >
> > >
> >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Till Rohrmann <tr...@apache.org>.
Thanks for driving this effort Yangze.

+1

Cheers,
Till

On Wed, Apr 1, 2020 at 12:41 PM Canbin Zheng <fe...@gmail.com> wrote:

> Thanks Yangze for driving the initial CPU support!
> +1 (non-binding) from my side.
>
>
> Xintong Song <to...@gmail.com> 于2020年4月1日周三 下午6:36写道:
>
> > Thanks Yangze, the FLIP looks good to me.
> > +1 (non-binding) from my side.
> >
> > Thank you~
> >
> > Xintong Song
> >
> >
> >
> > On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com> wrote:
> >
> > > Hi everyone,
> > >
> > > I'd like to start the vote of FLIP-108 [1], which adds GPU support in
> > > Flink. This FLIP is discussed in the thread[2].
> > >
> > > The vote will be open for at least 72 hours. Unless there is an
> > objection,
> > > I will try to close it by April 4, 2020 10:00 UTC if we have received
> > > sufficient votes.
> > >
> > > [1]
> > >
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > > [2]
> > >
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> > >
> > > Best,
> > > Yangze Guo
> > >
> >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Canbin Zheng <fe...@gmail.com>.
Thanks Yangze for driving the initial CPU support!
+1 (non-binding) from my side.


Xintong Song <to...@gmail.com> 于2020年4月1日周三 下午6:36写道:

> Thanks Yangze, the FLIP looks good to me.
> +1 (non-binding) from my side.
>
> Thank you~
>
> Xintong Song
>
>
>
> On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com> wrote:
>
> > Hi everyone,
> >
> > I'd like to start the vote of FLIP-108 [1], which adds GPU support in
> > Flink. This FLIP is discussed in the thread[2].
> >
> > The vote will be open for at least 72 hours. Unless there is an
> objection,
> > I will try to close it by April 4, 2020 10:00 UTC if we have received
> > sufficient votes.
> >
> > [1]
> >
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> > [2]
> >
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
> >
> > Best,
> > Yangze Guo
> >
>

Re: [VOTE] FLIP-108: Add GPU support in Flink

Posted by Xintong Song <to...@gmail.com>.
Thanks Yangze, the FLIP looks good to me.
+1 (non-binding) from my side.

Thank you~

Xintong Song



On Wed, Apr 1, 2020 at 5:22 PM Yangze Guo <ka...@gmail.com> wrote:

> Hi everyone,
>
> I'd like to start the vote of FLIP-108 [1], which adds GPU support in
> Flink. This FLIP is discussed in the thread[2].
>
> The vote will be open for at least 72 hours. Unless there is an objection,
> I will try to close it by April 4, 2020 10:00 UTC if we have received
> sufficient votes.
>
> [1]
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-108%3A+Add+GPU+support+in+Flink
> [2]
> http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-FLIP-108-Add-GPU-support-in-Flink-td38286.html
>
> Best,
> Yangze Guo
>