You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Pramod Immaneni <pr...@datatorrent.com> on 2016/06/07 21:27:30 UTC

Re: A proposal for Malhar

I wanted to close the loop on this discussion. In general everyone seemed
to be favorable to this idea with no serious objections. Folks had good
suggestions like documenting capabilities of operators, come up well
defined criteria for graduation of operators and what those criteria may be
and what to do with existing operators that may not yet be mature or
unused.

I am going to summarize the key points that resulted from the discussion
and would like to proceed with them.

   - Operators that do not yet provide the key platform capabilities to
   make an operator useful across different applications such as reusability,
   partitioning static or dynamic, idempotency, exactly once will still be
   accepted as long as they are functionally correct, have unit tests and will
   go into a separate module.
   - Contrib module was suggested as a place where new contributions go in
   that don't yet have all the platform capabilities and are not yet mature.
   If there are no other suggestions we will go with this one.
   - It was suggested the operators documentation list those platform
   capabilities it currently provides from the list above. I will document a
   structure for this in the contribution guidelines.
   - Folks wanted to know what would be the criteria to graduate an
   operator to the big leagues :). I will kick-off a separate thread for it as
   I think it requires its own discussion and hopefully we can come up with a
   set of guidelines for it.
   - David brought up state of some of the existing operators and their
   retirement and the layout of operators in Malhar in general and how it
   causes problems with development. I will ask him to lead the discussion on
   that.

Thanks

On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote:

> The two ideas are not conflicting, but rather complementing.
>
> On the contrary, putting a new process for people trying to contribute
> while NOT addressing the old unused subpar operators in the repository is
> what is conflicting.
>
> Keep in mind that when people try to contribute, they always look at the
> existing operators already in the repository as examples and likely a model
> for their new operators.
>
> David
>
>
> On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com> wrote:
>
> > Yes there are two conflicting threads now. The original thread was to
> open
> > up a way for contributors to submit code in a dir (contrib?) as long as
> > license part of taken care of.
> >
> > On the thread of removing non-used operators -> How do we know what is
> > being used?
> >
> > Thks,
> > Amol
> >
> >
> > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <sa...@datatorrent.com>
> > wrote:
> >
> > > +1 for removing the not-used operators.
> > >
> > > So we are creating a process for operator writers who don't want to
> > > understand the platform, yet wants to contribute? How big is that set?
> > > If we tell the app-user, here is the code which has not passed all the
> > > checklist, will they be ready to use that in production?
> > >
> > > This thread has 2 conflicting forces, reduce the operators and make it
> > easy
> > > to add more operators.
> > >
> > >
> > >
> > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> pramod@datatorrent.com>
> > > wrote:
> > >
> > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> > gaurav.gopi123@gmail.com>
> > > > wrote:
> > > >
> > > > > Pramod,
> > > > >
> > > > > By that logic I would say let's put all partitionable operators
> into
> > > one
> > > > > folder, non-partitionable operators in another and so on...
> > > > >
> > > >
> > > > Remember the original goal of making it easier for new members to
> > > > contribute and managing those contributions to maturity. It is not a
> > > > functional level separation.
> > > >
> > > >
> > > > > When I look at hadoop code I see these annotations being used at
> > class
> > > > > level and not at package/folder level.
> > > >
> > > >
> > > > I had a typo in my email, I meant to say "think of this like a
> > folder..."
> > > > as an analogy and not literally.
> > > >
> > > > Thanks
> > > >
> > > >
> > > > > Thanks
> > > > >
> > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> > > pramod@datatorrent.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > > > gaurav.gopi123@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Can same goal not be achieved by
> > > > > > > using
> > org.apache.hadoop.classification.InterfaceStability.Evolving
> > > /
> > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > > > > annotation?
> > > > > > >
> > > > > >
> > > > > > I think it is important to localize the additions in one place so
> > > that
> > > > it
> > > > > > becomes clearer to users about the maturity level of these,
> easier
> > > for
> > > > > > developers to track them towards the path to maturity and also
> > > > provides a
> > > > > > clearer directive for committers and contributors on acceptance
> of
> > > new
> > > > > > submissions. Relying on the annotations alone makes them spread
> all
> > > > over
> > > > > > the place and adds an additional layer of difficulty in
> > > identification
> > > > > not
> > > > > > just for users but also for developers who want to find such
> > > operators
> > > > > and
> > > > > > improve them. This of this like a folder level annotation where
> > > > > everything
> > > > > > under this folder is unstable or evolving.
> > > > > >
> > > > > > Thanks
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> > david@datatorrent.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Malhar in its current state, has way too many operators
> > > that
> > > > > fall
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > "non-production quality" category. We should make it
> > > obvious
> > > > to
> > > > > > > users
> > > > > > > > > > that
> > > > > > > > > > > which operators are up to par, and which operators are
> > not,
> > > > and
> > > > > > > maybe
> > > > > > > > > > even
> > > > > > > > > > > remove those that are likely not ever used in a real
> use
> > > > case.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I am ambivalent about revisiting older operators and
> doing
> > > this
> > > > > > > > exercise
> > > > > > > > > as
> > > > > > > > > > this can cause unnecessary tensions. My original intent
> is
> > > for
> > > > > > > > > > contributions going forward.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > IMO it is important to address this as well. Operators
> > outside
> > > > the
> > > > > > play
> > > > > > > > > area should be of well known quality.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > I think this is important, and I don't anticipate much
> tension
> > if
> > > > we
> > > > > > > > establish clear criteria.
> > > > > > > > It's not helpful if we let the old subpar operators stay and
> > put
> > > up
> > > > > the
> > > > > > > > bars for new operators.
> > > > > > > >
> > > > > > > > David
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: A proposal for Malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
I would suggest we go through the operators in those packages on an
individual basis and grade them into 3 buckets, those that meet the level
we expect from the operators (could be few of them), those that are
potentially useful but need additional work and those that we don't think
would be useful. The ones in the first bucket can remain in place, the
second set be moved to misc and third set moved to misc and deprecated.

Thanks

On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> wrote:

> Hi all,
>
> I would like to renew the discussion of retiring operators in Malhar.
>
> As stated before, the reason why we would like to retire operators in
> Malhar is because some of them were written a long time ago before Apache
> incubation, and they do not pertain to real use cases, are not up to par in
> code quality, have no potential for improvement, and probably completely
> unused by anybody.
>
> We do not want contributors to use them as a model of their contribution,
> or users to use them thinking they are of quality, and then hit a wall.
> Both scenarios are not beneficial to the reputation of Apex.
>
> The initial 3 packages that we would like to target are *lib/algo*,
> *lib/math*, and *lib/streamquery*.
>
> I'm adding this thread to the users list. Please speak up if you are using
> any operator in these 3 packages. We would like to hear from you.
>
> These are the options I can think of for retiring those operators:
>
> 1) Completely remove them from the malhar repository.
> 2) Move them from malhar-library into a separate artifact called
> malhar-misc
> 3) Mark them deprecated and add to their javadoc that they are no longer
> supported
>
> Note that 2 and 3 are not mutually exclusive. Any thoughts?
>
> David
>
> On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> I wanted to close the loop on this discussion. In general everyone seemed
>> to be favorable to this idea with no serious objections. Folks had good
>> suggestions like documenting capabilities of operators, come up well
>> defined criteria for graduation of operators and what those criteria may
>> be
>> and what to do with existing operators that may not yet be mature or
>> unused.
>>
>> I am going to summarize the key points that resulted from the discussion
>> and would like to proceed with them.
>>
>>    - Operators that do not yet provide the key platform capabilities to
>>    make an operator useful across different applications such as
>> reusability,
>>    partitioning static or dynamic, idempotency, exactly once will still be
>>    accepted as long as they are functionally correct, have unit tests and
>> will
>>    go into a separate module.
>>    - Contrib module was suggested as a place where new contributions go in
>>    that don't yet have all the platform capabilities and are not yet
>> mature.
>>    If there are no other suggestions we will go with this one.
>>    - It was suggested the operators documentation list those platform
>>    capabilities it currently provides from the list above. I will
>> document a
>>    structure for this in the contribution guidelines.
>>    - Folks wanted to know what would be the criteria to graduate an
>>    operator to the big leagues :). I will kick-off a separate thread for
>> it as
>>    I think it requires its own discussion and hopefully we can come up
>> with a
>>    set of guidelines for it.
>>    - David brought up state of some of the existing operators and their
>>
>>    retirement and the layout of operators in Malhar in general and how it
>>    causes problems with development. I will ask him to lead the
>> discussion on
>>    that.
>>
>> Thanks
>>
>> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote:
>>
>> > The two ideas are not conflicting, but rather complementing.
>> >
>> > On the contrary, putting a new process for people trying to contribute
>> > while NOT addressing the old unused subpar operators in the repository
>> is
>> > what is conflicting.
>> >
>> > Keep in mind that when people try to contribute, they always look at the
>> > existing operators already in the repository as examples and likely a
>> model
>> > for their new operators.
>> >
>> > David
>> >
>> >
>> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> >
>> > > Yes there are two conflicting threads now. The original thread was to
>> > open
>> > > up a way for contributors to submit code in a dir (contrib?) as long
>> as
>> > > license part of taken care of.
>> > >
>> > > On the thread of removing non-used operators -> How do we know what is
>> > > being used?
>> > >
>> > > Thks,
>> > > Amol
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> sandesh@datatorrent.com>
>> > > wrote:
>> > >
>> > > > +1 for removing the not-used operators.
>> > > >
>> > > > So we are creating a process for operator writers who don't want to
>> > > > understand the platform, yet wants to contribute? How big is that
>> set?
>> > > > If we tell the app-user, here is the code which has not passed all
>> the
>> > > > checklist, will they be ready to use that in production?
>> > > >
>> > > > This thread has 2 conflicting forces, reduce the operators and make
>> it
>> > > easy
>> > > > to add more operators.
>> > > >
>> > > >
>> > > >
>> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> > pramod@datatorrent.com>
>> > > > wrote:
>> > > >
>> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> > > gaurav.gopi123@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Pramod,
>> > > > > >
>> > > > > > By that logic I would say let's put all partitionable operators
>> > into
>> > > > one
>> > > > > > folder, non-partitionable operators in another and so on...
>> > > > > >
>> > > > >
>> > > > > Remember the original goal of making it easier for new members to
>> > > > > contribute and managing those contributions to maturity. It is
>> not a
>> > > > > functional level separation.
>> > > > >
>> > > > >
>> > > > > > When I look at hadoop code I see these annotations being used at
>> > > class
>> > > > > > level and not at package/folder level.
>> > > > >
>> > > > >
>> > > > > I had a typo in my email, I meant to say "think of this like a
>> > > folder..."
>> > > > > as an analogy and not literally.
>> > > > >
>> > > > > Thanks
>> > > > >
>> > > > >
>> > > > > > Thanks
>> > > > > >
>> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> > > > pramod@datatorrent.com
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> > > > > gaurav.gopi123@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Can same goal not be achieved by
>> > > > > > > > using
>> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> > > > /
>> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
>> > > > > > annotation?
>> > > > > > > >
>> > > > > > >
>> > > > > > > I think it is important to localize the additions in one
>> place so
>> > > > that
>> > > > > it
>> > > > > > > becomes clearer to users about the maturity level of these,
>> > easier
>> > > > for
>> > > > > > > developers to track them towards the path to maturity and also
>> > > > > provides a
>> > > > > > > clearer directive for committers and contributors on
>> acceptance
>> > of
>> > > > new
>> > > > > > > submissions. Relying on the annotations alone makes them
>> spread
>> > all
>> > > > > over
>> > > > > > > the place and adds an additional layer of difficulty in
>> > > > identification
>> > > > > > not
>> > > > > > > just for users but also for developers who want to find such
>> > > > operators
>> > > > > > and
>> > > > > > > improve them. This of this like a folder level annotation
>> where
>> > > > > > everything
>> > > > > > > under this folder is unstable or evolving.
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> > > david@datatorrent.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Malhar in its current state, has way too many
>> operators
>> > > > that
>> > > > > > fall
>> > > > > > > > in
>> > > > > > > > > > the
>> > > > > > > > > > > > "non-production quality" category. We should make it
>> > > > obvious
>> > > > > to
>> > > > > > > > users
>> > > > > > > > > > > that
>> > > > > > > > > > > > which operators are up to par, and which operators
>> are
>> > > not,
>> > > > > and
>> > > > > > > > maybe
>> > > > > > > > > > > even
>> > > > > > > > > > > > remove those that are likely not ever used in a real
>> > use
>> > > > > case.
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > I am ambivalent about revisiting older operators and
>> > doing
>> > > > this
>> > > > > > > > > exercise
>> > > > > > > > > > as
>> > > > > > > > > > > this can cause unnecessary tensions. My original
>> intent
>> > is
>> > > > for
>> > > > > > > > > > > contributions going forward.
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > IMO it is important to address this as well. Operators
>> > > outside
>> > > > > the
>> > > > > > > play
>> > > > > > > > > > area should be of well known quality.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > I think this is important, and I don't anticipate much
>> > tension
>> > > if
>> > > > > we
>> > > > > > > > > establish clear criteria.
>> > > > > > > > > It's not helpful if we let the old subpar operators stay
>> and
>> > > put
>> > > > up
>> > > > > > the
>> > > > > > > > > bars for new operators.
>> > > > > > > > >
>> > > > > > > > > David
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Jaikit Jilka <jj...@leadferret.com>.
Hello,

I think there is some problem while reading the properties file. I tried to add these same properties manually before launching the application in the console. It ran but Input operator gave java.sql.SQLFeatureNotSupportedException. But when I again tried running the application without adding the properties manually in the console it failed. What might be the reason for such behavior?

Thank You,

Jaikit Jilka

----- Original Message -----
From: "Lakshmi Velineni" <la...@datatorrent.com>
To: "users" <us...@apex.apache.org>
Sent: Thursday, July 28, 2016 11:03:00 AM
Subject: Re: A proposal for Malhar

Hi,

I created a shared google sheet and tracked the various details of
operators. Currently, the sheet contains information about operators under
lib/algo only. Link is
https://docs.google.com/a/datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_CaWpXt3GDccM/edit?usp=sharing
.
Will update the sheet soon with lib/math too.

Thanks
Lakshmi Prasanna

On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com> wrote:

> Hi Lakshmi,
>
> Thanks for volunteering.
>
> I think Pramod's suggestion of putting the operators into 3 buckets and
> Siyuan's suggestion of starting a shared Google Sheet that tracks
> individual operators are both good, with the exception that lib/streamquery
> is one unit and we probably do not need to look at individual operators
> under it.
>
> If we don't have any objection in the community, let's start the process.
>
> David
>
> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> I am interested to work on this.
>>
>> Regards,
>> Lakshmi prasanna
>>
>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>> wrote:
>>
>> > Why not have a shared google sheet with a list of operators and options
>> > that we want to do with it.
>> > I think it's case by case.
>> > But retire unused or obsolete operators is important and we should do it
>> > sooner rather than later.
>> >
>> > Regards,
>> > Siyuan
>> >
>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> >
>> >>
>> >> My vote is to do 2&3
>> >>
>> >> Thks
>> >> Amol
>> >>
>> >>
>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> >> VKottapalli@directv.com> wrote:
>> >>
>> >>> +1 for deprecating the packages listed below.
>> >>>
>> >>> -----Original Message-----
>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> >>>
>> >>> +1
>> >>>
>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>>
>> >>> > Hi all,
>> >>> >
>> >>> > I would like to renew the discussion of retiring operators in
>> Malhar.
>> >>> >
>> >>> > As stated before, the reason why we would like to retire operators
>> in
>> >>> > Malhar is because some of them were written a long time ago before
>> >>> > Apache incubation, and they do not pertain to real use cases, are
>> not
>> >>> > up to par in code quality, have no potential for improvement, and
>> >>> > probably completely unused by anybody.
>> >>> >
>> >>> > We do not want contributors to use them as a model of their
>> >>> > contribution, or users to use them thinking they are of quality, and
>> >>> then hit a wall.
>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>> >>> >
>> >>> > The initial 3 packages that we would like to target are *lib/algo*,
>> >>> > *lib/math*, and *lib/streamquery*.
>> >>>
>> >>> >
>> >>> > I'm adding this thread to the users list. Please speak up if you are
>> >>> > using any operator in these 3 packages. We would like to hear from
>> you.
>> >>> >
>> >>> > These are the options I can think of for retiring those operators:
>> >>> >
>> >>> > 1) Completely remove them from the malhar repository.
>> >>> > 2) Move them from malhar-library into a separate artifact called
>> >>> > malhar-misc
>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>> >>> > longer supported
>> >>> >
>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >>> >
>> >>> > David
>> >>> >
>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> >>> > <pr...@datatorrent.com>
>> >>> > wrote:
>> >>> >
>> >>> >> I wanted to close the loop on this discussion. In general everyone
>> >>> >> seemed to be favorable to this idea with no serious objections.
>> Folks
>> >>> >> had good suggestions like documenting capabilities of operators,
>> come
>> >>> >> up well defined criteria for graduation of operators and what those
>> >>> >> criteria may be and what to do with existing operators that may not
>> >>> >> yet be mature or unused.
>> >>> >>
>> >>> >> I am going to summarize the key points that resulted from the
>> >>> >> discussion and would like to proceed with them.
>> >>> >>
>> >>> >>    - Operators that do not yet provide the key platform
>> capabilities
>> >>> to
>> >>> >>    make an operator useful across different applications such as
>> >>> >> reusability,
>> >>> >>    partitioning static or dynamic, idempotency, exactly once will
>> >>> still be
>> >>> >>    accepted as long as they are functionally correct, have unit
>> tests
>> >>> >> and will
>> >>> >>    go into a separate module.
>> >>> >>    - Contrib module was suggested as a place where new
>> contributions
>> >>> go in
>> >>> >>    that don't yet have all the platform capabilities and are not
>> yet
>> >>> >> mature.
>> >>> >>    If there are no other suggestions we will go with this one.
>> >>> >>    - It was suggested the operators documentation list those
>> platform
>> >>> >>    capabilities it currently provides from the list above. I will
>> >>> >> document a
>> >>> >>    structure for this in the contribution guidelines.
>> >>> >>    - Folks wanted to know what would be the criteria to graduate an
>> >>> >>    operator to the big leagues :). I will kick-off a separate
>> thread
>> >>> >> for it as
>> >>> >>    I think it requires its own discussion and hopefully we can come
>> >>> >> up with a
>> >>> >>    set of guidelines for it.
>> >>> >>    - David brought up state of some of the existing operators and
>> >>> their
>> >>> >>    retirement and the layout of operators in Malhar in general and
>> >>> how it
>> >>> >>    causes problems with development. I will ask him to lead the
>> >>> >> discussion on
>> >>> >>    that.
>> >>> >>
>> >>> >> Thanks
>> >>> >>
>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>> >>
>> >>> >> > The two ideas are not conflicting, but rather complementing.
>> >>> >> >
>> >>> >> > On the contrary, putting a new process for people trying to
>> >>> >> > contribute while NOT addressing the old unused subpar operators
>> in
>> >>> >> > the repository
>> >>> >> is
>> >>> >> > what is conflicting.
>> >>> >> >
>> >>> >> > Keep in mind that when people try to contribute, they always look
>> >>> >> > at the existing operators already in the repository as examples
>> and
>> >>> >> > likely a
>> >>> >> model
>> >>> >> > for their new operators.
>> >>> >> >
>> >>> >> > David
>> >>> >> >
>> >>> >> >
>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>> amol@datatorrent.com>
>> >>> >> wrote:
>> >>> >> >
>> >>> >> > > Yes there are two conflicting threads now. The original thread
>> >>> >> > > was to
>> >>> >> > open
>> >>> >> > > up a way for contributors to submit code in a dir (contrib?) as
>> >>> >> > > long
>> >>> >> as
>> >>> >> > > license part of taken care of.
>> >>> >> > >
>> >>> >> > > On the thread of removing non-used operators -> How do we know
>> >>> >> > > what is being used?
>> >>> >> > >
>> >>> >> > > Thks,
>> >>> >> > > Amol
>> >>> >> > >
>> >>> >> > >
>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >>> >> sandesh@datatorrent.com>
>> >>> >> > > wrote:
>> >>> >> > >
>> >>> >> > > > +1 for removing the not-used operators.
>> >>> >> > > >
>> >>> >> > > > So we are creating a process for operator writers who don't
>> >>> >> > > > want to understand the platform, yet wants to contribute? How
>> >>> >> > > > big is that
>> >>> >> set?
>> >>> >> > > > If we tell the app-user, here is the code which has not
>> passed
>> >>> >> > > > all
>> >>> >> the
>> >>> >> > > > checklist, will they be ready to use that in production?
>> >>> >> > > >
>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators
>> and
>> >>> >> > > > make
>> >>> >> it
>> >>> >> > > easy
>> >>> >> > > > to add more operators.
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >>> >> > pramod@datatorrent.com>
>> >>> >> > > > wrote:
>> >>> >> > > >
>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >>> >> > > gaurav.gopi123@gmail.com>
>> >>> >> > > > > wrote:
>> >>> >> > > > >
>> >>> >> > > > > > Pramod,
>> >>> >> > > > > >
>> >>> >> > > > > > By that logic I would say let's put all partitionable
>> >>> >> > > > > > operators
>> >>> >> > into
>> >>> >> > > > one
>> >>> >> > > > > > folder, non-partitionable operators in another and so
>> on...
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > > > Remember the original goal of making it easier for new
>> >>> >> > > > > members to contribute and managing those contributions to
>> >>> >> > > > > maturity. It is
>> >>> >> not a
>> >>> >> > > > > functional level separation.
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > > When I look at hadoop code I see these annotations being
>> >>> >> > > > > > used at
>> >>> >> > > class
>> >>> >> > > > > > level and not at package/folder level.
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>> like
>> >>> >> > > > > a
>> >>> >> > > folder..."
>> >>> >> > > > > as an analogy and not literally.
>> >>> >> > > > >
>> >>> >> > > > > Thanks
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > > Thanks
>> >>> >> > > > > >
>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >>> >> > > > pramod@datatorrent.com
>> >>> >> > > > > >
>> >>> >> > > > > > wrote:
>> >>> >> > > > > >
>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >>> >> > > > > gaurav.gopi123@gmail.com>
>> >>> >> > > > > > > wrote:
>> >>> >> > > > > > >
>> >>> >> > > > > > > > Can same goal not be achieved by using
>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> >>> >> > > > /
>> >>> >> > > > > > > >
>> org.apache.hadoop.classification.InterfaceStability.Uns
>> >>> >> > > > > > > > table
>> >>> >> > > > > > annotation?
>> >>> >> > > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > > > I think it is important to localize the additions in
>> one
>> >>> >> place so
>> >>> >> > > > that
>> >>> >> > > > > it
>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>> >>> >> > > > > > > these,
>> >>> >> > easier
>> >>> >> > > > for
>> >>> >> > > > > > > developers to track them towards the path to maturity
>> and
>> >>> >> > > > > > > also
>> >>> >> > > > > provides a
>> >>> >> > > > > > > clearer directive for committers and contributors on
>> >>> >> acceptance
>> >>> >> > of
>> >>> >> > > > new
>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>> them
>> >>> >> spread
>> >>> >> > all
>> >>> >> > > > > over
>> >>> >> > > > > > > the place and adds an additional layer of difficulty in
>> >>> >> > > > identification
>> >>> >> > > > > > not
>> >>> >> > > > > > > just for users but also for developers who want to find
>> >>> >> > > > > > > such
>> >>> >> > > > operators
>> >>> >> > > > > > and
>> >>> >> > > > > > > improve them. This of this like a folder level
>> annotation
>> >>> >> where
>> >>> >> > > > > > everything
>> >>> >> > > > > > > under this folder is unstable or evolving.
>> >>> >> > > > > > >
>> >>> >> > > > > > > Thanks
>> >>> >> > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > > > >
>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >>> >> > > david@datatorrent.com
>> >>> >> > > > >
>> >>> >> > > > > > > wrote:
>> >>> >> > > > > > > >
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > > >
>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too many
>> >>> >> operators
>> >>> >> > > > that
>> >>> >> > > > > > fall
>> >>> >> > > > > > > > in
>> >>> >> > > > > > > > > > the
>> >>> >> > > > > > > > > > > > "non-production quality" category. We should
>> >>> >> > > > > > > > > > > > make it
>> >>> >> > > > obvious
>> >>> >> > > > > to
>> >>> >> > > > > > > > users
>> >>> >> > > > > > > > > > > that
>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>> >>> >> > > > > > > > > > > > operators
>> >>> >> are
>> >>> >> > > not,
>> >>> >> > > > > and
>> >>> >> > > > > > > > maybe
>> >>> >> > > > > > > > > > > even
>> >>> >> > > > > > > > > > > > remove those that are likely not ever used
>> in a
>> >>> >> > > > > > > > > > > > real
>> >>> >> > use
>> >>> >> > > > > case.
>> >>> >> > > > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>> operators
>> >>> >> > > > > > > > > > > and
>> >>> >> > doing
>> >>> >> > > > this
>> >>> >> > > > > > > > > exercise
>> >>> >> > > > > > > > > > as
>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>> original
>> >>> >> intent
>> >>> >> > is
>> >>> >> > > > for
>> >>> >> > > > > > > > > > > contributions going forward.
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>> >>> >> > > > > > > > > > Operators
>> >>> >> > > outside
>> >>> >> > > > > the
>> >>> >> > > > > > > play
>> >>> >> > > > > > > > > > area should be of well known quality.
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > I think this is important, and I don't anticipate
>> >>> >> > > > > > > > > much
>> >>> >> > tension
>> >>> >> > > if
>> >>> >> > > > > we
>> >>> >> > > > > > > > > establish clear criteria.
>> >>> >> > > > > > > > > It's not helpful if we let the old subpar operators
>> >>> >> > > > > > > > > stay
>> >>> >> and
>> >>> >> > > put
>> >>> >> > > > up
>> >>> >> > > > > > the
>> >>> >> > > > > > > > > bars for new operators.
>> >>> >> > > > > > > > >
>> >>> >> > > > > > > > > David
>> >>> >> > > > > > > > >
>> >>> >> > > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > >
>> >>> >> > >
>> >>> >> >
>> >>> >>
>> >>> >
>> >>> >
>> >>>
>> >>
>> >>
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Timothy Farkas <ti...@gmail.com>.
+1 for options 2 and 3

On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
VKottapalli@directv.com> wrote:

> +1 for deprecating the packages listed below.
>
> -----Original Message-----
> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
> Sent: Tuesday, July 12, 2016 12:01 PM
>
> +1
>
> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> wrote:
>
> > Hi all,
> >
> > I would like to renew the discussion of retiring operators in Malhar.
> >
> > As stated before, the reason why we would like to retire operators in
> > Malhar is because some of them were written a long time ago before
> > Apache incubation, and they do not pertain to real use cases, are not
> > up to par in code quality, have no potential for improvement, and
> > probably completely unused by anybody.
> >
> > We do not want contributors to use them as a model of their
> > contribution, or users to use them thinking they are of quality, and
> then hit a wall.
> > Both scenarios are not beneficial to the reputation of Apex.
> >
> > The initial 3 packages that we would like to target are *lib/algo*,
> > *lib/math*, and *lib/streamquery*.
> >
> > I'm adding this thread to the users list. Please speak up if you are
> > using any operator in these 3 packages. We would like to hear from you.
> >
> > These are the options I can think of for retiring those operators:
> >
> > 1) Completely remove them from the malhar repository.
> > 2) Move them from malhar-library into a separate artifact called
> > malhar-misc
> > 3) Mark them deprecated and add to their javadoc that they are no
> > longer supported
> >
> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >
> > David
> >
> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> > <pr...@datatorrent.com>
> > wrote:
> >
> >> I wanted to close the loop on this discussion. In general everyone
> >> seemed to be favorable to this idea with no serious objections. Folks
> >> had good suggestions like documenting capabilities of operators, come
> >> up well defined criteria for graduation of operators and what those
> >> criteria may be and what to do with existing operators that may not
> >> yet be mature or unused.
> >>
> >> I am going to summarize the key points that resulted from the
> >> discussion and would like to proceed with them.
> >>
> >>    - Operators that do not yet provide the key platform capabilities to
> >>    make an operator useful across different applications such as
> >> reusability,
> >>    partitioning static or dynamic, idempotency, exactly once will still
> be
> >>    accepted as long as they are functionally correct, have unit tests
> >> and will
> >>    go into a separate module.
> >>    - Contrib module was suggested as a place where new contributions go
> in
> >>    that don't yet have all the platform capabilities and are not yet
> >> mature.
> >>    If there are no other suggestions we will go with this one.
> >>    - It was suggested the operators documentation list those platform
> >>    capabilities it currently provides from the list above. I will
> >> document a
> >>    structure for this in the contribution guidelines.
> >>    - Folks wanted to know what would be the criteria to graduate an
> >>    operator to the big leagues :). I will kick-off a separate thread
> >> for it as
> >>    I think it requires its own discussion and hopefully we can come
> >> up with a
> >>    set of guidelines for it.
> >>    - David brought up state of some of the existing operators and their
> >>    retirement and the layout of operators in Malhar in general and how
> it
> >>    causes problems with development. I will ask him to lead the
> >> discussion on
> >>    that.
> >>
> >> Thanks
> >>
> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
> wrote:
> >>
> >> > The two ideas are not conflicting, but rather complementing.
> >> >
> >> > On the contrary, putting a new process for people trying to
> >> > contribute while NOT addressing the old unused subpar operators in
> >> > the repository
> >> is
> >> > what is conflicting.
> >> >
> >> > Keep in mind that when people try to contribute, they always look
> >> > at the existing operators already in the repository as examples and
> >> > likely a
> >> model
> >> > for their new operators.
> >> >
> >> > David
> >> >
> >> >
> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
> >> wrote:
> >> >
> >> > > Yes there are two conflicting threads now. The original thread
> >> > > was to
> >> > open
> >> > > up a way for contributors to submit code in a dir (contrib?) as
> >> > > long
> >> as
> >> > > license part of taken care of.
> >> > >
> >> > > On the thread of removing non-used operators -> How do we know
> >> > > what is being used?
> >> > >
> >> > > Thks,
> >> > > Amol
> >> > >
> >> > >
> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >> sandesh@datatorrent.com>
> >> > > wrote:
> >> > >
> >> > > > +1 for removing the not-used operators.
> >> > > >
> >> > > > So we are creating a process for operator writers who don't
> >> > > > want to understand the platform, yet wants to contribute? How
> >> > > > big is that
> >> set?
> >> > > > If we tell the app-user, here is the code which has not passed
> >> > > > all
> >> the
> >> > > > checklist, will they be ready to use that in production?
> >> > > >
> >> > > > This thread has 2 conflicting forces, reduce the operators and
> >> > > > make
> >> it
> >> > > easy
> >> > > > to add more operators.
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> >> > pramod@datatorrent.com>
> >> > > > wrote:
> >> > > >
> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> >> > > gaurav.gopi123@gmail.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Pramod,
> >> > > > > >
> >> > > > > > By that logic I would say let's put all partitionable
> >> > > > > > operators
> >> > into
> >> > > > one
> >> > > > > > folder, non-partitionable operators in another and so on...
> >> > > > > >
> >> > > > >
> >> > > > > Remember the original goal of making it easier for new
> >> > > > > members to contribute and managing those contributions to
> >> > > > > maturity. It is
> >> not a
> >> > > > > functional level separation.
> >> > > > >
> >> > > > >
> >> > > > > > When I look at hadoop code I see these annotations being
> >> > > > > > used at
> >> > > class
> >> > > > > > level and not at package/folder level.
> >> > > > >
> >> > > > >
> >> > > > > I had a typo in my email, I meant to say "think of this like
> >> > > > > a
> >> > > folder..."
> >> > > > > as an analogy and not literally.
> >> > > > >
> >> > > > > Thanks
> >> > > > >
> >> > > > >
> >> > > > > > Thanks
> >> > > > > >
> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> >> > > > pramod@datatorrent.com
> >> > > > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> >> > > > > gaurav.gopi123@gmail.com>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Can same goal not be achieved by using
> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> >> > > > /
> >> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Uns
> >> > > > > > > > table
> >> > > > > > annotation?
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > I think it is important to localize the additions in one
> >> place so
> >> > > > that
> >> > > > > it
> >> > > > > > > becomes clearer to users about the maturity level of
> >> > > > > > > these,
> >> > easier
> >> > > > for
> >> > > > > > > developers to track them towards the path to maturity and
> >> > > > > > > also
> >> > > > > provides a
> >> > > > > > > clearer directive for committers and contributors on
> >> acceptance
> >> > of
> >> > > > new
> >> > > > > > > submissions. Relying on the annotations alone makes them
> >> spread
> >> > all
> >> > > > > over
> >> > > > > > > the place and adds an additional layer of difficulty in
> >> > > > identification
> >> > > > > > not
> >> > > > > > > just for users but also for developers who want to find
> >> > > > > > > such
> >> > > > operators
> >> > > > > > and
> >> > > > > > > improve them. This of this like a folder level annotation
> >> where
> >> > > > > > everything
> >> > > > > > > under this folder is unstable or evolving.
> >> > > > > > >
> >> > > > > > > Thanks
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> >> > > david@datatorrent.com
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Malhar in its current state, has way too many
> >> operators
> >> > > > that
> >> > > > > > fall
> >> > > > > > > > in
> >> > > > > > > > > > the
> >> > > > > > > > > > > > "non-production quality" category. We should
> >> > > > > > > > > > > > make it
> >> > > > obvious
> >> > > > > to
> >> > > > > > > > users
> >> > > > > > > > > > > that
> >> > > > > > > > > > > > which operators are up to par, and which
> >> > > > > > > > > > > > operators
> >> are
> >> > > not,
> >> > > > > and
> >> > > > > > > > maybe
> >> > > > > > > > > > > even
> >> > > > > > > > > > > > remove those that are likely not ever used in a
> >> > > > > > > > > > > > real
> >> > use
> >> > > > > case.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > I am ambivalent about revisiting older operators
> >> > > > > > > > > > > and
> >> > doing
> >> > > > this
> >> > > > > > > > > exercise
> >> > > > > > > > > > as
> >> > > > > > > > > > > this can cause unnecessary tensions. My original
> >> intent
> >> > is
> >> > > > for
> >> > > > > > > > > > > contributions going forward.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > IMO it is important to address this as well.
> >> > > > > > > > > > Operators
> >> > > outside
> >> > > > > the
> >> > > > > > > play
> >> > > > > > > > > > area should be of well known quality.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > I think this is important, and I don't anticipate
> >> > > > > > > > > much
> >> > tension
> >> > > if
> >> > > > > we
> >> > > > > > > > > establish clear criteria.
> >> > > > > > > > > It's not helpful if we let the old subpar operators
> >> > > > > > > > > stay
> >> and
> >> > > put
> >> > > > up
> >> > > > > > the
> >> > > > > > > > > bars for new operators.
> >> > > > > > > > >
> >> > > > > > > > > David
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: A proposal for Malhar

Posted by Chinmay Kolhatkar <ch...@datatorrent.com>.
+1. This is a really good starting point to cleanup malhar.

On Wed, Jul 13, 2016 at 3:06 AM, David Yan <da...@datatorrent.com> wrote:

> Hi Lakshmi,
>
> Thanks for volunteering.
>
> I think Pramod's suggestion of putting the operators into 3 buckets and
> Siyuan's suggestion of starting a shared Google Sheet that tracks
> individual operators are both good, with the exception that lib/streamquery
> is one unit and we probably do not need to look at individual operators
> under it.
>
> If we don't have any objection in the community, let's start the process.
>
> David
>
> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> I am interested to work on this.
>>
>> Regards,
>> Lakshmi prasanna
>>
>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>> wrote:
>>
>> > Why not have a shared google sheet with a list of operators and options
>> > that we want to do with it.
>> > I think it's case by case.
>> > But retire unused or obsolete operators is important and we should do it
>> > sooner rather than later.
>> >
>> > Regards,
>> > Siyuan
>> >
>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> >
>> >>
>> >> My vote is to do 2&3
>> >>
>> >> Thks
>> >> Amol
>> >>
>> >>
>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> >> VKottapalli@directv.com> wrote:
>> >>
>> >>> +1 for deprecating the packages listed below.
>> >>>
>> >>> -----Original Message-----
>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> >>>
>> >>> +1
>> >>>
>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>>
>> >>> > Hi all,
>> >>> >
>> >>> > I would like to renew the discussion of retiring operators in
>> Malhar.
>> >>> >
>> >>> > As stated before, the reason why we would like to retire operators
>> in
>> >>> > Malhar is because some of them were written a long time ago before
>> >>> > Apache incubation, and they do not pertain to real use cases, are
>> not
>> >>> > up to par in code quality, have no potential for improvement, and
>> >>> > probably completely unused by anybody.
>> >>> >
>> >>> > We do not want contributors to use them as a model of their
>> >>> > contribution, or users to use them thinking they are of quality, and
>> >>> then hit a wall.
>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>> >>> >
>> >>> > The initial 3 packages that we would like to target are *lib/algo*,
>> >>> > *lib/math*, and *lib/streamquery*.
>> >>>
>> >>> >
>> >>> > I'm adding this thread to the users list. Please speak up if you are
>> >>> > using any operator in these 3 packages. We would like to hear from
>> you.
>> >>> >
>> >>> > These are the options I can think of for retiring those operators:
>> >>> >
>> >>> > 1) Completely remove them from the malhar repository.
>> >>> > 2) Move them from malhar-library into a separate artifact called
>> >>> > malhar-misc
>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>> >>> > longer supported
>> >>> >
>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >>> >
>> >>> > David
>> >>> >
>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> >>> > <pr...@datatorrent.com>
>> >>> > wrote:
>> >>> >
>> >>> >> I wanted to close the loop on this discussion. In general everyone
>> >>> >> seemed to be favorable to this idea with no serious objections.
>> Folks
>> >>> >> had good suggestions like documenting capabilities of operators,
>> come
>> >>> >> up well defined criteria for graduation of operators and what those
>> >>> >> criteria may be and what to do with existing operators that may not
>> >>> >> yet be mature or unused.
>> >>> >>
>> >>> >> I am going to summarize the key points that resulted from the
>> >>> >> discussion and would like to proceed with them.
>> >>> >>
>> >>> >>    - Operators that do not yet provide the key platform
>> capabilities
>> >>> to
>> >>> >>    make an operator useful across different applications such as
>> >>> >> reusability,
>> >>> >>    partitioning static or dynamic, idempotency, exactly once will
>> >>> still be
>> >>> >>    accepted as long as they are functionally correct, have unit
>> tests
>> >>> >> and will
>> >>> >>    go into a separate module.
>> >>> >>    - Contrib module was suggested as a place where new
>> contributions
>> >>> go in
>> >>> >>    that don't yet have all the platform capabilities and are not
>> yet
>> >>> >> mature.
>> >>> >>    If there are no other suggestions we will go with this one.
>> >>> >>    - It was suggested the operators documentation list those
>> platform
>> >>> >>    capabilities it currently provides from the list above. I will
>> >>> >> document a
>> >>> >>    structure for this in the contribution guidelines.
>> >>> >>    - Folks wanted to know what would be the criteria to graduate an
>> >>> >>    operator to the big leagues :). I will kick-off a separate
>> thread
>> >>> >> for it as
>> >>> >>    I think it requires its own discussion and hopefully we can come
>> >>> >> up with a
>> >>> >>    set of guidelines for it.
>> >>> >>    - David brought up state of some of the existing operators and
>> >>> their
>> >>> >>    retirement and the layout of operators in Malhar in general and
>> >>> how it
>> >>> >>    causes problems with development. I will ask him to lead the
>> >>> >> discussion on
>> >>> >>    that.
>> >>> >>
>> >>> >> Thanks
>> >>> >>
>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>> >>
>> >>> >> > The two ideas are not conflicting, but rather complementing.
>> >>> >> >
>> >>> >> > On the contrary, putting a new process for people trying to
>> >>> >> > contribute while NOT addressing the old unused subpar operators
>> in
>> >>> >> > the repository
>> >>> >> is
>> >>> >> > what is conflicting.
>> >>> >> >
>> >>> >> > Keep in mind that when people try to contribute, they always look
>> >>> >> > at the existing operators already in the repository as examples
>> and
>> >>> >> > likely a
>> >>> >> model
>> >>> >> > for their new operators.
>> >>> >> >
>> >>> >> > David
>> >>> >> >
>> >>> >> >
>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>> amol@datatorrent.com>
>> >>> >> wrote:
>> >>> >> >
>> >>> >> > > Yes there are two conflicting threads now. The original thread
>> >>> >> > > was to
>> >>> >> > open
>> >>> >> > > up a way for contributors to submit code in a dir (contrib?) as
>> >>> >> > > long
>> >>> >> as
>> >>> >> > > license part of taken care of.
>> >>> >> > >
>> >>> >> > > On the thread of removing non-used operators -> How do we know
>> >>> >> > > what is being used?
>> >>> >> > >
>> >>> >> > > Thks,
>> >>> >> > > Amol
>> >>> >> > >
>> >>> >> > >
>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >>> >> sandesh@datatorrent.com>
>> >>> >> > > wrote:
>> >>> >> > >
>> >>> >> > > > +1 for removing the not-used operators.
>> >>> >> > > >
>> >>> >> > > > So we are creating a process for operator writers who don't
>> >>> >> > > > want to understand the platform, yet wants to contribute? How
>> >>> >> > > > big is that
>> >>> >> set?
>> >>> >> > > > If we tell the app-user, here is the code which has not
>> passed
>> >>> >> > > > all
>> >>> >> the
>> >>> >> > > > checklist, will they be ready to use that in production?
>> >>> >> > > >
>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators
>> and
>> >>> >> > > > make
>> >>> >> it
>> >>> >> > > easy
>> >>> >> > > > to add more operators.
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >>> >> > pramod@datatorrent.com>
>> >>> >> > > > wrote:
>> >>> >> > > >
>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >>> >> > > gaurav.gopi123@gmail.com>
>> >>> >> > > > > wrote:
>> >>> >> > > > >
>> >>> >> > > > > > Pramod,
>> >>> >> > > > > >
>> >>> >> > > > > > By that logic I would say let's put all partitionable
>> >>> >> > > > > > operators
>> >>> >> > into
>> >>> >> > > > one
>> >>> >> > > > > > folder, non-partitionable operators in another and so
>> on...
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > > > Remember the original goal of making it easier for new
>> >>> >> > > > > members to contribute and managing those contributions to
>> >>> >> > > > > maturity. It is
>> >>> >> not a
>> >>> >> > > > > functional level separation.
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > > When I look at hadoop code I see these annotations being
>> >>> >> > > > > > used at
>> >>> >> > > class
>> >>> >> > > > > > level and not at package/folder level.
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>> like
>> >>> >> > > > > a
>> >>> >> > > folder..."
>> >>> >> > > > > as an analogy and not literally.
>> >>> >> > > > >
>> >>> >> > > > > Thanks
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > > Thanks
>> >>> >> > > > > >
>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >>> >> > > > pramod@datatorrent.com
>> >>> >> > > > > >
>> >>> >> > > > > > wrote:
>> >>> >> > > > > >
>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >>> >> > > > > gaurav.gopi123@gmail.com>
>> >>> >> > > > > > > wrote:
>> >>> >> > > > > > >
>> >>> >> > > > > > > > Can same goal not be achieved by using
>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> >>> >> > > > /
>> >>> >> > > > > > > >
>> org.apache.hadoop.classification.InterfaceStability.Uns
>> >>> >> > > > > > > > table
>> >>> >> > > > > > annotation?
>> >>> >> > > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > > > I think it is important to localize the additions in
>> one
>> >>> >> place so
>> >>> >> > > > that
>> >>> >> > > > > it
>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>> >>> >> > > > > > > these,
>> >>> >> > easier
>> >>> >> > > > for
>> >>> >> > > > > > > developers to track them towards the path to maturity
>> and
>> >>> >> > > > > > > also
>> >>> >> > > > > provides a
>> >>> >> > > > > > > clearer directive for committers and contributors on
>> >>> >> acceptance
>> >>> >> > of
>> >>> >> > > > new
>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>> them
>> >>> >> spread
>> >>> >> > all
>> >>> >> > > > > over
>> >>> >> > > > > > > the place and adds an additional layer of difficulty in
>> >>> >> > > > identification
>> >>> >> > > > > > not
>> >>> >> > > > > > > just for users but also for developers who want to find
>> >>> >> > > > > > > such
>> >>> >> > > > operators
>> >>> >> > > > > > and
>> >>> >> > > > > > > improve them. This of this like a folder level
>> annotation
>> >>> >> where
>> >>> >> > > > > > everything
>> >>> >> > > > > > > under this folder is unstable or evolving.
>> >>> >> > > > > > >
>> >>> >> > > > > > > Thanks
>> >>> >> > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > > > >
>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >>> >> > > david@datatorrent.com
>> >>> >> > > > >
>> >>> >> > > > > > > wrote:
>> >>> >> > > > > > > >
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > > >
>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too many
>> >>> >> operators
>> >>> >> > > > that
>> >>> >> > > > > > fall
>> >>> >> > > > > > > > in
>> >>> >> > > > > > > > > > the
>> >>> >> > > > > > > > > > > > "non-production quality" category. We should
>> >>> >> > > > > > > > > > > > make it
>> >>> >> > > > obvious
>> >>> >> > > > > to
>> >>> >> > > > > > > > users
>> >>> >> > > > > > > > > > > that
>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>> >>> >> > > > > > > > > > > > operators
>> >>> >> are
>> >>> >> > > not,
>> >>> >> > > > > and
>> >>> >> > > > > > > > maybe
>> >>> >> > > > > > > > > > > even
>> >>> >> > > > > > > > > > > > remove those that are likely not ever used
>> in a
>> >>> >> > > > > > > > > > > > real
>> >>> >> > use
>> >>> >> > > > > case.
>> >>> >> > > > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>> operators
>> >>> >> > > > > > > > > > > and
>> >>> >> > doing
>> >>> >> > > > this
>> >>> >> > > > > > > > > exercise
>> >>> >> > > > > > > > > > as
>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>> original
>> >>> >> intent
>> >>> >> > is
>> >>> >> > > > for
>> >>> >> > > > > > > > > > > contributions going forward.
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>> >>> >> > > > > > > > > > Operators
>> >>> >> > > outside
>> >>> >> > > > > the
>> >>> >> > > > > > > play
>> >>> >> > > > > > > > > > area should be of well known quality.
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > I think this is important, and I don't anticipate
>> >>> >> > > > > > > > > much
>> >>> >> > tension
>> >>> >> > > if
>> >>> >> > > > > we
>> >>> >> > > > > > > > > establish clear criteria.
>> >>> >> > > > > > > > > It's not helpful if we let the old subpar operators
>> >>> >> > > > > > > > > stay
>> >>> >> and
>> >>> >> > > put
>> >>> >> > > > up
>> >>> >> > > > > > the
>> >>> >> > > > > > > > > bars for new operators.
>> >>> >> > > > > > > > >
>> >>> >> > > > > > > > > David
>> >>> >> > > > > > > > >
>> >>> >> > > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > >
>> >>> >> > >
>> >>> >> >
>> >>> >>
>> >>> >
>> >>> >
>> >>>
>> >>
>> >>
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Thomas thanks for the suggestions and the comments in the document. I will
take another look at the ones that I had shortlisted in the document to
keep. Within that subset, would it be ok to leave the ones that don't have
a large state problem, for the time being, till we have replacement
operators implemented with the new windowing and state management. After
the cleanup, I can also help in the development effort of those replacement
operators as well.

Thanks

On Tue, Aug 9, 2016 at 11:21 AM, Thomas Weise <th...@gmail.com>
wrote:

> There are a bunch of operators that don't have proper state management and
> also don't support generic windowing (event time etc.). I would suggest to
> move those out or deprecate them.
>
> The new windowing and state management support along with the appropriate
> aggregators is going to make them obsolete.
>
> Thomas
>
>
> On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> Hi,
>>
>> Friendly Reminder :
>>
>> I created a shared google sheet and tracked the various details of
>> operators. The sheet contains information about operators under lib/algo,
>> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
>> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
>> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
>> operator . Please take a look and provide comments as if any.
>>
>> Thanks
>> Lakshmi Prasanna
>>
>> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
>> wrote:
>>
>>> Added comments, also recommend having the misc folder for the remaining
>>> operators in contrib according to proposed guidelines
>>>
>>> https://github.com/apache/apex-site/pull/44
>>>
>>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com>
>>> wrote:
>>>
>>> > Hi
>>> >
>>> > I also added recommendation for lib/math operators to the same
>>> document as
>>> > a separate sheet. Please have a look.
>>> >
>>> > Thanks
>>> > Lakshmi Prasanna
>>> >
>>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com
>>> > > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I also added recommendation for each operator . Please take a look.
>>> >>
>>> >> thanks
>>> >>
>>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>>> >> lakshmi@datatorrent.com> wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>> I created a shared google sheet and tracked the various details of
>>> >>> operators. Currently, the sheet contains information about operators
>>> under
>>> >>> lib/algo only. Link is https://docs.google.com/a/
>>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>>
>>> >>> lib/math too.
>>> >>>
>>> >>> Thanks
>>> >>> Lakshmi Prasanna
>>> >>>
>>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>>> >>> wrote:
>>> >>>
>>> >>>> Hi Lakshmi,
>>> >>>>
>>> >>>> Thanks for volunteering.
>>> >>>>
>>> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
>>> and
>>> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>> >>>> individual operators are both good, with the exception that
>>> lib/streamquery
>>> >>>> is one unit and we probably do not need to look at individual
>>> operators
>>> >>>> under it.
>>> >>>>
>>> >>>> If we don't have any objection in the community, let's start the
>>> >>>> process.
>>> >>>>
>>> >>>> David
>>> >>>>
>>> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>> >>>> lakshmi@datatorrent.com> wrote:
>>> >>>>
>>> >>>>> I am interested to work on this.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Lakshmi prasanna
>>> >>>>>
>>> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <
>>> hsy541@gmail.com>
>>> >>>>> wrote:
>>> >>>>>
>>> >>>>> > Why not have a shared google sheet with a list of operators and
>>> >>>>> options
>>> >>>>> > that we want to do with it.
>>> >>>>> > I think it's case by case.
>>> >>>>> > But retire unused or obsolete operators is important and we
>>> should
>>> >>>>> do it
>>> >>>>> > sooner rather than later.
>>> >>>>> >
>>> >>>>> > Regards,
>>> >>>>> > Siyuan
>>> >>>>> >
>>> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <
>>> amol@datatorrent.com>
>>> >>>>> wrote:
>>> >>>>> >
>>> >>>>> >>
>>> >>>>> >> My vote is to do 2&3
>>> >>>>> >>
>>> >>>>> >> Thks
>>> >>>>> >> Amol
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>> >>>>> >> VKottapalli@directv.com> wrote:
>>> >>>>> >>
>>> >>>>> >>> +1 for deprecating the packages listed below.
>>> >>>>> >>>
>>> >>>>> >>> -----Original Message-----
>>> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>> >>>>> >>>
>>> >>>>> >>> +1
>>> >>>>> >>>
>>> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>>> david@datatorrent.com
>>> >>>>> >
>>> >>>>> >>> wrote:
>>> >>>>> >>>
>>> >>>>> >>> > Hi all,
>>> >>>>> >>> >
>>> >>>>> >>> > I would like to renew the discussion of retiring operators in
>>> >>>>> Malhar.
>>> >>>>> >>> >
>>> >>>>> >>> > As stated before, the reason why we would like to retire
>>> >>>>> operators in
>>> >>>>> >>> > Malhar is because some of them were written a long time ago
>>> >>>>> before
>>> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>>> >>>>> are not
>>> >>>>> >>> > up to par in code quality, have no potential for
>>> improvement, and
>>> >>>>> >>> > probably completely unused by anybody.
>>> >>>>> >>> >
>>> >>>>> >>> > We do not want contributors to use them as a model of their
>>> >>>>> >>> > contribution, or users to use them thinking they are of
>>> quality,
>>> >>>>> and
>>> >>>>> >>> then hit a wall.
>>> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >>>>> >>> >
>>> >>>>> >>> > The initial 3 packages that we would like to target are
>>> >>>>> *lib/algo*,
>>> >>>>> >>> > *lib/math*, and *lib/streamquery*.
>>> >>>>> >>>
>>> >>>>> >>> >
>>> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
>>> you
>>> >>>>> are
>>> >>>>> >>> > using any operator in these 3 packages. We would like to hear
>>> >>>>> from you.
>>> >>>>> >>> >
>>> >>>>> >>> > These are the options I can think of for retiring those
>>> >>>>> operators:
>>> >>>>> >>> >
>>> >>>>> >>> > 1) Completely remove them from the malhar repository.
>>> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
>>> called
>>> >>>>> >>> > malhar-misc
>>> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they
>>> are no
>>> >>>>> >>> > longer supported
>>> >>>>> >>> >
>>> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >>>>> >>> >
>>> >>>>> >>> > David
>>> >>>>> >>> >
>>> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> >>>>> >>> > <pr...@datatorrent.com>
>>> >>>>> >>> > wrote:
>>> >>>>> >>> >
>>> >>>>> >>> >> I wanted to close the loop on this discussion. In general
>>> >>>>> everyone
>>> >>>>> >>> >> seemed to be favorable to this idea with no serious
>>> objections.
>>> >>>>> Folks
>>> >>>>> >>> >> had good suggestions like documenting capabilities of
>>> >>>>> operators, come
>>> >>>>> >>> >> up well defined criteria for graduation of operators and
>>> what
>>> >>>>> those
>>> >>>>> >>> >> criteria may be and what to do with existing operators that
>>> may
>>> >>>>> not
>>> >>>>> >>> >> yet be mature or unused.
>>> >>>>> >>> >>
>>> >>>>> >>> >> I am going to summarize the key points that resulted from
>>> the
>>> >>>>> >>> >> discussion and would like to proceed with them.
>>> >>>>> >>> >>
>>> >>>>> >>> >>    - Operators that do not yet provide the key platform
>>> >>>>> capabilities
>>> >>>>> >>> to
>>> >>>>> >>> >>    make an operator useful across different applications
>>> such as
>>> >>>>> >>> >> reusability,
>>> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>>> >>>>> will
>>> >>>>> >>> still be
>>> >>>>> >>> >>    accepted as long as they are functionally correct, have
>>> unit
>>> >>>>> tests
>>> >>>>> >>> >> and will
>>> >>>>> >>> >>    go into a separate module.
>>> >>>>> >>> >>    - Contrib module was suggested as a place where new
>>> >>>>> contributions
>>> >>>>> >>> go in
>>> >>>>> >>> >>    that don't yet have all the platform capabilities and are
>>> >>>>> not yet
>>> >>>>> >>> >> mature.
>>> >>>>> >>> >>    If there are no other suggestions we will go with this
>>> one.
>>> >>>>> >>> >>    - It was suggested the operators documentation list those
>>> >>>>> platform
>>> >>>>> >>> >>    capabilities it currently provides from the list above. I
>>> >>>>> will
>>> >>>>> >>> >> document a
>>> >>>>> >>> >>    structure for this in the contribution guidelines.
>>> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
>>> >>>>> graduate an
>>> >>>>> >>> >>    operator to the big leagues :). I will kick-off a
>>> separate
>>> >>>>> thread
>>> >>>>> >>> >> for it as
>>> >>>>> >>> >>    I think it requires its own discussion and hopefully we
>>> can
>>> >>>>> come
>>> >>>>> >>> >> up with a
>>> >>>>> >>> >>    set of guidelines for it.
>>> >>>>> >>> >>    - David brought up state of some of the existing
>>> operators
>>> >>>>> and
>>> >>>>> >>> their
>>> >>>>> >>> >>    retirement and the layout of operators in Malhar in
>>> general
>>> >>>>> and
>>> >>>>> >>> how it
>>> >>>>> >>> >>    causes problems with development. I will ask him to lead
>>> the
>>> >>>>> >>> >> discussion on
>>> >>>>> >>> >>    that.
>>> >>>>> >>> >>
>>> >>>>> >>> >> Thanks
>>> >>>>> >>> >>
>>> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>> >>>>> david@datatorrent.com>
>>> >>>>> >>> wrote:
>>> >>>>> >>> >>
>>> >>>>> >>> >> > The two ideas are not conflicting, but rather
>>> complementing.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > On the contrary, putting a new process for people trying
>>> to
>>> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>> >>>>> operators in
>>> >>>>> >>> >> > the repository
>>> >>>>> >>> >> is
>>> >>>>> >>> >> > what is conflicting.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > Keep in mind that when people try to contribute, they
>>> always
>>> >>>>> look
>>> >>>>> >>> >> > at the existing operators already in the repository as
>>> >>>>> examples and
>>> >>>>> >>> >> > likely a
>>> >>>>> >>> >> model
>>> >>>>> >>> >> > for their new operators.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > David
>>> >>>>> >>> >> >
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>> >>>>> amol@datatorrent.com>
>>> >>>>> >>> >> wrote:
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>> >>>>> thread
>>> >>>>> >>> >> > > was to
>>> >>>>> >>> >> > open
>>> >>>>> >>> >> > > up a way for contributors to submit code in a dir
>>> >>>>> (contrib?) as
>>> >>>>> >>> >> > > long
>>> >>>>> >>> >> as
>>> >>>>> >>> >> > > license part of taken care of.
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > On the thread of removing non-used operators -> How do
>>> we
>>> >>>>> know
>>> >>>>> >>> >> > > what is being used?
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > Thks,
>>> >>>>> >>> >> > > Amol
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>> >>>>> >>> >> sandesh@datatorrent.com>
>>> >>>>> >>> >> > > wrote:
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > > +1 for removing the not-used operators.
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > So we are creating a process for operator writers who
>>> >>>>> don't
>>> >>>>> >>> >> > > > want to understand the platform, yet wants to
>>> contribute?
>>> >>>>> How
>>> >>>>> >>> >> > > > big is that
>>> >>>>> >>> >> set?
>>> >>>>> >>> >> > > > If we tell the app-user, here is the code which has
>>> not
>>> >>>>> passed
>>> >>>>> >>> >> > > > all
>>> >>>>> >>> >> the
>>> >>>>> >>> >> > > > checklist, will they be ready to use that in
>>> production?
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>>> >>>>> operators and
>>> >>>>> >>> >> > > > make
>>> >>>>> >>> >> it
>>> >>>>> >>> >> > > easy
>>> >>>>> >>> >> > > > to add more operators.
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>> >>>>> >>> >> > pramod@datatorrent.com>
>>> >>>>> >>> >> > > > wrote:
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>> >>>>> >>> >> > > > > wrote:
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > Pramod,
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > By that logic I would say let's put all
>>> partitionable
>>> >>>>> >>> >> > > > > > operators
>>> >>>>> >>> >> > into
>>> >>>>> >>> >> > > > one
>>> >>>>> >>> >> > > > > > folder, non-partitionable operators in another
>>> and so
>>> >>>>> on...
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > Remember the original goal of making it easier for
>>> new
>>> >>>>> >>> >> > > > > members to contribute and managing those
>>> contributions
>>> >>>>> to
>>> >>>>> >>> >> > > > > maturity. It is
>>> >>>>> >>> >> not a
>>> >>>>> >>> >> > > > > functional level separation.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>> >>>>> being
>>> >>>>> >>> >> > > > > > used at
>>> >>>>> >>> >> > > class
>>> >>>>> >>> >> > > > > > level and not at package/folder level.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
>>> this
>>> >>>>> like
>>> >>>>> >>> >> > > > > a
>>> >>>>> >>> >> > > folder..."
>>> >>>>> >>> >> > > > > as an analogy and not literally.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > Thanks
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > Thanks
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>> >>>>> >>> >> > > > pramod@datatorrent.com
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > wrote:
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>> >>>>> >>> >> > > > > > > wrote:
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>> >>>>> >>> >> > > org.apache.hadoop.classification.
>>> >>>>> InterfaceStability.Evolving
>>> >>>>> >>> >> > > > /
>>> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>>> >>>>> InterfaceStability.Uns
>>> >>>>> >>> >> > > > > > > > table
>>> >>>>> >>> >> > > > > > annotation?
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > I think it is important to localize the
>>> additions
>>> >>>>> in one
>>> >>>>> >>> >> place so
>>> >>>>> >>> >> > > > that
>>> >>>>> >>> >> > > > > it
>>> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity
>>> level of
>>> >>>>> >>> >> > > > > > > these,
>>> >>>>> >>> >> > easier
>>> >>>>> >>> >> > > > for
>>> >>>>> >>> >> > > > > > > developers to track them towards the path to
>>> >>>>> maturity and
>>> >>>>> >>> >> > > > > > > also
>>> >>>>> >>> >> > > > > provides a
>>> >>>>> >>> >> > > > > > > clearer directive for committers and
>>> contributors on
>>> >>>>> >>> >> acceptance
>>> >>>>> >>> >> > of
>>> >>>>> >>> >> > > > new
>>> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
>>> makes
>>> >>>>> them
>>> >>>>> >>> >> spread
>>> >>>>> >>> >> > all
>>> >>>>> >>> >> > > > > over
>>> >>>>> >>> >> > > > > > > the place and adds an additional layer of
>>> >>>>> difficulty in
>>> >>>>> >>> >> > > > identification
>>> >>>>> >>> >> > > > > > not
>>> >>>>> >>> >> > > > > > > just for users but also for developers who want
>>> to
>>> >>>>> find
>>> >>>>> >>> >> > > > > > > such
>>> >>>>> >>> >> > > > operators
>>> >>>>> >>> >> > > > > > and
>>> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>> >>>>> annotation
>>> >>>>> >>> >> where
>>> >>>>> >>> >> > > > > > everything
>>> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > Thanks
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>> >>>>> >>> >> > > david@datatorrent.com
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > > wrote:
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way
>>> too
>>> >>>>> many
>>> >>>>> >>> >> operators
>>> >>>>> >>> >> > > > that
>>> >>>>> >>> >> > > > > > fall
>>> >>>>> >>> >> > > > > > > > in
>>> >>>>> >>> >> > > > > > > > > > the
>>> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>> >>>>> should
>>> >>>>> >>> >> > > > > > > > > > > > make it
>>> >>>>> >>> >> > > > obvious
>>> >>>>> >>> >> > > > > to
>>> >>>>> >>> >> > > > > > > > users
>>> >>>>> >>> >> > > > > > > > > > > that
>>> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and
>>> which
>>> >>>>> >>> >> > > > > > > > > > > > operators
>>> >>>>> >>> >> are
>>> >>>>> >>> >> > > not,
>>> >>>>> >>> >> > > > > and
>>> >>>>> >>> >> > > > > > > > maybe
>>> >>>>> >>> >> > > > > > > > > > > even
>>> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>>> >>>>> used in a
>>> >>>>> >>> >> > > > > > > > > > > > real
>>> >>>>> >>> >> > use
>>> >>>>> >>> >> > > > > case.
>>> >>>>> >>> >> > > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>> >>>>> operators
>>> >>>>> >>> >> > > > > > > > > > > and
>>> >>>>> >>> >> > doing
>>> >>>>> >>> >> > > > this
>>> >>>>> >>> >> > > > > > > > > exercise
>>> >>>>> >>> >> > > > > > > > > > as
>>> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>> >>>>> original
>>> >>>>> >>> >> intent
>>> >>>>> >>> >> > is
>>> >>>>> >>> >> > > > for
>>> >>>>> >>> >> > > > > > > > > > > contributions going forward.
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as
>>> well.
>>> >>>>> >>> >> > > > > > > > > > Operators
>>> >>>>> >>> >> > > outside
>>> >>>>> >>> >> > > > > the
>>> >>>>> >>> >> > > > > > > play
>>> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
>>> >>>>> anticipate
>>> >>>>> >>> >> > > > > > > > > much
>>> >>>>> >>> >> > tension
>>> >>>>> >>> >> > > if
>>> >>>>> >>> >> > > > > we
>>> >>>>> >>> >> > > > > > > > > establish clear criteria.
>>> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>> >>>>> operators
>>> >>>>> >>> >> > > > > > > > > stay
>>> >>>>> >>> >> and
>>> >>>>> >>> >> > > put
>>> >>>>> >>> >> > > > up
>>> >>>>> >>> >> > > > > > the
>>> >>>>> >>> >> > > > > > > > > bars for new operators.
>>> >>>>> >>> >> > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > David
>>> >>>>> >>> >> > > > > > > > >
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> >
>>> >>>>> >>> >>
>>> >>>>> >>> >
>>> >>>>> >>> >
>>> >>>>> >>>
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>>
>>
>>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Thomas thanks for the suggestions and the comments in the document. I will
take another look at the ones that I had shortlisted in the document to
keep. Within that subset, would it be ok to leave the ones that don't have
a large state problem, for the time being, till we have replacement
operators implemented with the new windowing and state management. After
the cleanup, I can also help in the development effort of those replacement
operators as well.

Thanks

On Tue, Aug 9, 2016 at 11:21 AM, Thomas Weise <th...@gmail.com>
wrote:

> There are a bunch of operators that don't have proper state management and
> also don't support generic windowing (event time etc.). I would suggest to
> move those out or deprecate them.
>
> The new windowing and state management support along with the appropriate
> aggregators is going to make them obsolete.
>
> Thomas
>
>
> On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> Hi,
>>
>> Friendly Reminder :
>>
>> I created a shared google sheet and tracked the various details of
>> operators. The sheet contains information about operators under lib/algo,
>> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
>> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
>> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
>> operator . Please take a look and provide comments as if any.
>>
>> Thanks
>> Lakshmi Prasanna
>>
>> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
>> wrote:
>>
>>> Added comments, also recommend having the misc folder for the remaining
>>> operators in contrib according to proposed guidelines
>>>
>>> https://github.com/apache/apex-site/pull/44
>>>
>>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com>
>>> wrote:
>>>
>>> > Hi
>>> >
>>> > I also added recommendation for lib/math operators to the same
>>> document as
>>> > a separate sheet. Please have a look.
>>> >
>>> > Thanks
>>> > Lakshmi Prasanna
>>> >
>>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com
>>> > > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I also added recommendation for each operator . Please take a look.
>>> >>
>>> >> thanks
>>> >>
>>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>>> >> lakshmi@datatorrent.com> wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>> I created a shared google sheet and tracked the various details of
>>> >>> operators. Currently, the sheet contains information about operators
>>> under
>>> >>> lib/algo only. Link is https://docs.google.com/a/
>>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>>
>>> >>> lib/math too.
>>> >>>
>>> >>> Thanks
>>> >>> Lakshmi Prasanna
>>> >>>
>>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>>> >>> wrote:
>>> >>>
>>> >>>> Hi Lakshmi,
>>> >>>>
>>> >>>> Thanks for volunteering.
>>> >>>>
>>> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
>>> and
>>> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>> >>>> individual operators are both good, with the exception that
>>> lib/streamquery
>>> >>>> is one unit and we probably do not need to look at individual
>>> operators
>>> >>>> under it.
>>> >>>>
>>> >>>> If we don't have any objection in the community, let's start the
>>> >>>> process.
>>> >>>>
>>> >>>> David
>>> >>>>
>>> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>> >>>> lakshmi@datatorrent.com> wrote:
>>> >>>>
>>> >>>>> I am interested to work on this.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Lakshmi prasanna
>>> >>>>>
>>> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <
>>> hsy541@gmail.com>
>>> >>>>> wrote:
>>> >>>>>
>>> >>>>> > Why not have a shared google sheet with a list of operators and
>>> >>>>> options
>>> >>>>> > that we want to do with it.
>>> >>>>> > I think it's case by case.
>>> >>>>> > But retire unused or obsolete operators is important and we
>>> should
>>> >>>>> do it
>>> >>>>> > sooner rather than later.
>>> >>>>> >
>>> >>>>> > Regards,
>>> >>>>> > Siyuan
>>> >>>>> >
>>> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <
>>> amol@datatorrent.com>
>>> >>>>> wrote:
>>> >>>>> >
>>> >>>>> >>
>>> >>>>> >> My vote is to do 2&3
>>> >>>>> >>
>>> >>>>> >> Thks
>>> >>>>> >> Amol
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>> >>>>> >> VKottapalli@directv.com> wrote:
>>> >>>>> >>
>>> >>>>> >>> +1 for deprecating the packages listed below.
>>> >>>>> >>>
>>> >>>>> >>> -----Original Message-----
>>> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>> >>>>> >>>
>>> >>>>> >>> +1
>>> >>>>> >>>
>>> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>>> david@datatorrent.com
>>> >>>>> >
>>> >>>>> >>> wrote:
>>> >>>>> >>>
>>> >>>>> >>> > Hi all,
>>> >>>>> >>> >
>>> >>>>> >>> > I would like to renew the discussion of retiring operators in
>>> >>>>> Malhar.
>>> >>>>> >>> >
>>> >>>>> >>> > As stated before, the reason why we would like to retire
>>> >>>>> operators in
>>> >>>>> >>> > Malhar is because some of them were written a long time ago
>>> >>>>> before
>>> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>>> >>>>> are not
>>> >>>>> >>> > up to par in code quality, have no potential for
>>> improvement, and
>>> >>>>> >>> > probably completely unused by anybody.
>>> >>>>> >>> >
>>> >>>>> >>> > We do not want contributors to use them as a model of their
>>> >>>>> >>> > contribution, or users to use them thinking they are of
>>> quality,
>>> >>>>> and
>>> >>>>> >>> then hit a wall.
>>> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >>>>> >>> >
>>> >>>>> >>> > The initial 3 packages that we would like to target are
>>> >>>>> *lib/algo*,
>>> >>>>> >>> > *lib/math*, and *lib/streamquery*.
>>> >>>>> >>>
>>> >>>>> >>> >
>>> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
>>> you
>>> >>>>> are
>>> >>>>> >>> > using any operator in these 3 packages. We would like to hear
>>> >>>>> from you.
>>> >>>>> >>> >
>>> >>>>> >>> > These are the options I can think of for retiring those
>>> >>>>> operators:
>>> >>>>> >>> >
>>> >>>>> >>> > 1) Completely remove them from the malhar repository.
>>> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
>>> called
>>> >>>>> >>> > malhar-misc
>>> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they
>>> are no
>>> >>>>> >>> > longer supported
>>> >>>>> >>> >
>>> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >>>>> >>> >
>>> >>>>> >>> > David
>>> >>>>> >>> >
>>> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> >>>>> >>> > <pr...@datatorrent.com>
>>> >>>>> >>> > wrote:
>>> >>>>> >>> >
>>> >>>>> >>> >> I wanted to close the loop on this discussion. In general
>>> >>>>> everyone
>>> >>>>> >>> >> seemed to be favorable to this idea with no serious
>>> objections.
>>> >>>>> Folks
>>> >>>>> >>> >> had good suggestions like documenting capabilities of
>>> >>>>> operators, come
>>> >>>>> >>> >> up well defined criteria for graduation of operators and
>>> what
>>> >>>>> those
>>> >>>>> >>> >> criteria may be and what to do with existing operators that
>>> may
>>> >>>>> not
>>> >>>>> >>> >> yet be mature or unused.
>>> >>>>> >>> >>
>>> >>>>> >>> >> I am going to summarize the key points that resulted from
>>> the
>>> >>>>> >>> >> discussion and would like to proceed with them.
>>> >>>>> >>> >>
>>> >>>>> >>> >>    - Operators that do not yet provide the key platform
>>> >>>>> capabilities
>>> >>>>> >>> to
>>> >>>>> >>> >>    make an operator useful across different applications
>>> such as
>>> >>>>> >>> >> reusability,
>>> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>>> >>>>> will
>>> >>>>> >>> still be
>>> >>>>> >>> >>    accepted as long as they are functionally correct, have
>>> unit
>>> >>>>> tests
>>> >>>>> >>> >> and will
>>> >>>>> >>> >>    go into a separate module.
>>> >>>>> >>> >>    - Contrib module was suggested as a place where new
>>> >>>>> contributions
>>> >>>>> >>> go in
>>> >>>>> >>> >>    that don't yet have all the platform capabilities and are
>>> >>>>> not yet
>>> >>>>> >>> >> mature.
>>> >>>>> >>> >>    If there are no other suggestions we will go with this
>>> one.
>>> >>>>> >>> >>    - It was suggested the operators documentation list those
>>> >>>>> platform
>>> >>>>> >>> >>    capabilities it currently provides from the list above. I
>>> >>>>> will
>>> >>>>> >>> >> document a
>>> >>>>> >>> >>    structure for this in the contribution guidelines.
>>> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
>>> >>>>> graduate an
>>> >>>>> >>> >>    operator to the big leagues :). I will kick-off a
>>> separate
>>> >>>>> thread
>>> >>>>> >>> >> for it as
>>> >>>>> >>> >>    I think it requires its own discussion and hopefully we
>>> can
>>> >>>>> come
>>> >>>>> >>> >> up with a
>>> >>>>> >>> >>    set of guidelines for it.
>>> >>>>> >>> >>    - David brought up state of some of the existing
>>> operators
>>> >>>>> and
>>> >>>>> >>> their
>>> >>>>> >>> >>    retirement and the layout of operators in Malhar in
>>> general
>>> >>>>> and
>>> >>>>> >>> how it
>>> >>>>> >>> >>    causes problems with development. I will ask him to lead
>>> the
>>> >>>>> >>> >> discussion on
>>> >>>>> >>> >>    that.
>>> >>>>> >>> >>
>>> >>>>> >>> >> Thanks
>>> >>>>> >>> >>
>>> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>> >>>>> david@datatorrent.com>
>>> >>>>> >>> wrote:
>>> >>>>> >>> >>
>>> >>>>> >>> >> > The two ideas are not conflicting, but rather
>>> complementing.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > On the contrary, putting a new process for people trying
>>> to
>>> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>> >>>>> operators in
>>> >>>>> >>> >> > the repository
>>> >>>>> >>> >> is
>>> >>>>> >>> >> > what is conflicting.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > Keep in mind that when people try to contribute, they
>>> always
>>> >>>>> look
>>> >>>>> >>> >> > at the existing operators already in the repository as
>>> >>>>> examples and
>>> >>>>> >>> >> > likely a
>>> >>>>> >>> >> model
>>> >>>>> >>> >> > for their new operators.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > David
>>> >>>>> >>> >> >
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>> >>>>> amol@datatorrent.com>
>>> >>>>> >>> >> wrote:
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>> >>>>> thread
>>> >>>>> >>> >> > > was to
>>> >>>>> >>> >> > open
>>> >>>>> >>> >> > > up a way for contributors to submit code in a dir
>>> >>>>> (contrib?) as
>>> >>>>> >>> >> > > long
>>> >>>>> >>> >> as
>>> >>>>> >>> >> > > license part of taken care of.
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > On the thread of removing non-used operators -> How do
>>> we
>>> >>>>> know
>>> >>>>> >>> >> > > what is being used?
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > Thks,
>>> >>>>> >>> >> > > Amol
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>> >>>>> >>> >> sandesh@datatorrent.com>
>>> >>>>> >>> >> > > wrote:
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > > +1 for removing the not-used operators.
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > So we are creating a process for operator writers who
>>> >>>>> don't
>>> >>>>> >>> >> > > > want to understand the platform, yet wants to
>>> contribute?
>>> >>>>> How
>>> >>>>> >>> >> > > > big is that
>>> >>>>> >>> >> set?
>>> >>>>> >>> >> > > > If we tell the app-user, here is the code which has
>>> not
>>> >>>>> passed
>>> >>>>> >>> >> > > > all
>>> >>>>> >>> >> the
>>> >>>>> >>> >> > > > checklist, will they be ready to use that in
>>> production?
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>>> >>>>> operators and
>>> >>>>> >>> >> > > > make
>>> >>>>> >>> >> it
>>> >>>>> >>> >> > > easy
>>> >>>>> >>> >> > > > to add more operators.
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>> >>>>> >>> >> > pramod@datatorrent.com>
>>> >>>>> >>> >> > > > wrote:
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>> >>>>> >>> >> > > > > wrote:
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > Pramod,
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > By that logic I would say let's put all
>>> partitionable
>>> >>>>> >>> >> > > > > > operators
>>> >>>>> >>> >> > into
>>> >>>>> >>> >> > > > one
>>> >>>>> >>> >> > > > > > folder, non-partitionable operators in another
>>> and so
>>> >>>>> on...
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > Remember the original goal of making it easier for
>>> new
>>> >>>>> >>> >> > > > > members to contribute and managing those
>>> contributions
>>> >>>>> to
>>> >>>>> >>> >> > > > > maturity. It is
>>> >>>>> >>> >> not a
>>> >>>>> >>> >> > > > > functional level separation.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>> >>>>> being
>>> >>>>> >>> >> > > > > > used at
>>> >>>>> >>> >> > > class
>>> >>>>> >>> >> > > > > > level and not at package/folder level.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
>>> this
>>> >>>>> like
>>> >>>>> >>> >> > > > > a
>>> >>>>> >>> >> > > folder..."
>>> >>>>> >>> >> > > > > as an analogy and not literally.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > Thanks
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > Thanks
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>> >>>>> >>> >> > > > pramod@datatorrent.com
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > wrote:
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>> >>>>> >>> >> > > > > > > wrote:
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>> >>>>> >>> >> > > org.apache.hadoop.classification.
>>> >>>>> InterfaceStability.Evolving
>>> >>>>> >>> >> > > > /
>>> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>>> >>>>> InterfaceStability.Uns
>>> >>>>> >>> >> > > > > > > > table
>>> >>>>> >>> >> > > > > > annotation?
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > I think it is important to localize the
>>> additions
>>> >>>>> in one
>>> >>>>> >>> >> place so
>>> >>>>> >>> >> > > > that
>>> >>>>> >>> >> > > > > it
>>> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity
>>> level of
>>> >>>>> >>> >> > > > > > > these,
>>> >>>>> >>> >> > easier
>>> >>>>> >>> >> > > > for
>>> >>>>> >>> >> > > > > > > developers to track them towards the path to
>>> >>>>> maturity and
>>> >>>>> >>> >> > > > > > > also
>>> >>>>> >>> >> > > > > provides a
>>> >>>>> >>> >> > > > > > > clearer directive for committers and
>>> contributors on
>>> >>>>> >>> >> acceptance
>>> >>>>> >>> >> > of
>>> >>>>> >>> >> > > > new
>>> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
>>> makes
>>> >>>>> them
>>> >>>>> >>> >> spread
>>> >>>>> >>> >> > all
>>> >>>>> >>> >> > > > > over
>>> >>>>> >>> >> > > > > > > the place and adds an additional layer of
>>> >>>>> difficulty in
>>> >>>>> >>> >> > > > identification
>>> >>>>> >>> >> > > > > > not
>>> >>>>> >>> >> > > > > > > just for users but also for developers who want
>>> to
>>> >>>>> find
>>> >>>>> >>> >> > > > > > > such
>>> >>>>> >>> >> > > > operators
>>> >>>>> >>> >> > > > > > and
>>> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>> >>>>> annotation
>>> >>>>> >>> >> where
>>> >>>>> >>> >> > > > > > everything
>>> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > Thanks
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>> >>>>> >>> >> > > david@datatorrent.com
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > > wrote:
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way
>>> too
>>> >>>>> many
>>> >>>>> >>> >> operators
>>> >>>>> >>> >> > > > that
>>> >>>>> >>> >> > > > > > fall
>>> >>>>> >>> >> > > > > > > > in
>>> >>>>> >>> >> > > > > > > > > > the
>>> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>> >>>>> should
>>> >>>>> >>> >> > > > > > > > > > > > make it
>>> >>>>> >>> >> > > > obvious
>>> >>>>> >>> >> > > > > to
>>> >>>>> >>> >> > > > > > > > users
>>> >>>>> >>> >> > > > > > > > > > > that
>>> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and
>>> which
>>> >>>>> >>> >> > > > > > > > > > > > operators
>>> >>>>> >>> >> are
>>> >>>>> >>> >> > > not,
>>> >>>>> >>> >> > > > > and
>>> >>>>> >>> >> > > > > > > > maybe
>>> >>>>> >>> >> > > > > > > > > > > even
>>> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>>> >>>>> used in a
>>> >>>>> >>> >> > > > > > > > > > > > real
>>> >>>>> >>> >> > use
>>> >>>>> >>> >> > > > > case.
>>> >>>>> >>> >> > > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>> >>>>> operators
>>> >>>>> >>> >> > > > > > > > > > > and
>>> >>>>> >>> >> > doing
>>> >>>>> >>> >> > > > this
>>> >>>>> >>> >> > > > > > > > > exercise
>>> >>>>> >>> >> > > > > > > > > > as
>>> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>> >>>>> original
>>> >>>>> >>> >> intent
>>> >>>>> >>> >> > is
>>> >>>>> >>> >> > > > for
>>> >>>>> >>> >> > > > > > > > > > > contributions going forward.
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as
>>> well.
>>> >>>>> >>> >> > > > > > > > > > Operators
>>> >>>>> >>> >> > > outside
>>> >>>>> >>> >> > > > > the
>>> >>>>> >>> >> > > > > > > play
>>> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
>>> >>>>> anticipate
>>> >>>>> >>> >> > > > > > > > > much
>>> >>>>> >>> >> > tension
>>> >>>>> >>> >> > > if
>>> >>>>> >>> >> > > > > we
>>> >>>>> >>> >> > > > > > > > > establish clear criteria.
>>> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>> >>>>> operators
>>> >>>>> >>> >> > > > > > > > > stay
>>> >>>>> >>> >> and
>>> >>>>> >>> >> > > put
>>> >>>>> >>> >> > > > up
>>> >>>>> >>> >> > > > > > the
>>> >>>>> >>> >> > > > > > > > > bars for new operators.
>>> >>>>> >>> >> > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > David
>>> >>>>> >>> >> > > > > > > > >
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> >
>>> >>>>> >>> >>
>>> >>>>> >>> >
>>> >>>>> >>> >
>>> >>>>> >>>
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>>
>>
>>
>

Re: A proposal for Malhar

Posted by Thomas Weise <th...@gmail.com>.
There are a bunch of operators that don't have proper state management and
also don't support generic windowing (event time etc.). I would suggest to
move those out or deprecate them.

The new windowing and state management support along with the appropriate
aggregators is going to make them obsolete.

Thomas


On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi,
>
> Friendly Reminder :
>
> I created a shared google sheet and tracked the various details of
> operators. The sheet contains information about operators under lib/algo,
> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
> operator . Please take a look and provide comments as if any.
>
> Thanks
> Lakshmi Prasanna
>
> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> Added comments, also recommend having the misc folder for the remaining
>> operators in contrib according to proposed guidelines
>>
>> https://github.com/apache/apex-site/pull/44
>>
>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>> lakshmi@datatorrent.com>
>> wrote:
>>
>> > Hi
>> >
>> > I also added recommendation for lib/math operators to the same document
>> as
>> > a separate sheet. Please have a look.
>> >
>> > Thanks
>> > Lakshmi Prasanna
>> >
>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>> lakshmi@datatorrent.com
>> > > wrote:
>> >
>> >> Hi,
>> >>
>> >> I also added recommendation for each operator . Please take a look.
>> >>
>> >> thanks
>> >>
>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> >> lakshmi@datatorrent.com> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I created a shared google sheet and tracked the various details of
>> >>> operators. Currently, the sheet contains information about operators
>> under
>> >>> lib/algo only. Link is https://docs.google.com/a/
>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>
>> >>> lib/math too.
>> >>>
>> >>> Thanks
>> >>> Lakshmi Prasanna
>> >>>
>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>>
>> >>>> Hi Lakshmi,
>> >>>>
>> >>>> Thanks for volunteering.
>> >>>>
>> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
>> and
>> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>> >>>> individual operators are both good, with the exception that
>> lib/streamquery
>> >>>> is one unit and we probably do not need to look at individual
>> operators
>> >>>> under it.
>> >>>>
>> >>>> If we don't have any objection in the community, let's start the
>> >>>> process.
>> >>>>
>> >>>> David
>> >>>>
>> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>> >>>> lakshmi@datatorrent.com> wrote:
>> >>>>
>> >>>>> I am interested to work on this.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Lakshmi prasanna
>> >>>>>
>> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hsy541@gmail.com
>> >
>> >>>>> wrote:
>> >>>>>
>> >>>>> > Why not have a shared google sheet with a list of operators and
>> >>>>> options
>> >>>>> > that we want to do with it.
>> >>>>> > I think it's case by case.
>> >>>>> > But retire unused or obsolete operators is important and we should
>> >>>>> do it
>> >>>>> > sooner rather than later.
>> >>>>> >
>> >>>>> > Regards,
>> >>>>> > Siyuan
>> >>>>> >
>> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <amol@datatorrent.com
>> >
>> >>>>> wrote:
>> >>>>> >
>> >>>>> >>
>> >>>>> >> My vote is to do 2&3
>> >>>>> >>
>> >>>>> >> Thks
>> >>>>> >> Amol
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> >>>>> >> VKottapalli@directv.com> wrote:
>> >>>>> >>
>> >>>>> >>> +1 for deprecating the packages listed below.
>> >>>>> >>>
>> >>>>> >>> -----Original Message-----
>> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> >>>>> >>>
>> >>>>> >>> +1
>> >>>>> >>>
>> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>> david@datatorrent.com
>> >>>>> >
>> >>>>> >>> wrote:
>> >>>>> >>>
>> >>>>> >>> > Hi all,
>> >>>>> >>> >
>> >>>>> >>> > I would like to renew the discussion of retiring operators in
>> >>>>> Malhar.
>> >>>>> >>> >
>> >>>>> >>> > As stated before, the reason why we would like to retire
>> >>>>> operators in
>> >>>>> >>> > Malhar is because some of them were written a long time ago
>> >>>>> before
>> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>> >>>>> are not
>> >>>>> >>> > up to par in code quality, have no potential for improvement,
>> and
>> >>>>> >>> > probably completely unused by anybody.
>> >>>>> >>> >
>> >>>>> >>> > We do not want contributors to use them as a model of their
>> >>>>> >>> > contribution, or users to use them thinking they are of
>> quality,
>> >>>>> and
>> >>>>> >>> then hit a wall.
>> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>> >>>>> >>> >
>> >>>>> >>> > The initial 3 packages that we would like to target are
>> >>>>> *lib/algo*,
>> >>>>> >>> > *lib/math*, and *lib/streamquery*.
>> >>>>> >>>
>> >>>>> >>> >
>> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
>> you
>> >>>>> are
>> >>>>> >>> > using any operator in these 3 packages. We would like to hear
>> >>>>> from you.
>> >>>>> >>> >
>> >>>>> >>> > These are the options I can think of for retiring those
>> >>>>> operators:
>> >>>>> >>> >
>> >>>>> >>> > 1) Completely remove them from the malhar repository.
>> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
>> called
>> >>>>> >>> > malhar-misc
>> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they
>> are no
>> >>>>> >>> > longer supported
>> >>>>> >>> >
>> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >>>>> >>> >
>> >>>>> >>> > David
>> >>>>> >>> >
>> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> >>>>> >>> > <pr...@datatorrent.com>
>> >>>>> >>> > wrote:
>> >>>>> >>> >
>> >>>>> >>> >> I wanted to close the loop on this discussion. In general
>> >>>>> everyone
>> >>>>> >>> >> seemed to be favorable to this idea with no serious
>> objections.
>> >>>>> Folks
>> >>>>> >>> >> had good suggestions like documenting capabilities of
>> >>>>> operators, come
>> >>>>> >>> >> up well defined criteria for graduation of operators and what
>> >>>>> those
>> >>>>> >>> >> criteria may be and what to do with existing operators that
>> may
>> >>>>> not
>> >>>>> >>> >> yet be mature or unused.
>> >>>>> >>> >>
>> >>>>> >>> >> I am going to summarize the key points that resulted from the
>> >>>>> >>> >> discussion and would like to proceed with them.
>> >>>>> >>> >>
>> >>>>> >>> >>    - Operators that do not yet provide the key platform
>> >>>>> capabilities
>> >>>>> >>> to
>> >>>>> >>> >>    make an operator useful across different applications
>> such as
>> >>>>> >>> >> reusability,
>> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>> >>>>> will
>> >>>>> >>> still be
>> >>>>> >>> >>    accepted as long as they are functionally correct, have
>> unit
>> >>>>> tests
>> >>>>> >>> >> and will
>> >>>>> >>> >>    go into a separate module.
>> >>>>> >>> >>    - Contrib module was suggested as a place where new
>> >>>>> contributions
>> >>>>> >>> go in
>> >>>>> >>> >>    that don't yet have all the platform capabilities and are
>> >>>>> not yet
>> >>>>> >>> >> mature.
>> >>>>> >>> >>    If there are no other suggestions we will go with this
>> one.
>> >>>>> >>> >>    - It was suggested the operators documentation list those
>> >>>>> platform
>> >>>>> >>> >>    capabilities it currently provides from the list above. I
>> >>>>> will
>> >>>>> >>> >> document a
>> >>>>> >>> >>    structure for this in the contribution guidelines.
>> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
>> >>>>> graduate an
>> >>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>> >>>>> thread
>> >>>>> >>> >> for it as
>> >>>>> >>> >>    I think it requires its own discussion and hopefully we
>> can
>> >>>>> come
>> >>>>> >>> >> up with a
>> >>>>> >>> >>    set of guidelines for it.
>> >>>>> >>> >>    - David brought up state of some of the existing operators
>> >>>>> and
>> >>>>> >>> their
>> >>>>> >>> >>    retirement and the layout of operators in Malhar in
>> general
>> >>>>> and
>> >>>>> >>> how it
>> >>>>> >>> >>    causes problems with development. I will ask him to lead
>> the
>> >>>>> >>> >> discussion on
>> >>>>> >>> >>    that.
>> >>>>> >>> >>
>> >>>>> >>> >> Thanks
>> >>>>> >>> >>
>> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>> >>>>> david@datatorrent.com>
>> >>>>> >>> wrote:
>> >>>>> >>> >>
>> >>>>> >>> >> > The two ideas are not conflicting, but rather
>> complementing.
>> >>>>> >>> >> >
>> >>>>> >>> >> > On the contrary, putting a new process for people trying to
>> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
>> >>>>> operators in
>> >>>>> >>> >> > the repository
>> >>>>> >>> >> is
>> >>>>> >>> >> > what is conflicting.
>> >>>>> >>> >> >
>> >>>>> >>> >> > Keep in mind that when people try to contribute, they
>> always
>> >>>>> look
>> >>>>> >>> >> > at the existing operators already in the repository as
>> >>>>> examples and
>> >>>>> >>> >> > likely a
>> >>>>> >>> >> model
>> >>>>> >>> >> > for their new operators.
>> >>>>> >>> >> >
>> >>>>> >>> >> > David
>> >>>>> >>> >> >
>> >>>>> >>> >> >
>> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>> >>>>> amol@datatorrent.com>
>> >>>>> >>> >> wrote:
>> >>>>> >>> >> >
>> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
>> >>>>> thread
>> >>>>> >>> >> > > was to
>> >>>>> >>> >> > open
>> >>>>> >>> >> > > up a way for contributors to submit code in a dir
>> >>>>> (contrib?) as
>> >>>>> >>> >> > > long
>> >>>>> >>> >> as
>> >>>>> >>> >> > > license part of taken care of.
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>> >>>>> know
>> >>>>> >>> >> > > what is being used?
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > Thks,
>> >>>>> >>> >> > > Amol
>> >>>>> >>> >> > >
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >>>>> >>> >> sandesh@datatorrent.com>
>> >>>>> >>> >> > > wrote:
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > > +1 for removing the not-used operators.
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > So we are creating a process for operator writers who
>> >>>>> don't
>> >>>>> >>> >> > > > want to understand the platform, yet wants to
>> contribute?
>> >>>>> How
>> >>>>> >>> >> > > > big is that
>> >>>>> >>> >> set?
>> >>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>> >>>>> passed
>> >>>>> >>> >> > > > all
>> >>>>> >>> >> the
>> >>>>> >>> >> > > > checklist, will they be ready to use that in
>> production?
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>> >>>>> operators and
>> >>>>> >>> >> > > > make
>> >>>>> >>> >> it
>> >>>>> >>> >> > > easy
>> >>>>> >>> >> > > > to add more operators.
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >>>>> >>> >> > pramod@datatorrent.com>
>> >>>>> >>> >> > > > wrote:
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
>> >>>>> >>> >> > > > > wrote:
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > Pramod,
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > By that logic I would say let's put all
>> partitionable
>> >>>>> >>> >> > > > > > operators
>> >>>>> >>> >> > into
>> >>>>> >>> >> > > > one
>> >>>>> >>> >> > > > > > folder, non-partitionable operators in another and
>> so
>> >>>>> on...
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > Remember the original goal of making it easier for
>> new
>> >>>>> >>> >> > > > > members to contribute and managing those
>> contributions
>> >>>>> to
>> >>>>> >>> >> > > > > maturity. It is
>> >>>>> >>> >> not a
>> >>>>> >>> >> > > > > functional level separation.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>> >>>>> being
>> >>>>> >>> >> > > > > > used at
>> >>>>> >>> >> > > class
>> >>>>> >>> >> > > > > > level and not at package/folder level.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
>> this
>> >>>>> like
>> >>>>> >>> >> > > > > a
>> >>>>> >>> >> > > folder..."
>> >>>>> >>> >> > > > > as an analogy and not literally.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > Thanks
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > Thanks
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >>>>> >>> >> > > > pramod@datatorrent.com
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > wrote:
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>> >>>>> >>> >> > > > > > > wrote:
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>> >>>>> >>> >> > > org.apache.hadoop.classification.
>> >>>>> InterfaceStability.Evolving
>> >>>>> >>> >> > > > /
>> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>> >>>>> InterfaceStability.Uns
>> >>>>> >>> >> > > > > > > > table
>> >>>>> >>> >> > > > > > annotation?
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > I think it is important to localize the additions
>> >>>>> in one
>> >>>>> >>> >> place so
>> >>>>> >>> >> > > > that
>> >>>>> >>> >> > > > > it
>> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity
>> level of
>> >>>>> >>> >> > > > > > > these,
>> >>>>> >>> >> > easier
>> >>>>> >>> >> > > > for
>> >>>>> >>> >> > > > > > > developers to track them towards the path to
>> >>>>> maturity and
>> >>>>> >>> >> > > > > > > also
>> >>>>> >>> >> > > > > provides a
>> >>>>> >>> >> > > > > > > clearer directive for committers and
>> contributors on
>> >>>>> >>> >> acceptance
>> >>>>> >>> >> > of
>> >>>>> >>> >> > > > new
>> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
>> makes
>> >>>>> them
>> >>>>> >>> >> spread
>> >>>>> >>> >> > all
>> >>>>> >>> >> > > > > over
>> >>>>> >>> >> > > > > > > the place and adds an additional layer of
>> >>>>> difficulty in
>> >>>>> >>> >> > > > identification
>> >>>>> >>> >> > > > > > not
>> >>>>> >>> >> > > > > > > just for users but also for developers who want
>> to
>> >>>>> find
>> >>>>> >>> >> > > > > > > such
>> >>>>> >>> >> > > > operators
>> >>>>> >>> >> > > > > > and
>> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
>> >>>>> annotation
>> >>>>> >>> >> where
>> >>>>> >>> >> > > > > > everything
>> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > Thanks
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >>>>> >>> >> > > david@datatorrent.com
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > > wrote:
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way
>> too
>> >>>>> many
>> >>>>> >>> >> operators
>> >>>>> >>> >> > > > that
>> >>>>> >>> >> > > > > > fall
>> >>>>> >>> >> > > > > > > > in
>> >>>>> >>> >> > > > > > > > > > the
>> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>> >>>>> should
>> >>>>> >>> >> > > > > > > > > > > > make it
>> >>>>> >>> >> > > > obvious
>> >>>>> >>> >> > > > > to
>> >>>>> >>> >> > > > > > > > users
>> >>>>> >>> >> > > > > > > > > > > that
>> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and
>> which
>> >>>>> >>> >> > > > > > > > > > > > operators
>> >>>>> >>> >> are
>> >>>>> >>> >> > > not,
>> >>>>> >>> >> > > > > and
>> >>>>> >>> >> > > > > > > > maybe
>> >>>>> >>> >> > > > > > > > > > > even
>> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>> >>>>> used in a
>> >>>>> >>> >> > > > > > > > > > > > real
>> >>>>> >>> >> > use
>> >>>>> >>> >> > > > > case.
>> >>>>> >>> >> > > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>> >>>>> operators
>> >>>>> >>> >> > > > > > > > > > > and
>> >>>>> >>> >> > doing
>> >>>>> >>> >> > > > this
>> >>>>> >>> >> > > > > > > > > exercise
>> >>>>> >>> >> > > > > > > > > > as
>> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>> >>>>> original
>> >>>>> >>> >> intent
>> >>>>> >>> >> > is
>> >>>>> >>> >> > > > for
>> >>>>> >>> >> > > > > > > > > > > contributions going forward.
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as
>> well.
>> >>>>> >>> >> > > > > > > > > > Operators
>> >>>>> >>> >> > > outside
>> >>>>> >>> >> > > > > the
>> >>>>> >>> >> > > > > > > play
>> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
>> >>>>> anticipate
>> >>>>> >>> >> > > > > > > > > much
>> >>>>> >>> >> > tension
>> >>>>> >>> >> > > if
>> >>>>> >>> >> > > > > we
>> >>>>> >>> >> > > > > > > > > establish clear criteria.
>> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>> >>>>> operators
>> >>>>> >>> >> > > > > > > > > stay
>> >>>>> >>> >> and
>> >>>>> >>> >> > > put
>> >>>>> >>> >> > > > up
>> >>>>> >>> >> > > > > > the
>> >>>>> >>> >> > > > > > > > > bars for new operators.
>> >>>>> >>> >> > > > > > > > >
>> >>>>> >>> >> > > > > > > > > David
>> >>>>> >>> >> > > > > > > > >
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > >
>> >>>>> >>> >> >
>> >>>>> >>> >>
>> >>>>> >>> >
>> >>>>> >>> >
>> >>>>> >>>
>> >>>>> >>
>> >>>>> >>
>> >>>>> >
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Thomas Weise <th...@gmail.com>.
There are a bunch of operators that don't have proper state management and
also don't support generic windowing (event time etc.). I would suggest to
move those out or deprecate them.

The new windowing and state management support along with the appropriate
aggregators is going to make them obsolete.

Thomas


On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi,
>
> Friendly Reminder :
>
> I created a shared google sheet and tracked the various details of
> operators. The sheet contains information about operators under lib/algo,
> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
> operator . Please take a look and provide comments as if any.
>
> Thanks
> Lakshmi Prasanna
>
> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> Added comments, also recommend having the misc folder for the remaining
>> operators in contrib according to proposed guidelines
>>
>> https://github.com/apache/apex-site/pull/44
>>
>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>> lakshmi@datatorrent.com>
>> wrote:
>>
>> > Hi
>> >
>> > I also added recommendation for lib/math operators to the same document
>> as
>> > a separate sheet. Please have a look.
>> >
>> > Thanks
>> > Lakshmi Prasanna
>> >
>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>> lakshmi@datatorrent.com
>> > > wrote:
>> >
>> >> Hi,
>> >>
>> >> I also added recommendation for each operator . Please take a look.
>> >>
>> >> thanks
>> >>
>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> >> lakshmi@datatorrent.com> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I created a shared google sheet and tracked the various details of
>> >>> operators. Currently, the sheet contains information about operators
>> under
>> >>> lib/algo only. Link is https://docs.google.com/a/
>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>
>> >>> lib/math too.
>> >>>
>> >>> Thanks
>> >>> Lakshmi Prasanna
>> >>>
>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>>
>> >>>> Hi Lakshmi,
>> >>>>
>> >>>> Thanks for volunteering.
>> >>>>
>> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
>> and
>> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>> >>>> individual operators are both good, with the exception that
>> lib/streamquery
>> >>>> is one unit and we probably do not need to look at individual
>> operators
>> >>>> under it.
>> >>>>
>> >>>> If we don't have any objection in the community, let's start the
>> >>>> process.
>> >>>>
>> >>>> David
>> >>>>
>> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>> >>>> lakshmi@datatorrent.com> wrote:
>> >>>>
>> >>>>> I am interested to work on this.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Lakshmi prasanna
>> >>>>>
>> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hsy541@gmail.com
>> >
>> >>>>> wrote:
>> >>>>>
>> >>>>> > Why not have a shared google sheet with a list of operators and
>> >>>>> options
>> >>>>> > that we want to do with it.
>> >>>>> > I think it's case by case.
>> >>>>> > But retire unused or obsolete operators is important and we should
>> >>>>> do it
>> >>>>> > sooner rather than later.
>> >>>>> >
>> >>>>> > Regards,
>> >>>>> > Siyuan
>> >>>>> >
>> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <amol@datatorrent.com
>> >
>> >>>>> wrote:
>> >>>>> >
>> >>>>> >>
>> >>>>> >> My vote is to do 2&3
>> >>>>> >>
>> >>>>> >> Thks
>> >>>>> >> Amol
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> >>>>> >> VKottapalli@directv.com> wrote:
>> >>>>> >>
>> >>>>> >>> +1 for deprecating the packages listed below.
>> >>>>> >>>
>> >>>>> >>> -----Original Message-----
>> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> >>>>> >>>
>> >>>>> >>> +1
>> >>>>> >>>
>> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>> david@datatorrent.com
>> >>>>> >
>> >>>>> >>> wrote:
>> >>>>> >>>
>> >>>>> >>> > Hi all,
>> >>>>> >>> >
>> >>>>> >>> > I would like to renew the discussion of retiring operators in
>> >>>>> Malhar.
>> >>>>> >>> >
>> >>>>> >>> > As stated before, the reason why we would like to retire
>> >>>>> operators in
>> >>>>> >>> > Malhar is because some of them were written a long time ago
>> >>>>> before
>> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>> >>>>> are not
>> >>>>> >>> > up to par in code quality, have no potential for improvement,
>> and
>> >>>>> >>> > probably completely unused by anybody.
>> >>>>> >>> >
>> >>>>> >>> > We do not want contributors to use them as a model of their
>> >>>>> >>> > contribution, or users to use them thinking they are of
>> quality,
>> >>>>> and
>> >>>>> >>> then hit a wall.
>> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>> >>>>> >>> >
>> >>>>> >>> > The initial 3 packages that we would like to target are
>> >>>>> *lib/algo*,
>> >>>>> >>> > *lib/math*, and *lib/streamquery*.
>> >>>>> >>>
>> >>>>> >>> >
>> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
>> you
>> >>>>> are
>> >>>>> >>> > using any operator in these 3 packages. We would like to hear
>> >>>>> from you.
>> >>>>> >>> >
>> >>>>> >>> > These are the options I can think of for retiring those
>> >>>>> operators:
>> >>>>> >>> >
>> >>>>> >>> > 1) Completely remove them from the malhar repository.
>> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
>> called
>> >>>>> >>> > malhar-misc
>> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they
>> are no
>> >>>>> >>> > longer supported
>> >>>>> >>> >
>> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >>>>> >>> >
>> >>>>> >>> > David
>> >>>>> >>> >
>> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> >>>>> >>> > <pr...@datatorrent.com>
>> >>>>> >>> > wrote:
>> >>>>> >>> >
>> >>>>> >>> >> I wanted to close the loop on this discussion. In general
>> >>>>> everyone
>> >>>>> >>> >> seemed to be favorable to this idea with no serious
>> objections.
>> >>>>> Folks
>> >>>>> >>> >> had good suggestions like documenting capabilities of
>> >>>>> operators, come
>> >>>>> >>> >> up well defined criteria for graduation of operators and what
>> >>>>> those
>> >>>>> >>> >> criteria may be and what to do with existing operators that
>> may
>> >>>>> not
>> >>>>> >>> >> yet be mature or unused.
>> >>>>> >>> >>
>> >>>>> >>> >> I am going to summarize the key points that resulted from the
>> >>>>> >>> >> discussion and would like to proceed with them.
>> >>>>> >>> >>
>> >>>>> >>> >>    - Operators that do not yet provide the key platform
>> >>>>> capabilities
>> >>>>> >>> to
>> >>>>> >>> >>    make an operator useful across different applications
>> such as
>> >>>>> >>> >> reusability,
>> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>> >>>>> will
>> >>>>> >>> still be
>> >>>>> >>> >>    accepted as long as they are functionally correct, have
>> unit
>> >>>>> tests
>> >>>>> >>> >> and will
>> >>>>> >>> >>    go into a separate module.
>> >>>>> >>> >>    - Contrib module was suggested as a place where new
>> >>>>> contributions
>> >>>>> >>> go in
>> >>>>> >>> >>    that don't yet have all the platform capabilities and are
>> >>>>> not yet
>> >>>>> >>> >> mature.
>> >>>>> >>> >>    If there are no other suggestions we will go with this
>> one.
>> >>>>> >>> >>    - It was suggested the operators documentation list those
>> >>>>> platform
>> >>>>> >>> >>    capabilities it currently provides from the list above. I
>> >>>>> will
>> >>>>> >>> >> document a
>> >>>>> >>> >>    structure for this in the contribution guidelines.
>> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
>> >>>>> graduate an
>> >>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>> >>>>> thread
>> >>>>> >>> >> for it as
>> >>>>> >>> >>    I think it requires its own discussion and hopefully we
>> can
>> >>>>> come
>> >>>>> >>> >> up with a
>> >>>>> >>> >>    set of guidelines for it.
>> >>>>> >>> >>    - David brought up state of some of the existing operators
>> >>>>> and
>> >>>>> >>> their
>> >>>>> >>> >>    retirement and the layout of operators in Malhar in
>> general
>> >>>>> and
>> >>>>> >>> how it
>> >>>>> >>> >>    causes problems with development. I will ask him to lead
>> the
>> >>>>> >>> >> discussion on
>> >>>>> >>> >>    that.
>> >>>>> >>> >>
>> >>>>> >>> >> Thanks
>> >>>>> >>> >>
>> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>> >>>>> david@datatorrent.com>
>> >>>>> >>> wrote:
>> >>>>> >>> >>
>> >>>>> >>> >> > The two ideas are not conflicting, but rather
>> complementing.
>> >>>>> >>> >> >
>> >>>>> >>> >> > On the contrary, putting a new process for people trying to
>> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
>> >>>>> operators in
>> >>>>> >>> >> > the repository
>> >>>>> >>> >> is
>> >>>>> >>> >> > what is conflicting.
>> >>>>> >>> >> >
>> >>>>> >>> >> > Keep in mind that when people try to contribute, they
>> always
>> >>>>> look
>> >>>>> >>> >> > at the existing operators already in the repository as
>> >>>>> examples and
>> >>>>> >>> >> > likely a
>> >>>>> >>> >> model
>> >>>>> >>> >> > for their new operators.
>> >>>>> >>> >> >
>> >>>>> >>> >> > David
>> >>>>> >>> >> >
>> >>>>> >>> >> >
>> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>> >>>>> amol@datatorrent.com>
>> >>>>> >>> >> wrote:
>> >>>>> >>> >> >
>> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
>> >>>>> thread
>> >>>>> >>> >> > > was to
>> >>>>> >>> >> > open
>> >>>>> >>> >> > > up a way for contributors to submit code in a dir
>> >>>>> (contrib?) as
>> >>>>> >>> >> > > long
>> >>>>> >>> >> as
>> >>>>> >>> >> > > license part of taken care of.
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>> >>>>> know
>> >>>>> >>> >> > > what is being used?
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > Thks,
>> >>>>> >>> >> > > Amol
>> >>>>> >>> >> > >
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >>>>> >>> >> sandesh@datatorrent.com>
>> >>>>> >>> >> > > wrote:
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > > +1 for removing the not-used operators.
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > So we are creating a process for operator writers who
>> >>>>> don't
>> >>>>> >>> >> > > > want to understand the platform, yet wants to
>> contribute?
>> >>>>> How
>> >>>>> >>> >> > > > big is that
>> >>>>> >>> >> set?
>> >>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>> >>>>> passed
>> >>>>> >>> >> > > > all
>> >>>>> >>> >> the
>> >>>>> >>> >> > > > checklist, will they be ready to use that in
>> production?
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>> >>>>> operators and
>> >>>>> >>> >> > > > make
>> >>>>> >>> >> it
>> >>>>> >>> >> > > easy
>> >>>>> >>> >> > > > to add more operators.
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >>>>> >>> >> > pramod@datatorrent.com>
>> >>>>> >>> >> > > > wrote:
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
>> >>>>> >>> >> > > > > wrote:
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > Pramod,
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > By that logic I would say let's put all
>> partitionable
>> >>>>> >>> >> > > > > > operators
>> >>>>> >>> >> > into
>> >>>>> >>> >> > > > one
>> >>>>> >>> >> > > > > > folder, non-partitionable operators in another and
>> so
>> >>>>> on...
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > Remember the original goal of making it easier for
>> new
>> >>>>> >>> >> > > > > members to contribute and managing those
>> contributions
>> >>>>> to
>> >>>>> >>> >> > > > > maturity. It is
>> >>>>> >>> >> not a
>> >>>>> >>> >> > > > > functional level separation.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>> >>>>> being
>> >>>>> >>> >> > > > > > used at
>> >>>>> >>> >> > > class
>> >>>>> >>> >> > > > > > level and not at package/folder level.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
>> this
>> >>>>> like
>> >>>>> >>> >> > > > > a
>> >>>>> >>> >> > > folder..."
>> >>>>> >>> >> > > > > as an analogy and not literally.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > Thanks
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > Thanks
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >>>>> >>> >> > > > pramod@datatorrent.com
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > wrote:
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>> >>>>> >>> >> > > > > > > wrote:
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>> >>>>> >>> >> > > org.apache.hadoop.classification.
>> >>>>> InterfaceStability.Evolving
>> >>>>> >>> >> > > > /
>> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>> >>>>> InterfaceStability.Uns
>> >>>>> >>> >> > > > > > > > table
>> >>>>> >>> >> > > > > > annotation?
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > I think it is important to localize the additions
>> >>>>> in one
>> >>>>> >>> >> place so
>> >>>>> >>> >> > > > that
>> >>>>> >>> >> > > > > it
>> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity
>> level of
>> >>>>> >>> >> > > > > > > these,
>> >>>>> >>> >> > easier
>> >>>>> >>> >> > > > for
>> >>>>> >>> >> > > > > > > developers to track them towards the path to
>> >>>>> maturity and
>> >>>>> >>> >> > > > > > > also
>> >>>>> >>> >> > > > > provides a
>> >>>>> >>> >> > > > > > > clearer directive for committers and
>> contributors on
>> >>>>> >>> >> acceptance
>> >>>>> >>> >> > of
>> >>>>> >>> >> > > > new
>> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
>> makes
>> >>>>> them
>> >>>>> >>> >> spread
>> >>>>> >>> >> > all
>> >>>>> >>> >> > > > > over
>> >>>>> >>> >> > > > > > > the place and adds an additional layer of
>> >>>>> difficulty in
>> >>>>> >>> >> > > > identification
>> >>>>> >>> >> > > > > > not
>> >>>>> >>> >> > > > > > > just for users but also for developers who want
>> to
>> >>>>> find
>> >>>>> >>> >> > > > > > > such
>> >>>>> >>> >> > > > operators
>> >>>>> >>> >> > > > > > and
>> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
>> >>>>> annotation
>> >>>>> >>> >> where
>> >>>>> >>> >> > > > > > everything
>> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > Thanks
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >>>>> >>> >> > > david@datatorrent.com
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > > wrote:
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way
>> too
>> >>>>> many
>> >>>>> >>> >> operators
>> >>>>> >>> >> > > > that
>> >>>>> >>> >> > > > > > fall
>> >>>>> >>> >> > > > > > > > in
>> >>>>> >>> >> > > > > > > > > > the
>> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>> >>>>> should
>> >>>>> >>> >> > > > > > > > > > > > make it
>> >>>>> >>> >> > > > obvious
>> >>>>> >>> >> > > > > to
>> >>>>> >>> >> > > > > > > > users
>> >>>>> >>> >> > > > > > > > > > > that
>> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and
>> which
>> >>>>> >>> >> > > > > > > > > > > > operators
>> >>>>> >>> >> are
>> >>>>> >>> >> > > not,
>> >>>>> >>> >> > > > > and
>> >>>>> >>> >> > > > > > > > maybe
>> >>>>> >>> >> > > > > > > > > > > even
>> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>> >>>>> used in a
>> >>>>> >>> >> > > > > > > > > > > > real
>> >>>>> >>> >> > use
>> >>>>> >>> >> > > > > case.
>> >>>>> >>> >> > > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>> >>>>> operators
>> >>>>> >>> >> > > > > > > > > > > and
>> >>>>> >>> >> > doing
>> >>>>> >>> >> > > > this
>> >>>>> >>> >> > > > > > > > > exercise
>> >>>>> >>> >> > > > > > > > > > as
>> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>> >>>>> original
>> >>>>> >>> >> intent
>> >>>>> >>> >> > is
>> >>>>> >>> >> > > > for
>> >>>>> >>> >> > > > > > > > > > > contributions going forward.
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as
>> well.
>> >>>>> >>> >> > > > > > > > > > Operators
>> >>>>> >>> >> > > outside
>> >>>>> >>> >> > > > > the
>> >>>>> >>> >> > > > > > > play
>> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
>> >>>>> anticipate
>> >>>>> >>> >> > > > > > > > > much
>> >>>>> >>> >> > tension
>> >>>>> >>> >> > > if
>> >>>>> >>> >> > > > > we
>> >>>>> >>> >> > > > > > > > > establish clear criteria.
>> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>> >>>>> operators
>> >>>>> >>> >> > > > > > > > > stay
>> >>>>> >>> >> and
>> >>>>> >>> >> > > put
>> >>>>> >>> >> > > > up
>> >>>>> >>> >> > > > > > the
>> >>>>> >>> >> > > > > > > > > bars for new operators.
>> >>>>> >>> >> > > > > > > > >
>> >>>>> >>> >> > > > > > > > > David
>> >>>>> >>> >> > > > > > > > >
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > >
>> >>>>> >>> >> >
>> >>>>> >>> >>
>> >>>>> >>> >
>> >>>>> >>> >
>> >>>>> >>>
>> >>>>> >>
>> >>>>> >>
>> >>>>> >
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Hi,

Friendly Reminder :

I created a shared google sheet and tracked the various details of
operators. The sheet contains information about operators under lib/algo,
lib/math & lib/streamquery. Link is https://docs.google.com/a/
datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
operator . Please take a look and provide comments as if any.

Thanks
Lakshmi Prasanna

On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> Added comments, also recommend having the misc folder for the remaining
> operators in contrib according to proposed guidelines
>
> https://github.com/apache/apex-site/pull/44
>
> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <lakshmi@datatorrent.com
> >
> wrote:
>
> > Hi
> >
> > I also added recommendation for lib/math operators to the same document
> as
> > a separate sheet. Please have a look.
> >
> > Thanks
> > Lakshmi Prasanna
> >
> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
> lakshmi@datatorrent.com
> > > wrote:
> >
> >> Hi,
> >>
> >> I also added recommendation for each operator . Please take a look.
> >>
> >> thanks
> >>
> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
> >> lakshmi@datatorrent.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I created a shared google sheet and tracked the various details of
> >>> operators. Currently, the sheet contains information about operators
> under
> >>> lib/algo only. Link is https://docs.google.com/a/
> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
> >>> lib/math too.
> >>>
> >>> Thanks
> >>> Lakshmi Prasanna
> >>>
> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
> >>> wrote:
> >>>
> >>>> Hi Lakshmi,
> >>>>
> >>>> Thanks for volunteering.
> >>>>
> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
> and
> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
> >>>> individual operators are both good, with the exception that
> lib/streamquery
> >>>> is one unit and we probably do not need to look at individual
> operators
> >>>> under it.
> >>>>
> >>>> If we don't have any objection in the community, let's start the
> >>>> process.
> >>>>
> >>>> David
> >>>>
> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
> >>>> lakshmi@datatorrent.com> wrote:
> >>>>
> >>>>> I am interested to work on this.
> >>>>>
> >>>>> Regards,
> >>>>> Lakshmi prasanna
> >>>>>
> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> > Why not have a shared google sheet with a list of operators and
> >>>>> options
> >>>>> > that we want to do with it.
> >>>>> > I think it's case by case.
> >>>>> > But retire unused or obsolete operators is important and we should
> >>>>> do it
> >>>>> > sooner rather than later.
> >>>>> >
> >>>>> > Regards,
> >>>>> > Siyuan
> >>>>> >
> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
> >>>>> wrote:
> >>>>> >
> >>>>> >>
> >>>>> >> My vote is to do 2&3
> >>>>> >>
> >>>>> >> Thks
> >>>>> >> Amol
> >>>>> >>
> >>>>> >>
> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> >>>>> >> VKottapalli@directv.com> wrote:
> >>>>> >>
> >>>>> >>> +1 for deprecating the packages listed below.
> >>>>> >>>
> >>>>> >>> -----Original Message-----
> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
> >>>>> >>>
> >>>>> >>> +1
> >>>>> >>>
> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
> david@datatorrent.com
> >>>>> >
> >>>>> >>> wrote:
> >>>>> >>>
> >>>>> >>> > Hi all,
> >>>>> >>> >
> >>>>> >>> > I would like to renew the discussion of retiring operators in
> >>>>> Malhar.
> >>>>> >>> >
> >>>>> >>> > As stated before, the reason why we would like to retire
> >>>>> operators in
> >>>>> >>> > Malhar is because some of them were written a long time ago
> >>>>> before
> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
> >>>>> are not
> >>>>> >>> > up to par in code quality, have no potential for improvement,
> and
> >>>>> >>> > probably completely unused by anybody.
> >>>>> >>> >
> >>>>> >>> > We do not want contributors to use them as a model of their
> >>>>> >>> > contribution, or users to use them thinking they are of
> quality,
> >>>>> and
> >>>>> >>> then hit a wall.
> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
> >>>>> >>> >
> >>>>> >>> > The initial 3 packages that we would like to target are
> >>>>> *lib/algo*,
> >>>>> >>> > *lib/math*, and *lib/streamquery*.
> >>>>> >>>
> >>>>> >>> >
> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
> you
> >>>>> are
> >>>>> >>> > using any operator in these 3 packages. We would like to hear
> >>>>> from you.
> >>>>> >>> >
> >>>>> >>> > These are the options I can think of for retiring those
> >>>>> operators:
> >>>>> >>> >
> >>>>> >>> > 1) Completely remove them from the malhar repository.
> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
> called
> >>>>> >>> > malhar-misc
> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are
> no
> >>>>> >>> > longer supported
> >>>>> >>> >
> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >>>>> >>> >
> >>>>> >>> > David
> >>>>> >>> >
> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> >>>>> >>> > <pr...@datatorrent.com>
> >>>>> >>> > wrote:
> >>>>> >>> >
> >>>>> >>> >> I wanted to close the loop on this discussion. In general
> >>>>> everyone
> >>>>> >>> >> seemed to be favorable to this idea with no serious
> objections.
> >>>>> Folks
> >>>>> >>> >> had good suggestions like documenting capabilities of
> >>>>> operators, come
> >>>>> >>> >> up well defined criteria for graduation of operators and what
> >>>>> those
> >>>>> >>> >> criteria may be and what to do with existing operators that
> may
> >>>>> not
> >>>>> >>> >> yet be mature or unused.
> >>>>> >>> >>
> >>>>> >>> >> I am going to summarize the key points that resulted from the
> >>>>> >>> >> discussion and would like to proceed with them.
> >>>>> >>> >>
> >>>>> >>> >>    - Operators that do not yet provide the key platform
> >>>>> capabilities
> >>>>> >>> to
> >>>>> >>> >>    make an operator useful across different applications such
> as
> >>>>> >>> >> reusability,
> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
> >>>>> will
> >>>>> >>> still be
> >>>>> >>> >>    accepted as long as they are functionally correct, have
> unit
> >>>>> tests
> >>>>> >>> >> and will
> >>>>> >>> >>    go into a separate module.
> >>>>> >>> >>    - Contrib module was suggested as a place where new
> >>>>> contributions
> >>>>> >>> go in
> >>>>> >>> >>    that don't yet have all the platform capabilities and are
> >>>>> not yet
> >>>>> >>> >> mature.
> >>>>> >>> >>    If there are no other suggestions we will go with this one.
> >>>>> >>> >>    - It was suggested the operators documentation list those
> >>>>> platform
> >>>>> >>> >>    capabilities it currently provides from the list above. I
> >>>>> will
> >>>>> >>> >> document a
> >>>>> >>> >>    structure for this in the contribution guidelines.
> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
> >>>>> graduate an
> >>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
> >>>>> thread
> >>>>> >>> >> for it as
> >>>>> >>> >>    I think it requires its own discussion and hopefully we can
> >>>>> come
> >>>>> >>> >> up with a
> >>>>> >>> >>    set of guidelines for it.
> >>>>> >>> >>    - David brought up state of some of the existing operators
> >>>>> and
> >>>>> >>> their
> >>>>> >>> >>    retirement and the layout of operators in Malhar in general
> >>>>> and
> >>>>> >>> how it
> >>>>> >>> >>    causes problems with development. I will ask him to lead
> the
> >>>>> >>> >> discussion on
> >>>>> >>> >>    that.
> >>>>> >>> >>
> >>>>> >>> >> Thanks
> >>>>> >>> >>
> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
> >>>>> david@datatorrent.com>
> >>>>> >>> wrote:
> >>>>> >>> >>
> >>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
> >>>>> >>> >> >
> >>>>> >>> >> > On the contrary, putting a new process for people trying to
> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
> >>>>> operators in
> >>>>> >>> >> > the repository
> >>>>> >>> >> is
> >>>>> >>> >> > what is conflicting.
> >>>>> >>> >> >
> >>>>> >>> >> > Keep in mind that when people try to contribute, they always
> >>>>> look
> >>>>> >>> >> > at the existing operators already in the repository as
> >>>>> examples and
> >>>>> >>> >> > likely a
> >>>>> >>> >> model
> >>>>> >>> >> > for their new operators.
> >>>>> >>> >> >
> >>>>> >>> >> > David
> >>>>> >>> >> >
> >>>>> >>> >> >
> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
> >>>>> amol@datatorrent.com>
> >>>>> >>> >> wrote:
> >>>>> >>> >> >
> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
> >>>>> thread
> >>>>> >>> >> > > was to
> >>>>> >>> >> > open
> >>>>> >>> >> > > up a way for contributors to submit code in a dir
> >>>>> (contrib?) as
> >>>>> >>> >> > > long
> >>>>> >>> >> as
> >>>>> >>> >> > > license part of taken care of.
> >>>>> >>> >> > >
> >>>>> >>> >> > > On the thread of removing non-used operators -> How do we
> >>>>> know
> >>>>> >>> >> > > what is being used?
> >>>>> >>> >> > >
> >>>>> >>> >> > > Thks,
> >>>>> >>> >> > > Amol
> >>>>> >>> >> > >
> >>>>> >>> >> > >
> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >>>>> >>> >> sandesh@datatorrent.com>
> >>>>> >>> >> > > wrote:
> >>>>> >>> >> > >
> >>>>> >>> >> > > > +1 for removing the not-used operators.
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > So we are creating a process for operator writers who
> >>>>> don't
> >>>>> >>> >> > > > want to understand the platform, yet wants to
> contribute?
> >>>>> How
> >>>>> >>> >> > > > big is that
> >>>>> >>> >> set?
> >>>>> >>> >> > > > If we tell the app-user, here is the code which has not
> >>>>> passed
> >>>>> >>> >> > > > all
> >>>>> >>> >> the
> >>>>> >>> >> > > > checklist, will they be ready to use that in production?
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
> >>>>> operators and
> >>>>> >>> >> > > > make
> >>>>> >>> >> it
> >>>>> >>> >> > > easy
> >>>>> >>> >> > > > to add more operators.
> >>>>> >>> >> > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> >>>>> >>> >> > pramod@datatorrent.com>
> >>>>> >>> >> > > > wrote:
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
> >>>>> >>> >> > > > > wrote:
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > Pramod,
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > By that logic I would say let's put all
> partitionable
> >>>>> >>> >> > > > > > operators
> >>>>> >>> >> > into
> >>>>> >>> >> > > > one
> >>>>> >>> >> > > > > > folder, non-partitionable operators in another and
> so
> >>>>> on...
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > Remember the original goal of making it easier for new
> >>>>> >>> >> > > > > members to contribute and managing those contributions
> >>>>> to
> >>>>> >>> >> > > > > maturity. It is
> >>>>> >>> >> not a
> >>>>> >>> >> > > > > functional level separation.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
> >>>>> being
> >>>>> >>> >> > > > > > used at
> >>>>> >>> >> > > class
> >>>>> >>> >> > > > > > level and not at package/folder level.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
> this
> >>>>> like
> >>>>> >>> >> > > > > a
> >>>>> >>> >> > > folder..."
> >>>>> >>> >> > > > > as an analogy and not literally.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > Thanks
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > Thanks
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> >>>>> >>> >> > > > pramod@datatorrent.com
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > wrote:
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
> >>>>> >>> >> > > > > > > wrote:
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
> >>>>> >>> >> > > org.apache.hadoop.classification.
> >>>>> InterfaceStability.Evolving
> >>>>> >>> >> > > > /
> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
> >>>>> InterfaceStability.Uns
> >>>>> >>> >> > > > > > > > table
> >>>>> >>> >> > > > > > annotation?
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > I think it is important to localize the additions
> >>>>> in one
> >>>>> >>> >> place so
> >>>>> >>> >> > > > that
> >>>>> >>> >> > > > > it
> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity level
> of
> >>>>> >>> >> > > > > > > these,
> >>>>> >>> >> > easier
> >>>>> >>> >> > > > for
> >>>>> >>> >> > > > > > > developers to track them towards the path to
> >>>>> maturity and
> >>>>> >>> >> > > > > > > also
> >>>>> >>> >> > > > > provides a
> >>>>> >>> >> > > > > > > clearer directive for committers and contributors
> on
> >>>>> >>> >> acceptance
> >>>>> >>> >> > of
> >>>>> >>> >> > > > new
> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
> makes
> >>>>> them
> >>>>> >>> >> spread
> >>>>> >>> >> > all
> >>>>> >>> >> > > > > over
> >>>>> >>> >> > > > > > > the place and adds an additional layer of
> >>>>> difficulty in
> >>>>> >>> >> > > > identification
> >>>>> >>> >> > > > > > not
> >>>>> >>> >> > > > > > > just for users but also for developers who want to
> >>>>> find
> >>>>> >>> >> > > > > > > such
> >>>>> >>> >> > > > operators
> >>>>> >>> >> > > > > > and
> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
> >>>>> annotation
> >>>>> >>> >> where
> >>>>> >>> >> > > > > > everything
> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > Thanks
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> >>>>> >>> >> > > david@datatorrent.com
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > > wrote:
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
> >>>>> many
> >>>>> >>> >> operators
> >>>>> >>> >> > > > that
> >>>>> >>> >> > > > > > fall
> >>>>> >>> >> > > > > > > > in
> >>>>> >>> >> > > > > > > > > > the
> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
> >>>>> should
> >>>>> >>> >> > > > > > > > > > > > make it
> >>>>> >>> >> > > > obvious
> >>>>> >>> >> > > > > to
> >>>>> >>> >> > > > > > > > users
> >>>>> >>> >> > > > > > > > > > > that
> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
> >>>>> >>> >> > > > > > > > > > > > operators
> >>>>> >>> >> are
> >>>>> >>> >> > > not,
> >>>>> >>> >> > > > > and
> >>>>> >>> >> > > > > > > > maybe
> >>>>> >>> >> > > > > > > > > > > even
> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
> >>>>> used in a
> >>>>> >>> >> > > > > > > > > > > > real
> >>>>> >>> >> > use
> >>>>> >>> >> > > > > case.
> >>>>> >>> >> > > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
> >>>>> operators
> >>>>> >>> >> > > > > > > > > > > and
> >>>>> >>> >> > doing
> >>>>> >>> >> > > > this
> >>>>> >>> >> > > > > > > > > exercise
> >>>>> >>> >> > > > > > > > > > as
> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
> >>>>> original
> >>>>> >>> >> intent
> >>>>> >>> >> > is
> >>>>> >>> >> > > > for
> >>>>> >>> >> > > > > > > > > > > contributions going forward.
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
> >>>>> >>> >> > > > > > > > > > Operators
> >>>>> >>> >> > > outside
> >>>>> >>> >> > > > > the
> >>>>> >>> >> > > > > > > play
> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
> >>>>> anticipate
> >>>>> >>> >> > > > > > > > > much
> >>>>> >>> >> > tension
> >>>>> >>> >> > > if
> >>>>> >>> >> > > > > we
> >>>>> >>> >> > > > > > > > > establish clear criteria.
> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
> >>>>> operators
> >>>>> >>> >> > > > > > > > > stay
> >>>>> >>> >> and
> >>>>> >>> >> > > put
> >>>>> >>> >> > > > up
> >>>>> >>> >> > > > > > the
> >>>>> >>> >> > > > > > > > > bars for new operators.
> >>>>> >>> >> > > > > > > > >
> >>>>> >>> >> > > > > > > > > David
> >>>>> >>> >> > > > > > > > >
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > >
> >>>>> >>> >> >
> >>>>> >>> >>
> >>>>> >>> >
> >>>>> >>> >
> >>>>> >>>
> >>>>> >>
> >>>>> >>
> >>>>> >
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Hi,

Friendly Reminder :

I created a shared google sheet and tracked the various details of
operators. The sheet contains information about operators under lib/algo,
lib/math & lib/streamquery. Link is https://docs.google.com/a/
datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
operator . Please take a look and provide comments as if any.

Thanks
Lakshmi Prasanna

On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> Added comments, also recommend having the misc folder for the remaining
> operators in contrib according to proposed guidelines
>
> https://github.com/apache/apex-site/pull/44
>
> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <lakshmi@datatorrent.com
> >
> wrote:
>
> > Hi
> >
> > I also added recommendation for lib/math operators to the same document
> as
> > a separate sheet. Please have a look.
> >
> > Thanks
> > Lakshmi Prasanna
> >
> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
> lakshmi@datatorrent.com
> > > wrote:
> >
> >> Hi,
> >>
> >> I also added recommendation for each operator . Please take a look.
> >>
> >> thanks
> >>
> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
> >> lakshmi@datatorrent.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I created a shared google sheet and tracked the various details of
> >>> operators. Currently, the sheet contains information about operators
> under
> >>> lib/algo only. Link is https://docs.google.com/a/
> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
> >>> lib/math too.
> >>>
> >>> Thanks
> >>> Lakshmi Prasanna
> >>>
> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
> >>> wrote:
> >>>
> >>>> Hi Lakshmi,
> >>>>
> >>>> Thanks for volunteering.
> >>>>
> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
> and
> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
> >>>> individual operators are both good, with the exception that
> lib/streamquery
> >>>> is one unit and we probably do not need to look at individual
> operators
> >>>> under it.
> >>>>
> >>>> If we don't have any objection in the community, let's start the
> >>>> process.
> >>>>
> >>>> David
> >>>>
> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
> >>>> lakshmi@datatorrent.com> wrote:
> >>>>
> >>>>> I am interested to work on this.
> >>>>>
> >>>>> Regards,
> >>>>> Lakshmi prasanna
> >>>>>
> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> > Why not have a shared google sheet with a list of operators and
> >>>>> options
> >>>>> > that we want to do with it.
> >>>>> > I think it's case by case.
> >>>>> > But retire unused or obsolete operators is important and we should
> >>>>> do it
> >>>>> > sooner rather than later.
> >>>>> >
> >>>>> > Regards,
> >>>>> > Siyuan
> >>>>> >
> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
> >>>>> wrote:
> >>>>> >
> >>>>> >>
> >>>>> >> My vote is to do 2&3
> >>>>> >>
> >>>>> >> Thks
> >>>>> >> Amol
> >>>>> >>
> >>>>> >>
> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> >>>>> >> VKottapalli@directv.com> wrote:
> >>>>> >>
> >>>>> >>> +1 for deprecating the packages listed below.
> >>>>> >>>
> >>>>> >>> -----Original Message-----
> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
> >>>>> >>>
> >>>>> >>> +1
> >>>>> >>>
> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
> david@datatorrent.com
> >>>>> >
> >>>>> >>> wrote:
> >>>>> >>>
> >>>>> >>> > Hi all,
> >>>>> >>> >
> >>>>> >>> > I would like to renew the discussion of retiring operators in
> >>>>> Malhar.
> >>>>> >>> >
> >>>>> >>> > As stated before, the reason why we would like to retire
> >>>>> operators in
> >>>>> >>> > Malhar is because some of them were written a long time ago
> >>>>> before
> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
> >>>>> are not
> >>>>> >>> > up to par in code quality, have no potential for improvement,
> and
> >>>>> >>> > probably completely unused by anybody.
> >>>>> >>> >
> >>>>> >>> > We do not want contributors to use them as a model of their
> >>>>> >>> > contribution, or users to use them thinking they are of
> quality,
> >>>>> and
> >>>>> >>> then hit a wall.
> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
> >>>>> >>> >
> >>>>> >>> > The initial 3 packages that we would like to target are
> >>>>> *lib/algo*,
> >>>>> >>> > *lib/math*, and *lib/streamquery*.
> >>>>> >>>
> >>>>> >>> >
> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
> you
> >>>>> are
> >>>>> >>> > using any operator in these 3 packages. We would like to hear
> >>>>> from you.
> >>>>> >>> >
> >>>>> >>> > These are the options I can think of for retiring those
> >>>>> operators:
> >>>>> >>> >
> >>>>> >>> > 1) Completely remove them from the malhar repository.
> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
> called
> >>>>> >>> > malhar-misc
> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are
> no
> >>>>> >>> > longer supported
> >>>>> >>> >
> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >>>>> >>> >
> >>>>> >>> > David
> >>>>> >>> >
> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> >>>>> >>> > <pr...@datatorrent.com>
> >>>>> >>> > wrote:
> >>>>> >>> >
> >>>>> >>> >> I wanted to close the loop on this discussion. In general
> >>>>> everyone
> >>>>> >>> >> seemed to be favorable to this idea with no serious
> objections.
> >>>>> Folks
> >>>>> >>> >> had good suggestions like documenting capabilities of
> >>>>> operators, come
> >>>>> >>> >> up well defined criteria for graduation of operators and what
> >>>>> those
> >>>>> >>> >> criteria may be and what to do with existing operators that
> may
> >>>>> not
> >>>>> >>> >> yet be mature or unused.
> >>>>> >>> >>
> >>>>> >>> >> I am going to summarize the key points that resulted from the
> >>>>> >>> >> discussion and would like to proceed with them.
> >>>>> >>> >>
> >>>>> >>> >>    - Operators that do not yet provide the key platform
> >>>>> capabilities
> >>>>> >>> to
> >>>>> >>> >>    make an operator useful across different applications such
> as
> >>>>> >>> >> reusability,
> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
> >>>>> will
> >>>>> >>> still be
> >>>>> >>> >>    accepted as long as they are functionally correct, have
> unit
> >>>>> tests
> >>>>> >>> >> and will
> >>>>> >>> >>    go into a separate module.
> >>>>> >>> >>    - Contrib module was suggested as a place where new
> >>>>> contributions
> >>>>> >>> go in
> >>>>> >>> >>    that don't yet have all the platform capabilities and are
> >>>>> not yet
> >>>>> >>> >> mature.
> >>>>> >>> >>    If there are no other suggestions we will go with this one.
> >>>>> >>> >>    - It was suggested the operators documentation list those
> >>>>> platform
> >>>>> >>> >>    capabilities it currently provides from the list above. I
> >>>>> will
> >>>>> >>> >> document a
> >>>>> >>> >>    structure for this in the contribution guidelines.
> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
> >>>>> graduate an
> >>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
> >>>>> thread
> >>>>> >>> >> for it as
> >>>>> >>> >>    I think it requires its own discussion and hopefully we can
> >>>>> come
> >>>>> >>> >> up with a
> >>>>> >>> >>    set of guidelines for it.
> >>>>> >>> >>    - David brought up state of some of the existing operators
> >>>>> and
> >>>>> >>> their
> >>>>> >>> >>    retirement and the layout of operators in Malhar in general
> >>>>> and
> >>>>> >>> how it
> >>>>> >>> >>    causes problems with development. I will ask him to lead
> the
> >>>>> >>> >> discussion on
> >>>>> >>> >>    that.
> >>>>> >>> >>
> >>>>> >>> >> Thanks
> >>>>> >>> >>
> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
> >>>>> david@datatorrent.com>
> >>>>> >>> wrote:
> >>>>> >>> >>
> >>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
> >>>>> >>> >> >
> >>>>> >>> >> > On the contrary, putting a new process for people trying to
> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
> >>>>> operators in
> >>>>> >>> >> > the repository
> >>>>> >>> >> is
> >>>>> >>> >> > what is conflicting.
> >>>>> >>> >> >
> >>>>> >>> >> > Keep in mind that when people try to contribute, they always
> >>>>> look
> >>>>> >>> >> > at the existing operators already in the repository as
> >>>>> examples and
> >>>>> >>> >> > likely a
> >>>>> >>> >> model
> >>>>> >>> >> > for their new operators.
> >>>>> >>> >> >
> >>>>> >>> >> > David
> >>>>> >>> >> >
> >>>>> >>> >> >
> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
> >>>>> amol@datatorrent.com>
> >>>>> >>> >> wrote:
> >>>>> >>> >> >
> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
> >>>>> thread
> >>>>> >>> >> > > was to
> >>>>> >>> >> > open
> >>>>> >>> >> > > up a way for contributors to submit code in a dir
> >>>>> (contrib?) as
> >>>>> >>> >> > > long
> >>>>> >>> >> as
> >>>>> >>> >> > > license part of taken care of.
> >>>>> >>> >> > >
> >>>>> >>> >> > > On the thread of removing non-used operators -> How do we
> >>>>> know
> >>>>> >>> >> > > what is being used?
> >>>>> >>> >> > >
> >>>>> >>> >> > > Thks,
> >>>>> >>> >> > > Amol
> >>>>> >>> >> > >
> >>>>> >>> >> > >
> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >>>>> >>> >> sandesh@datatorrent.com>
> >>>>> >>> >> > > wrote:
> >>>>> >>> >> > >
> >>>>> >>> >> > > > +1 for removing the not-used operators.
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > So we are creating a process for operator writers who
> >>>>> don't
> >>>>> >>> >> > > > want to understand the platform, yet wants to
> contribute?
> >>>>> How
> >>>>> >>> >> > > > big is that
> >>>>> >>> >> set?
> >>>>> >>> >> > > > If we tell the app-user, here is the code which has not
> >>>>> passed
> >>>>> >>> >> > > > all
> >>>>> >>> >> the
> >>>>> >>> >> > > > checklist, will they be ready to use that in production?
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
> >>>>> operators and
> >>>>> >>> >> > > > make
> >>>>> >>> >> it
> >>>>> >>> >> > > easy
> >>>>> >>> >> > > > to add more operators.
> >>>>> >>> >> > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> >>>>> >>> >> > pramod@datatorrent.com>
> >>>>> >>> >> > > > wrote:
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
> >>>>> >>> >> > > > > wrote:
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > Pramod,
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > By that logic I would say let's put all
> partitionable
> >>>>> >>> >> > > > > > operators
> >>>>> >>> >> > into
> >>>>> >>> >> > > > one
> >>>>> >>> >> > > > > > folder, non-partitionable operators in another and
> so
> >>>>> on...
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > Remember the original goal of making it easier for new
> >>>>> >>> >> > > > > members to contribute and managing those contributions
> >>>>> to
> >>>>> >>> >> > > > > maturity. It is
> >>>>> >>> >> not a
> >>>>> >>> >> > > > > functional level separation.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
> >>>>> being
> >>>>> >>> >> > > > > > used at
> >>>>> >>> >> > > class
> >>>>> >>> >> > > > > > level and not at package/folder level.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
> this
> >>>>> like
> >>>>> >>> >> > > > > a
> >>>>> >>> >> > > folder..."
> >>>>> >>> >> > > > > as an analogy and not literally.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > Thanks
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > Thanks
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> >>>>> >>> >> > > > pramod@datatorrent.com
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > wrote:
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
> >>>>> >>> >> > > > > > > wrote:
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
> >>>>> >>> >> > > org.apache.hadoop.classification.
> >>>>> InterfaceStability.Evolving
> >>>>> >>> >> > > > /
> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
> >>>>> InterfaceStability.Uns
> >>>>> >>> >> > > > > > > > table
> >>>>> >>> >> > > > > > annotation?
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > I think it is important to localize the additions
> >>>>> in one
> >>>>> >>> >> place so
> >>>>> >>> >> > > > that
> >>>>> >>> >> > > > > it
> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity level
> of
> >>>>> >>> >> > > > > > > these,
> >>>>> >>> >> > easier
> >>>>> >>> >> > > > for
> >>>>> >>> >> > > > > > > developers to track them towards the path to
> >>>>> maturity and
> >>>>> >>> >> > > > > > > also
> >>>>> >>> >> > > > > provides a
> >>>>> >>> >> > > > > > > clearer directive for committers and contributors
> on
> >>>>> >>> >> acceptance
> >>>>> >>> >> > of
> >>>>> >>> >> > > > new
> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
> makes
> >>>>> them
> >>>>> >>> >> spread
> >>>>> >>> >> > all
> >>>>> >>> >> > > > > over
> >>>>> >>> >> > > > > > > the place and adds an additional layer of
> >>>>> difficulty in
> >>>>> >>> >> > > > identification
> >>>>> >>> >> > > > > > not
> >>>>> >>> >> > > > > > > just for users but also for developers who want to
> >>>>> find
> >>>>> >>> >> > > > > > > such
> >>>>> >>> >> > > > operators
> >>>>> >>> >> > > > > > and
> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
> >>>>> annotation
> >>>>> >>> >> where
> >>>>> >>> >> > > > > > everything
> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > Thanks
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> >>>>> >>> >> > > david@datatorrent.com
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > > wrote:
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
> >>>>> many
> >>>>> >>> >> operators
> >>>>> >>> >> > > > that
> >>>>> >>> >> > > > > > fall
> >>>>> >>> >> > > > > > > > in
> >>>>> >>> >> > > > > > > > > > the
> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
> >>>>> should
> >>>>> >>> >> > > > > > > > > > > > make it
> >>>>> >>> >> > > > obvious
> >>>>> >>> >> > > > > to
> >>>>> >>> >> > > > > > > > users
> >>>>> >>> >> > > > > > > > > > > that
> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
> >>>>> >>> >> > > > > > > > > > > > operators
> >>>>> >>> >> are
> >>>>> >>> >> > > not,
> >>>>> >>> >> > > > > and
> >>>>> >>> >> > > > > > > > maybe
> >>>>> >>> >> > > > > > > > > > > even
> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
> >>>>> used in a
> >>>>> >>> >> > > > > > > > > > > > real
> >>>>> >>> >> > use
> >>>>> >>> >> > > > > case.
> >>>>> >>> >> > > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
> >>>>> operators
> >>>>> >>> >> > > > > > > > > > > and
> >>>>> >>> >> > doing
> >>>>> >>> >> > > > this
> >>>>> >>> >> > > > > > > > > exercise
> >>>>> >>> >> > > > > > > > > > as
> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
> >>>>> original
> >>>>> >>> >> intent
> >>>>> >>> >> > is
> >>>>> >>> >> > > > for
> >>>>> >>> >> > > > > > > > > > > contributions going forward.
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
> >>>>> >>> >> > > > > > > > > > Operators
> >>>>> >>> >> > > outside
> >>>>> >>> >> > > > > the
> >>>>> >>> >> > > > > > > play
> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
> >>>>> anticipate
> >>>>> >>> >> > > > > > > > > much
> >>>>> >>> >> > tension
> >>>>> >>> >> > > if
> >>>>> >>> >> > > > > we
> >>>>> >>> >> > > > > > > > > establish clear criteria.
> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
> >>>>> operators
> >>>>> >>> >> > > > > > > > > stay
> >>>>> >>> >> and
> >>>>> >>> >> > > put
> >>>>> >>> >> > > > up
> >>>>> >>> >> > > > > > the
> >>>>> >>> >> > > > > > > > > bars for new operators.
> >>>>> >>> >> > > > > > > > >
> >>>>> >>> >> > > > > > > > > David
> >>>>> >>> >> > > > > > > > >
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > >
> >>>>> >>> >> >
> >>>>> >>> >>
> >>>>> >>> >
> >>>>> >>> >
> >>>>> >>>
> >>>>> >>
> >>>>> >>
> >>>>> >
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>

Re: A proposal for Malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
Added comments, also recommend having the misc folder for the remaining
operators in contrib according to proposed guidelines

https://github.com/apache/apex-site/pull/44

On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi
>
> I also added recommendation for lib/math operators to the same document as
> a separate sheet. Please have a look.
>
> Thanks
> Lakshmi Prasanna
>
> On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> Hi,
>>
>> I also added recommendation for each operator . Please take a look.
>>
>> thanks
>>
>> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> lakshmi@datatorrent.com> wrote:
>>
>>> Hi,
>>>
>>> I created a shared google sheet and tracked the various details of
>>> operators. Currently, the sheet contains information about operators under
>>> lib/algo only. Link is https://docs.google.com/a/
>>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>> lib/math too.
>>>
>>> Thanks
>>> Lakshmi Prasanna
>>>
>>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>>> wrote:
>>>
>>>> Hi Lakshmi,
>>>>
>>>> Thanks for volunteering.
>>>>
>>>> I think Pramod's suggestion of putting the operators into 3 buckets and
>>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>>> individual operators are both good, with the exception that lib/streamquery
>>>> is one unit and we probably do not need to look at individual operators
>>>> under it.
>>>>
>>>> If we don't have any objection in the community, let's start the
>>>> process.
>>>>
>>>> David
>>>>
>>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>>> lakshmi@datatorrent.com> wrote:
>>>>
>>>>> I am interested to work on this.
>>>>>
>>>>> Regards,
>>>>> Lakshmi prasanna
>>>>>
>>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Why not have a shared google sheet with a list of operators and
>>>>> options
>>>>> > that we want to do with it.
>>>>> > I think it's case by case.
>>>>> > But retire unused or obsolete operators is important and we should
>>>>> do it
>>>>> > sooner rather than later.
>>>>> >
>>>>> > Regards,
>>>>> > Siyuan
>>>>> >
>>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>>>>> wrote:
>>>>> >
>>>>> >>
>>>>> >> My vote is to do 2&3
>>>>> >>
>>>>> >> Thks
>>>>> >> Amol
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>>>> >> VKottapalli@directv.com> wrote:
>>>>> >>
>>>>> >>> +1 for deprecating the packages listed below.
>>>>> >>>
>>>>> >>> -----Original Message-----
>>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>>> >>>
>>>>> >>> +1
>>>>> >>>
>>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <david@datatorrent.com
>>>>> >
>>>>> >>> wrote:
>>>>> >>>
>>>>> >>> > Hi all,
>>>>> >>> >
>>>>> >>> > I would like to renew the discussion of retiring operators in
>>>>> Malhar.
>>>>> >>> >
>>>>> >>> > As stated before, the reason why we would like to retire
>>>>> operators in
>>>>> >>> > Malhar is because some of them were written a long time ago
>>>>> before
>>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>>>>> are not
>>>>> >>> > up to par in code quality, have no potential for improvement, and
>>>>> >>> > probably completely unused by anybody.
>>>>> >>> >
>>>>> >>> > We do not want contributors to use them as a model of their
>>>>> >>> > contribution, or users to use them thinking they are of quality,
>>>>> and
>>>>> >>> then hit a wall.
>>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>>>> >>> >
>>>>> >>> > The initial 3 packages that we would like to target are
>>>>> *lib/algo*,
>>>>> >>> > *lib/math*, and *lib/streamquery*.
>>>>> >>>
>>>>> >>> >
>>>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>>>> are
>>>>> >>> > using any operator in these 3 packages. We would like to hear
>>>>> from you.
>>>>> >>> >
>>>>> >>> > These are the options I can think of for retiring those
>>>>> operators:
>>>>> >>> >
>>>>> >>> > 1) Completely remove them from the malhar repository.
>>>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>>>> >>> > malhar-misc
>>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>>>> >>> > longer supported
>>>>> >>> >
>>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>>>> >>> >
>>>>> >>> > David
>>>>> >>> >
>>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>>>> >>> > <pr...@datatorrent.com>
>>>>> >>> > wrote:
>>>>> >>> >
>>>>> >>> >> I wanted to close the loop on this discussion. In general
>>>>> everyone
>>>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>>>> Folks
>>>>> >>> >> had good suggestions like documenting capabilities of
>>>>> operators, come
>>>>> >>> >> up well defined criteria for graduation of operators and what
>>>>> those
>>>>> >>> >> criteria may be and what to do with existing operators that may
>>>>> not
>>>>> >>> >> yet be mature or unused.
>>>>> >>> >>
>>>>> >>> >> I am going to summarize the key points that resulted from the
>>>>> >>> >> discussion and would like to proceed with them.
>>>>> >>> >>
>>>>> >>> >>    - Operators that do not yet provide the key platform
>>>>> capabilities
>>>>> >>> to
>>>>> >>> >>    make an operator useful across different applications such as
>>>>> >>> >> reusability,
>>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>>>>> will
>>>>> >>> still be
>>>>> >>> >>    accepted as long as they are functionally correct, have unit
>>>>> tests
>>>>> >>> >> and will
>>>>> >>> >>    go into a separate module.
>>>>> >>> >>    - Contrib module was suggested as a place where new
>>>>> contributions
>>>>> >>> go in
>>>>> >>> >>    that don't yet have all the platform capabilities and are
>>>>> not yet
>>>>> >>> >> mature.
>>>>> >>> >>    If there are no other suggestions we will go with this one.
>>>>> >>> >>    - It was suggested the operators documentation list those
>>>>> platform
>>>>> >>> >>    capabilities it currently provides from the list above. I
>>>>> will
>>>>> >>> >> document a
>>>>> >>> >>    structure for this in the contribution guidelines.
>>>>> >>> >>    - Folks wanted to know what would be the criteria to
>>>>> graduate an
>>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>>>>> thread
>>>>> >>> >> for it as
>>>>> >>> >>    I think it requires its own discussion and hopefully we can
>>>>> come
>>>>> >>> >> up with a
>>>>> >>> >>    set of guidelines for it.
>>>>> >>> >>    - David brought up state of some of the existing operators
>>>>> and
>>>>> >>> their
>>>>> >>> >>    retirement and the layout of operators in Malhar in general
>>>>> and
>>>>> >>> how it
>>>>> >>> >>    causes problems with development. I will ask him to lead the
>>>>> >>> >> discussion on
>>>>> >>> >>    that.
>>>>> >>> >>
>>>>> >>> >> Thanks
>>>>> >>> >>
>>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>>>> david@datatorrent.com>
>>>>> >>> wrote:
>>>>> >>> >>
>>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
>>>>> >>> >> >
>>>>> >>> >> > On the contrary, putting a new process for people trying to
>>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>>>> operators in
>>>>> >>> >> > the repository
>>>>> >>> >> is
>>>>> >>> >> > what is conflicting.
>>>>> >>> >> >
>>>>> >>> >> > Keep in mind that when people try to contribute, they always
>>>>> look
>>>>> >>> >> > at the existing operators already in the repository as
>>>>> examples and
>>>>> >>> >> > likely a
>>>>> >>> >> model
>>>>> >>> >> > for their new operators.
>>>>> >>> >> >
>>>>> >>> >> > David
>>>>> >>> >> >
>>>>> >>> >> >
>>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>>>> amol@datatorrent.com>
>>>>> >>> >> wrote:
>>>>> >>> >> >
>>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>>>> thread
>>>>> >>> >> > > was to
>>>>> >>> >> > open
>>>>> >>> >> > > up a way for contributors to submit code in a dir
>>>>> (contrib?) as
>>>>> >>> >> > > long
>>>>> >>> >> as
>>>>> >>> >> > > license part of taken care of.
>>>>> >>> >> > >
>>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>>>>> know
>>>>> >>> >> > > what is being used?
>>>>> >>> >> > >
>>>>> >>> >> > > Thks,
>>>>> >>> >> > > Amol
>>>>> >>> >> > >
>>>>> >>> >> > >
>>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>>>> >>> >> sandesh@datatorrent.com>
>>>>> >>> >> > > wrote:
>>>>> >>> >> > >
>>>>> >>> >> > > > +1 for removing the not-used operators.
>>>>> >>> >> > > >
>>>>> >>> >> > > > So we are creating a process for operator writers who
>>>>> don't
>>>>> >>> >> > > > want to understand the platform, yet wants to contribute?
>>>>> How
>>>>> >>> >> > > > big is that
>>>>> >>> >> set?
>>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>>>>> passed
>>>>> >>> >> > > > all
>>>>> >>> >> the
>>>>> >>> >> > > > checklist, will they be ready to use that in production?
>>>>> >>> >> > > >
>>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>>>>> operators and
>>>>> >>> >> > > > make
>>>>> >>> >> it
>>>>> >>> >> > > easy
>>>>> >>> >> > > > to add more operators.
>>>>> >>> >> > > >
>>>>> >>> >> > > >
>>>>> >>> >> > > >
>>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>>>> >>> >> > pramod@datatorrent.com>
>>>>> >>> >> > > > wrote:
>>>>> >>> >> > > >
>>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>>>> >>> >> > > > > wrote:
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > Pramod,
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > By that logic I would say let's put all partitionable
>>>>> >>> >> > > > > > operators
>>>>> >>> >> > into
>>>>> >>> >> > > > one
>>>>> >>> >> > > > > > folder, non-partitionable operators in another and so
>>>>> on...
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > Remember the original goal of making it easier for new
>>>>> >>> >> > > > > members to contribute and managing those contributions
>>>>> to
>>>>> >>> >> > > > > maturity. It is
>>>>> >>> >> not a
>>>>> >>> >> > > > > functional level separation.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>>>> being
>>>>> >>> >> > > > > > used at
>>>>> >>> >> > > class
>>>>> >>> >> > > > > > level and not at package/folder level.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>>>>> like
>>>>> >>> >> > > > > a
>>>>> >>> >> > > folder..."
>>>>> >>> >> > > > > as an analogy and not literally.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > Thanks
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > Thanks
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>>>> >>> >> > > > pramod@datatorrent.com
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > wrote:
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>>>> >>> >> > > > > > > wrote:
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>>>> >>> >> > > org.apache.hadoop.classification.
>>>>> InterfaceStability.Evolving
>>>>> >>> >> > > > /
>>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>>>>> InterfaceStability.Uns
>>>>> >>> >> > > > > > > > table
>>>>> >>> >> > > > > > annotation?
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > I think it is important to localize the additions
>>>>> in one
>>>>> >>> >> place so
>>>>> >>> >> > > > that
>>>>> >>> >> > > > > it
>>>>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>>>>> >>> >> > > > > > > these,
>>>>> >>> >> > easier
>>>>> >>> >> > > > for
>>>>> >>> >> > > > > > > developers to track them towards the path to
>>>>> maturity and
>>>>> >>> >> > > > > > > also
>>>>> >>> >> > > > > provides a
>>>>> >>> >> > > > > > > clearer directive for committers and contributors on
>>>>> >>> >> acceptance
>>>>> >>> >> > of
>>>>> >>> >> > > > new
>>>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>>>>> them
>>>>> >>> >> spread
>>>>> >>> >> > all
>>>>> >>> >> > > > > over
>>>>> >>> >> > > > > > > the place and adds an additional layer of
>>>>> difficulty in
>>>>> >>> >> > > > identification
>>>>> >>> >> > > > > > not
>>>>> >>> >> > > > > > > just for users but also for developers who want to
>>>>> find
>>>>> >>> >> > > > > > > such
>>>>> >>> >> > > > operators
>>>>> >>> >> > > > > > and
>>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>>>> annotation
>>>>> >>> >> where
>>>>> >>> >> > > > > > everything
>>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > Thanks
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>>>> >>> >> > > david@datatorrent.com
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > > wrote:
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
>>>>> many
>>>>> >>> >> operators
>>>>> >>> >> > > > that
>>>>> >>> >> > > > > > fall
>>>>> >>> >> > > > > > > > in
>>>>> >>> >> > > > > > > > > > the
>>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>>>> should
>>>>> >>> >> > > > > > > > > > > > make it
>>>>> >>> >> > > > obvious
>>>>> >>> >> > > > > to
>>>>> >>> >> > > > > > > > users
>>>>> >>> >> > > > > > > > > > > that
>>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>>>>> >>> >> > > > > > > > > > > > operators
>>>>> >>> >> are
>>>>> >>> >> > > not,
>>>>> >>> >> > > > > and
>>>>> >>> >> > > > > > > > maybe
>>>>> >>> >> > > > > > > > > > > even
>>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>>>>> used in a
>>>>> >>> >> > > > > > > > > > > > real
>>>>> >>> >> > use
>>>>> >>> >> > > > > case.
>>>>> >>> >> > > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>>>> operators
>>>>> >>> >> > > > > > > > > > > and
>>>>> >>> >> > doing
>>>>> >>> >> > > > this
>>>>> >>> >> > > > > > > > > exercise
>>>>> >>> >> > > > > > > > > > as
>>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>>>> original
>>>>> >>> >> intent
>>>>> >>> >> > is
>>>>> >>> >> > > > for
>>>>> >>> >> > > > > > > > > > > contributions going forward.
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>>>>> >>> >> > > > > > > > > > Operators
>>>>> >>> >> > > outside
>>>>> >>> >> > > > > the
>>>>> >>> >> > > > > > > play
>>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > I think this is important, and I don't
>>>>> anticipate
>>>>> >>> >> > > > > > > > > much
>>>>> >>> >> > tension
>>>>> >>> >> > > if
>>>>> >>> >> > > > > we
>>>>> >>> >> > > > > > > > > establish clear criteria.
>>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>>>> operators
>>>>> >>> >> > > > > > > > > stay
>>>>> >>> >> and
>>>>> >>> >> > > put
>>>>> >>> >> > > > up
>>>>> >>> >> > > > > > the
>>>>> >>> >> > > > > > > > > bars for new operators.
>>>>> >>> >> > > > > > > > >
>>>>> >>> >> > > > > > > > > David
>>>>> >>> >> > > > > > > > >
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > >
>>>>> >>> >> > >
>>>>> >>> >> >
>>>>> >>> >>
>>>>> >>> >
>>>>> >>> >
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>

Re: A proposal for Malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
Added comments, also recommend having the misc folder for the remaining
operators in contrib according to proposed guidelines

https://github.com/apache/apex-site/pull/44

On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi
>
> I also added recommendation for lib/math operators to the same document as
> a separate sheet. Please have a look.
>
> Thanks
> Lakshmi Prasanna
>
> On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> Hi,
>>
>> I also added recommendation for each operator . Please take a look.
>>
>> thanks
>>
>> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> lakshmi@datatorrent.com> wrote:
>>
>>> Hi,
>>>
>>> I created a shared google sheet and tracked the various details of
>>> operators. Currently, the sheet contains information about operators under
>>> lib/algo only. Link is https://docs.google.com/a/
>>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>> lib/math too.
>>>
>>> Thanks
>>> Lakshmi Prasanna
>>>
>>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>>> wrote:
>>>
>>>> Hi Lakshmi,
>>>>
>>>> Thanks for volunteering.
>>>>
>>>> I think Pramod's suggestion of putting the operators into 3 buckets and
>>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>>> individual operators are both good, with the exception that lib/streamquery
>>>> is one unit and we probably do not need to look at individual operators
>>>> under it.
>>>>
>>>> If we don't have any objection in the community, let's start the
>>>> process.
>>>>
>>>> David
>>>>
>>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>>> lakshmi@datatorrent.com> wrote:
>>>>
>>>>> I am interested to work on this.
>>>>>
>>>>> Regards,
>>>>> Lakshmi prasanna
>>>>>
>>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Why not have a shared google sheet with a list of operators and
>>>>> options
>>>>> > that we want to do with it.
>>>>> > I think it's case by case.
>>>>> > But retire unused or obsolete operators is important and we should
>>>>> do it
>>>>> > sooner rather than later.
>>>>> >
>>>>> > Regards,
>>>>> > Siyuan
>>>>> >
>>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>>>>> wrote:
>>>>> >
>>>>> >>
>>>>> >> My vote is to do 2&3
>>>>> >>
>>>>> >> Thks
>>>>> >> Amol
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>>>> >> VKottapalli@directv.com> wrote:
>>>>> >>
>>>>> >>> +1 for deprecating the packages listed below.
>>>>> >>>
>>>>> >>> -----Original Message-----
>>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>>> >>>
>>>>> >>> +1
>>>>> >>>
>>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <david@datatorrent.com
>>>>> >
>>>>> >>> wrote:
>>>>> >>>
>>>>> >>> > Hi all,
>>>>> >>> >
>>>>> >>> > I would like to renew the discussion of retiring operators in
>>>>> Malhar.
>>>>> >>> >
>>>>> >>> > As stated before, the reason why we would like to retire
>>>>> operators in
>>>>> >>> > Malhar is because some of them were written a long time ago
>>>>> before
>>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>>>>> are not
>>>>> >>> > up to par in code quality, have no potential for improvement, and
>>>>> >>> > probably completely unused by anybody.
>>>>> >>> >
>>>>> >>> > We do not want contributors to use them as a model of their
>>>>> >>> > contribution, or users to use them thinking they are of quality,
>>>>> and
>>>>> >>> then hit a wall.
>>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>>>> >>> >
>>>>> >>> > The initial 3 packages that we would like to target are
>>>>> *lib/algo*,
>>>>> >>> > *lib/math*, and *lib/streamquery*.
>>>>> >>>
>>>>> >>> >
>>>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>>>> are
>>>>> >>> > using any operator in these 3 packages. We would like to hear
>>>>> from you.
>>>>> >>> >
>>>>> >>> > These are the options I can think of for retiring those
>>>>> operators:
>>>>> >>> >
>>>>> >>> > 1) Completely remove them from the malhar repository.
>>>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>>>> >>> > malhar-misc
>>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>>>> >>> > longer supported
>>>>> >>> >
>>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>>>> >>> >
>>>>> >>> > David
>>>>> >>> >
>>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>>>> >>> > <pr...@datatorrent.com>
>>>>> >>> > wrote:
>>>>> >>> >
>>>>> >>> >> I wanted to close the loop on this discussion. In general
>>>>> everyone
>>>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>>>> Folks
>>>>> >>> >> had good suggestions like documenting capabilities of
>>>>> operators, come
>>>>> >>> >> up well defined criteria for graduation of operators and what
>>>>> those
>>>>> >>> >> criteria may be and what to do with existing operators that may
>>>>> not
>>>>> >>> >> yet be mature or unused.
>>>>> >>> >>
>>>>> >>> >> I am going to summarize the key points that resulted from the
>>>>> >>> >> discussion and would like to proceed with them.
>>>>> >>> >>
>>>>> >>> >>    - Operators that do not yet provide the key platform
>>>>> capabilities
>>>>> >>> to
>>>>> >>> >>    make an operator useful across different applications such as
>>>>> >>> >> reusability,
>>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>>>>> will
>>>>> >>> still be
>>>>> >>> >>    accepted as long as they are functionally correct, have unit
>>>>> tests
>>>>> >>> >> and will
>>>>> >>> >>    go into a separate module.
>>>>> >>> >>    - Contrib module was suggested as a place where new
>>>>> contributions
>>>>> >>> go in
>>>>> >>> >>    that don't yet have all the platform capabilities and are
>>>>> not yet
>>>>> >>> >> mature.
>>>>> >>> >>    If there are no other suggestions we will go with this one.
>>>>> >>> >>    - It was suggested the operators documentation list those
>>>>> platform
>>>>> >>> >>    capabilities it currently provides from the list above. I
>>>>> will
>>>>> >>> >> document a
>>>>> >>> >>    structure for this in the contribution guidelines.
>>>>> >>> >>    - Folks wanted to know what would be the criteria to
>>>>> graduate an
>>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>>>>> thread
>>>>> >>> >> for it as
>>>>> >>> >>    I think it requires its own discussion and hopefully we can
>>>>> come
>>>>> >>> >> up with a
>>>>> >>> >>    set of guidelines for it.
>>>>> >>> >>    - David brought up state of some of the existing operators
>>>>> and
>>>>> >>> their
>>>>> >>> >>    retirement and the layout of operators in Malhar in general
>>>>> and
>>>>> >>> how it
>>>>> >>> >>    causes problems with development. I will ask him to lead the
>>>>> >>> >> discussion on
>>>>> >>> >>    that.
>>>>> >>> >>
>>>>> >>> >> Thanks
>>>>> >>> >>
>>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>>>> david@datatorrent.com>
>>>>> >>> wrote:
>>>>> >>> >>
>>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
>>>>> >>> >> >
>>>>> >>> >> > On the contrary, putting a new process for people trying to
>>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>>>> operators in
>>>>> >>> >> > the repository
>>>>> >>> >> is
>>>>> >>> >> > what is conflicting.
>>>>> >>> >> >
>>>>> >>> >> > Keep in mind that when people try to contribute, they always
>>>>> look
>>>>> >>> >> > at the existing operators already in the repository as
>>>>> examples and
>>>>> >>> >> > likely a
>>>>> >>> >> model
>>>>> >>> >> > for their new operators.
>>>>> >>> >> >
>>>>> >>> >> > David
>>>>> >>> >> >
>>>>> >>> >> >
>>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>>>> amol@datatorrent.com>
>>>>> >>> >> wrote:
>>>>> >>> >> >
>>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>>>> thread
>>>>> >>> >> > > was to
>>>>> >>> >> > open
>>>>> >>> >> > > up a way for contributors to submit code in a dir
>>>>> (contrib?) as
>>>>> >>> >> > > long
>>>>> >>> >> as
>>>>> >>> >> > > license part of taken care of.
>>>>> >>> >> > >
>>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>>>>> know
>>>>> >>> >> > > what is being used?
>>>>> >>> >> > >
>>>>> >>> >> > > Thks,
>>>>> >>> >> > > Amol
>>>>> >>> >> > >
>>>>> >>> >> > >
>>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>>>> >>> >> sandesh@datatorrent.com>
>>>>> >>> >> > > wrote:
>>>>> >>> >> > >
>>>>> >>> >> > > > +1 for removing the not-used operators.
>>>>> >>> >> > > >
>>>>> >>> >> > > > So we are creating a process for operator writers who
>>>>> don't
>>>>> >>> >> > > > want to understand the platform, yet wants to contribute?
>>>>> How
>>>>> >>> >> > > > big is that
>>>>> >>> >> set?
>>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>>>>> passed
>>>>> >>> >> > > > all
>>>>> >>> >> the
>>>>> >>> >> > > > checklist, will they be ready to use that in production?
>>>>> >>> >> > > >
>>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>>>>> operators and
>>>>> >>> >> > > > make
>>>>> >>> >> it
>>>>> >>> >> > > easy
>>>>> >>> >> > > > to add more operators.
>>>>> >>> >> > > >
>>>>> >>> >> > > >
>>>>> >>> >> > > >
>>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>>>> >>> >> > pramod@datatorrent.com>
>>>>> >>> >> > > > wrote:
>>>>> >>> >> > > >
>>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>>>> >>> >> > > > > wrote:
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > Pramod,
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > By that logic I would say let's put all partitionable
>>>>> >>> >> > > > > > operators
>>>>> >>> >> > into
>>>>> >>> >> > > > one
>>>>> >>> >> > > > > > folder, non-partitionable operators in another and so
>>>>> on...
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > Remember the original goal of making it easier for new
>>>>> >>> >> > > > > members to contribute and managing those contributions
>>>>> to
>>>>> >>> >> > > > > maturity. It is
>>>>> >>> >> not a
>>>>> >>> >> > > > > functional level separation.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>>>> being
>>>>> >>> >> > > > > > used at
>>>>> >>> >> > > class
>>>>> >>> >> > > > > > level and not at package/folder level.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>>>>> like
>>>>> >>> >> > > > > a
>>>>> >>> >> > > folder..."
>>>>> >>> >> > > > > as an analogy and not literally.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > Thanks
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > Thanks
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>>>> >>> >> > > > pramod@datatorrent.com
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > wrote:
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>>>> >>> >> > > > > > > wrote:
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>>>> >>> >> > > org.apache.hadoop.classification.
>>>>> InterfaceStability.Evolving
>>>>> >>> >> > > > /
>>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>>>>> InterfaceStability.Uns
>>>>> >>> >> > > > > > > > table
>>>>> >>> >> > > > > > annotation?
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > I think it is important to localize the additions
>>>>> in one
>>>>> >>> >> place so
>>>>> >>> >> > > > that
>>>>> >>> >> > > > > it
>>>>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>>>>> >>> >> > > > > > > these,
>>>>> >>> >> > easier
>>>>> >>> >> > > > for
>>>>> >>> >> > > > > > > developers to track them towards the path to
>>>>> maturity and
>>>>> >>> >> > > > > > > also
>>>>> >>> >> > > > > provides a
>>>>> >>> >> > > > > > > clearer directive for committers and contributors on
>>>>> >>> >> acceptance
>>>>> >>> >> > of
>>>>> >>> >> > > > new
>>>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>>>>> them
>>>>> >>> >> spread
>>>>> >>> >> > all
>>>>> >>> >> > > > > over
>>>>> >>> >> > > > > > > the place and adds an additional layer of
>>>>> difficulty in
>>>>> >>> >> > > > identification
>>>>> >>> >> > > > > > not
>>>>> >>> >> > > > > > > just for users but also for developers who want to
>>>>> find
>>>>> >>> >> > > > > > > such
>>>>> >>> >> > > > operators
>>>>> >>> >> > > > > > and
>>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>>>> annotation
>>>>> >>> >> where
>>>>> >>> >> > > > > > everything
>>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > Thanks
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>>>> >>> >> > > david@datatorrent.com
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > > wrote:
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
>>>>> many
>>>>> >>> >> operators
>>>>> >>> >> > > > that
>>>>> >>> >> > > > > > fall
>>>>> >>> >> > > > > > > > in
>>>>> >>> >> > > > > > > > > > the
>>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>>>> should
>>>>> >>> >> > > > > > > > > > > > make it
>>>>> >>> >> > > > obvious
>>>>> >>> >> > > > > to
>>>>> >>> >> > > > > > > > users
>>>>> >>> >> > > > > > > > > > > that
>>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>>>>> >>> >> > > > > > > > > > > > operators
>>>>> >>> >> are
>>>>> >>> >> > > not,
>>>>> >>> >> > > > > and
>>>>> >>> >> > > > > > > > maybe
>>>>> >>> >> > > > > > > > > > > even
>>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>>>>> used in a
>>>>> >>> >> > > > > > > > > > > > real
>>>>> >>> >> > use
>>>>> >>> >> > > > > case.
>>>>> >>> >> > > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>>>> operators
>>>>> >>> >> > > > > > > > > > > and
>>>>> >>> >> > doing
>>>>> >>> >> > > > this
>>>>> >>> >> > > > > > > > > exercise
>>>>> >>> >> > > > > > > > > > as
>>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>>>> original
>>>>> >>> >> intent
>>>>> >>> >> > is
>>>>> >>> >> > > > for
>>>>> >>> >> > > > > > > > > > > contributions going forward.
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>>>>> >>> >> > > > > > > > > > Operators
>>>>> >>> >> > > outside
>>>>> >>> >> > > > > the
>>>>> >>> >> > > > > > > play
>>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > I think this is important, and I don't
>>>>> anticipate
>>>>> >>> >> > > > > > > > > much
>>>>> >>> >> > tension
>>>>> >>> >> > > if
>>>>> >>> >> > > > > we
>>>>> >>> >> > > > > > > > > establish clear criteria.
>>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>>>> operators
>>>>> >>> >> > > > > > > > > stay
>>>>> >>> >> and
>>>>> >>> >> > > put
>>>>> >>> >> > > > up
>>>>> >>> >> > > > > > the
>>>>> >>> >> > > > > > > > > bars for new operators.
>>>>> >>> >> > > > > > > > >
>>>>> >>> >> > > > > > > > > David
>>>>> >>> >> > > > > > > > >
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > >
>>>>> >>> >> > >
>>>>> >>> >> >
>>>>> >>> >>
>>>>> >>> >
>>>>> >>> >
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Hi

I also added recommendation for lib/math operators to the same document as
a separate sheet. Please have a look.

Thanks
Lakshmi Prasanna

On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi,
>
> I also added recommendation for each operator . Please take a look.
>
> thanks
>
> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
> lakshmi@datatorrent.com> wrote:
>
>> Hi,
>>
>> I created a shared google sheet and tracked the various details of
>> operators. Currently, the sheet contains information about operators under
>> lib/algo only. Link is
>> https://docs.google.com/a/datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_CaWpXt3GDccM/edit?usp=sharing .
>> Will update the sheet soon with lib/math too.
>>
>> Thanks
>> Lakshmi Prasanna
>>
>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com> wrote:
>>
>>> Hi Lakshmi,
>>>
>>> Thanks for volunteering.
>>>
>>> I think Pramod's suggestion of putting the operators into 3 buckets and
>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>> individual operators are both good, with the exception that lib/streamquery
>>> is one unit and we probably do not need to look at individual operators
>>> under it.
>>>
>>> If we don't have any objection in the community, let's start the process.
>>>
>>> David
>>>
>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com> wrote:
>>>
>>>> I am interested to work on this.
>>>>
>>>> Regards,
>>>> Lakshmi prasanna
>>>>
>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>>>> wrote:
>>>>
>>>> > Why not have a shared google sheet with a list of operators and
>>>> options
>>>> > that we want to do with it.
>>>> > I think it's case by case.
>>>> > But retire unused or obsolete operators is important and we should do
>>>> it
>>>> > sooner rather than later.
>>>> >
>>>> > Regards,
>>>> > Siyuan
>>>> >
>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>>>> wrote:
>>>> >
>>>> >>
>>>> >> My vote is to do 2&3
>>>> >>
>>>> >> Thks
>>>> >> Amol
>>>> >>
>>>> >>
>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>>> >> VKottapalli@directv.com> wrote:
>>>> >>
>>>> >>> +1 for deprecating the packages listed below.
>>>> >>>
>>>> >>> -----Original Message-----
>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>> >>>
>>>> >>> +1
>>>> >>>
>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>>>> >>> wrote:
>>>> >>>
>>>> >>> > Hi all,
>>>> >>> >
>>>> >>> > I would like to renew the discussion of retiring operators in
>>>> Malhar.
>>>> >>> >
>>>> >>> > As stated before, the reason why we would like to retire
>>>> operators in
>>>> >>> > Malhar is because some of them were written a long time ago before
>>>> >>> > Apache incubation, and they do not pertain to real use cases, are
>>>> not
>>>> >>> > up to par in code quality, have no potential for improvement, and
>>>> >>> > probably completely unused by anybody.
>>>> >>> >
>>>> >>> > We do not want contributors to use them as a model of their
>>>> >>> > contribution, or users to use them thinking they are of quality,
>>>> and
>>>> >>> then hit a wall.
>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>>> >>> >
>>>> >>> > The initial 3 packages that we would like to target are
>>>> *lib/algo*,
>>>> >>> > *lib/math*, and *lib/streamquery*.
>>>> >>>
>>>> >>> >
>>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>>> are
>>>> >>> > using any operator in these 3 packages. We would like to hear
>>>> from you.
>>>> >>> >
>>>> >>> > These are the options I can think of for retiring those operators:
>>>> >>> >
>>>> >>> > 1) Completely remove them from the malhar repository.
>>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>>> >>> > malhar-misc
>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>>> >>> > longer supported
>>>> >>> >
>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>>> >>> >
>>>> >>> > David
>>>> >>> >
>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>>> >>> > <pr...@datatorrent.com>
>>>> >>> > wrote:
>>>> >>> >
>>>> >>> >> I wanted to close the loop on this discussion. In general
>>>> everyone
>>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>>> Folks
>>>> >>> >> had good suggestions like documenting capabilities of operators,
>>>> come
>>>> >>> >> up well defined criteria for graduation of operators and what
>>>> those
>>>> >>> >> criteria may be and what to do with existing operators that may
>>>> not
>>>> >>> >> yet be mature or unused.
>>>> >>> >>
>>>> >>> >> I am going to summarize the key points that resulted from the
>>>> >>> >> discussion and would like to proceed with them.
>>>> >>> >>
>>>> >>> >>    - Operators that do not yet provide the key platform
>>>> capabilities
>>>> >>> to
>>>> >>> >>    make an operator useful across different applications such as
>>>> >>> >> reusability,
>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once will
>>>> >>> still be
>>>> >>> >>    accepted as long as they are functionally correct, have unit
>>>> tests
>>>> >>> >> and will
>>>> >>> >>    go into a separate module.
>>>> >>> >>    - Contrib module was suggested as a place where new
>>>> contributions
>>>> >>> go in
>>>> >>> >>    that don't yet have all the platform capabilities and are not
>>>> yet
>>>> >>> >> mature.
>>>> >>> >>    If there are no other suggestions we will go with this one.
>>>> >>> >>    - It was suggested the operators documentation list those
>>>> platform
>>>> >>> >>    capabilities it currently provides from the list above. I will
>>>> >>> >> document a
>>>> >>> >>    structure for this in the contribution guidelines.
>>>> >>> >>    - Folks wanted to know what would be the criteria to graduate
>>>> an
>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>>>> thread
>>>> >>> >> for it as
>>>> >>> >>    I think it requires its own discussion and hopefully we can
>>>> come
>>>> >>> >> up with a
>>>> >>> >>    set of guidelines for it.
>>>> >>> >>    - David brought up state of some of the existing operators and
>>>> >>> their
>>>> >>> >>    retirement and the layout of operators in Malhar in general
>>>> and
>>>> >>> how it
>>>> >>> >>    causes problems with development. I will ask him to lead the
>>>> >>> >> discussion on
>>>> >>> >>    that.
>>>> >>> >>
>>>> >>> >> Thanks
>>>> >>> >>
>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>>> david@datatorrent.com>
>>>> >>> wrote:
>>>> >>> >>
>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
>>>> >>> >> >
>>>> >>> >> > On the contrary, putting a new process for people trying to
>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>>> operators in
>>>> >>> >> > the repository
>>>> >>> >> is
>>>> >>> >> > what is conflicting.
>>>> >>> >> >
>>>> >>> >> > Keep in mind that when people try to contribute, they always
>>>> look
>>>> >>> >> > at the existing operators already in the repository as
>>>> examples and
>>>> >>> >> > likely a
>>>> >>> >> model
>>>> >>> >> > for their new operators.
>>>> >>> >> >
>>>> >>> >> > David
>>>> >>> >> >
>>>> >>> >> >
>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>>> amol@datatorrent.com>
>>>> >>> >> wrote:
>>>> >>> >> >
>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>>> thread
>>>> >>> >> > > was to
>>>> >>> >> > open
>>>> >>> >> > > up a way for contributors to submit code in a dir (contrib?)
>>>> as
>>>> >>> >> > > long
>>>> >>> >> as
>>>> >>> >> > > license part of taken care of.
>>>> >>> >> > >
>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>>>> know
>>>> >>> >> > > what is being used?
>>>> >>> >> > >
>>>> >>> >> > > Thks,
>>>> >>> >> > > Amol
>>>> >>> >> > >
>>>> >>> >> > >
>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>>> >>> >> sandesh@datatorrent.com>
>>>> >>> >> > > wrote:
>>>> >>> >> > >
>>>> >>> >> > > > +1 for removing the not-used operators.
>>>> >>> >> > > >
>>>> >>> >> > > > So we are creating a process for operator writers who don't
>>>> >>> >> > > > want to understand the platform, yet wants to contribute?
>>>> How
>>>> >>> >> > > > big is that
>>>> >>> >> set?
>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>>>> passed
>>>> >>> >> > > > all
>>>> >>> >> the
>>>> >>> >> > > > checklist, will they be ready to use that in production?
>>>> >>> >> > > >
>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators
>>>> and
>>>> >>> >> > > > make
>>>> >>> >> it
>>>> >>> >> > > easy
>>>> >>> >> > > > to add more operators.
>>>> >>> >> > > >
>>>> >>> >> > > >
>>>> >>> >> > > >
>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>>> >>> >> > pramod@datatorrent.com>
>>>> >>> >> > > > wrote:
>>>> >>> >> > > >
>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>>> >>> >> > > > > wrote:
>>>> >>> >> > > > >
>>>> >>> >> > > > > > Pramod,
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > By that logic I would say let's put all partitionable
>>>> >>> >> > > > > > operators
>>>> >>> >> > into
>>>> >>> >> > > > one
>>>> >>> >> > > > > > folder, non-partitionable operators in another and so
>>>> on...
>>>> >>> >> > > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > Remember the original goal of making it easier for new
>>>> >>> >> > > > > members to contribute and managing those contributions to
>>>> >>> >> > > > > maturity. It is
>>>> >>> >> not a
>>>> >>> >> > > > > functional level separation.
>>>> >>> >> > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>>> being
>>>> >>> >> > > > > > used at
>>>> >>> >> > > class
>>>> >>> >> > > > > > level and not at package/folder level.
>>>> >>> >> > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>>>> like
>>>> >>> >> > > > > a
>>>> >>> >> > > folder..."
>>>> >>> >> > > > > as an analogy and not literally.
>>>> >>> >> > > > >
>>>> >>> >> > > > > Thanks
>>>> >>> >> > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > > Thanks
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>>> >>> >> > > > pramod@datatorrent.com
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > wrote:
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>>> >>> >> > > > > > > wrote:
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>>>> >>> >> > > > /
>>>> >>> >> > > > > > > >
>>>> org.apache.hadoop.classification.InterfaceStability.Uns
>>>> >>> >> > > > > > > > table
>>>> >>> >> > > > > > annotation?
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > I think it is important to localize the additions in
>>>> one
>>>> >>> >> place so
>>>> >>> >> > > > that
>>>> >>> >> > > > > it
>>>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>>>> >>> >> > > > > > > these,
>>>> >>> >> > easier
>>>> >>> >> > > > for
>>>> >>> >> > > > > > > developers to track them towards the path to
>>>> maturity and
>>>> >>> >> > > > > > > also
>>>> >>> >> > > > > provides a
>>>> >>> >> > > > > > > clearer directive for committers and contributors on
>>>> >>> >> acceptance
>>>> >>> >> > of
>>>> >>> >> > > > new
>>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>>>> them
>>>> >>> >> spread
>>>> >>> >> > all
>>>> >>> >> > > > > over
>>>> >>> >> > > > > > > the place and adds an additional layer of difficulty
>>>> in
>>>> >>> >> > > > identification
>>>> >>> >> > > > > > not
>>>> >>> >> > > > > > > just for users but also for developers who want to
>>>> find
>>>> >>> >> > > > > > > such
>>>> >>> >> > > > operators
>>>> >>> >> > > > > > and
>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>>> annotation
>>>> >>> >> where
>>>> >>> >> > > > > > everything
>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > Thanks
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>>> >>> >> > > david@datatorrent.com
>>>> >>> >> > > > >
>>>> >>> >> > > > > > > wrote:
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > > > > >
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > > >
>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
>>>> many
>>>> >>> >> operators
>>>> >>> >> > > > that
>>>> >>> >> > > > > > fall
>>>> >>> >> > > > > > > > in
>>>> >>> >> > > > > > > > > > the
>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>>> should
>>>> >>> >> > > > > > > > > > > > make it
>>>> >>> >> > > > obvious
>>>> >>> >> > > > > to
>>>> >>> >> > > > > > > > users
>>>> >>> >> > > > > > > > > > > that
>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>>>> >>> >> > > > > > > > > > > > operators
>>>> >>> >> are
>>>> >>> >> > > not,
>>>> >>> >> > > > > and
>>>> >>> >> > > > > > > > maybe
>>>> >>> >> > > > > > > > > > > even
>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever used
>>>> in a
>>>> >>> >> > > > > > > > > > > > real
>>>> >>> >> > use
>>>> >>> >> > > > > case.
>>>> >>> >> > > > > > > > > > > >
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>>> operators
>>>> >>> >> > > > > > > > > > > and
>>>> >>> >> > doing
>>>> >>> >> > > > this
>>>> >>> >> > > > > > > > > exercise
>>>> >>> >> > > > > > > > > > as
>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>>> original
>>>> >>> >> intent
>>>> >>> >> > is
>>>> >>> >> > > > for
>>>> >>> >> > > > > > > > > > > contributions going forward.
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>>>> >>> >> > > > > > > > > > Operators
>>>> >>> >> > > outside
>>>> >>> >> > > > > the
>>>> >>> >> > > > > > > play
>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>>> >>> >> > > > > > > > > >
>>>> >>> >> > > > > > > > > >
>>>> >>> >> > > > > > > > > I think this is important, and I don't anticipate
>>>> >>> >> > > > > > > > > much
>>>> >>> >> > tension
>>>> >>> >> > > if
>>>> >>> >> > > > > we
>>>> >>> >> > > > > > > > > establish clear criteria.
>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>>> operators
>>>> >>> >> > > > > > > > > stay
>>>> >>> >> and
>>>> >>> >> > > put
>>>> >>> >> > > > up
>>>> >>> >> > > > > > the
>>>> >>> >> > > > > > > > > bars for new operators.
>>>> >>> >> > > > > > > > >
>>>> >>> >> > > > > > > > > David
>>>> >>> >> > > > > > > > >
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > >
>>>> >>> >> > >
>>>> >>> >> >
>>>> >>> >>
>>>> >>> >
>>>> >>> >
>>>> >>>
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Hi

I also added recommendation for lib/math operators to the same document as
a separate sheet. Please have a look.

Thanks
Lakshmi Prasanna

On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi,
>
> I also added recommendation for each operator . Please take a look.
>
> thanks
>
> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
> lakshmi@datatorrent.com> wrote:
>
>> Hi,
>>
>> I created a shared google sheet and tracked the various details of
>> operators. Currently, the sheet contains information about operators under
>> lib/algo only. Link is
>> https://docs.google.com/a/datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_CaWpXt3GDccM/edit?usp=sharing .
>> Will update the sheet soon with lib/math too.
>>
>> Thanks
>> Lakshmi Prasanna
>>
>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com> wrote:
>>
>>> Hi Lakshmi,
>>>
>>> Thanks for volunteering.
>>>
>>> I think Pramod's suggestion of putting the operators into 3 buckets and
>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>> individual operators are both good, with the exception that lib/streamquery
>>> is one unit and we probably do not need to look at individual operators
>>> under it.
>>>
>>> If we don't have any objection in the community, let's start the process.
>>>
>>> David
>>>
>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com> wrote:
>>>
>>>> I am interested to work on this.
>>>>
>>>> Regards,
>>>> Lakshmi prasanna
>>>>
>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>>>> wrote:
>>>>
>>>> > Why not have a shared google sheet with a list of operators and
>>>> options
>>>> > that we want to do with it.
>>>> > I think it's case by case.
>>>> > But retire unused or obsolete operators is important and we should do
>>>> it
>>>> > sooner rather than later.
>>>> >
>>>> > Regards,
>>>> > Siyuan
>>>> >
>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>>>> wrote:
>>>> >
>>>> >>
>>>> >> My vote is to do 2&3
>>>> >>
>>>> >> Thks
>>>> >> Amol
>>>> >>
>>>> >>
>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>>> >> VKottapalli@directv.com> wrote:
>>>> >>
>>>> >>> +1 for deprecating the packages listed below.
>>>> >>>
>>>> >>> -----Original Message-----
>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>> >>>
>>>> >>> +1
>>>> >>>
>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>>>> >>> wrote:
>>>> >>>
>>>> >>> > Hi all,
>>>> >>> >
>>>> >>> > I would like to renew the discussion of retiring operators in
>>>> Malhar.
>>>> >>> >
>>>> >>> > As stated before, the reason why we would like to retire
>>>> operators in
>>>> >>> > Malhar is because some of them were written a long time ago before
>>>> >>> > Apache incubation, and they do not pertain to real use cases, are
>>>> not
>>>> >>> > up to par in code quality, have no potential for improvement, and
>>>> >>> > probably completely unused by anybody.
>>>> >>> >
>>>> >>> > We do not want contributors to use them as a model of their
>>>> >>> > contribution, or users to use them thinking they are of quality,
>>>> and
>>>> >>> then hit a wall.
>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>>> >>> >
>>>> >>> > The initial 3 packages that we would like to target are
>>>> *lib/algo*,
>>>> >>> > *lib/math*, and *lib/streamquery*.
>>>> >>>
>>>> >>> >
>>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>>> are
>>>> >>> > using any operator in these 3 packages. We would like to hear
>>>> from you.
>>>> >>> >
>>>> >>> > These are the options I can think of for retiring those operators:
>>>> >>> >
>>>> >>> > 1) Completely remove them from the malhar repository.
>>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>>> >>> > malhar-misc
>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>>> >>> > longer supported
>>>> >>> >
>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>>> >>> >
>>>> >>> > David
>>>> >>> >
>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>>> >>> > <pr...@datatorrent.com>
>>>> >>> > wrote:
>>>> >>> >
>>>> >>> >> I wanted to close the loop on this discussion. In general
>>>> everyone
>>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>>> Folks
>>>> >>> >> had good suggestions like documenting capabilities of operators,
>>>> come
>>>> >>> >> up well defined criteria for graduation of operators and what
>>>> those
>>>> >>> >> criteria may be and what to do with existing operators that may
>>>> not
>>>> >>> >> yet be mature or unused.
>>>> >>> >>
>>>> >>> >> I am going to summarize the key points that resulted from the
>>>> >>> >> discussion and would like to proceed with them.
>>>> >>> >>
>>>> >>> >>    - Operators that do not yet provide the key platform
>>>> capabilities
>>>> >>> to
>>>> >>> >>    make an operator useful across different applications such as
>>>> >>> >> reusability,
>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once will
>>>> >>> still be
>>>> >>> >>    accepted as long as they are functionally correct, have unit
>>>> tests
>>>> >>> >> and will
>>>> >>> >>    go into a separate module.
>>>> >>> >>    - Contrib module was suggested as a place where new
>>>> contributions
>>>> >>> go in
>>>> >>> >>    that don't yet have all the platform capabilities and are not
>>>> yet
>>>> >>> >> mature.
>>>> >>> >>    If there are no other suggestions we will go with this one.
>>>> >>> >>    - It was suggested the operators documentation list those
>>>> platform
>>>> >>> >>    capabilities it currently provides from the list above. I will
>>>> >>> >> document a
>>>> >>> >>    structure for this in the contribution guidelines.
>>>> >>> >>    - Folks wanted to know what would be the criteria to graduate
>>>> an
>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>>>> thread
>>>> >>> >> for it as
>>>> >>> >>    I think it requires its own discussion and hopefully we can
>>>> come
>>>> >>> >> up with a
>>>> >>> >>    set of guidelines for it.
>>>> >>> >>    - David brought up state of some of the existing operators and
>>>> >>> their
>>>> >>> >>    retirement and the layout of operators in Malhar in general
>>>> and
>>>> >>> how it
>>>> >>> >>    causes problems with development. I will ask him to lead the
>>>> >>> >> discussion on
>>>> >>> >>    that.
>>>> >>> >>
>>>> >>> >> Thanks
>>>> >>> >>
>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>>> david@datatorrent.com>
>>>> >>> wrote:
>>>> >>> >>
>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
>>>> >>> >> >
>>>> >>> >> > On the contrary, putting a new process for people trying to
>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>>> operators in
>>>> >>> >> > the repository
>>>> >>> >> is
>>>> >>> >> > what is conflicting.
>>>> >>> >> >
>>>> >>> >> > Keep in mind that when people try to contribute, they always
>>>> look
>>>> >>> >> > at the existing operators already in the repository as
>>>> examples and
>>>> >>> >> > likely a
>>>> >>> >> model
>>>> >>> >> > for their new operators.
>>>> >>> >> >
>>>> >>> >> > David
>>>> >>> >> >
>>>> >>> >> >
>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>>> amol@datatorrent.com>
>>>> >>> >> wrote:
>>>> >>> >> >
>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>>> thread
>>>> >>> >> > > was to
>>>> >>> >> > open
>>>> >>> >> > > up a way for contributors to submit code in a dir (contrib?)
>>>> as
>>>> >>> >> > > long
>>>> >>> >> as
>>>> >>> >> > > license part of taken care of.
>>>> >>> >> > >
>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>>>> know
>>>> >>> >> > > what is being used?
>>>> >>> >> > >
>>>> >>> >> > > Thks,
>>>> >>> >> > > Amol
>>>> >>> >> > >
>>>> >>> >> > >
>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>>> >>> >> sandesh@datatorrent.com>
>>>> >>> >> > > wrote:
>>>> >>> >> > >
>>>> >>> >> > > > +1 for removing the not-used operators.
>>>> >>> >> > > >
>>>> >>> >> > > > So we are creating a process for operator writers who don't
>>>> >>> >> > > > want to understand the platform, yet wants to contribute?
>>>> How
>>>> >>> >> > > > big is that
>>>> >>> >> set?
>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>>>> passed
>>>> >>> >> > > > all
>>>> >>> >> the
>>>> >>> >> > > > checklist, will they be ready to use that in production?
>>>> >>> >> > > >
>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators
>>>> and
>>>> >>> >> > > > make
>>>> >>> >> it
>>>> >>> >> > > easy
>>>> >>> >> > > > to add more operators.
>>>> >>> >> > > >
>>>> >>> >> > > >
>>>> >>> >> > > >
>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>>> >>> >> > pramod@datatorrent.com>
>>>> >>> >> > > > wrote:
>>>> >>> >> > > >
>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>>> >>> >> > > > > wrote:
>>>> >>> >> > > > >
>>>> >>> >> > > > > > Pramod,
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > By that logic I would say let's put all partitionable
>>>> >>> >> > > > > > operators
>>>> >>> >> > into
>>>> >>> >> > > > one
>>>> >>> >> > > > > > folder, non-partitionable operators in another and so
>>>> on...
>>>> >>> >> > > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > Remember the original goal of making it easier for new
>>>> >>> >> > > > > members to contribute and managing those contributions to
>>>> >>> >> > > > > maturity. It is
>>>> >>> >> not a
>>>> >>> >> > > > > functional level separation.
>>>> >>> >> > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>>> being
>>>> >>> >> > > > > > used at
>>>> >>> >> > > class
>>>> >>> >> > > > > > level and not at package/folder level.
>>>> >>> >> > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>>>> like
>>>> >>> >> > > > > a
>>>> >>> >> > > folder..."
>>>> >>> >> > > > > as an analogy and not literally.
>>>> >>> >> > > > >
>>>> >>> >> > > > > Thanks
>>>> >>> >> > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > > Thanks
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>>> >>> >> > > > pramod@datatorrent.com
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > wrote:
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>>> >>> >> > > > > > > wrote:
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>>>> >>> >> > > > /
>>>> >>> >> > > > > > > >
>>>> org.apache.hadoop.classification.InterfaceStability.Uns
>>>> >>> >> > > > > > > > table
>>>> >>> >> > > > > > annotation?
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > I think it is important to localize the additions in
>>>> one
>>>> >>> >> place so
>>>> >>> >> > > > that
>>>> >>> >> > > > > it
>>>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>>>> >>> >> > > > > > > these,
>>>> >>> >> > easier
>>>> >>> >> > > > for
>>>> >>> >> > > > > > > developers to track them towards the path to
>>>> maturity and
>>>> >>> >> > > > > > > also
>>>> >>> >> > > > > provides a
>>>> >>> >> > > > > > > clearer directive for committers and contributors on
>>>> >>> >> acceptance
>>>> >>> >> > of
>>>> >>> >> > > > new
>>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>>>> them
>>>> >>> >> spread
>>>> >>> >> > all
>>>> >>> >> > > > > over
>>>> >>> >> > > > > > > the place and adds an additional layer of difficulty
>>>> in
>>>> >>> >> > > > identification
>>>> >>> >> > > > > > not
>>>> >>> >> > > > > > > just for users but also for developers who want to
>>>> find
>>>> >>> >> > > > > > > such
>>>> >>> >> > > > operators
>>>> >>> >> > > > > > and
>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>>> annotation
>>>> >>> >> where
>>>> >>> >> > > > > > everything
>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > Thanks
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>>> >>> >> > > david@datatorrent.com
>>>> >>> >> > > > >
>>>> >>> >> > > > > > > wrote:
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > > > > >
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > > >
>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
>>>> many
>>>> >>> >> operators
>>>> >>> >> > > > that
>>>> >>> >> > > > > > fall
>>>> >>> >> > > > > > > > in
>>>> >>> >> > > > > > > > > > the
>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>>> should
>>>> >>> >> > > > > > > > > > > > make it
>>>> >>> >> > > > obvious
>>>> >>> >> > > > > to
>>>> >>> >> > > > > > > > users
>>>> >>> >> > > > > > > > > > > that
>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>>>> >>> >> > > > > > > > > > > > operators
>>>> >>> >> are
>>>> >>> >> > > not,
>>>> >>> >> > > > > and
>>>> >>> >> > > > > > > > maybe
>>>> >>> >> > > > > > > > > > > even
>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever used
>>>> in a
>>>> >>> >> > > > > > > > > > > > real
>>>> >>> >> > use
>>>> >>> >> > > > > case.
>>>> >>> >> > > > > > > > > > > >
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>>> operators
>>>> >>> >> > > > > > > > > > > and
>>>> >>> >> > doing
>>>> >>> >> > > > this
>>>> >>> >> > > > > > > > > exercise
>>>> >>> >> > > > > > > > > > as
>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>>> original
>>>> >>> >> intent
>>>> >>> >> > is
>>>> >>> >> > > > for
>>>> >>> >> > > > > > > > > > > contributions going forward.
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>>>> >>> >> > > > > > > > > > Operators
>>>> >>> >> > > outside
>>>> >>> >> > > > > the
>>>> >>> >> > > > > > > play
>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>>> >>> >> > > > > > > > > >
>>>> >>> >> > > > > > > > > >
>>>> >>> >> > > > > > > > > I think this is important, and I don't anticipate
>>>> >>> >> > > > > > > > > much
>>>> >>> >> > tension
>>>> >>> >> > > if
>>>> >>> >> > > > > we
>>>> >>> >> > > > > > > > > establish clear criteria.
>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>>> operators
>>>> >>> >> > > > > > > > > stay
>>>> >>> >> and
>>>> >>> >> > > put
>>>> >>> >> > > > up
>>>> >>> >> > > > > > the
>>>> >>> >> > > > > > > > > bars for new operators.
>>>> >>> >> > > > > > > > >
>>>> >>> >> > > > > > > > > David
>>>> >>> >> > > > > > > > >
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > >
>>>> >>> >> > >
>>>> >>> >> >
>>>> >>> >>
>>>> >>> >
>>>> >>> >
>>>> >>>
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Hi,

I also added recommendation for each operator . Please take a look.

thanks

On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi,
>
> I created a shared google sheet and tracked the various details of
> operators. Currently, the sheet contains information about operators under
> lib/algo only. Link is
> https://docs.google.com/a/datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_CaWpXt3GDccM/edit?usp=sharing .
> Will update the sheet soon with lib/math too.
>
> Thanks
> Lakshmi Prasanna
>
> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com> wrote:
>
>> Hi Lakshmi,
>>
>> Thanks for volunteering.
>>
>> I think Pramod's suggestion of putting the operators into 3 buckets and
>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>> individual operators are both good, with the exception that lib/streamquery
>> is one unit and we probably do not need to look at individual operators
>> under it.
>>
>> If we don't have any objection in the community, let's start the process.
>>
>> David
>>
>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>> lakshmi@datatorrent.com> wrote:
>>
>>> I am interested to work on this.
>>>
>>> Regards,
>>> Lakshmi prasanna
>>>
>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>>> wrote:
>>>
>>> > Why not have a shared google sheet with a list of operators and options
>>> > that we want to do with it.
>>> > I think it's case by case.
>>> > But retire unused or obsolete operators is important and we should do
>>> it
>>> > sooner rather than later.
>>> >
>>> > Regards,
>>> > Siyuan
>>> >
>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>>> wrote:
>>> >
>>> >>
>>> >> My vote is to do 2&3
>>> >>
>>> >> Thks
>>> >> Amol
>>> >>
>>> >>
>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>> >> VKottapalli@directv.com> wrote:
>>> >>
>>> >>> +1 for deprecating the packages listed below.
>>> >>>
>>> >>> -----Original Message-----
>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>> >>>
>>> >>> +1
>>> >>>
>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>>> >>> wrote:
>>> >>>
>>> >>> > Hi all,
>>> >>> >
>>> >>> > I would like to renew the discussion of retiring operators in
>>> Malhar.
>>> >>> >
>>> >>> > As stated before, the reason why we would like to retire operators
>>> in
>>> >>> > Malhar is because some of them were written a long time ago before
>>> >>> > Apache incubation, and they do not pertain to real use cases, are
>>> not
>>> >>> > up to par in code quality, have no potential for improvement, and
>>> >>> > probably completely unused by anybody.
>>> >>> >
>>> >>> > We do not want contributors to use them as a model of their
>>> >>> > contribution, or users to use them thinking they are of quality,
>>> and
>>> >>> then hit a wall.
>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >>> >
>>> >>> > The initial 3 packages that we would like to target are *lib/algo*,
>>> >>> > *lib/math*, and *lib/streamquery*.
>>> >>>
>>> >>> >
>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>> are
>>> >>> > using any operator in these 3 packages. We would like to hear from
>>> you.
>>> >>> >
>>> >>> > These are the options I can think of for retiring those operators:
>>> >>> >
>>> >>> > 1) Completely remove them from the malhar repository.
>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>> >>> > malhar-misc
>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>> >>> > longer supported
>>> >>> >
>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >>> >
>>> >>> > David
>>> >>> >
>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> >>> > <pr...@datatorrent.com>
>>> >>> > wrote:
>>> >>> >
>>> >>> >> I wanted to close the loop on this discussion. In general everyone
>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>> Folks
>>> >>> >> had good suggestions like documenting capabilities of operators,
>>> come
>>> >>> >> up well defined criteria for graduation of operators and what
>>> those
>>> >>> >> criteria may be and what to do with existing operators that may
>>> not
>>> >>> >> yet be mature or unused.
>>> >>> >>
>>> >>> >> I am going to summarize the key points that resulted from the
>>> >>> >> discussion and would like to proceed with them.
>>> >>> >>
>>> >>> >>    - Operators that do not yet provide the key platform
>>> capabilities
>>> >>> to
>>> >>> >>    make an operator useful across different applications such as
>>> >>> >> reusability,
>>> >>> >>    partitioning static or dynamic, idempotency, exactly once will
>>> >>> still be
>>> >>> >>    accepted as long as they are functionally correct, have unit
>>> tests
>>> >>> >> and will
>>> >>> >>    go into a separate module.
>>> >>> >>    - Contrib module was suggested as a place where new
>>> contributions
>>> >>> go in
>>> >>> >>    that don't yet have all the platform capabilities and are not
>>> yet
>>> >>> >> mature.
>>> >>> >>    If there are no other suggestions we will go with this one.
>>> >>> >>    - It was suggested the operators documentation list those
>>> platform
>>> >>> >>    capabilities it currently provides from the list above. I will
>>> >>> >> document a
>>> >>> >>    structure for this in the contribution guidelines.
>>> >>> >>    - Folks wanted to know what would be the criteria to graduate
>>> an
>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>>> thread
>>> >>> >> for it as
>>> >>> >>    I think it requires its own discussion and hopefully we can
>>> come
>>> >>> >> up with a
>>> >>> >>    set of guidelines for it.
>>> >>> >>    - David brought up state of some of the existing operators and
>>> >>> their
>>> >>> >>    retirement and the layout of operators in Malhar in general and
>>> >>> how it
>>> >>> >>    causes problems with development. I will ask him to lead the
>>> >>> >> discussion on
>>> >>> >>    that.
>>> >>> >>
>>> >>> >> Thanks
>>> >>> >>
>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <david@datatorrent.com
>>> >
>>> >>> wrote:
>>> >>> >>
>>> >>> >> > The two ideas are not conflicting, but rather complementing.
>>> >>> >> >
>>> >>> >> > On the contrary, putting a new process for people trying to
>>> >>> >> > contribute while NOT addressing the old unused subpar operators
>>> in
>>> >>> >> > the repository
>>> >>> >> is
>>> >>> >> > what is conflicting.
>>> >>> >> >
>>> >>> >> > Keep in mind that when people try to contribute, they always
>>> look
>>> >>> >> > at the existing operators already in the repository as examples
>>> and
>>> >>> >> > likely a
>>> >>> >> model
>>> >>> >> > for their new operators.
>>> >>> >> >
>>> >>> >> > David
>>> >>> >> >
>>> >>> >> >
>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>> amol@datatorrent.com>
>>> >>> >> wrote:
>>> >>> >> >
>>> >>> >> > > Yes there are two conflicting threads now. The original thread
>>> >>> >> > > was to
>>> >>> >> > open
>>> >>> >> > > up a way for contributors to submit code in a dir (contrib?)
>>> as
>>> >>> >> > > long
>>> >>> >> as
>>> >>> >> > > license part of taken care of.
>>> >>> >> > >
>>> >>> >> > > On the thread of removing non-used operators -> How do we know
>>> >>> >> > > what is being used?
>>> >>> >> > >
>>> >>> >> > > Thks,
>>> >>> >> > > Amol
>>> >>> >> > >
>>> >>> >> > >
>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>> >>> >> sandesh@datatorrent.com>
>>> >>> >> > > wrote:
>>> >>> >> > >
>>> >>> >> > > > +1 for removing the not-used operators.
>>> >>> >> > > >
>>> >>> >> > > > So we are creating a process for operator writers who don't
>>> >>> >> > > > want to understand the platform, yet wants to contribute?
>>> How
>>> >>> >> > > > big is that
>>> >>> >> set?
>>> >>> >> > > > If we tell the app-user, here is the code which has not
>>> passed
>>> >>> >> > > > all
>>> >>> >> the
>>> >>> >> > > > checklist, will they be ready to use that in production?
>>> >>> >> > > >
>>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators
>>> and
>>> >>> >> > > > make
>>> >>> >> it
>>> >>> >> > > easy
>>> >>> >> > > > to add more operators.
>>> >>> >> > > >
>>> >>> >> > > >
>>> >>> >> > > >
>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>> >>> >> > pramod@datatorrent.com>
>>> >>> >> > > > wrote:
>>> >>> >> > > >
>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>> >>> >> > > gaurav.gopi123@gmail.com>
>>> >>> >> > > > > wrote:
>>> >>> >> > > > >
>>> >>> >> > > > > > Pramod,
>>> >>> >> > > > > >
>>> >>> >> > > > > > By that logic I would say let's put all partitionable
>>> >>> >> > > > > > operators
>>> >>> >> > into
>>> >>> >> > > > one
>>> >>> >> > > > > > folder, non-partitionable operators in another and so
>>> on...
>>> >>> >> > > > > >
>>> >>> >> > > > >
>>> >>> >> > > > > Remember the original goal of making it easier for new
>>> >>> >> > > > > members to contribute and managing those contributions to
>>> >>> >> > > > > maturity. It is
>>> >>> >> not a
>>> >>> >> > > > > functional level separation.
>>> >>> >> > > > >
>>> >>> >> > > > >
>>> >>> >> > > > > > When I look at hadoop code I see these annotations being
>>> >>> >> > > > > > used at
>>> >>> >> > > class
>>> >>> >> > > > > > level and not at package/folder level.
>>> >>> >> > > > >
>>> >>> >> > > > >
>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>>> like
>>> >>> >> > > > > a
>>> >>> >> > > folder..."
>>> >>> >> > > > > as an analogy and not literally.
>>> >>> >> > > > >
>>> >>> >> > > > > Thanks
>>> >>> >> > > > >
>>> >>> >> > > > >
>>> >>> >> > > > > > Thanks
>>> >>> >> > > > > >
>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>> >>> >> > > > pramod@datatorrent.com
>>> >>> >> > > > > >
>>> >>> >> > > > > > wrote:
>>> >>> >> > > > > >
>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>> >>> >> > > > > > > wrote:
>>> >>> >> > > > > > >
>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>>> >>> >> > > > /
>>> >>> >> > > > > > > >
>>> org.apache.hadoop.classification.InterfaceStability.Uns
>>> >>> >> > > > > > > > table
>>> >>> >> > > > > > annotation?
>>> >>> >> > > > > > > >
>>> >>> >> > > > > > >
>>> >>> >> > > > > > > I think it is important to localize the additions in
>>> one
>>> >>> >> place so
>>> >>> >> > > > that
>>> >>> >> > > > > it
>>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>>> >>> >> > > > > > > these,
>>> >>> >> > easier
>>> >>> >> > > > for
>>> >>> >> > > > > > > developers to track them towards the path to maturity
>>> and
>>> >>> >> > > > > > > also
>>> >>> >> > > > > provides a
>>> >>> >> > > > > > > clearer directive for committers and contributors on
>>> >>> >> acceptance
>>> >>> >> > of
>>> >>> >> > > > new
>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>>> them
>>> >>> >> spread
>>> >>> >> > all
>>> >>> >> > > > > over
>>> >>> >> > > > > > > the place and adds an additional layer of difficulty
>>> in
>>> >>> >> > > > identification
>>> >>> >> > > > > > not
>>> >>> >> > > > > > > just for users but also for developers who want to
>>> find
>>> >>> >> > > > > > > such
>>> >>> >> > > > operators
>>> >>> >> > > > > > and
>>> >>> >> > > > > > > improve them. This of this like a folder level
>>> annotation
>>> >>> >> where
>>> >>> >> > > > > > everything
>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>> >>> >> > > > > > >
>>> >>> >> > > > > > > Thanks
>>> >>> >> > > > > > >
>>> >>> >> > > > > > >
>>> >>> >> > > > > > > >
>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>> >>> >> > > david@datatorrent.com
>>> >>> >> > > > >
>>> >>> >> > > > > > > wrote:
>>> >>> >> > > > > > > >
>>> >>> >> > > > > > > > > >
>>> >>> >> > > > > > > > > > >
>>> >>> >> > > > > > > > > > > >
>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
>>> many
>>> >>> >> operators
>>> >>> >> > > > that
>>> >>> >> > > > > > fall
>>> >>> >> > > > > > > > in
>>> >>> >> > > > > > > > > > the
>>> >>> >> > > > > > > > > > > > "non-production quality" category. We should
>>> >>> >> > > > > > > > > > > > make it
>>> >>> >> > > > obvious
>>> >>> >> > > > > to
>>> >>> >> > > > > > > > users
>>> >>> >> > > > > > > > > > > that
>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>>> >>> >> > > > > > > > > > > > operators
>>> >>> >> are
>>> >>> >> > > not,
>>> >>> >> > > > > and
>>> >>> >> > > > > > > > maybe
>>> >>> >> > > > > > > > > > > even
>>> >>> >> > > > > > > > > > > > remove those that are likely not ever used
>>> in a
>>> >>> >> > > > > > > > > > > > real
>>> >>> >> > use
>>> >>> >> > > > > case.
>>> >>> >> > > > > > > > > > > >
>>> >>> >> > > > > > > > > > >
>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>> operators
>>> >>> >> > > > > > > > > > > and
>>> >>> >> > doing
>>> >>> >> > > > this
>>> >>> >> > > > > > > > > exercise
>>> >>> >> > > > > > > > > > as
>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>> original
>>> >>> >> intent
>>> >>> >> > is
>>> >>> >> > > > for
>>> >>> >> > > > > > > > > > > contributions going forward.
>>> >>> >> > > > > > > > > > >
>>> >>> >> > > > > > > > > > >
>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>>> >>> >> > > > > > > > > > Operators
>>> >>> >> > > outside
>>> >>> >> > > > > the
>>> >>> >> > > > > > > play
>>> >>> >> > > > > > > > > > area should be of well known quality.
>>> >>> >> > > > > > > > > >
>>> >>> >> > > > > > > > > >
>>> >>> >> > > > > > > > > I think this is important, and I don't anticipate
>>> >>> >> > > > > > > > > much
>>> >>> >> > tension
>>> >>> >> > > if
>>> >>> >> > > > > we
>>> >>> >> > > > > > > > > establish clear criteria.
>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>> operators
>>> >>> >> > > > > > > > > stay
>>> >>> >> and
>>> >>> >> > > put
>>> >>> >> > > > up
>>> >>> >> > > > > > the
>>> >>> >> > > > > > > > > bars for new operators.
>>> >>> >> > > > > > > > >
>>> >>> >> > > > > > > > > David
>>> >>> >> > > > > > > > >
>>> >>> >> > > > > > > >
>>> >>> >> > > > > > >
>>> >>> >> > > > > >
>>> >>> >> > > > >
>>> >>> >> > > >
>>> >>> >> > >
>>> >>> >> >
>>> >>> >>
>>> >>> >
>>> >>> >
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>
>>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Hi,

I also added recommendation for each operator . Please take a look.

thanks

On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi,
>
> I created a shared google sheet and tracked the various details of
> operators. Currently, the sheet contains information about operators under
> lib/algo only. Link is
> https://docs.google.com/a/datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_CaWpXt3GDccM/edit?usp=sharing .
> Will update the sheet soon with lib/math too.
>
> Thanks
> Lakshmi Prasanna
>
> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com> wrote:
>
>> Hi Lakshmi,
>>
>> Thanks for volunteering.
>>
>> I think Pramod's suggestion of putting the operators into 3 buckets and
>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>> individual operators are both good, with the exception that lib/streamquery
>> is one unit and we probably do not need to look at individual operators
>> under it.
>>
>> If we don't have any objection in the community, let's start the process.
>>
>> David
>>
>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>> lakshmi@datatorrent.com> wrote:
>>
>>> I am interested to work on this.
>>>
>>> Regards,
>>> Lakshmi prasanna
>>>
>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>>> wrote:
>>>
>>> > Why not have a shared google sheet with a list of operators and options
>>> > that we want to do with it.
>>> > I think it's case by case.
>>> > But retire unused or obsolete operators is important and we should do
>>> it
>>> > sooner rather than later.
>>> >
>>> > Regards,
>>> > Siyuan
>>> >
>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>>> wrote:
>>> >
>>> >>
>>> >> My vote is to do 2&3
>>> >>
>>> >> Thks
>>> >> Amol
>>> >>
>>> >>
>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>> >> VKottapalli@directv.com> wrote:
>>> >>
>>> >>> +1 for deprecating the packages listed below.
>>> >>>
>>> >>> -----Original Message-----
>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>> >>>
>>> >>> +1
>>> >>>
>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>>> >>> wrote:
>>> >>>
>>> >>> > Hi all,
>>> >>> >
>>> >>> > I would like to renew the discussion of retiring operators in
>>> Malhar.
>>> >>> >
>>> >>> > As stated before, the reason why we would like to retire operators
>>> in
>>> >>> > Malhar is because some of them were written a long time ago before
>>> >>> > Apache incubation, and they do not pertain to real use cases, are
>>> not
>>> >>> > up to par in code quality, have no potential for improvement, and
>>> >>> > probably completely unused by anybody.
>>> >>> >
>>> >>> > We do not want contributors to use them as a model of their
>>> >>> > contribution, or users to use them thinking they are of quality,
>>> and
>>> >>> then hit a wall.
>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >>> >
>>> >>> > The initial 3 packages that we would like to target are *lib/algo*,
>>> >>> > *lib/math*, and *lib/streamquery*.
>>> >>>
>>> >>> >
>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>> are
>>> >>> > using any operator in these 3 packages. We would like to hear from
>>> you.
>>> >>> >
>>> >>> > These are the options I can think of for retiring those operators:
>>> >>> >
>>> >>> > 1) Completely remove them from the malhar repository.
>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>> >>> > malhar-misc
>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>> >>> > longer supported
>>> >>> >
>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >>> >
>>> >>> > David
>>> >>> >
>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> >>> > <pr...@datatorrent.com>
>>> >>> > wrote:
>>> >>> >
>>> >>> >> I wanted to close the loop on this discussion. In general everyone
>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>> Folks
>>> >>> >> had good suggestions like documenting capabilities of operators,
>>> come
>>> >>> >> up well defined criteria for graduation of operators and what
>>> those
>>> >>> >> criteria may be and what to do with existing operators that may
>>> not
>>> >>> >> yet be mature or unused.
>>> >>> >>
>>> >>> >> I am going to summarize the key points that resulted from the
>>> >>> >> discussion and would like to proceed with them.
>>> >>> >>
>>> >>> >>    - Operators that do not yet provide the key platform
>>> capabilities
>>> >>> to
>>> >>> >>    make an operator useful across different applications such as
>>> >>> >> reusability,
>>> >>> >>    partitioning static or dynamic, idempotency, exactly once will
>>> >>> still be
>>> >>> >>    accepted as long as they are functionally correct, have unit
>>> tests
>>> >>> >> and will
>>> >>> >>    go into a separate module.
>>> >>> >>    - Contrib module was suggested as a place where new
>>> contributions
>>> >>> go in
>>> >>> >>    that don't yet have all the platform capabilities and are not
>>> yet
>>> >>> >> mature.
>>> >>> >>    If there are no other suggestions we will go with this one.
>>> >>> >>    - It was suggested the operators documentation list those
>>> platform
>>> >>> >>    capabilities it currently provides from the list above. I will
>>> >>> >> document a
>>> >>> >>    structure for this in the contribution guidelines.
>>> >>> >>    - Folks wanted to know what would be the criteria to graduate
>>> an
>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>>> thread
>>> >>> >> for it as
>>> >>> >>    I think it requires its own discussion and hopefully we can
>>> come
>>> >>> >> up with a
>>> >>> >>    set of guidelines for it.
>>> >>> >>    - David brought up state of some of the existing operators and
>>> >>> their
>>> >>> >>    retirement and the layout of operators in Malhar in general and
>>> >>> how it
>>> >>> >>    causes problems with development. I will ask him to lead the
>>> >>> >> discussion on
>>> >>> >>    that.
>>> >>> >>
>>> >>> >> Thanks
>>> >>> >>
>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <david@datatorrent.com
>>> >
>>> >>> wrote:
>>> >>> >>
>>> >>> >> > The two ideas are not conflicting, but rather complementing.
>>> >>> >> >
>>> >>> >> > On the contrary, putting a new process for people trying to
>>> >>> >> > contribute while NOT addressing the old unused subpar operators
>>> in
>>> >>> >> > the repository
>>> >>> >> is
>>> >>> >> > what is conflicting.
>>> >>> >> >
>>> >>> >> > Keep in mind that when people try to contribute, they always
>>> look
>>> >>> >> > at the existing operators already in the repository as examples
>>> and
>>> >>> >> > likely a
>>> >>> >> model
>>> >>> >> > for their new operators.
>>> >>> >> >
>>> >>> >> > David
>>> >>> >> >
>>> >>> >> >
>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>> amol@datatorrent.com>
>>> >>> >> wrote:
>>> >>> >> >
>>> >>> >> > > Yes there are two conflicting threads now. The original thread
>>> >>> >> > > was to
>>> >>> >> > open
>>> >>> >> > > up a way for contributors to submit code in a dir (contrib?)
>>> as
>>> >>> >> > > long
>>> >>> >> as
>>> >>> >> > > license part of taken care of.
>>> >>> >> > >
>>> >>> >> > > On the thread of removing non-used operators -> How do we know
>>> >>> >> > > what is being used?
>>> >>> >> > >
>>> >>> >> > > Thks,
>>> >>> >> > > Amol
>>> >>> >> > >
>>> >>> >> > >
>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>> >>> >> sandesh@datatorrent.com>
>>> >>> >> > > wrote:
>>> >>> >> > >
>>> >>> >> > > > +1 for removing the not-used operators.
>>> >>> >> > > >
>>> >>> >> > > > So we are creating a process for operator writers who don't
>>> >>> >> > > > want to understand the platform, yet wants to contribute?
>>> How
>>> >>> >> > > > big is that
>>> >>> >> set?
>>> >>> >> > > > If we tell the app-user, here is the code which has not
>>> passed
>>> >>> >> > > > all
>>> >>> >> the
>>> >>> >> > > > checklist, will they be ready to use that in production?
>>> >>> >> > > >
>>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators
>>> and
>>> >>> >> > > > make
>>> >>> >> it
>>> >>> >> > > easy
>>> >>> >> > > > to add more operators.
>>> >>> >> > > >
>>> >>> >> > > >
>>> >>> >> > > >
>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>> >>> >> > pramod@datatorrent.com>
>>> >>> >> > > > wrote:
>>> >>> >> > > >
>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>> >>> >> > > gaurav.gopi123@gmail.com>
>>> >>> >> > > > > wrote:
>>> >>> >> > > > >
>>> >>> >> > > > > > Pramod,
>>> >>> >> > > > > >
>>> >>> >> > > > > > By that logic I would say let's put all partitionable
>>> >>> >> > > > > > operators
>>> >>> >> > into
>>> >>> >> > > > one
>>> >>> >> > > > > > folder, non-partitionable operators in another and so
>>> on...
>>> >>> >> > > > > >
>>> >>> >> > > > >
>>> >>> >> > > > > Remember the original goal of making it easier for new
>>> >>> >> > > > > members to contribute and managing those contributions to
>>> >>> >> > > > > maturity. It is
>>> >>> >> not a
>>> >>> >> > > > > functional level separation.
>>> >>> >> > > > >
>>> >>> >> > > > >
>>> >>> >> > > > > > When I look at hadoop code I see these annotations being
>>> >>> >> > > > > > used at
>>> >>> >> > > class
>>> >>> >> > > > > > level and not at package/folder level.
>>> >>> >> > > > >
>>> >>> >> > > > >
>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>>> like
>>> >>> >> > > > > a
>>> >>> >> > > folder..."
>>> >>> >> > > > > as an analogy and not literally.
>>> >>> >> > > > >
>>> >>> >> > > > > Thanks
>>> >>> >> > > > >
>>> >>> >> > > > >
>>> >>> >> > > > > > Thanks
>>> >>> >> > > > > >
>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>> >>> >> > > > pramod@datatorrent.com
>>> >>> >> > > > > >
>>> >>> >> > > > > > wrote:
>>> >>> >> > > > > >
>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>> >>> >> > > > > > > wrote:
>>> >>> >> > > > > > >
>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>>> >>> >> > > > /
>>> >>> >> > > > > > > >
>>> org.apache.hadoop.classification.InterfaceStability.Uns
>>> >>> >> > > > > > > > table
>>> >>> >> > > > > > annotation?
>>> >>> >> > > > > > > >
>>> >>> >> > > > > > >
>>> >>> >> > > > > > > I think it is important to localize the additions in
>>> one
>>> >>> >> place so
>>> >>> >> > > > that
>>> >>> >> > > > > it
>>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>>> >>> >> > > > > > > these,
>>> >>> >> > easier
>>> >>> >> > > > for
>>> >>> >> > > > > > > developers to track them towards the path to maturity
>>> and
>>> >>> >> > > > > > > also
>>> >>> >> > > > > provides a
>>> >>> >> > > > > > > clearer directive for committers and contributors on
>>> >>> >> acceptance
>>> >>> >> > of
>>> >>> >> > > > new
>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>>> them
>>> >>> >> spread
>>> >>> >> > all
>>> >>> >> > > > > over
>>> >>> >> > > > > > > the place and adds an additional layer of difficulty
>>> in
>>> >>> >> > > > identification
>>> >>> >> > > > > > not
>>> >>> >> > > > > > > just for users but also for developers who want to
>>> find
>>> >>> >> > > > > > > such
>>> >>> >> > > > operators
>>> >>> >> > > > > > and
>>> >>> >> > > > > > > improve them. This of this like a folder level
>>> annotation
>>> >>> >> where
>>> >>> >> > > > > > everything
>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>> >>> >> > > > > > >
>>> >>> >> > > > > > > Thanks
>>> >>> >> > > > > > >
>>> >>> >> > > > > > >
>>> >>> >> > > > > > > >
>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>> >>> >> > > david@datatorrent.com
>>> >>> >> > > > >
>>> >>> >> > > > > > > wrote:
>>> >>> >> > > > > > > >
>>> >>> >> > > > > > > > > >
>>> >>> >> > > > > > > > > > >
>>> >>> >> > > > > > > > > > > >
>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
>>> many
>>> >>> >> operators
>>> >>> >> > > > that
>>> >>> >> > > > > > fall
>>> >>> >> > > > > > > > in
>>> >>> >> > > > > > > > > > the
>>> >>> >> > > > > > > > > > > > "non-production quality" category. We should
>>> >>> >> > > > > > > > > > > > make it
>>> >>> >> > > > obvious
>>> >>> >> > > > > to
>>> >>> >> > > > > > > > users
>>> >>> >> > > > > > > > > > > that
>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>>> >>> >> > > > > > > > > > > > operators
>>> >>> >> are
>>> >>> >> > > not,
>>> >>> >> > > > > and
>>> >>> >> > > > > > > > maybe
>>> >>> >> > > > > > > > > > > even
>>> >>> >> > > > > > > > > > > > remove those that are likely not ever used
>>> in a
>>> >>> >> > > > > > > > > > > > real
>>> >>> >> > use
>>> >>> >> > > > > case.
>>> >>> >> > > > > > > > > > > >
>>> >>> >> > > > > > > > > > >
>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>> operators
>>> >>> >> > > > > > > > > > > and
>>> >>> >> > doing
>>> >>> >> > > > this
>>> >>> >> > > > > > > > > exercise
>>> >>> >> > > > > > > > > > as
>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>> original
>>> >>> >> intent
>>> >>> >> > is
>>> >>> >> > > > for
>>> >>> >> > > > > > > > > > > contributions going forward.
>>> >>> >> > > > > > > > > > >
>>> >>> >> > > > > > > > > > >
>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>>> >>> >> > > > > > > > > > Operators
>>> >>> >> > > outside
>>> >>> >> > > > > the
>>> >>> >> > > > > > > play
>>> >>> >> > > > > > > > > > area should be of well known quality.
>>> >>> >> > > > > > > > > >
>>> >>> >> > > > > > > > > >
>>> >>> >> > > > > > > > > I think this is important, and I don't anticipate
>>> >>> >> > > > > > > > > much
>>> >>> >> > tension
>>> >>> >> > > if
>>> >>> >> > > > > we
>>> >>> >> > > > > > > > > establish clear criteria.
>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>> operators
>>> >>> >> > > > > > > > > stay
>>> >>> >> and
>>> >>> >> > > put
>>> >>> >> > > > up
>>> >>> >> > > > > > the
>>> >>> >> > > > > > > > > bars for new operators.
>>> >>> >> > > > > > > > >
>>> >>> >> > > > > > > > > David
>>> >>> >> > > > > > > > >
>>> >>> >> > > > > > > >
>>> >>> >> > > > > > >
>>> >>> >> > > > > >
>>> >>> >> > > > >
>>> >>> >> > > >
>>> >>> >> > >
>>> >>> >> >
>>> >>> >>
>>> >>> >
>>> >>> >
>>> >>>
>>> >>
>>> >>
>>> >
>>>
>>
>>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Hi,

I created a shared google sheet and tracked the various details of
operators. Currently, the sheet contains information about operators under
lib/algo only. Link is
https://docs.google.com/a/datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_CaWpXt3GDccM/edit?usp=sharing
.
Will update the sheet soon with lib/math too.

Thanks
Lakshmi Prasanna

On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com> wrote:

> Hi Lakshmi,
>
> Thanks for volunteering.
>
> I think Pramod's suggestion of putting the operators into 3 buckets and
> Siyuan's suggestion of starting a shared Google Sheet that tracks
> individual operators are both good, with the exception that lib/streamquery
> is one unit and we probably do not need to look at individual operators
> under it.
>
> If we don't have any objection in the community, let's start the process.
>
> David
>
> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> I am interested to work on this.
>>
>> Regards,
>> Lakshmi prasanna
>>
>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>> wrote:
>>
>> > Why not have a shared google sheet with a list of operators and options
>> > that we want to do with it.
>> > I think it's case by case.
>> > But retire unused or obsolete operators is important and we should do it
>> > sooner rather than later.
>> >
>> > Regards,
>> > Siyuan
>> >
>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> >
>> >>
>> >> My vote is to do 2&3
>> >>
>> >> Thks
>> >> Amol
>> >>
>> >>
>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> >> VKottapalli@directv.com> wrote:
>> >>
>> >>> +1 for deprecating the packages listed below.
>> >>>
>> >>> -----Original Message-----
>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> >>>
>> >>> +1
>> >>>
>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>>
>> >>> > Hi all,
>> >>> >
>> >>> > I would like to renew the discussion of retiring operators in
>> Malhar.
>> >>> >
>> >>> > As stated before, the reason why we would like to retire operators
>> in
>> >>> > Malhar is because some of them were written a long time ago before
>> >>> > Apache incubation, and they do not pertain to real use cases, are
>> not
>> >>> > up to par in code quality, have no potential for improvement, and
>> >>> > probably completely unused by anybody.
>> >>> >
>> >>> > We do not want contributors to use them as a model of their
>> >>> > contribution, or users to use them thinking they are of quality, and
>> >>> then hit a wall.
>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>> >>> >
>> >>> > The initial 3 packages that we would like to target are *lib/algo*,
>> >>> > *lib/math*, and *lib/streamquery*.
>> >>>
>> >>> >
>> >>> > I'm adding this thread to the users list. Please speak up if you are
>> >>> > using any operator in these 3 packages. We would like to hear from
>> you.
>> >>> >
>> >>> > These are the options I can think of for retiring those operators:
>> >>> >
>> >>> > 1) Completely remove them from the malhar repository.
>> >>> > 2) Move them from malhar-library into a separate artifact called
>> >>> > malhar-misc
>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>> >>> > longer supported
>> >>> >
>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >>> >
>> >>> > David
>> >>> >
>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> >>> > <pr...@datatorrent.com>
>> >>> > wrote:
>> >>> >
>> >>> >> I wanted to close the loop on this discussion. In general everyone
>> >>> >> seemed to be favorable to this idea with no serious objections.
>> Folks
>> >>> >> had good suggestions like documenting capabilities of operators,
>> come
>> >>> >> up well defined criteria for graduation of operators and what those
>> >>> >> criteria may be and what to do with existing operators that may not
>> >>> >> yet be mature or unused.
>> >>> >>
>> >>> >> I am going to summarize the key points that resulted from the
>> >>> >> discussion and would like to proceed with them.
>> >>> >>
>> >>> >>    - Operators that do not yet provide the key platform
>> capabilities
>> >>> to
>> >>> >>    make an operator useful across different applications such as
>> >>> >> reusability,
>> >>> >>    partitioning static or dynamic, idempotency, exactly once will
>> >>> still be
>> >>> >>    accepted as long as they are functionally correct, have unit
>> tests
>> >>> >> and will
>> >>> >>    go into a separate module.
>> >>> >>    - Contrib module was suggested as a place where new
>> contributions
>> >>> go in
>> >>> >>    that don't yet have all the platform capabilities and are not
>> yet
>> >>> >> mature.
>> >>> >>    If there are no other suggestions we will go with this one.
>> >>> >>    - It was suggested the operators documentation list those
>> platform
>> >>> >>    capabilities it currently provides from the list above. I will
>> >>> >> document a
>> >>> >>    structure for this in the contribution guidelines.
>> >>> >>    - Folks wanted to know what would be the criteria to graduate an
>> >>> >>    operator to the big leagues :). I will kick-off a separate
>> thread
>> >>> >> for it as
>> >>> >>    I think it requires its own discussion and hopefully we can come
>> >>> >> up with a
>> >>> >>    set of guidelines for it.
>> >>> >>    - David brought up state of some of the existing operators and
>> >>> their
>> >>> >>    retirement and the layout of operators in Malhar in general and
>> >>> how it
>> >>> >>    causes problems with development. I will ask him to lead the
>> >>> >> discussion on
>> >>> >>    that.
>> >>> >>
>> >>> >> Thanks
>> >>> >>
>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>> >>
>> >>> >> > The two ideas are not conflicting, but rather complementing.
>> >>> >> >
>> >>> >> > On the contrary, putting a new process for people trying to
>> >>> >> > contribute while NOT addressing the old unused subpar operators
>> in
>> >>> >> > the repository
>> >>> >> is
>> >>> >> > what is conflicting.
>> >>> >> >
>> >>> >> > Keep in mind that when people try to contribute, they always look
>> >>> >> > at the existing operators already in the repository as examples
>> and
>> >>> >> > likely a
>> >>> >> model
>> >>> >> > for their new operators.
>> >>> >> >
>> >>> >> > David
>> >>> >> >
>> >>> >> >
>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>> amol@datatorrent.com>
>> >>> >> wrote:
>> >>> >> >
>> >>> >> > > Yes there are two conflicting threads now. The original thread
>> >>> >> > > was to
>> >>> >> > open
>> >>> >> > > up a way for contributors to submit code in a dir (contrib?) as
>> >>> >> > > long
>> >>> >> as
>> >>> >> > > license part of taken care of.
>> >>> >> > >
>> >>> >> > > On the thread of removing non-used operators -> How do we know
>> >>> >> > > what is being used?
>> >>> >> > >
>> >>> >> > > Thks,
>> >>> >> > > Amol
>> >>> >> > >
>> >>> >> > >
>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >>> >> sandesh@datatorrent.com>
>> >>> >> > > wrote:
>> >>> >> > >
>> >>> >> > > > +1 for removing the not-used operators.
>> >>> >> > > >
>> >>> >> > > > So we are creating a process for operator writers who don't
>> >>> >> > > > want to understand the platform, yet wants to contribute? How
>> >>> >> > > > big is that
>> >>> >> set?
>> >>> >> > > > If we tell the app-user, here is the code which has not
>> passed
>> >>> >> > > > all
>> >>> >> the
>> >>> >> > > > checklist, will they be ready to use that in production?
>> >>> >> > > >
>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators
>> and
>> >>> >> > > > make
>> >>> >> it
>> >>> >> > > easy
>> >>> >> > > > to add more operators.
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >>> >> > pramod@datatorrent.com>
>> >>> >> > > > wrote:
>> >>> >> > > >
>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >>> >> > > gaurav.gopi123@gmail.com>
>> >>> >> > > > > wrote:
>> >>> >> > > > >
>> >>> >> > > > > > Pramod,
>> >>> >> > > > > >
>> >>> >> > > > > > By that logic I would say let's put all partitionable
>> >>> >> > > > > > operators
>> >>> >> > into
>> >>> >> > > > one
>> >>> >> > > > > > folder, non-partitionable operators in another and so
>> on...
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > > > Remember the original goal of making it easier for new
>> >>> >> > > > > members to contribute and managing those contributions to
>> >>> >> > > > > maturity. It is
>> >>> >> not a
>> >>> >> > > > > functional level separation.
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > > When I look at hadoop code I see these annotations being
>> >>> >> > > > > > used at
>> >>> >> > > class
>> >>> >> > > > > > level and not at package/folder level.
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>> like
>> >>> >> > > > > a
>> >>> >> > > folder..."
>> >>> >> > > > > as an analogy and not literally.
>> >>> >> > > > >
>> >>> >> > > > > Thanks
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > > Thanks
>> >>> >> > > > > >
>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >>> >> > > > pramod@datatorrent.com
>> >>> >> > > > > >
>> >>> >> > > > > > wrote:
>> >>> >> > > > > >
>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >>> >> > > > > gaurav.gopi123@gmail.com>
>> >>> >> > > > > > > wrote:
>> >>> >> > > > > > >
>> >>> >> > > > > > > > Can same goal not be achieved by using
>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> >>> >> > > > /
>> >>> >> > > > > > > >
>> org.apache.hadoop.classification.InterfaceStability.Uns
>> >>> >> > > > > > > > table
>> >>> >> > > > > > annotation?
>> >>> >> > > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > > > I think it is important to localize the additions in
>> one
>> >>> >> place so
>> >>> >> > > > that
>> >>> >> > > > > it
>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>> >>> >> > > > > > > these,
>> >>> >> > easier
>> >>> >> > > > for
>> >>> >> > > > > > > developers to track them towards the path to maturity
>> and
>> >>> >> > > > > > > also
>> >>> >> > > > > provides a
>> >>> >> > > > > > > clearer directive for committers and contributors on
>> >>> >> acceptance
>> >>> >> > of
>> >>> >> > > > new
>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>> them
>> >>> >> spread
>> >>> >> > all
>> >>> >> > > > > over
>> >>> >> > > > > > > the place and adds an additional layer of difficulty in
>> >>> >> > > > identification
>> >>> >> > > > > > not
>> >>> >> > > > > > > just for users but also for developers who want to find
>> >>> >> > > > > > > such
>> >>> >> > > > operators
>> >>> >> > > > > > and
>> >>> >> > > > > > > improve them. This of this like a folder level
>> annotation
>> >>> >> where
>> >>> >> > > > > > everything
>> >>> >> > > > > > > under this folder is unstable or evolving.
>> >>> >> > > > > > >
>> >>> >> > > > > > > Thanks
>> >>> >> > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > > > >
>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >>> >> > > david@datatorrent.com
>> >>> >> > > > >
>> >>> >> > > > > > > wrote:
>> >>> >> > > > > > > >
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > > >
>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too many
>> >>> >> operators
>> >>> >> > > > that
>> >>> >> > > > > > fall
>> >>> >> > > > > > > > in
>> >>> >> > > > > > > > > > the
>> >>> >> > > > > > > > > > > > "non-production quality" category. We should
>> >>> >> > > > > > > > > > > > make it
>> >>> >> > > > obvious
>> >>> >> > > > > to
>> >>> >> > > > > > > > users
>> >>> >> > > > > > > > > > > that
>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>> >>> >> > > > > > > > > > > > operators
>> >>> >> are
>> >>> >> > > not,
>> >>> >> > > > > and
>> >>> >> > > > > > > > maybe
>> >>> >> > > > > > > > > > > even
>> >>> >> > > > > > > > > > > > remove those that are likely not ever used
>> in a
>> >>> >> > > > > > > > > > > > real
>> >>> >> > use
>> >>> >> > > > > case.
>> >>> >> > > > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>> operators
>> >>> >> > > > > > > > > > > and
>> >>> >> > doing
>> >>> >> > > > this
>> >>> >> > > > > > > > > exercise
>> >>> >> > > > > > > > > > as
>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>> original
>> >>> >> intent
>> >>> >> > is
>> >>> >> > > > for
>> >>> >> > > > > > > > > > > contributions going forward.
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>> >>> >> > > > > > > > > > Operators
>> >>> >> > > outside
>> >>> >> > > > > the
>> >>> >> > > > > > > play
>> >>> >> > > > > > > > > > area should be of well known quality.
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > I think this is important, and I don't anticipate
>> >>> >> > > > > > > > > much
>> >>> >> > tension
>> >>> >> > > if
>> >>> >> > > > > we
>> >>> >> > > > > > > > > establish clear criteria.
>> >>> >> > > > > > > > > It's not helpful if we let the old subpar operators
>> >>> >> > > > > > > > > stay
>> >>> >> and
>> >>> >> > > put
>> >>> >> > > > up
>> >>> >> > > > > > the
>> >>> >> > > > > > > > > bars for new operators.
>> >>> >> > > > > > > > >
>> >>> >> > > > > > > > > David
>> >>> >> > > > > > > > >
>> >>> >> > > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > >
>> >>> >> > >
>> >>> >> >
>> >>> >>
>> >>> >
>> >>> >
>> >>>
>> >>
>> >>
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Chinmay Kolhatkar <ch...@datatorrent.com>.
+1. This is a really good starting point to cleanup malhar.

On Wed, Jul 13, 2016 at 3:06 AM, David Yan <da...@datatorrent.com> wrote:

> Hi Lakshmi,
>
> Thanks for volunteering.
>
> I think Pramod's suggestion of putting the operators into 3 buckets and
> Siyuan's suggestion of starting a shared Google Sheet that tracks
> individual operators are both good, with the exception that lib/streamquery
> is one unit and we probably do not need to look at individual operators
> under it.
>
> If we don't have any objection in the community, let's start the process.
>
> David
>
> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> I am interested to work on this.
>>
>> Regards,
>> Lakshmi prasanna
>>
>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>> wrote:
>>
>> > Why not have a shared google sheet with a list of operators and options
>> > that we want to do with it.
>> > I think it's case by case.
>> > But retire unused or obsolete operators is important and we should do it
>> > sooner rather than later.
>> >
>> > Regards,
>> > Siyuan
>> >
>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> >
>> >>
>> >> My vote is to do 2&3
>> >>
>> >> Thks
>> >> Amol
>> >>
>> >>
>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> >> VKottapalli@directv.com> wrote:
>> >>
>> >>> +1 for deprecating the packages listed below.
>> >>>
>> >>> -----Original Message-----
>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> >>>
>> >>> +1
>> >>>
>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>>
>> >>> > Hi all,
>> >>> >
>> >>> > I would like to renew the discussion of retiring operators in
>> Malhar.
>> >>> >
>> >>> > As stated before, the reason why we would like to retire operators
>> in
>> >>> > Malhar is because some of them were written a long time ago before
>> >>> > Apache incubation, and they do not pertain to real use cases, are
>> not
>> >>> > up to par in code quality, have no potential for improvement, and
>> >>> > probably completely unused by anybody.
>> >>> >
>> >>> > We do not want contributors to use them as a model of their
>> >>> > contribution, or users to use them thinking they are of quality, and
>> >>> then hit a wall.
>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>> >>> >
>> >>> > The initial 3 packages that we would like to target are *lib/algo*,
>> >>> > *lib/math*, and *lib/streamquery*.
>> >>>
>> >>> >
>> >>> > I'm adding this thread to the users list. Please speak up if you are
>> >>> > using any operator in these 3 packages. We would like to hear from
>> you.
>> >>> >
>> >>> > These are the options I can think of for retiring those operators:
>> >>> >
>> >>> > 1) Completely remove them from the malhar repository.
>> >>> > 2) Move them from malhar-library into a separate artifact called
>> >>> > malhar-misc
>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>> >>> > longer supported
>> >>> >
>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >>> >
>> >>> > David
>> >>> >
>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> >>> > <pr...@datatorrent.com>
>> >>> > wrote:
>> >>> >
>> >>> >> I wanted to close the loop on this discussion. In general everyone
>> >>> >> seemed to be favorable to this idea with no serious objections.
>> Folks
>> >>> >> had good suggestions like documenting capabilities of operators,
>> come
>> >>> >> up well defined criteria for graduation of operators and what those
>> >>> >> criteria may be and what to do with existing operators that may not
>> >>> >> yet be mature or unused.
>> >>> >>
>> >>> >> I am going to summarize the key points that resulted from the
>> >>> >> discussion and would like to proceed with them.
>> >>> >>
>> >>> >>    - Operators that do not yet provide the key platform
>> capabilities
>> >>> to
>> >>> >>    make an operator useful across different applications such as
>> >>> >> reusability,
>> >>> >>    partitioning static or dynamic, idempotency, exactly once will
>> >>> still be
>> >>> >>    accepted as long as they are functionally correct, have unit
>> tests
>> >>> >> and will
>> >>> >>    go into a separate module.
>> >>> >>    - Contrib module was suggested as a place where new
>> contributions
>> >>> go in
>> >>> >>    that don't yet have all the platform capabilities and are not
>> yet
>> >>> >> mature.
>> >>> >>    If there are no other suggestions we will go with this one.
>> >>> >>    - It was suggested the operators documentation list those
>> platform
>> >>> >>    capabilities it currently provides from the list above. I will
>> >>> >> document a
>> >>> >>    structure for this in the contribution guidelines.
>> >>> >>    - Folks wanted to know what would be the criteria to graduate an
>> >>> >>    operator to the big leagues :). I will kick-off a separate
>> thread
>> >>> >> for it as
>> >>> >>    I think it requires its own discussion and hopefully we can come
>> >>> >> up with a
>> >>> >>    set of guidelines for it.
>> >>> >>    - David brought up state of some of the existing operators and
>> >>> their
>> >>> >>    retirement and the layout of operators in Malhar in general and
>> >>> how it
>> >>> >>    causes problems with development. I will ask him to lead the
>> >>> >> discussion on
>> >>> >>    that.
>> >>> >>
>> >>> >> Thanks
>> >>> >>
>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>> >>
>> >>> >> > The two ideas are not conflicting, but rather complementing.
>> >>> >> >
>> >>> >> > On the contrary, putting a new process for people trying to
>> >>> >> > contribute while NOT addressing the old unused subpar operators
>> in
>> >>> >> > the repository
>> >>> >> is
>> >>> >> > what is conflicting.
>> >>> >> >
>> >>> >> > Keep in mind that when people try to contribute, they always look
>> >>> >> > at the existing operators already in the repository as examples
>> and
>> >>> >> > likely a
>> >>> >> model
>> >>> >> > for their new operators.
>> >>> >> >
>> >>> >> > David
>> >>> >> >
>> >>> >> >
>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>> amol@datatorrent.com>
>> >>> >> wrote:
>> >>> >> >
>> >>> >> > > Yes there are two conflicting threads now. The original thread
>> >>> >> > > was to
>> >>> >> > open
>> >>> >> > > up a way for contributors to submit code in a dir (contrib?) as
>> >>> >> > > long
>> >>> >> as
>> >>> >> > > license part of taken care of.
>> >>> >> > >
>> >>> >> > > On the thread of removing non-used operators -> How do we know
>> >>> >> > > what is being used?
>> >>> >> > >
>> >>> >> > > Thks,
>> >>> >> > > Amol
>> >>> >> > >
>> >>> >> > >
>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >>> >> sandesh@datatorrent.com>
>> >>> >> > > wrote:
>> >>> >> > >
>> >>> >> > > > +1 for removing the not-used operators.
>> >>> >> > > >
>> >>> >> > > > So we are creating a process for operator writers who don't
>> >>> >> > > > want to understand the platform, yet wants to contribute? How
>> >>> >> > > > big is that
>> >>> >> set?
>> >>> >> > > > If we tell the app-user, here is the code which has not
>> passed
>> >>> >> > > > all
>> >>> >> the
>> >>> >> > > > checklist, will they be ready to use that in production?
>> >>> >> > > >
>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators
>> and
>> >>> >> > > > make
>> >>> >> it
>> >>> >> > > easy
>> >>> >> > > > to add more operators.
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > >
>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >>> >> > pramod@datatorrent.com>
>> >>> >> > > > wrote:
>> >>> >> > > >
>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >>> >> > > gaurav.gopi123@gmail.com>
>> >>> >> > > > > wrote:
>> >>> >> > > > >
>> >>> >> > > > > > Pramod,
>> >>> >> > > > > >
>> >>> >> > > > > > By that logic I would say let's put all partitionable
>> >>> >> > > > > > operators
>> >>> >> > into
>> >>> >> > > > one
>> >>> >> > > > > > folder, non-partitionable operators in another and so
>> on...
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > > > Remember the original goal of making it easier for new
>> >>> >> > > > > members to contribute and managing those contributions to
>> >>> >> > > > > maturity. It is
>> >>> >> not a
>> >>> >> > > > > functional level separation.
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > > When I look at hadoop code I see these annotations being
>> >>> >> > > > > > used at
>> >>> >> > > class
>> >>> >> > > > > > level and not at package/folder level.
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>> like
>> >>> >> > > > > a
>> >>> >> > > folder..."
>> >>> >> > > > > as an analogy and not literally.
>> >>> >> > > > >
>> >>> >> > > > > Thanks
>> >>> >> > > > >
>> >>> >> > > > >
>> >>> >> > > > > > Thanks
>> >>> >> > > > > >
>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >>> >> > > > pramod@datatorrent.com
>> >>> >> > > > > >
>> >>> >> > > > > > wrote:
>> >>> >> > > > > >
>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >>> >> > > > > gaurav.gopi123@gmail.com>
>> >>> >> > > > > > > wrote:
>> >>> >> > > > > > >
>> >>> >> > > > > > > > Can same goal not be achieved by using
>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> >>> >> > > > /
>> >>> >> > > > > > > >
>> org.apache.hadoop.classification.InterfaceStability.Uns
>> >>> >> > > > > > > > table
>> >>> >> > > > > > annotation?
>> >>> >> > > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > > > I think it is important to localize the additions in
>> one
>> >>> >> place so
>> >>> >> > > > that
>> >>> >> > > > > it
>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>> >>> >> > > > > > > these,
>> >>> >> > easier
>> >>> >> > > > for
>> >>> >> > > > > > > developers to track them towards the path to maturity
>> and
>> >>> >> > > > > > > also
>> >>> >> > > > > provides a
>> >>> >> > > > > > > clearer directive for committers and contributors on
>> >>> >> acceptance
>> >>> >> > of
>> >>> >> > > > new
>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>> them
>> >>> >> spread
>> >>> >> > all
>> >>> >> > > > > over
>> >>> >> > > > > > > the place and adds an additional layer of difficulty in
>> >>> >> > > > identification
>> >>> >> > > > > > not
>> >>> >> > > > > > > just for users but also for developers who want to find
>> >>> >> > > > > > > such
>> >>> >> > > > operators
>> >>> >> > > > > > and
>> >>> >> > > > > > > improve them. This of this like a folder level
>> annotation
>> >>> >> where
>> >>> >> > > > > > everything
>> >>> >> > > > > > > under this folder is unstable or evolving.
>> >>> >> > > > > > >
>> >>> >> > > > > > > Thanks
>> >>> >> > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > > > >
>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >>> >> > > david@datatorrent.com
>> >>> >> > > > >
>> >>> >> > > > > > > wrote:
>> >>> >> > > > > > > >
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > > >
>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too many
>> >>> >> operators
>> >>> >> > > > that
>> >>> >> > > > > > fall
>> >>> >> > > > > > > > in
>> >>> >> > > > > > > > > > the
>> >>> >> > > > > > > > > > > > "non-production quality" category. We should
>> >>> >> > > > > > > > > > > > make it
>> >>> >> > > > obvious
>> >>> >> > > > > to
>> >>> >> > > > > > > > users
>> >>> >> > > > > > > > > > > that
>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>> >>> >> > > > > > > > > > > > operators
>> >>> >> are
>> >>> >> > > not,
>> >>> >> > > > > and
>> >>> >> > > > > > > > maybe
>> >>> >> > > > > > > > > > > even
>> >>> >> > > > > > > > > > > > remove those that are likely not ever used
>> in a
>> >>> >> > > > > > > > > > > > real
>> >>> >> > use
>> >>> >> > > > > case.
>> >>> >> > > > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>> operators
>> >>> >> > > > > > > > > > > and
>> >>> >> > doing
>> >>> >> > > > this
>> >>> >> > > > > > > > > exercise
>> >>> >> > > > > > > > > > as
>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>> original
>> >>> >> intent
>> >>> >> > is
>> >>> >> > > > for
>> >>> >> > > > > > > > > > > contributions going forward.
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > >
>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>> >>> >> > > > > > > > > > Operators
>> >>> >> > > outside
>> >>> >> > > > > the
>> >>> >> > > > > > > play
>> >>> >> > > > > > > > > > area should be of well known quality.
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > >
>> >>> >> > > > > > > > > I think this is important, and I don't anticipate
>> >>> >> > > > > > > > > much
>> >>> >> > tension
>> >>> >> > > if
>> >>> >> > > > > we
>> >>> >> > > > > > > > > establish clear criteria.
>> >>> >> > > > > > > > > It's not helpful if we let the old subpar operators
>> >>> >> > > > > > > > > stay
>> >>> >> and
>> >>> >> > > put
>> >>> >> > > > up
>> >>> >> > > > > > the
>> >>> >> > > > > > > > > bars for new operators.
>> >>> >> > > > > > > > >
>> >>> >> > > > > > > > > David
>> >>> >> > > > > > > > >
>> >>> >> > > > > > > >
>> >>> >> > > > > > >
>> >>> >> > > > > >
>> >>> >> > > > >
>> >>> >> > > >
>> >>> >> > >
>> >>> >> >
>> >>> >>
>> >>> >
>> >>> >
>> >>>
>> >>
>> >>
>> >
>>
>
>

Re: A proposal for Malhar

Posted by David Yan <da...@datatorrent.com>.
Hi Lakshmi,

Thanks for volunteering.

I think Pramod's suggestion of putting the operators into 3 buckets and
Siyuan's suggestion of starting a shared Google Sheet that tracks
individual operators are both good, with the exception that lib/streamquery
is one unit and we probably do not need to look at individual operators
under it.

If we don't have any objection in the community, let's start the process.

David

On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> I am interested to work on this.
>
> Regards,
> Lakshmi prasanna
>
> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
> wrote:
>
> > Why not have a shared google sheet with a list of operators and options
> > that we want to do with it.
> > I think it's case by case.
> > But retire unused or obsolete operators is important and we should do it
> > sooner rather than later.
> >
> > Regards,
> > Siyuan
> >
> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
> wrote:
> >
> >>
> >> My vote is to do 2&3
> >>
> >> Thks
> >> Amol
> >>
> >>
> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> >> VKottapalli@directv.com> wrote:
> >>
> >>> +1 for deprecating the packages listed below.
> >>>
> >>> -----Original Message-----
> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
> >>> Sent: Tuesday, July 12, 2016 12:01 PM
> >>>
> >>> +1
> >>>
> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
> >>> wrote:
> >>>
> >>> > Hi all,
> >>> >
> >>> > I would like to renew the discussion of retiring operators in Malhar.
> >>> >
> >>> > As stated before, the reason why we would like to retire operators in
> >>> > Malhar is because some of them were written a long time ago before
> >>> > Apache incubation, and they do not pertain to real use cases, are not
> >>> > up to par in code quality, have no potential for improvement, and
> >>> > probably completely unused by anybody.
> >>> >
> >>> > We do not want contributors to use them as a model of their
> >>> > contribution, or users to use them thinking they are of quality, and
> >>> then hit a wall.
> >>> > Both scenarios are not beneficial to the reputation of Apex.
> >>> >
> >>> > The initial 3 packages that we would like to target are *lib/algo*,
> >>> > *lib/math*, and *lib/streamquery*.
> >>>
> >>> >
> >>> > I'm adding this thread to the users list. Please speak up if you are
> >>> > using any operator in these 3 packages. We would like to hear from
> you.
> >>> >
> >>> > These are the options I can think of for retiring those operators:
> >>> >
> >>> > 1) Completely remove them from the malhar repository.
> >>> > 2) Move them from malhar-library into a separate artifact called
> >>> > malhar-misc
> >>> > 3) Mark them deprecated and add to their javadoc that they are no
> >>> > longer supported
> >>> >
> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >>> >
> >>> > David
> >>> >
> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> >>> > <pr...@datatorrent.com>
> >>> > wrote:
> >>> >
> >>> >> I wanted to close the loop on this discussion. In general everyone
> >>> >> seemed to be favorable to this idea with no serious objections.
> Folks
> >>> >> had good suggestions like documenting capabilities of operators,
> come
> >>> >> up well defined criteria for graduation of operators and what those
> >>> >> criteria may be and what to do with existing operators that may not
> >>> >> yet be mature or unused.
> >>> >>
> >>> >> I am going to summarize the key points that resulted from the
> >>> >> discussion and would like to proceed with them.
> >>> >>
> >>> >>    - Operators that do not yet provide the key platform capabilities
> >>> to
> >>> >>    make an operator useful across different applications such as
> >>> >> reusability,
> >>> >>    partitioning static or dynamic, idempotency, exactly once will
> >>> still be
> >>> >>    accepted as long as they are functionally correct, have unit
> tests
> >>> >> and will
> >>> >>    go into a separate module.
> >>> >>    - Contrib module was suggested as a place where new contributions
> >>> go in
> >>> >>    that don't yet have all the platform capabilities and are not yet
> >>> >> mature.
> >>> >>    If there are no other suggestions we will go with this one.
> >>> >>    - It was suggested the operators documentation list those
> platform
> >>> >>    capabilities it currently provides from the list above. I will
> >>> >> document a
> >>> >>    structure for this in the contribution guidelines.
> >>> >>    - Folks wanted to know what would be the criteria to graduate an
> >>> >>    operator to the big leagues :). I will kick-off a separate thread
> >>> >> for it as
> >>> >>    I think it requires its own discussion and hopefully we can come
> >>> >> up with a
> >>> >>    set of guidelines for it.
> >>> >>    - David brought up state of some of the existing operators and
> >>> their
> >>> >>    retirement and the layout of operators in Malhar in general and
> >>> how it
> >>> >>    causes problems with development. I will ask him to lead the
> >>> >> discussion on
> >>> >>    that.
> >>> >>
> >>> >> Thanks
> >>> >>
> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
> >>> wrote:
> >>> >>
> >>> >> > The two ideas are not conflicting, but rather complementing.
> >>> >> >
> >>> >> > On the contrary, putting a new process for people trying to
> >>> >> > contribute while NOT addressing the old unused subpar operators in
> >>> >> > the repository
> >>> >> is
> >>> >> > what is conflicting.
> >>> >> >
> >>> >> > Keep in mind that when people try to contribute, they always look
> >>> >> > at the existing operators already in the repository as examples
> and
> >>> >> > likely a
> >>> >> model
> >>> >> > for their new operators.
> >>> >> >
> >>> >> > David
> >>> >> >
> >>> >> >
> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <amol@datatorrent.com
> >
> >>> >> wrote:
> >>> >> >
> >>> >> > > Yes there are two conflicting threads now. The original thread
> >>> >> > > was to
> >>> >> > open
> >>> >> > > up a way for contributors to submit code in a dir (contrib?) as
> >>> >> > > long
> >>> >> as
> >>> >> > > license part of taken care of.
> >>> >> > >
> >>> >> > > On the thread of removing non-used operators -> How do we know
> >>> >> > > what is being used?
> >>> >> > >
> >>> >> > > Thks,
> >>> >> > > Amol
> >>> >> > >
> >>> >> > >
> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >>> >> sandesh@datatorrent.com>
> >>> >> > > wrote:
> >>> >> > >
> >>> >> > > > +1 for removing the not-used operators.
> >>> >> > > >
> >>> >> > > > So we are creating a process for operator writers who don't
> >>> >> > > > want to understand the platform, yet wants to contribute? How
> >>> >> > > > big is that
> >>> >> set?
> >>> >> > > > If we tell the app-user, here is the code which has not passed
> >>> >> > > > all
> >>> >> the
> >>> >> > > > checklist, will they be ready to use that in production?
> >>> >> > > >
> >>> >> > > > This thread has 2 conflicting forces, reduce the operators and
> >>> >> > > > make
> >>> >> it
> >>> >> > > easy
> >>> >> > > > to add more operators.
> >>> >> > > >
> >>> >> > > >
> >>> >> > > >
> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> >>> >> > pramod@datatorrent.com>
> >>> >> > > > wrote:
> >>> >> > > >
> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> >>> >> > > gaurav.gopi123@gmail.com>
> >>> >> > > > > wrote:
> >>> >> > > > >
> >>> >> > > > > > Pramod,
> >>> >> > > > > >
> >>> >> > > > > > By that logic I would say let's put all partitionable
> >>> >> > > > > > operators
> >>> >> > into
> >>> >> > > > one
> >>> >> > > > > > folder, non-partitionable operators in another and so
> on...
> >>> >> > > > > >
> >>> >> > > > >
> >>> >> > > > > Remember the original goal of making it easier for new
> >>> >> > > > > members to contribute and managing those contributions to
> >>> >> > > > > maturity. It is
> >>> >> not a
> >>> >> > > > > functional level separation.
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > > When I look at hadoop code I see these annotations being
> >>> >> > > > > > used at
> >>> >> > > class
> >>> >> > > > > > level and not at package/folder level.
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > I had a typo in my email, I meant to say "think of this like
> >>> >> > > > > a
> >>> >> > > folder..."
> >>> >> > > > > as an analogy and not literally.
> >>> >> > > > >
> >>> >> > > > > Thanks
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > > Thanks
> >>> >> > > > > >
> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> >>> >> > > > pramod@datatorrent.com
> >>> >> > > > > >
> >>> >> > > > > > wrote:
> >>> >> > > > > >
> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> >>> >> > > > > gaurav.gopi123@gmail.com>
> >>> >> > > > > > > wrote:
> >>> >> > > > > > >
> >>> >> > > > > > > > Can same goal not be achieved by using
> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> >>> >> > > > /
> >>> >> > > > > > > >
> org.apache.hadoop.classification.InterfaceStability.Uns
> >>> >> > > > > > > > table
> >>> >> > > > > > annotation?
> >>> >> > > > > > > >
> >>> >> > > > > > >
> >>> >> > > > > > > I think it is important to localize the additions in one
> >>> >> place so
> >>> >> > > > that
> >>> >> > > > > it
> >>> >> > > > > > > becomes clearer to users about the maturity level of
> >>> >> > > > > > > these,
> >>> >> > easier
> >>> >> > > > for
> >>> >> > > > > > > developers to track them towards the path to maturity
> and
> >>> >> > > > > > > also
> >>> >> > > > > provides a
> >>> >> > > > > > > clearer directive for committers and contributors on
> >>> >> acceptance
> >>> >> > of
> >>> >> > > > new
> >>> >> > > > > > > submissions. Relying on the annotations alone makes them
> >>> >> spread
> >>> >> > all
> >>> >> > > > > over
> >>> >> > > > > > > the place and adds an additional layer of difficulty in
> >>> >> > > > identification
> >>> >> > > > > > not
> >>> >> > > > > > > just for users but also for developers who want to find
> >>> >> > > > > > > such
> >>> >> > > > operators
> >>> >> > > > > > and
> >>> >> > > > > > > improve them. This of this like a folder level
> annotation
> >>> >> where
> >>> >> > > > > > everything
> >>> >> > > > > > > under this folder is unstable or evolving.
> >>> >> > > > > > >
> >>> >> > > > > > > Thanks
> >>> >> > > > > > >
> >>> >> > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> >>> >> > > david@datatorrent.com
> >>> >> > > > >
> >>> >> > > > > > > wrote:
> >>> >> > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > Malhar in its current state, has way too many
> >>> >> operators
> >>> >> > > > that
> >>> >> > > > > > fall
> >>> >> > > > > > > > in
> >>> >> > > > > > > > > > the
> >>> >> > > > > > > > > > > > "non-production quality" category. We should
> >>> >> > > > > > > > > > > > make it
> >>> >> > > > obvious
> >>> >> > > > > to
> >>> >> > > > > > > > users
> >>> >> > > > > > > > > > > that
> >>> >> > > > > > > > > > > > which operators are up to par, and which
> >>> >> > > > > > > > > > > > operators
> >>> >> are
> >>> >> > > not,
> >>> >> > > > > and
> >>> >> > > > > > > > maybe
> >>> >> > > > > > > > > > > even
> >>> >> > > > > > > > > > > > remove those that are likely not ever used in
> a
> >>> >> > > > > > > > > > > > real
> >>> >> > use
> >>> >> > > > > case.
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > I am ambivalent about revisiting older operators
> >>> >> > > > > > > > > > > and
> >>> >> > doing
> >>> >> > > > this
> >>> >> > > > > > > > > exercise
> >>> >> > > > > > > > > > as
> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My original
> >>> >> intent
> >>> >> > is
> >>> >> > > > for
> >>> >> > > > > > > > > > > contributions going forward.
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > IMO it is important to address this as well.
> >>> >> > > > > > > > > > Operators
> >>> >> > > outside
> >>> >> > > > > the
> >>> >> > > > > > > play
> >>> >> > > > > > > > > > area should be of well known quality.
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > I think this is important, and I don't anticipate
> >>> >> > > > > > > > > much
> >>> >> > tension
> >>> >> > > if
> >>> >> > > > > we
> >>> >> > > > > > > > > establish clear criteria.
> >>> >> > > > > > > > > It's not helpful if we let the old subpar operators
> >>> >> > > > > > > > > stay
> >>> >> and
> >>> >> > > put
> >>> >> > > > up
> >>> >> > > > > > the
> >>> >> > > > > > > > > bars for new operators.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > David
> >>> >> > > > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > >
> >>> >> > > > > >
> >>> >> > > > >
> >>> >> > > >
> >>> >> > >
> >>> >> >
> >>> >>
> >>> >
> >>> >
> >>>
> >>
> >>
> >
>

Re: A proposal for Malhar

Posted by David Yan <da...@datatorrent.com>.
Hi Lakshmi,

Thanks for volunteering.

I think Pramod's suggestion of putting the operators into 3 buckets and
Siyuan's suggestion of starting a shared Google Sheet that tracks
individual operators are both good, with the exception that lib/streamquery
is one unit and we probably do not need to look at individual operators
under it.

If we don't have any objection in the community, let's start the process.

David

On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> I am interested to work on this.
>
> Regards,
> Lakshmi prasanna
>
> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
> wrote:
>
> > Why not have a shared google sheet with a list of operators and options
> > that we want to do with it.
> > I think it's case by case.
> > But retire unused or obsolete operators is important and we should do it
> > sooner rather than later.
> >
> > Regards,
> > Siyuan
> >
> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
> wrote:
> >
> >>
> >> My vote is to do 2&3
> >>
> >> Thks
> >> Amol
> >>
> >>
> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> >> VKottapalli@directv.com> wrote:
> >>
> >>> +1 for deprecating the packages listed below.
> >>>
> >>> -----Original Message-----
> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
> >>> Sent: Tuesday, July 12, 2016 12:01 PM
> >>>
> >>> +1
> >>>
> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
> >>> wrote:
> >>>
> >>> > Hi all,
> >>> >
> >>> > I would like to renew the discussion of retiring operators in Malhar.
> >>> >
> >>> > As stated before, the reason why we would like to retire operators in
> >>> > Malhar is because some of them were written a long time ago before
> >>> > Apache incubation, and they do not pertain to real use cases, are not
> >>> > up to par in code quality, have no potential for improvement, and
> >>> > probably completely unused by anybody.
> >>> >
> >>> > We do not want contributors to use them as a model of their
> >>> > contribution, or users to use them thinking they are of quality, and
> >>> then hit a wall.
> >>> > Both scenarios are not beneficial to the reputation of Apex.
> >>> >
> >>> > The initial 3 packages that we would like to target are *lib/algo*,
> >>> > *lib/math*, and *lib/streamquery*.
> >>>
> >>> >
> >>> > I'm adding this thread to the users list. Please speak up if you are
> >>> > using any operator in these 3 packages. We would like to hear from
> you.
> >>> >
> >>> > These are the options I can think of for retiring those operators:
> >>> >
> >>> > 1) Completely remove them from the malhar repository.
> >>> > 2) Move them from malhar-library into a separate artifact called
> >>> > malhar-misc
> >>> > 3) Mark them deprecated and add to their javadoc that they are no
> >>> > longer supported
> >>> >
> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >>> >
> >>> > David
> >>> >
> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> >>> > <pr...@datatorrent.com>
> >>> > wrote:
> >>> >
> >>> >> I wanted to close the loop on this discussion. In general everyone
> >>> >> seemed to be favorable to this idea with no serious objections.
> Folks
> >>> >> had good suggestions like documenting capabilities of operators,
> come
> >>> >> up well defined criteria for graduation of operators and what those
> >>> >> criteria may be and what to do with existing operators that may not
> >>> >> yet be mature or unused.
> >>> >>
> >>> >> I am going to summarize the key points that resulted from the
> >>> >> discussion and would like to proceed with them.
> >>> >>
> >>> >>    - Operators that do not yet provide the key platform capabilities
> >>> to
> >>> >>    make an operator useful across different applications such as
> >>> >> reusability,
> >>> >>    partitioning static or dynamic, idempotency, exactly once will
> >>> still be
> >>> >>    accepted as long as they are functionally correct, have unit
> tests
> >>> >> and will
> >>> >>    go into a separate module.
> >>> >>    - Contrib module was suggested as a place where new contributions
> >>> go in
> >>> >>    that don't yet have all the platform capabilities and are not yet
> >>> >> mature.
> >>> >>    If there are no other suggestions we will go with this one.
> >>> >>    - It was suggested the operators documentation list those
> platform
> >>> >>    capabilities it currently provides from the list above. I will
> >>> >> document a
> >>> >>    structure for this in the contribution guidelines.
> >>> >>    - Folks wanted to know what would be the criteria to graduate an
> >>> >>    operator to the big leagues :). I will kick-off a separate thread
> >>> >> for it as
> >>> >>    I think it requires its own discussion and hopefully we can come
> >>> >> up with a
> >>> >>    set of guidelines for it.
> >>> >>    - David brought up state of some of the existing operators and
> >>> their
> >>> >>    retirement and the layout of operators in Malhar in general and
> >>> how it
> >>> >>    causes problems with development. I will ask him to lead the
> >>> >> discussion on
> >>> >>    that.
> >>> >>
> >>> >> Thanks
> >>> >>
> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
> >>> wrote:
> >>> >>
> >>> >> > The two ideas are not conflicting, but rather complementing.
> >>> >> >
> >>> >> > On the contrary, putting a new process for people trying to
> >>> >> > contribute while NOT addressing the old unused subpar operators in
> >>> >> > the repository
> >>> >> is
> >>> >> > what is conflicting.
> >>> >> >
> >>> >> > Keep in mind that when people try to contribute, they always look
> >>> >> > at the existing operators already in the repository as examples
> and
> >>> >> > likely a
> >>> >> model
> >>> >> > for their new operators.
> >>> >> >
> >>> >> > David
> >>> >> >
> >>> >> >
> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <amol@datatorrent.com
> >
> >>> >> wrote:
> >>> >> >
> >>> >> > > Yes there are two conflicting threads now. The original thread
> >>> >> > > was to
> >>> >> > open
> >>> >> > > up a way for contributors to submit code in a dir (contrib?) as
> >>> >> > > long
> >>> >> as
> >>> >> > > license part of taken care of.
> >>> >> > >
> >>> >> > > On the thread of removing non-used operators -> How do we know
> >>> >> > > what is being used?
> >>> >> > >
> >>> >> > > Thks,
> >>> >> > > Amol
> >>> >> > >
> >>> >> > >
> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >>> >> sandesh@datatorrent.com>
> >>> >> > > wrote:
> >>> >> > >
> >>> >> > > > +1 for removing the not-used operators.
> >>> >> > > >
> >>> >> > > > So we are creating a process for operator writers who don't
> >>> >> > > > want to understand the platform, yet wants to contribute? How
> >>> >> > > > big is that
> >>> >> set?
> >>> >> > > > If we tell the app-user, here is the code which has not passed
> >>> >> > > > all
> >>> >> the
> >>> >> > > > checklist, will they be ready to use that in production?
> >>> >> > > >
> >>> >> > > > This thread has 2 conflicting forces, reduce the operators and
> >>> >> > > > make
> >>> >> it
> >>> >> > > easy
> >>> >> > > > to add more operators.
> >>> >> > > >
> >>> >> > > >
> >>> >> > > >
> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> >>> >> > pramod@datatorrent.com>
> >>> >> > > > wrote:
> >>> >> > > >
> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> >>> >> > > gaurav.gopi123@gmail.com>
> >>> >> > > > > wrote:
> >>> >> > > > >
> >>> >> > > > > > Pramod,
> >>> >> > > > > >
> >>> >> > > > > > By that logic I would say let's put all partitionable
> >>> >> > > > > > operators
> >>> >> > into
> >>> >> > > > one
> >>> >> > > > > > folder, non-partitionable operators in another and so
> on...
> >>> >> > > > > >
> >>> >> > > > >
> >>> >> > > > > Remember the original goal of making it easier for new
> >>> >> > > > > members to contribute and managing those contributions to
> >>> >> > > > > maturity. It is
> >>> >> not a
> >>> >> > > > > functional level separation.
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > > When I look at hadoop code I see these annotations being
> >>> >> > > > > > used at
> >>> >> > > class
> >>> >> > > > > > level and not at package/folder level.
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > I had a typo in my email, I meant to say "think of this like
> >>> >> > > > > a
> >>> >> > > folder..."
> >>> >> > > > > as an analogy and not literally.
> >>> >> > > > >
> >>> >> > > > > Thanks
> >>> >> > > > >
> >>> >> > > > >
> >>> >> > > > > > Thanks
> >>> >> > > > > >
> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> >>> >> > > > pramod@datatorrent.com
> >>> >> > > > > >
> >>> >> > > > > > wrote:
> >>> >> > > > > >
> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> >>> >> > > > > gaurav.gopi123@gmail.com>
> >>> >> > > > > > > wrote:
> >>> >> > > > > > >
> >>> >> > > > > > > > Can same goal not be achieved by using
> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> >>> >> > > > /
> >>> >> > > > > > > >
> org.apache.hadoop.classification.InterfaceStability.Uns
> >>> >> > > > > > > > table
> >>> >> > > > > > annotation?
> >>> >> > > > > > > >
> >>> >> > > > > > >
> >>> >> > > > > > > I think it is important to localize the additions in one
> >>> >> place so
> >>> >> > > > that
> >>> >> > > > > it
> >>> >> > > > > > > becomes clearer to users about the maturity level of
> >>> >> > > > > > > these,
> >>> >> > easier
> >>> >> > > > for
> >>> >> > > > > > > developers to track them towards the path to maturity
> and
> >>> >> > > > > > > also
> >>> >> > > > > provides a
> >>> >> > > > > > > clearer directive for committers and contributors on
> >>> >> acceptance
> >>> >> > of
> >>> >> > > > new
> >>> >> > > > > > > submissions. Relying on the annotations alone makes them
> >>> >> spread
> >>> >> > all
> >>> >> > > > > over
> >>> >> > > > > > > the place and adds an additional layer of difficulty in
> >>> >> > > > identification
> >>> >> > > > > > not
> >>> >> > > > > > > just for users but also for developers who want to find
> >>> >> > > > > > > such
> >>> >> > > > operators
> >>> >> > > > > > and
> >>> >> > > > > > > improve them. This of this like a folder level
> annotation
> >>> >> where
> >>> >> > > > > > everything
> >>> >> > > > > > > under this folder is unstable or evolving.
> >>> >> > > > > > >
> >>> >> > > > > > > Thanks
> >>> >> > > > > > >
> >>> >> > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> >>> >> > > david@datatorrent.com
> >>> >> > > > >
> >>> >> > > > > > > wrote:
> >>> >> > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > > > Malhar in its current state, has way too many
> >>> >> operators
> >>> >> > > > that
> >>> >> > > > > > fall
> >>> >> > > > > > > > in
> >>> >> > > > > > > > > > the
> >>> >> > > > > > > > > > > > "non-production quality" category. We should
> >>> >> > > > > > > > > > > > make it
> >>> >> > > > obvious
> >>> >> > > > > to
> >>> >> > > > > > > > users
> >>> >> > > > > > > > > > > that
> >>> >> > > > > > > > > > > > which operators are up to par, and which
> >>> >> > > > > > > > > > > > operators
> >>> >> are
> >>> >> > > not,
> >>> >> > > > > and
> >>> >> > > > > > > > maybe
> >>> >> > > > > > > > > > > even
> >>> >> > > > > > > > > > > > remove those that are likely not ever used in
> a
> >>> >> > > > > > > > > > > > real
> >>> >> > use
> >>> >> > > > > case.
> >>> >> > > > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > > I am ambivalent about revisiting older operators
> >>> >> > > > > > > > > > > and
> >>> >> > doing
> >>> >> > > > this
> >>> >> > > > > > > > > exercise
> >>> >> > > > > > > > > > as
> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My original
> >>> >> intent
> >>> >> > is
> >>> >> > > > for
> >>> >> > > > > > > > > > > contributions going forward.
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > >
> >>> >> > > > > > > > > > IMO it is important to address this as well.
> >>> >> > > > > > > > > > Operators
> >>> >> > > outside
> >>> >> > > > > the
> >>> >> > > > > > > play
> >>> >> > > > > > > > > > area should be of well known quality.
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > >
> >>> >> > > > > > > > > I think this is important, and I don't anticipate
> >>> >> > > > > > > > > much
> >>> >> > tension
> >>> >> > > if
> >>> >> > > > > we
> >>> >> > > > > > > > > establish clear criteria.
> >>> >> > > > > > > > > It's not helpful if we let the old subpar operators
> >>> >> > > > > > > > > stay
> >>> >> and
> >>> >> > > put
> >>> >> > > > up
> >>> >> > > > > > the
> >>> >> > > > > > > > > bars for new operators.
> >>> >> > > > > > > > >
> >>> >> > > > > > > > > David
> >>> >> > > > > > > > >
> >>> >> > > > > > > >
> >>> >> > > > > > >
> >>> >> > > > > >
> >>> >> > > > >
> >>> >> > > >
> >>> >> > >
> >>> >> >
> >>> >>
> >>> >
> >>> >
> >>>
> >>
> >>
> >
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
I am interested to work on this.

Regards,
Lakshmi prasanna

On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com> wrote:

> Why not have a shared google sheet with a list of operators and options
> that we want to do with it.
> I think it's case by case.
> But retire unused or obsolete operators is important and we should do it
> sooner rather than later.
>
> Regards,
> Siyuan
>
> On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com> wrote:
>
>>
>> My vote is to do 2&3
>>
>> Thks
>> Amol
>>
>>
>> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> VKottapalli@directv.com> wrote:
>>
>>> +1 for deprecating the packages listed below.
>>>
>>> -----Original Message-----
>>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>
>>> +1
>>>
>>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>>> wrote:
>>>
>>> > Hi all,
>>> >
>>> > I would like to renew the discussion of retiring operators in Malhar.
>>> >
>>> > As stated before, the reason why we would like to retire operators in
>>> > Malhar is because some of them were written a long time ago before
>>> > Apache incubation, and they do not pertain to real use cases, are not
>>> > up to par in code quality, have no potential for improvement, and
>>> > probably completely unused by anybody.
>>> >
>>> > We do not want contributors to use them as a model of their
>>> > contribution, or users to use them thinking they are of quality, and
>>> then hit a wall.
>>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >
>>> > The initial 3 packages that we would like to target are *lib/algo*,
>>> > *lib/math*, and *lib/streamquery*.
>>>
>>> >
>>> > I'm adding this thread to the users list. Please speak up if you are
>>> > using any operator in these 3 packages. We would like to hear from you.
>>> >
>>> > These are the options I can think of for retiring those operators:
>>> >
>>> > 1) Completely remove them from the malhar repository.
>>> > 2) Move them from malhar-library into a separate artifact called
>>> > malhar-misc
>>> > 3) Mark them deprecated and add to their javadoc that they are no
>>> > longer supported
>>> >
>>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >
>>> > David
>>> >
>>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> > <pr...@datatorrent.com>
>>> > wrote:
>>> >
>>> >> I wanted to close the loop on this discussion. In general everyone
>>> >> seemed to be favorable to this idea with no serious objections. Folks
>>> >> had good suggestions like documenting capabilities of operators, come
>>> >> up well defined criteria for graduation of operators and what those
>>> >> criteria may be and what to do with existing operators that may not
>>> >> yet be mature or unused.
>>> >>
>>> >> I am going to summarize the key points that resulted from the
>>> >> discussion and would like to proceed with them.
>>> >>
>>> >>    - Operators that do not yet provide the key platform capabilities
>>> to
>>> >>    make an operator useful across different applications such as
>>> >> reusability,
>>> >>    partitioning static or dynamic, idempotency, exactly once will
>>> still be
>>> >>    accepted as long as they are functionally correct, have unit tests
>>> >> and will
>>> >>    go into a separate module.
>>> >>    - Contrib module was suggested as a place where new contributions
>>> go in
>>> >>    that don't yet have all the platform capabilities and are not yet
>>> >> mature.
>>> >>    If there are no other suggestions we will go with this one.
>>> >>    - It was suggested the operators documentation list those platform
>>> >>    capabilities it currently provides from the list above. I will
>>> >> document a
>>> >>    structure for this in the contribution guidelines.
>>> >>    - Folks wanted to know what would be the criteria to graduate an
>>> >>    operator to the big leagues :). I will kick-off a separate thread
>>> >> for it as
>>> >>    I think it requires its own discussion and hopefully we can come
>>> >> up with a
>>> >>    set of guidelines for it.
>>> >>    - David brought up state of some of the existing operators and
>>> their
>>> >>    retirement and the layout of operators in Malhar in general and
>>> how it
>>> >>    causes problems with development. I will ask him to lead the
>>> >> discussion on
>>> >>    that.
>>> >>
>>> >> Thanks
>>> >>
>>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
>>> wrote:
>>> >>
>>> >> > The two ideas are not conflicting, but rather complementing.
>>> >> >
>>> >> > On the contrary, putting a new process for people trying to
>>> >> > contribute while NOT addressing the old unused subpar operators in
>>> >> > the repository
>>> >> is
>>> >> > what is conflicting.
>>> >> >
>>> >> > Keep in mind that when people try to contribute, they always look
>>> >> > at the existing operators already in the repository as examples and
>>> >> > likely a
>>> >> model
>>> >> > for their new operators.
>>> >> >
>>> >> > David
>>> >> >
>>> >> >
>>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
>>> >> wrote:
>>> >> >
>>> >> > > Yes there are two conflicting threads now. The original thread
>>> >> > > was to
>>> >> > open
>>> >> > > up a way for contributors to submit code in a dir (contrib?) as
>>> >> > > long
>>> >> as
>>> >> > > license part of taken care of.
>>> >> > >
>>> >> > > On the thread of removing non-used operators -> How do we know
>>> >> > > what is being used?
>>> >> > >
>>> >> > > Thks,
>>> >> > > Amol
>>> >> > >
>>> >> > >
>>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>> >> sandesh@datatorrent.com>
>>> >> > > wrote:
>>> >> > >
>>> >> > > > +1 for removing the not-used operators.
>>> >> > > >
>>> >> > > > So we are creating a process for operator writers who don't
>>> >> > > > want to understand the platform, yet wants to contribute? How
>>> >> > > > big is that
>>> >> set?
>>> >> > > > If we tell the app-user, here is the code which has not passed
>>> >> > > > all
>>> >> the
>>> >> > > > checklist, will they be ready to use that in production?
>>> >> > > >
>>> >> > > > This thread has 2 conflicting forces, reduce the operators and
>>> >> > > > make
>>> >> it
>>> >> > > easy
>>> >> > > > to add more operators.
>>> >> > > >
>>> >> > > >
>>> >> > > >
>>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>> >> > pramod@datatorrent.com>
>>> >> > > > wrote:
>>> >> > > >
>>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>> >> > > gaurav.gopi123@gmail.com>
>>> >> > > > > wrote:
>>> >> > > > >
>>> >> > > > > > Pramod,
>>> >> > > > > >
>>> >> > > > > > By that logic I would say let's put all partitionable
>>> >> > > > > > operators
>>> >> > into
>>> >> > > > one
>>> >> > > > > > folder, non-partitionable operators in another and so on...
>>> >> > > > > >
>>> >> > > > >
>>> >> > > > > Remember the original goal of making it easier for new
>>> >> > > > > members to contribute and managing those contributions to
>>> >> > > > > maturity. It is
>>> >> not a
>>> >> > > > > functional level separation.
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > > When I look at hadoop code I see these annotations being
>>> >> > > > > > used at
>>> >> > > class
>>> >> > > > > > level and not at package/folder level.
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > I had a typo in my email, I meant to say "think of this like
>>> >> > > > > a
>>> >> > > folder..."
>>> >> > > > > as an analogy and not literally.
>>> >> > > > >
>>> >> > > > > Thanks
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > > Thanks
>>> >> > > > > >
>>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>> >> > > > pramod@datatorrent.com
>>> >> > > > > >
>>> >> > > > > > wrote:
>>> >> > > > > >
>>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>> >> > > > > gaurav.gopi123@gmail.com>
>>> >> > > > > > > wrote:
>>> >> > > > > > >
>>> >> > > > > > > > Can same goal not be achieved by using
>>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>>> >> > > > /
>>> >> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Uns
>>> >> > > > > > > > table
>>> >> > > > > > annotation?
>>> >> > > > > > > >
>>> >> > > > > > >
>>> >> > > > > > > I think it is important to localize the additions in one
>>> >> place so
>>> >> > > > that
>>> >> > > > > it
>>> >> > > > > > > becomes clearer to users about the maturity level of
>>> >> > > > > > > these,
>>> >> > easier
>>> >> > > > for
>>> >> > > > > > > developers to track them towards the path to maturity and
>>> >> > > > > > > also
>>> >> > > > > provides a
>>> >> > > > > > > clearer directive for committers and contributors on
>>> >> acceptance
>>> >> > of
>>> >> > > > new
>>> >> > > > > > > submissions. Relying on the annotations alone makes them
>>> >> spread
>>> >> > all
>>> >> > > > > over
>>> >> > > > > > > the place and adds an additional layer of difficulty in
>>> >> > > > identification
>>> >> > > > > > not
>>> >> > > > > > > just for users but also for developers who want to find
>>> >> > > > > > > such
>>> >> > > > operators
>>> >> > > > > > and
>>> >> > > > > > > improve them. This of this like a folder level annotation
>>> >> where
>>> >> > > > > > everything
>>> >> > > > > > > under this folder is unstable or evolving.
>>> >> > > > > > >
>>> >> > > > > > > Thanks
>>> >> > > > > > >
>>> >> > > > > > >
>>> >> > > > > > > >
>>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>> >> > > david@datatorrent.com
>>> >> > > > >
>>> >> > > > > > > wrote:
>>> >> > > > > > > >
>>> >> > > > > > > > > >
>>> >> > > > > > > > > > >
>>> >> > > > > > > > > > > >
>>> >> > > > > > > > > > > > Malhar in its current state, has way too many
>>> >> operators
>>> >> > > > that
>>> >> > > > > > fall
>>> >> > > > > > > > in
>>> >> > > > > > > > > > the
>>> >> > > > > > > > > > > > "non-production quality" category. We should
>>> >> > > > > > > > > > > > make it
>>> >> > > > obvious
>>> >> > > > > to
>>> >> > > > > > > > users
>>> >> > > > > > > > > > > that
>>> >> > > > > > > > > > > > which operators are up to par, and which
>>> >> > > > > > > > > > > > operators
>>> >> are
>>> >> > > not,
>>> >> > > > > and
>>> >> > > > > > > > maybe
>>> >> > > > > > > > > > > even
>>> >> > > > > > > > > > > > remove those that are likely not ever used in a
>>> >> > > > > > > > > > > > real
>>> >> > use
>>> >> > > > > case.
>>> >> > > > > > > > > > > >
>>> >> > > > > > > > > > >
>>> >> > > > > > > > > > > I am ambivalent about revisiting older operators
>>> >> > > > > > > > > > > and
>>> >> > doing
>>> >> > > > this
>>> >> > > > > > > > > exercise
>>> >> > > > > > > > > > as
>>> >> > > > > > > > > > > this can cause unnecessary tensions. My original
>>> >> intent
>>> >> > is
>>> >> > > > for
>>> >> > > > > > > > > > > contributions going forward.
>>> >> > > > > > > > > > >
>>> >> > > > > > > > > > >
>>> >> > > > > > > > > > IMO it is important to address this as well.
>>> >> > > > > > > > > > Operators
>>> >> > > outside
>>> >> > > > > the
>>> >> > > > > > > play
>>> >> > > > > > > > > > area should be of well known quality.
>>> >> > > > > > > > > >
>>> >> > > > > > > > > >
>>> >> > > > > > > > > I think this is important, and I don't anticipate
>>> >> > > > > > > > > much
>>> >> > tension
>>> >> > > if
>>> >> > > > > we
>>> >> > > > > > > > > establish clear criteria.
>>> >> > > > > > > > > It's not helpful if we let the old subpar operators
>>> >> > > > > > > > > stay
>>> >> and
>>> >> > > put
>>> >> > > > up
>>> >> > > > > > the
>>> >> > > > > > > > > bars for new operators.
>>> >> > > > > > > > >
>>> >> > > > > > > > > David
>>> >> > > > > > > > >
>>> >> > > > > > > >
>>> >> > > > > > >
>>> >> > > > > >
>>> >> > > > >
>>> >> > > >
>>> >> > >
>>> >> >
>>> >>
>>> >
>>> >
>>>
>>
>>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
I am interested to work on this.

Regards,
Lakshmi prasanna

On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com> wrote:

> Why not have a shared google sheet with a list of operators and options
> that we want to do with it.
> I think it's case by case.
> But retire unused or obsolete operators is important and we should do it
> sooner rather than later.
>
> Regards,
> Siyuan
>
> On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com> wrote:
>
>>
>> My vote is to do 2&3
>>
>> Thks
>> Amol
>>
>>
>> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> VKottapalli@directv.com> wrote:
>>
>>> +1 for deprecating the packages listed below.
>>>
>>> -----Original Message-----
>>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>
>>> +1
>>>
>>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>>> wrote:
>>>
>>> > Hi all,
>>> >
>>> > I would like to renew the discussion of retiring operators in Malhar.
>>> >
>>> > As stated before, the reason why we would like to retire operators in
>>> > Malhar is because some of them were written a long time ago before
>>> > Apache incubation, and they do not pertain to real use cases, are not
>>> > up to par in code quality, have no potential for improvement, and
>>> > probably completely unused by anybody.
>>> >
>>> > We do not want contributors to use them as a model of their
>>> > contribution, or users to use them thinking they are of quality, and
>>> then hit a wall.
>>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >
>>> > The initial 3 packages that we would like to target are *lib/algo*,
>>> > *lib/math*, and *lib/streamquery*.
>>>
>>> >
>>> > I'm adding this thread to the users list. Please speak up if you are
>>> > using any operator in these 3 packages. We would like to hear from you.
>>> >
>>> > These are the options I can think of for retiring those operators:
>>> >
>>> > 1) Completely remove them from the malhar repository.
>>> > 2) Move them from malhar-library into a separate artifact called
>>> > malhar-misc
>>> > 3) Mark them deprecated and add to their javadoc that they are no
>>> > longer supported
>>> >
>>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >
>>> > David
>>> >
>>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> > <pr...@datatorrent.com>
>>> > wrote:
>>> >
>>> >> I wanted to close the loop on this discussion. In general everyone
>>> >> seemed to be favorable to this idea with no serious objections. Folks
>>> >> had good suggestions like documenting capabilities of operators, come
>>> >> up well defined criteria for graduation of operators and what those
>>> >> criteria may be and what to do with existing operators that may not
>>> >> yet be mature or unused.
>>> >>
>>> >> I am going to summarize the key points that resulted from the
>>> >> discussion and would like to proceed with them.
>>> >>
>>> >>    - Operators that do not yet provide the key platform capabilities
>>> to
>>> >>    make an operator useful across different applications such as
>>> >> reusability,
>>> >>    partitioning static or dynamic, idempotency, exactly once will
>>> still be
>>> >>    accepted as long as they are functionally correct, have unit tests
>>> >> and will
>>> >>    go into a separate module.
>>> >>    - Contrib module was suggested as a place where new contributions
>>> go in
>>> >>    that don't yet have all the platform capabilities and are not yet
>>> >> mature.
>>> >>    If there are no other suggestions we will go with this one.
>>> >>    - It was suggested the operators documentation list those platform
>>> >>    capabilities it currently provides from the list above. I will
>>> >> document a
>>> >>    structure for this in the contribution guidelines.
>>> >>    - Folks wanted to know what would be the criteria to graduate an
>>> >>    operator to the big leagues :). I will kick-off a separate thread
>>> >> for it as
>>> >>    I think it requires its own discussion and hopefully we can come
>>> >> up with a
>>> >>    set of guidelines for it.
>>> >>    - David brought up state of some of the existing operators and
>>> their
>>> >>    retirement and the layout of operators in Malhar in general and
>>> how it
>>> >>    causes problems with development. I will ask him to lead the
>>> >> discussion on
>>> >>    that.
>>> >>
>>> >> Thanks
>>> >>
>>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
>>> wrote:
>>> >>
>>> >> > The two ideas are not conflicting, but rather complementing.
>>> >> >
>>> >> > On the contrary, putting a new process for people trying to
>>> >> > contribute while NOT addressing the old unused subpar operators in
>>> >> > the repository
>>> >> is
>>> >> > what is conflicting.
>>> >> >
>>> >> > Keep in mind that when people try to contribute, they always look
>>> >> > at the existing operators already in the repository as examples and
>>> >> > likely a
>>> >> model
>>> >> > for their new operators.
>>> >> >
>>> >> > David
>>> >> >
>>> >> >
>>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
>>> >> wrote:
>>> >> >
>>> >> > > Yes there are two conflicting threads now. The original thread
>>> >> > > was to
>>> >> > open
>>> >> > > up a way for contributors to submit code in a dir (contrib?) as
>>> >> > > long
>>> >> as
>>> >> > > license part of taken care of.
>>> >> > >
>>> >> > > On the thread of removing non-used operators -> How do we know
>>> >> > > what is being used?
>>> >> > >
>>> >> > > Thks,
>>> >> > > Amol
>>> >> > >
>>> >> > >
>>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>> >> sandesh@datatorrent.com>
>>> >> > > wrote:
>>> >> > >
>>> >> > > > +1 for removing the not-used operators.
>>> >> > > >
>>> >> > > > So we are creating a process for operator writers who don't
>>> >> > > > want to understand the platform, yet wants to contribute? How
>>> >> > > > big is that
>>> >> set?
>>> >> > > > If we tell the app-user, here is the code which has not passed
>>> >> > > > all
>>> >> the
>>> >> > > > checklist, will they be ready to use that in production?
>>> >> > > >
>>> >> > > > This thread has 2 conflicting forces, reduce the operators and
>>> >> > > > make
>>> >> it
>>> >> > > easy
>>> >> > > > to add more operators.
>>> >> > > >
>>> >> > > >
>>> >> > > >
>>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>> >> > pramod@datatorrent.com>
>>> >> > > > wrote:
>>> >> > > >
>>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>> >> > > gaurav.gopi123@gmail.com>
>>> >> > > > > wrote:
>>> >> > > > >
>>> >> > > > > > Pramod,
>>> >> > > > > >
>>> >> > > > > > By that logic I would say let's put all partitionable
>>> >> > > > > > operators
>>> >> > into
>>> >> > > > one
>>> >> > > > > > folder, non-partitionable operators in another and so on...
>>> >> > > > > >
>>> >> > > > >
>>> >> > > > > Remember the original goal of making it easier for new
>>> >> > > > > members to contribute and managing those contributions to
>>> >> > > > > maturity. It is
>>> >> not a
>>> >> > > > > functional level separation.
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > > When I look at hadoop code I see these annotations being
>>> >> > > > > > used at
>>> >> > > class
>>> >> > > > > > level and not at package/folder level.
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > I had a typo in my email, I meant to say "think of this like
>>> >> > > > > a
>>> >> > > folder..."
>>> >> > > > > as an analogy and not literally.
>>> >> > > > >
>>> >> > > > > Thanks
>>> >> > > > >
>>> >> > > > >
>>> >> > > > > > Thanks
>>> >> > > > > >
>>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>> >> > > > pramod@datatorrent.com
>>> >> > > > > >
>>> >> > > > > > wrote:
>>> >> > > > > >
>>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>> >> > > > > gaurav.gopi123@gmail.com>
>>> >> > > > > > > wrote:
>>> >> > > > > > >
>>> >> > > > > > > > Can same goal not be achieved by using
>>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>>> >> > > > /
>>> >> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Uns
>>> >> > > > > > > > table
>>> >> > > > > > annotation?
>>> >> > > > > > > >
>>> >> > > > > > >
>>> >> > > > > > > I think it is important to localize the additions in one
>>> >> place so
>>> >> > > > that
>>> >> > > > > it
>>> >> > > > > > > becomes clearer to users about the maturity level of
>>> >> > > > > > > these,
>>> >> > easier
>>> >> > > > for
>>> >> > > > > > > developers to track them towards the path to maturity and
>>> >> > > > > > > also
>>> >> > > > > provides a
>>> >> > > > > > > clearer directive for committers and contributors on
>>> >> acceptance
>>> >> > of
>>> >> > > > new
>>> >> > > > > > > submissions. Relying on the annotations alone makes them
>>> >> spread
>>> >> > all
>>> >> > > > > over
>>> >> > > > > > > the place and adds an additional layer of difficulty in
>>> >> > > > identification
>>> >> > > > > > not
>>> >> > > > > > > just for users but also for developers who want to find
>>> >> > > > > > > such
>>> >> > > > operators
>>> >> > > > > > and
>>> >> > > > > > > improve them. This of this like a folder level annotation
>>> >> where
>>> >> > > > > > everything
>>> >> > > > > > > under this folder is unstable or evolving.
>>> >> > > > > > >
>>> >> > > > > > > Thanks
>>> >> > > > > > >
>>> >> > > > > > >
>>> >> > > > > > > >
>>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>> >> > > david@datatorrent.com
>>> >> > > > >
>>> >> > > > > > > wrote:
>>> >> > > > > > > >
>>> >> > > > > > > > > >
>>> >> > > > > > > > > > >
>>> >> > > > > > > > > > > >
>>> >> > > > > > > > > > > > Malhar in its current state, has way too many
>>> >> operators
>>> >> > > > that
>>> >> > > > > > fall
>>> >> > > > > > > > in
>>> >> > > > > > > > > > the
>>> >> > > > > > > > > > > > "non-production quality" category. We should
>>> >> > > > > > > > > > > > make it
>>> >> > > > obvious
>>> >> > > > > to
>>> >> > > > > > > > users
>>> >> > > > > > > > > > > that
>>> >> > > > > > > > > > > > which operators are up to par, and which
>>> >> > > > > > > > > > > > operators
>>> >> are
>>> >> > > not,
>>> >> > > > > and
>>> >> > > > > > > > maybe
>>> >> > > > > > > > > > > even
>>> >> > > > > > > > > > > > remove those that are likely not ever used in a
>>> >> > > > > > > > > > > > real
>>> >> > use
>>> >> > > > > case.
>>> >> > > > > > > > > > > >
>>> >> > > > > > > > > > >
>>> >> > > > > > > > > > > I am ambivalent about revisiting older operators
>>> >> > > > > > > > > > > and
>>> >> > doing
>>> >> > > > this
>>> >> > > > > > > > > exercise
>>> >> > > > > > > > > > as
>>> >> > > > > > > > > > > this can cause unnecessary tensions. My original
>>> >> intent
>>> >> > is
>>> >> > > > for
>>> >> > > > > > > > > > > contributions going forward.
>>> >> > > > > > > > > > >
>>> >> > > > > > > > > > >
>>> >> > > > > > > > > > IMO it is important to address this as well.
>>> >> > > > > > > > > > Operators
>>> >> > > outside
>>> >> > > > > the
>>> >> > > > > > > play
>>> >> > > > > > > > > > area should be of well known quality.
>>> >> > > > > > > > > >
>>> >> > > > > > > > > >
>>> >> > > > > > > > > I think this is important, and I don't anticipate
>>> >> > > > > > > > > much
>>> >> > tension
>>> >> > > if
>>> >> > > > > we
>>> >> > > > > > > > > establish clear criteria.
>>> >> > > > > > > > > It's not helpful if we let the old subpar operators
>>> >> > > > > > > > > stay
>>> >> and
>>> >> > > put
>>> >> > > > up
>>> >> > > > > > the
>>> >> > > > > > > > > bars for new operators.
>>> >> > > > > > > > >
>>> >> > > > > > > > > David
>>> >> > > > > > > > >
>>> >> > > > > > > >
>>> >> > > > > > >
>>> >> > > > > >
>>> >> > > > >
>>> >> > > >
>>> >> > >
>>> >> >
>>> >>
>>> >
>>> >
>>>
>>
>>
>

Re: A proposal for Malhar

Posted by "hsy541@gmail.com" <hs...@gmail.com>.
Why not have a shared google sheet with a list of operators and options
that we want to do with it.
I think it's case by case.
But retire unused or obsolete operators is important and we should do it
sooner rather than later.

Regards,
Siyuan

On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com> wrote:

>
> My vote is to do 2&3
>
> Thks
> Amol
>
>
> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> VKottapalli@directv.com> wrote:
>
>> +1 for deprecating the packages listed below.
>>
>> -----Original Message-----
>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> Sent: Tuesday, July 12, 2016 12:01 PM
>>
>> +1
>>
>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>> wrote:
>>
>> > Hi all,
>> >
>> > I would like to renew the discussion of retiring operators in Malhar.
>> >
>> > As stated before, the reason why we would like to retire operators in
>> > Malhar is because some of them were written a long time ago before
>> > Apache incubation, and they do not pertain to real use cases, are not
>> > up to par in code quality, have no potential for improvement, and
>> > probably completely unused by anybody.
>> >
>> > We do not want contributors to use them as a model of their
>> > contribution, or users to use them thinking they are of quality, and
>> then hit a wall.
>> > Both scenarios are not beneficial to the reputation of Apex.
>> >
>> > The initial 3 packages that we would like to target are *lib/algo*,
>> > *lib/math*, and *lib/streamquery*.
>>
>> >
>> > I'm adding this thread to the users list. Please speak up if you are
>> > using any operator in these 3 packages. We would like to hear from you.
>> >
>> > These are the options I can think of for retiring those operators:
>> >
>> > 1) Completely remove them from the malhar repository.
>> > 2) Move them from malhar-library into a separate artifact called
>> > malhar-misc
>> > 3) Mark them deprecated and add to their javadoc that they are no
>> > longer supported
>> >
>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >
>> > David
>> >
>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> > <pr...@datatorrent.com>
>> > wrote:
>> >
>> >> I wanted to close the loop on this discussion. In general everyone
>> >> seemed to be favorable to this idea with no serious objections. Folks
>> >> had good suggestions like documenting capabilities of operators, come
>> >> up well defined criteria for graduation of operators and what those
>> >> criteria may be and what to do with existing operators that may not
>> >> yet be mature or unused.
>> >>
>> >> I am going to summarize the key points that resulted from the
>> >> discussion and would like to proceed with them.
>> >>
>> >>    - Operators that do not yet provide the key platform capabilities to
>> >>    make an operator useful across different applications such as
>> >> reusability,
>> >>    partitioning static or dynamic, idempotency, exactly once will
>> still be
>> >>    accepted as long as they are functionally correct, have unit tests
>> >> and will
>> >>    go into a separate module.
>> >>    - Contrib module was suggested as a place where new contributions
>> go in
>> >>    that don't yet have all the platform capabilities and are not yet
>> >> mature.
>> >>    If there are no other suggestions we will go with this one.
>> >>    - It was suggested the operators documentation list those platform
>> >>    capabilities it currently provides from the list above. I will
>> >> document a
>> >>    structure for this in the contribution guidelines.
>> >>    - Folks wanted to know what would be the criteria to graduate an
>> >>    operator to the big leagues :). I will kick-off a separate thread
>> >> for it as
>> >>    I think it requires its own discussion and hopefully we can come
>> >> up with a
>> >>    set of guidelines for it.
>> >>    - David brought up state of some of the existing operators and their
>> >>    retirement and the layout of operators in Malhar in general and how
>> it
>> >>    causes problems with development. I will ask him to lead the
>> >> discussion on
>> >>    that.
>> >>
>> >> Thanks
>> >>
>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
>> wrote:
>> >>
>> >> > The two ideas are not conflicting, but rather complementing.
>> >> >
>> >> > On the contrary, putting a new process for people trying to
>> >> > contribute while NOT addressing the old unused subpar operators in
>> >> > the repository
>> >> is
>> >> > what is conflicting.
>> >> >
>> >> > Keep in mind that when people try to contribute, they always look
>> >> > at the existing operators already in the repository as examples and
>> >> > likely a
>> >> model
>> >> > for their new operators.
>> >> >
>> >> > David
>> >> >
>> >> >
>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
>> >> wrote:
>> >> >
>> >> > > Yes there are two conflicting threads now. The original thread
>> >> > > was to
>> >> > open
>> >> > > up a way for contributors to submit code in a dir (contrib?) as
>> >> > > long
>> >> as
>> >> > > license part of taken care of.
>> >> > >
>> >> > > On the thread of removing non-used operators -> How do we know
>> >> > > what is being used?
>> >> > >
>> >> > > Thks,
>> >> > > Amol
>> >> > >
>> >> > >
>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >> sandesh@datatorrent.com>
>> >> > > wrote:
>> >> > >
>> >> > > > +1 for removing the not-used operators.
>> >> > > >
>> >> > > > So we are creating a process for operator writers who don't
>> >> > > > want to understand the platform, yet wants to contribute? How
>> >> > > > big is that
>> >> set?
>> >> > > > If we tell the app-user, here is the code which has not passed
>> >> > > > all
>> >> the
>> >> > > > checklist, will they be ready to use that in production?
>> >> > > >
>> >> > > > This thread has 2 conflicting forces, reduce the operators and
>> >> > > > make
>> >> it
>> >> > > easy
>> >> > > > to add more operators.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >> > pramod@datatorrent.com>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >> > > gaurav.gopi123@gmail.com>
>> >> > > > > wrote:
>> >> > > > >
>> >> > > > > > Pramod,
>> >> > > > > >
>> >> > > > > > By that logic I would say let's put all partitionable
>> >> > > > > > operators
>> >> > into
>> >> > > > one
>> >> > > > > > folder, non-partitionable operators in another and so on...
>> >> > > > > >
>> >> > > > >
>> >> > > > > Remember the original goal of making it easier for new
>> >> > > > > members to contribute and managing those contributions to
>> >> > > > > maturity. It is
>> >> not a
>> >> > > > > functional level separation.
>> >> > > > >
>> >> > > > >
>> >> > > > > > When I look at hadoop code I see these annotations being
>> >> > > > > > used at
>> >> > > class
>> >> > > > > > level and not at package/folder level.
>> >> > > > >
>> >> > > > >
>> >> > > > > I had a typo in my email, I meant to say "think of this like
>> >> > > > > a
>> >> > > folder..."
>> >> > > > > as an analogy and not literally.
>> >> > > > >
>> >> > > > > Thanks
>> >> > > > >
>> >> > > > >
>> >> > > > > > Thanks
>> >> > > > > >
>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >> > > > pramod@datatorrent.com
>> >> > > > > >
>> >> > > > > > wrote:
>> >> > > > > >
>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >> > > > > gaurav.gopi123@gmail.com>
>> >> > > > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > > Can same goal not be achieved by using
>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> >> > > > /
>> >> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Uns
>> >> > > > > > > > table
>> >> > > > > > annotation?
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > > > I think it is important to localize the additions in one
>> >> place so
>> >> > > > that
>> >> > > > > it
>> >> > > > > > > becomes clearer to users about the maturity level of
>> >> > > > > > > these,
>> >> > easier
>> >> > > > for
>> >> > > > > > > developers to track them towards the path to maturity and
>> >> > > > > > > also
>> >> > > > > provides a
>> >> > > > > > > clearer directive for committers and contributors on
>> >> acceptance
>> >> > of
>> >> > > > new
>> >> > > > > > > submissions. Relying on the annotations alone makes them
>> >> spread
>> >> > all
>> >> > > > > over
>> >> > > > > > > the place and adds an additional layer of difficulty in
>> >> > > > identification
>> >> > > > > > not
>> >> > > > > > > just for users but also for developers who want to find
>> >> > > > > > > such
>> >> > > > operators
>> >> > > > > > and
>> >> > > > > > > improve them. This of this like a folder level annotation
>> >> where
>> >> > > > > > everything
>> >> > > > > > > under this folder is unstable or evolving.
>> >> > > > > > >
>> >> > > > > > > Thanks
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > > > >
>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >> > > david@datatorrent.com
>> >> > > > >
>> >> > > > > > > wrote:
>> >> > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > Malhar in its current state, has way too many
>> >> operators
>> >> > > > that
>> >> > > > > > fall
>> >> > > > > > > > in
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > "non-production quality" category. We should
>> >> > > > > > > > > > > > make it
>> >> > > > obvious
>> >> > > > > to
>> >> > > > > > > > users
>> >> > > > > > > > > > > that
>> >> > > > > > > > > > > > which operators are up to par, and which
>> >> > > > > > > > > > > > operators
>> >> are
>> >> > > not,
>> >> > > > > and
>> >> > > > > > > > maybe
>> >> > > > > > > > > > > even
>> >> > > > > > > > > > > > remove those that are likely not ever used in a
>> >> > > > > > > > > > > > real
>> >> > use
>> >> > > > > case.
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > I am ambivalent about revisiting older operators
>> >> > > > > > > > > > > and
>> >> > doing
>> >> > > > this
>> >> > > > > > > > > exercise
>> >> > > > > > > > > > as
>> >> > > > > > > > > > > this can cause unnecessary tensions. My original
>> >> intent
>> >> > is
>> >> > > > for
>> >> > > > > > > > > > > contributions going forward.
>> >> > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > > IMO it is important to address this as well.
>> >> > > > > > > > > > Operators
>> >> > > outside
>> >> > > > > the
>> >> > > > > > > play
>> >> > > > > > > > > > area should be of well known quality.
>> >> > > > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > > I think this is important, and I don't anticipate
>> >> > > > > > > > > much
>> >> > tension
>> >> > > if
>> >> > > > > we
>> >> > > > > > > > > establish clear criteria.
>> >> > > > > > > > > It's not helpful if we let the old subpar operators
>> >> > > > > > > > > stay
>> >> and
>> >> > > put
>> >> > > > up
>> >> > > > > > the
>> >> > > > > > > > > bars for new operators.
>> >> > > > > > > > >
>> >> > > > > > > > > David
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >
>> >
>>
>
>

Re: A proposal for Malhar

Posted by "hsy541@gmail.com" <hs...@gmail.com>.
Why not have a shared google sheet with a list of operators and options
that we want to do with it.
I think it's case by case.
But retire unused or obsolete operators is important and we should do it
sooner rather than later.

Regards,
Siyuan

On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com> wrote:

>
> My vote is to do 2&3
>
> Thks
> Amol
>
>
> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> VKottapalli@directv.com> wrote:
>
>> +1 for deprecating the packages listed below.
>>
>> -----Original Message-----
>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> Sent: Tuesday, July 12, 2016 12:01 PM
>>
>> +1
>>
>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>> wrote:
>>
>> > Hi all,
>> >
>> > I would like to renew the discussion of retiring operators in Malhar.
>> >
>> > As stated before, the reason why we would like to retire operators in
>> > Malhar is because some of them were written a long time ago before
>> > Apache incubation, and they do not pertain to real use cases, are not
>> > up to par in code quality, have no potential for improvement, and
>> > probably completely unused by anybody.
>> >
>> > We do not want contributors to use them as a model of their
>> > contribution, or users to use them thinking they are of quality, and
>> then hit a wall.
>> > Both scenarios are not beneficial to the reputation of Apex.
>> >
>> > The initial 3 packages that we would like to target are *lib/algo*,
>> > *lib/math*, and *lib/streamquery*.
>>
>> >
>> > I'm adding this thread to the users list. Please speak up if you are
>> > using any operator in these 3 packages. We would like to hear from you.
>> >
>> > These are the options I can think of for retiring those operators:
>> >
>> > 1) Completely remove them from the malhar repository.
>> > 2) Move them from malhar-library into a separate artifact called
>> > malhar-misc
>> > 3) Mark them deprecated and add to their javadoc that they are no
>> > longer supported
>> >
>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >
>> > David
>> >
>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> > <pr...@datatorrent.com>
>> > wrote:
>> >
>> >> I wanted to close the loop on this discussion. In general everyone
>> >> seemed to be favorable to this idea with no serious objections. Folks
>> >> had good suggestions like documenting capabilities of operators, come
>> >> up well defined criteria for graduation of operators and what those
>> >> criteria may be and what to do with existing operators that may not
>> >> yet be mature or unused.
>> >>
>> >> I am going to summarize the key points that resulted from the
>> >> discussion and would like to proceed with them.
>> >>
>> >>    - Operators that do not yet provide the key platform capabilities to
>> >>    make an operator useful across different applications such as
>> >> reusability,
>> >>    partitioning static or dynamic, idempotency, exactly once will
>> still be
>> >>    accepted as long as they are functionally correct, have unit tests
>> >> and will
>> >>    go into a separate module.
>> >>    - Contrib module was suggested as a place where new contributions
>> go in
>> >>    that don't yet have all the platform capabilities and are not yet
>> >> mature.
>> >>    If there are no other suggestions we will go with this one.
>> >>    - It was suggested the operators documentation list those platform
>> >>    capabilities it currently provides from the list above. I will
>> >> document a
>> >>    structure for this in the contribution guidelines.
>> >>    - Folks wanted to know what would be the criteria to graduate an
>> >>    operator to the big leagues :). I will kick-off a separate thread
>> >> for it as
>> >>    I think it requires its own discussion and hopefully we can come
>> >> up with a
>> >>    set of guidelines for it.
>> >>    - David brought up state of some of the existing operators and their
>> >>    retirement and the layout of operators in Malhar in general and how
>> it
>> >>    causes problems with development. I will ask him to lead the
>> >> discussion on
>> >>    that.
>> >>
>> >> Thanks
>> >>
>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
>> wrote:
>> >>
>> >> > The two ideas are not conflicting, but rather complementing.
>> >> >
>> >> > On the contrary, putting a new process for people trying to
>> >> > contribute while NOT addressing the old unused subpar operators in
>> >> > the repository
>> >> is
>> >> > what is conflicting.
>> >> >
>> >> > Keep in mind that when people try to contribute, they always look
>> >> > at the existing operators already in the repository as examples and
>> >> > likely a
>> >> model
>> >> > for their new operators.
>> >> >
>> >> > David
>> >> >
>> >> >
>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
>> >> wrote:
>> >> >
>> >> > > Yes there are two conflicting threads now. The original thread
>> >> > > was to
>> >> > open
>> >> > > up a way for contributors to submit code in a dir (contrib?) as
>> >> > > long
>> >> as
>> >> > > license part of taken care of.
>> >> > >
>> >> > > On the thread of removing non-used operators -> How do we know
>> >> > > what is being used?
>> >> > >
>> >> > > Thks,
>> >> > > Amol
>> >> > >
>> >> > >
>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >> sandesh@datatorrent.com>
>> >> > > wrote:
>> >> > >
>> >> > > > +1 for removing the not-used operators.
>> >> > > >
>> >> > > > So we are creating a process for operator writers who don't
>> >> > > > want to understand the platform, yet wants to contribute? How
>> >> > > > big is that
>> >> set?
>> >> > > > If we tell the app-user, here is the code which has not passed
>> >> > > > all
>> >> the
>> >> > > > checklist, will they be ready to use that in production?
>> >> > > >
>> >> > > > This thread has 2 conflicting forces, reduce the operators and
>> >> > > > make
>> >> it
>> >> > > easy
>> >> > > > to add more operators.
>> >> > > >
>> >> > > >
>> >> > > >
>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >> > pramod@datatorrent.com>
>> >> > > > wrote:
>> >> > > >
>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >> > > gaurav.gopi123@gmail.com>
>> >> > > > > wrote:
>> >> > > > >
>> >> > > > > > Pramod,
>> >> > > > > >
>> >> > > > > > By that logic I would say let's put all partitionable
>> >> > > > > > operators
>> >> > into
>> >> > > > one
>> >> > > > > > folder, non-partitionable operators in another and so on...
>> >> > > > > >
>> >> > > > >
>> >> > > > > Remember the original goal of making it easier for new
>> >> > > > > members to contribute and managing those contributions to
>> >> > > > > maturity. It is
>> >> not a
>> >> > > > > functional level separation.
>> >> > > > >
>> >> > > > >
>> >> > > > > > When I look at hadoop code I see these annotations being
>> >> > > > > > used at
>> >> > > class
>> >> > > > > > level and not at package/folder level.
>> >> > > > >
>> >> > > > >
>> >> > > > > I had a typo in my email, I meant to say "think of this like
>> >> > > > > a
>> >> > > folder..."
>> >> > > > > as an analogy and not literally.
>> >> > > > >
>> >> > > > > Thanks
>> >> > > > >
>> >> > > > >
>> >> > > > > > Thanks
>> >> > > > > >
>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >> > > > pramod@datatorrent.com
>> >> > > > > >
>> >> > > > > > wrote:
>> >> > > > > >
>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >> > > > > gaurav.gopi123@gmail.com>
>> >> > > > > > > wrote:
>> >> > > > > > >
>> >> > > > > > > > Can same goal not be achieved by using
>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> >> > > > /
>> >> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Uns
>> >> > > > > > > > table
>> >> > > > > > annotation?
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > > > I think it is important to localize the additions in one
>> >> place so
>> >> > > > that
>> >> > > > > it
>> >> > > > > > > becomes clearer to users about the maturity level of
>> >> > > > > > > these,
>> >> > easier
>> >> > > > for
>> >> > > > > > > developers to track them towards the path to maturity and
>> >> > > > > > > also
>> >> > > > > provides a
>> >> > > > > > > clearer directive for committers and contributors on
>> >> acceptance
>> >> > of
>> >> > > > new
>> >> > > > > > > submissions. Relying on the annotations alone makes them
>> >> spread
>> >> > all
>> >> > > > > over
>> >> > > > > > > the place and adds an additional layer of difficulty in
>> >> > > > identification
>> >> > > > > > not
>> >> > > > > > > just for users but also for developers who want to find
>> >> > > > > > > such
>> >> > > > operators
>> >> > > > > > and
>> >> > > > > > > improve them. This of this like a folder level annotation
>> >> where
>> >> > > > > > everything
>> >> > > > > > > under this folder is unstable or evolving.
>> >> > > > > > >
>> >> > > > > > > Thanks
>> >> > > > > > >
>> >> > > > > > >
>> >> > > > > > > >
>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >> > > david@datatorrent.com
>> >> > > > >
>> >> > > > > > > wrote:
>> >> > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > > > Malhar in its current state, has way too many
>> >> operators
>> >> > > > that
>> >> > > > > > fall
>> >> > > > > > > > in
>> >> > > > > > > > > > the
>> >> > > > > > > > > > > > "non-production quality" category. We should
>> >> > > > > > > > > > > > make it
>> >> > > > obvious
>> >> > > > > to
>> >> > > > > > > > users
>> >> > > > > > > > > > > that
>> >> > > > > > > > > > > > which operators are up to par, and which
>> >> > > > > > > > > > > > operators
>> >> are
>> >> > > not,
>> >> > > > > and
>> >> > > > > > > > maybe
>> >> > > > > > > > > > > even
>> >> > > > > > > > > > > > remove those that are likely not ever used in a
>> >> > > > > > > > > > > > real
>> >> > use
>> >> > > > > case.
>> >> > > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > > > I am ambivalent about revisiting older operators
>> >> > > > > > > > > > > and
>> >> > doing
>> >> > > > this
>> >> > > > > > > > > exercise
>> >> > > > > > > > > > as
>> >> > > > > > > > > > > this can cause unnecessary tensions. My original
>> >> intent
>> >> > is
>> >> > > > for
>> >> > > > > > > > > > > contributions going forward.
>> >> > > > > > > > > > >
>> >> > > > > > > > > > >
>> >> > > > > > > > > > IMO it is important to address this as well.
>> >> > > > > > > > > > Operators
>> >> > > outside
>> >> > > > > the
>> >> > > > > > > play
>> >> > > > > > > > > > area should be of well known quality.
>> >> > > > > > > > > >
>> >> > > > > > > > > >
>> >> > > > > > > > > I think this is important, and I don't anticipate
>> >> > > > > > > > > much
>> >> > tension
>> >> > > if
>> >> > > > > we
>> >> > > > > > > > > establish clear criteria.
>> >> > > > > > > > > It's not helpful if we let the old subpar operators
>> >> > > > > > > > > stay
>> >> and
>> >> > > put
>> >> > > > up
>> >> > > > > > the
>> >> > > > > > > > > bars for new operators.
>> >> > > > > > > > >
>> >> > > > > > > > > David
>> >> > > > > > > > >
>> >> > > > > > > >
>> >> > > > > > >
>> >> > > > > >
>> >> > > > >
>> >> > > >
>> >> > >
>> >> >
>> >>
>> >
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Amol Kekre <am...@datatorrent.com>.
My vote is to do 2&3

Thks
Amol


On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
VKottapalli@directv.com> wrote:

> +1 for deprecating the packages listed below.
>
> -----Original Message-----
> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
> Sent: Tuesday, July 12, 2016 12:01 PM
>
> +1
>
> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> wrote:
>
> > Hi all,
> >
> > I would like to renew the discussion of retiring operators in Malhar.
> >
> > As stated before, the reason why we would like to retire operators in
> > Malhar is because some of them were written a long time ago before
> > Apache incubation, and they do not pertain to real use cases, are not
> > up to par in code quality, have no potential for improvement, and
> > probably completely unused by anybody.
> >
> > We do not want contributors to use them as a model of their
> > contribution, or users to use them thinking they are of quality, and
> then hit a wall.
> > Both scenarios are not beneficial to the reputation of Apex.
> >
> > The initial 3 packages that we would like to target are *lib/algo*,
> > *lib/math*, and *lib/streamquery*.
> >
> > I'm adding this thread to the users list. Please speak up if you are
> > using any operator in these 3 packages. We would like to hear from you.
> >
> > These are the options I can think of for retiring those operators:
> >
> > 1) Completely remove them from the malhar repository.
> > 2) Move them from malhar-library into a separate artifact called
> > malhar-misc
> > 3) Mark them deprecated and add to their javadoc that they are no
> > longer supported
> >
> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >
> > David
> >
> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> > <pr...@datatorrent.com>
> > wrote:
> >
> >> I wanted to close the loop on this discussion. In general everyone
> >> seemed to be favorable to this idea with no serious objections. Folks
> >> had good suggestions like documenting capabilities of operators, come
> >> up well defined criteria for graduation of operators and what those
> >> criteria may be and what to do with existing operators that may not
> >> yet be mature or unused.
> >>
> >> I am going to summarize the key points that resulted from the
> >> discussion and would like to proceed with them.
> >>
> >>    - Operators that do not yet provide the key platform capabilities to
> >>    make an operator useful across different applications such as
> >> reusability,
> >>    partitioning static or dynamic, idempotency, exactly once will still
> be
> >>    accepted as long as they are functionally correct, have unit tests
> >> and will
> >>    go into a separate module.
> >>    - Contrib module was suggested as a place where new contributions go
> in
> >>    that don't yet have all the platform capabilities and are not yet
> >> mature.
> >>    If there are no other suggestions we will go with this one.
> >>    - It was suggested the operators documentation list those platform
> >>    capabilities it currently provides from the list above. I will
> >> document a
> >>    structure for this in the contribution guidelines.
> >>    - Folks wanted to know what would be the criteria to graduate an
> >>    operator to the big leagues :). I will kick-off a separate thread
> >> for it as
> >>    I think it requires its own discussion and hopefully we can come
> >> up with a
> >>    set of guidelines for it.
> >>    - David brought up state of some of the existing operators and their
> >>    retirement and the layout of operators in Malhar in general and how
> it
> >>    causes problems with development. I will ask him to lead the
> >> discussion on
> >>    that.
> >>
> >> Thanks
> >>
> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
> wrote:
> >>
> >> > The two ideas are not conflicting, but rather complementing.
> >> >
> >> > On the contrary, putting a new process for people trying to
> >> > contribute while NOT addressing the old unused subpar operators in
> >> > the repository
> >> is
> >> > what is conflicting.
> >> >
> >> > Keep in mind that when people try to contribute, they always look
> >> > at the existing operators already in the repository as examples and
> >> > likely a
> >> model
> >> > for their new operators.
> >> >
> >> > David
> >> >
> >> >
> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
> >> wrote:
> >> >
> >> > > Yes there are two conflicting threads now. The original thread
> >> > > was to
> >> > open
> >> > > up a way for contributors to submit code in a dir (contrib?) as
> >> > > long
> >> as
> >> > > license part of taken care of.
> >> > >
> >> > > On the thread of removing non-used operators -> How do we know
> >> > > what is being used?
> >> > >
> >> > > Thks,
> >> > > Amol
> >> > >
> >> > >
> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >> sandesh@datatorrent.com>
> >> > > wrote:
> >> > >
> >> > > > +1 for removing the not-used operators.
> >> > > >
> >> > > > So we are creating a process for operator writers who don't
> >> > > > want to understand the platform, yet wants to contribute? How
> >> > > > big is that
> >> set?
> >> > > > If we tell the app-user, here is the code which has not passed
> >> > > > all
> >> the
> >> > > > checklist, will they be ready to use that in production?
> >> > > >
> >> > > > This thread has 2 conflicting forces, reduce the operators and
> >> > > > make
> >> it
> >> > > easy
> >> > > > to add more operators.
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> >> > pramod@datatorrent.com>
> >> > > > wrote:
> >> > > >
> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> >> > > gaurav.gopi123@gmail.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Pramod,
> >> > > > > >
> >> > > > > > By that logic I would say let's put all partitionable
> >> > > > > > operators
> >> > into
> >> > > > one
> >> > > > > > folder, non-partitionable operators in another and so on...
> >> > > > > >
> >> > > > >
> >> > > > > Remember the original goal of making it easier for new
> >> > > > > members to contribute and managing those contributions to
> >> > > > > maturity. It is
> >> not a
> >> > > > > functional level separation.
> >> > > > >
> >> > > > >
> >> > > > > > When I look at hadoop code I see these annotations being
> >> > > > > > used at
> >> > > class
> >> > > > > > level and not at package/folder level.
> >> > > > >
> >> > > > >
> >> > > > > I had a typo in my email, I meant to say "think of this like
> >> > > > > a
> >> > > folder..."
> >> > > > > as an analogy and not literally.
> >> > > > >
> >> > > > > Thanks
> >> > > > >
> >> > > > >
> >> > > > > > Thanks
> >> > > > > >
> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> >> > > > pramod@datatorrent.com
> >> > > > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> >> > > > > gaurav.gopi123@gmail.com>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Can same goal not be achieved by using
> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> >> > > > /
> >> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Uns
> >> > > > > > > > table
> >> > > > > > annotation?
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > I think it is important to localize the additions in one
> >> place so
> >> > > > that
> >> > > > > it
> >> > > > > > > becomes clearer to users about the maturity level of
> >> > > > > > > these,
> >> > easier
> >> > > > for
> >> > > > > > > developers to track them towards the path to maturity and
> >> > > > > > > also
> >> > > > > provides a
> >> > > > > > > clearer directive for committers and contributors on
> >> acceptance
> >> > of
> >> > > > new
> >> > > > > > > submissions. Relying on the annotations alone makes them
> >> spread
> >> > all
> >> > > > > over
> >> > > > > > > the place and adds an additional layer of difficulty in
> >> > > > identification
> >> > > > > > not
> >> > > > > > > just for users but also for developers who want to find
> >> > > > > > > such
> >> > > > operators
> >> > > > > > and
> >> > > > > > > improve them. This of this like a folder level annotation
> >> where
> >> > > > > > everything
> >> > > > > > > under this folder is unstable or evolving.
> >> > > > > > >
> >> > > > > > > Thanks
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> >> > > david@datatorrent.com
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Malhar in its current state, has way too many
> >> operators
> >> > > > that
> >> > > > > > fall
> >> > > > > > > > in
> >> > > > > > > > > > the
> >> > > > > > > > > > > > "non-production quality" category. We should
> >> > > > > > > > > > > > make it
> >> > > > obvious
> >> > > > > to
> >> > > > > > > > users
> >> > > > > > > > > > > that
> >> > > > > > > > > > > > which operators are up to par, and which
> >> > > > > > > > > > > > operators
> >> are
> >> > > not,
> >> > > > > and
> >> > > > > > > > maybe
> >> > > > > > > > > > > even
> >> > > > > > > > > > > > remove those that are likely not ever used in a
> >> > > > > > > > > > > > real
> >> > use
> >> > > > > case.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > I am ambivalent about revisiting older operators
> >> > > > > > > > > > > and
> >> > doing
> >> > > > this
> >> > > > > > > > > exercise
> >> > > > > > > > > > as
> >> > > > > > > > > > > this can cause unnecessary tensions. My original
> >> intent
> >> > is
> >> > > > for
> >> > > > > > > > > > > contributions going forward.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > IMO it is important to address this as well.
> >> > > > > > > > > > Operators
> >> > > outside
> >> > > > > the
> >> > > > > > > play
> >> > > > > > > > > > area should be of well known quality.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > I think this is important, and I don't anticipate
> >> > > > > > > > > much
> >> > tension
> >> > > if
> >> > > > > we
> >> > > > > > > > > establish clear criteria.
> >> > > > > > > > > It's not helpful if we let the old subpar operators
> >> > > > > > > > > stay
> >> and
> >> > > put
> >> > > > up
> >> > > > > > the
> >> > > > > > > > > bars for new operators.
> >> > > > > > > > >
> >> > > > > > > > > David
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: A proposal for Malhar

Posted by Amol Kekre <am...@datatorrent.com>.
My vote is to do 2&3

Thks
Amol


On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
VKottapalli@directv.com> wrote:

> +1 for deprecating the packages listed below.
>
> -----Original Message-----
> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
> Sent: Tuesday, July 12, 2016 12:01 PM
>
> +1
>
> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> wrote:
>
> > Hi all,
> >
> > I would like to renew the discussion of retiring operators in Malhar.
> >
> > As stated before, the reason why we would like to retire operators in
> > Malhar is because some of them were written a long time ago before
> > Apache incubation, and they do not pertain to real use cases, are not
> > up to par in code quality, have no potential for improvement, and
> > probably completely unused by anybody.
> >
> > We do not want contributors to use them as a model of their
> > contribution, or users to use them thinking they are of quality, and
> then hit a wall.
> > Both scenarios are not beneficial to the reputation of Apex.
> >
> > The initial 3 packages that we would like to target are *lib/algo*,
> > *lib/math*, and *lib/streamquery*.
> >
> > I'm adding this thread to the users list. Please speak up if you are
> > using any operator in these 3 packages. We would like to hear from you.
> >
> > These are the options I can think of for retiring those operators:
> >
> > 1) Completely remove them from the malhar repository.
> > 2) Move them from malhar-library into a separate artifact called
> > malhar-misc
> > 3) Mark them deprecated and add to their javadoc that they are no
> > longer supported
> >
> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >
> > David
> >
> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> > <pr...@datatorrent.com>
> > wrote:
> >
> >> I wanted to close the loop on this discussion. In general everyone
> >> seemed to be favorable to this idea with no serious objections. Folks
> >> had good suggestions like documenting capabilities of operators, come
> >> up well defined criteria for graduation of operators and what those
> >> criteria may be and what to do with existing operators that may not
> >> yet be mature or unused.
> >>
> >> I am going to summarize the key points that resulted from the
> >> discussion and would like to proceed with them.
> >>
> >>    - Operators that do not yet provide the key platform capabilities to
> >>    make an operator useful across different applications such as
> >> reusability,
> >>    partitioning static or dynamic, idempotency, exactly once will still
> be
> >>    accepted as long as they are functionally correct, have unit tests
> >> and will
> >>    go into a separate module.
> >>    - Contrib module was suggested as a place where new contributions go
> in
> >>    that don't yet have all the platform capabilities and are not yet
> >> mature.
> >>    If there are no other suggestions we will go with this one.
> >>    - It was suggested the operators documentation list those platform
> >>    capabilities it currently provides from the list above. I will
> >> document a
> >>    structure for this in the contribution guidelines.
> >>    - Folks wanted to know what would be the criteria to graduate an
> >>    operator to the big leagues :). I will kick-off a separate thread
> >> for it as
> >>    I think it requires its own discussion and hopefully we can come
> >> up with a
> >>    set of guidelines for it.
> >>    - David brought up state of some of the existing operators and their
> >>    retirement and the layout of operators in Malhar in general and how
> it
> >>    causes problems with development. I will ask him to lead the
> >> discussion on
> >>    that.
> >>
> >> Thanks
> >>
> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com>
> wrote:
> >>
> >> > The two ideas are not conflicting, but rather complementing.
> >> >
> >> > On the contrary, putting a new process for people trying to
> >> > contribute while NOT addressing the old unused subpar operators in
> >> > the repository
> >> is
> >> > what is conflicting.
> >> >
> >> > Keep in mind that when people try to contribute, they always look
> >> > at the existing operators already in the repository as examples and
> >> > likely a
> >> model
> >> > for their new operators.
> >> >
> >> > David
> >> >
> >> >
> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
> >> wrote:
> >> >
> >> > > Yes there are two conflicting threads now. The original thread
> >> > > was to
> >> > open
> >> > > up a way for contributors to submit code in a dir (contrib?) as
> >> > > long
> >> as
> >> > > license part of taken care of.
> >> > >
> >> > > On the thread of removing non-used operators -> How do we know
> >> > > what is being used?
> >> > >
> >> > > Thks,
> >> > > Amol
> >> > >
> >> > >
> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >> sandesh@datatorrent.com>
> >> > > wrote:
> >> > >
> >> > > > +1 for removing the not-used operators.
> >> > > >
> >> > > > So we are creating a process for operator writers who don't
> >> > > > want to understand the platform, yet wants to contribute? How
> >> > > > big is that
> >> set?
> >> > > > If we tell the app-user, here is the code which has not passed
> >> > > > all
> >> the
> >> > > > checklist, will they be ready to use that in production?
> >> > > >
> >> > > > This thread has 2 conflicting forces, reduce the operators and
> >> > > > make
> >> it
> >> > > easy
> >> > > > to add more operators.
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> >> > pramod@datatorrent.com>
> >> > > > wrote:
> >> > > >
> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> >> > > gaurav.gopi123@gmail.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > Pramod,
> >> > > > > >
> >> > > > > > By that logic I would say let's put all partitionable
> >> > > > > > operators
> >> > into
> >> > > > one
> >> > > > > > folder, non-partitionable operators in another and so on...
> >> > > > > >
> >> > > > >
> >> > > > > Remember the original goal of making it easier for new
> >> > > > > members to contribute and managing those contributions to
> >> > > > > maturity. It is
> >> not a
> >> > > > > functional level separation.
> >> > > > >
> >> > > > >
> >> > > > > > When I look at hadoop code I see these annotations being
> >> > > > > > used at
> >> > > class
> >> > > > > > level and not at package/folder level.
> >> > > > >
> >> > > > >
> >> > > > > I had a typo in my email, I meant to say "think of this like
> >> > > > > a
> >> > > folder..."
> >> > > > > as an analogy and not literally.
> >> > > > >
> >> > > > > Thanks
> >> > > > >
> >> > > > >
> >> > > > > > Thanks
> >> > > > > >
> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> >> > > > pramod@datatorrent.com
> >> > > > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> >> > > > > gaurav.gopi123@gmail.com>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Can same goal not be achieved by using
> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> >> > > > /
> >> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Uns
> >> > > > > > > > table
> >> > > > > > annotation?
> >> > > > > > > >
> >> > > > > > >
> >> > > > > > > I think it is important to localize the additions in one
> >> place so
> >> > > > that
> >> > > > > it
> >> > > > > > > becomes clearer to users about the maturity level of
> >> > > > > > > these,
> >> > easier
> >> > > > for
> >> > > > > > > developers to track them towards the path to maturity and
> >> > > > > > > also
> >> > > > > provides a
> >> > > > > > > clearer directive for committers and contributors on
> >> acceptance
> >> > of
> >> > > > new
> >> > > > > > > submissions. Relying on the annotations alone makes them
> >> spread
> >> > all
> >> > > > > over
> >> > > > > > > the place and adds an additional layer of difficulty in
> >> > > > identification
> >> > > > > > not
> >> > > > > > > just for users but also for developers who want to find
> >> > > > > > > such
> >> > > > operators
> >> > > > > > and
> >> > > > > > > improve them. This of this like a folder level annotation
> >> where
> >> > > > > > everything
> >> > > > > > > under this folder is unstable or evolving.
> >> > > > > > >
> >> > > > > > > Thanks
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> >> > > david@datatorrent.com
> >> > > > >
> >> > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Malhar in its current state, has way too many
> >> operators
> >> > > > that
> >> > > > > > fall
> >> > > > > > > > in
> >> > > > > > > > > > the
> >> > > > > > > > > > > > "non-production quality" category. We should
> >> > > > > > > > > > > > make it
> >> > > > obvious
> >> > > > > to
> >> > > > > > > > users
> >> > > > > > > > > > > that
> >> > > > > > > > > > > > which operators are up to par, and which
> >> > > > > > > > > > > > operators
> >> are
> >> > > not,
> >> > > > > and
> >> > > > > > > > maybe
> >> > > > > > > > > > > even
> >> > > > > > > > > > > > remove those that are likely not ever used in a
> >> > > > > > > > > > > > real
> >> > use
> >> > > > > case.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > I am ambivalent about revisiting older operators
> >> > > > > > > > > > > and
> >> > doing
> >> > > > this
> >> > > > > > > > > exercise
> >> > > > > > > > > > as
> >> > > > > > > > > > > this can cause unnecessary tensions. My original
> >> intent
> >> > is
> >> > > > for
> >> > > > > > > > > > > contributions going forward.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > IMO it is important to address this as well.
> >> > > > > > > > > > Operators
> >> > > outside
> >> > > > > the
> >> > > > > > > play
> >> > > > > > > > > > area should be of well known quality.
> >> > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > > I think this is important, and I don't anticipate
> >> > > > > > > > > much
> >> > tension
> >> > > if
> >> > > > > we
> >> > > > > > > > > establish clear criteria.
> >> > > > > > > > > It's not helpful if we let the old subpar operators
> >> > > > > > > > > stay
> >> and
> >> > > put
> >> > > > up
> >> > > > > > the
> >> > > > > > > > > bars for new operators.
> >> > > > > > > > >
> >> > > > > > > > > David
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

RE: A proposal for Malhar

Posted by "Kottapalli, Venkatesh" <VK...@DIRECTV.com>.
+1 for deprecating the packages listed below.

-----Original Message-----
From: hsy541@gmail.com [mailto:hsy541@gmail.com] 
Sent: Tuesday, July 12, 2016 12:01 PM

+1

On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> wrote:

> Hi all,
>
> I would like to renew the discussion of retiring operators in Malhar.
>
> As stated before, the reason why we would like to retire operators in 
> Malhar is because some of them were written a long time ago before 
> Apache incubation, and they do not pertain to real use cases, are not 
> up to par in code quality, have no potential for improvement, and 
> probably completely unused by anybody.
>
> We do not want contributors to use them as a model of their 
> contribution, or users to use them thinking they are of quality, and then hit a wall.
> Both scenarios are not beneficial to the reputation of Apex.
>
> The initial 3 packages that we would like to target are *lib/algo*, 
> *lib/math*, and *lib/streamquery*.
>
> I'm adding this thread to the users list. Please speak up if you are 
> using any operator in these 3 packages. We would like to hear from you.
>
> These are the options I can think of for retiring those operators:
>
> 1) Completely remove them from the malhar repository.
> 2) Move them from malhar-library into a separate artifact called 
> malhar-misc
> 3) Mark them deprecated and add to their javadoc that they are no 
> longer supported
>
> Note that 2 and 3 are not mutually exclusive. Any thoughts?
>
> David
>
> On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni 
> <pr...@datatorrent.com>
> wrote:
>
>> I wanted to close the loop on this discussion. In general everyone 
>> seemed to be favorable to this idea with no serious objections. Folks 
>> had good suggestions like documenting capabilities of operators, come 
>> up well defined criteria for graduation of operators and what those 
>> criteria may be and what to do with existing operators that may not 
>> yet be mature or unused.
>>
>> I am going to summarize the key points that resulted from the 
>> discussion and would like to proceed with them.
>>
>>    - Operators that do not yet provide the key platform capabilities to
>>    make an operator useful across different applications such as 
>> reusability,
>>    partitioning static or dynamic, idempotency, exactly once will still be
>>    accepted as long as they are functionally correct, have unit tests 
>> and will
>>    go into a separate module.
>>    - Contrib module was suggested as a place where new contributions go in
>>    that don't yet have all the platform capabilities and are not yet 
>> mature.
>>    If there are no other suggestions we will go with this one.
>>    - It was suggested the operators documentation list those platform
>>    capabilities it currently provides from the list above. I will 
>> document a
>>    structure for this in the contribution guidelines.
>>    - Folks wanted to know what would be the criteria to graduate an
>>    operator to the big leagues :). I will kick-off a separate thread 
>> for it as
>>    I think it requires its own discussion and hopefully we can come 
>> up with a
>>    set of guidelines for it.
>>    - David brought up state of some of the existing operators and their
>>    retirement and the layout of operators in Malhar in general and how it
>>    causes problems with development. I will ask him to lead the 
>> discussion on
>>    that.
>>
>> Thanks
>>
>> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote:
>>
>> > The two ideas are not conflicting, but rather complementing.
>> >
>> > On the contrary, putting a new process for people trying to 
>> > contribute while NOT addressing the old unused subpar operators in 
>> > the repository
>> is
>> > what is conflicting.
>> >
>> > Keep in mind that when people try to contribute, they always look 
>> > at the existing operators already in the repository as examples and 
>> > likely a
>> model
>> > for their new operators.
>> >
>> > David
>> >
>> >
>> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> >
>> > > Yes there are two conflicting threads now. The original thread 
>> > > was to
>> > open
>> > > up a way for contributors to submit code in a dir (contrib?) as 
>> > > long
>> as
>> > > license part of taken care of.
>> > >
>> > > On the thread of removing non-used operators -> How do we know 
>> > > what is being used?
>> > >
>> > > Thks,
>> > > Amol
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> sandesh@datatorrent.com>
>> > > wrote:
>> > >
>> > > > +1 for removing the not-used operators.
>> > > >
>> > > > So we are creating a process for operator writers who don't 
>> > > > want to understand the platform, yet wants to contribute? How 
>> > > > big is that
>> set?
>> > > > If we tell the app-user, here is the code which has not passed 
>> > > > all
>> the
>> > > > checklist, will they be ready to use that in production?
>> > > >
>> > > > This thread has 2 conflicting forces, reduce the operators and 
>> > > > make
>> it
>> > > easy
>> > > > to add more operators.
>> > > >
>> > > >
>> > > >
>> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> > pramod@datatorrent.com>
>> > > > wrote:
>> > > >
>> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> > > gaurav.gopi123@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Pramod,
>> > > > > >
>> > > > > > By that logic I would say let's put all partitionable 
>> > > > > > operators
>> > into
>> > > > one
>> > > > > > folder, non-partitionable operators in another and so on...
>> > > > > >
>> > > > >
>> > > > > Remember the original goal of making it easier for new 
>> > > > > members to contribute and managing those contributions to 
>> > > > > maturity. It is
>> not a
>> > > > > functional level separation.
>> > > > >
>> > > > >
>> > > > > > When I look at hadoop code I see these annotations being 
>> > > > > > used at
>> > > class
>> > > > > > level and not at package/folder level.
>> > > > >
>> > > > >
>> > > > > I had a typo in my email, I meant to say "think of this like 
>> > > > > a
>> > > folder..."
>> > > > > as an analogy and not literally.
>> > > > >
>> > > > > Thanks
>> > > > >
>> > > > >
>> > > > > > Thanks
>> > > > > >
>> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> > > > pramod@datatorrent.com
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> > > > > gaurav.gopi123@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Can same goal not be achieved by using
>> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> > > > /
>> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Uns
>> > > > > > > > table
>> > > > > > annotation?
>> > > > > > > >
>> > > > > > >
>> > > > > > > I think it is important to localize the additions in one
>> place so
>> > > > that
>> > > > > it
>> > > > > > > becomes clearer to users about the maturity level of 
>> > > > > > > these,
>> > easier
>> > > > for
>> > > > > > > developers to track them towards the path to maturity and 
>> > > > > > > also
>> > > > > provides a
>> > > > > > > clearer directive for committers and contributors on
>> acceptance
>> > of
>> > > > new
>> > > > > > > submissions. Relying on the annotations alone makes them
>> spread
>> > all
>> > > > > over
>> > > > > > > the place and adds an additional layer of difficulty in
>> > > > identification
>> > > > > > not
>> > > > > > > just for users but also for developers who want to find 
>> > > > > > > such
>> > > > operators
>> > > > > > and
>> > > > > > > improve them. This of this like a folder level annotation
>> where
>> > > > > > everything
>> > > > > > > under this folder is unstable or evolving.
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> > > david@datatorrent.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Malhar in its current state, has way too many
>> operators
>> > > > that
>> > > > > > fall
>> > > > > > > > in
>> > > > > > > > > > the
>> > > > > > > > > > > > "non-production quality" category. We should 
>> > > > > > > > > > > > make it
>> > > > obvious
>> > > > > to
>> > > > > > > > users
>> > > > > > > > > > > that
>> > > > > > > > > > > > which operators are up to par, and which 
>> > > > > > > > > > > > operators
>> are
>> > > not,
>> > > > > and
>> > > > > > > > maybe
>> > > > > > > > > > > even
>> > > > > > > > > > > > remove those that are likely not ever used in a 
>> > > > > > > > > > > > real
>> > use
>> > > > > case.
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > I am ambivalent about revisiting older operators 
>> > > > > > > > > > > and
>> > doing
>> > > > this
>> > > > > > > > > exercise
>> > > > > > > > > > as
>> > > > > > > > > > > this can cause unnecessary tensions. My original
>> intent
>> > is
>> > > > for
>> > > > > > > > > > > contributions going forward.
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > IMO it is important to address this as well. 
>> > > > > > > > > > Operators
>> > > outside
>> > > > > the
>> > > > > > > play
>> > > > > > > > > > area should be of well known quality.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > I think this is important, and I don't anticipate 
>> > > > > > > > > much
>> > tension
>> > > if
>> > > > > we
>> > > > > > > > > establish clear criteria.
>> > > > > > > > > It's not helpful if we let the old subpar operators 
>> > > > > > > > > stay
>> and
>> > > put
>> > > > up
>> > > > > > the
>> > > > > > > > > bars for new operators.
>> > > > > > > > >
>> > > > > > > > > David
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: A proposal for Malhar

Posted by "hsy541@gmail.com" <hs...@gmail.com>.
+1

On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> wrote:

> Hi all,
>
> I would like to renew the discussion of retiring operators in Malhar.
>
> As stated before, the reason why we would like to retire operators in
> Malhar is because some of them were written a long time ago before Apache
> incubation, and they do not pertain to real use cases, are not up to par in
> code quality, have no potential for improvement, and probably completely
> unused by anybody.
>
> We do not want contributors to use them as a model of their contribution,
> or users to use them thinking they are of quality, and then hit a wall.
> Both scenarios are not beneficial to the reputation of Apex.
>
> The initial 3 packages that we would like to target are *lib/algo*,
> *lib/math*, and *lib/streamquery*.
>
> I'm adding this thread to the users list. Please speak up if you are using
> any operator in these 3 packages. We would like to hear from you.
>
> These are the options I can think of for retiring those operators:
>
> 1) Completely remove them from the malhar repository.
> 2) Move them from malhar-library into a separate artifact called
> malhar-misc
> 3) Mark them deprecated and add to their javadoc that they are no longer
> supported
>
> Note that 2 and 3 are not mutually exclusive. Any thoughts?
>
> David
>
> On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> I wanted to close the loop on this discussion. In general everyone seemed
>> to be favorable to this idea with no serious objections. Folks had good
>> suggestions like documenting capabilities of operators, come up well
>> defined criteria for graduation of operators and what those criteria may
>> be
>> and what to do with existing operators that may not yet be mature or
>> unused.
>>
>> I am going to summarize the key points that resulted from the discussion
>> and would like to proceed with them.
>>
>>    - Operators that do not yet provide the key platform capabilities to
>>    make an operator useful across different applications such as
>> reusability,
>>    partitioning static or dynamic, idempotency, exactly once will still be
>>    accepted as long as they are functionally correct, have unit tests and
>> will
>>    go into a separate module.
>>    - Contrib module was suggested as a place where new contributions go in
>>    that don't yet have all the platform capabilities and are not yet
>> mature.
>>    If there are no other suggestions we will go with this one.
>>    - It was suggested the operators documentation list those platform
>>    capabilities it currently provides from the list above. I will
>> document a
>>    structure for this in the contribution guidelines.
>>    - Folks wanted to know what would be the criteria to graduate an
>>    operator to the big leagues :). I will kick-off a separate thread for
>> it as
>>    I think it requires its own discussion and hopefully we can come up
>> with a
>>    set of guidelines for it.
>>    - David brought up state of some of the existing operators and their
>>    retirement and the layout of operators in Malhar in general and how it
>>    causes problems with development. I will ask him to lead the
>> discussion on
>>    that.
>>
>> Thanks
>>
>> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote:
>>
>> > The two ideas are not conflicting, but rather complementing.
>> >
>> > On the contrary, putting a new process for people trying to contribute
>> > while NOT addressing the old unused subpar operators in the repository
>> is
>> > what is conflicting.
>> >
>> > Keep in mind that when people try to contribute, they always look at the
>> > existing operators already in the repository as examples and likely a
>> model
>> > for their new operators.
>> >
>> > David
>> >
>> >
>> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> >
>> > > Yes there are two conflicting threads now. The original thread was to
>> > open
>> > > up a way for contributors to submit code in a dir (contrib?) as long
>> as
>> > > license part of taken care of.
>> > >
>> > > On the thread of removing non-used operators -> How do we know what is
>> > > being used?
>> > >
>> > > Thks,
>> > > Amol
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> sandesh@datatorrent.com>
>> > > wrote:
>> > >
>> > > > +1 for removing the not-used operators.
>> > > >
>> > > > So we are creating a process for operator writers who don't want to
>> > > > understand the platform, yet wants to contribute? How big is that
>> set?
>> > > > If we tell the app-user, here is the code which has not passed all
>> the
>> > > > checklist, will they be ready to use that in production?
>> > > >
>> > > > This thread has 2 conflicting forces, reduce the operators and make
>> it
>> > > easy
>> > > > to add more operators.
>> > > >
>> > > >
>> > > >
>> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> > pramod@datatorrent.com>
>> > > > wrote:
>> > > >
>> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> > > gaurav.gopi123@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Pramod,
>> > > > > >
>> > > > > > By that logic I would say let's put all partitionable operators
>> > into
>> > > > one
>> > > > > > folder, non-partitionable operators in another and so on...
>> > > > > >
>> > > > >
>> > > > > Remember the original goal of making it easier for new members to
>> > > > > contribute and managing those contributions to maturity. It is
>> not a
>> > > > > functional level separation.
>> > > > >
>> > > > >
>> > > > > > When I look at hadoop code I see these annotations being used at
>> > > class
>> > > > > > level and not at package/folder level.
>> > > > >
>> > > > >
>> > > > > I had a typo in my email, I meant to say "think of this like a
>> > > folder..."
>> > > > > as an analogy and not literally.
>> > > > >
>> > > > > Thanks
>> > > > >
>> > > > >
>> > > > > > Thanks
>> > > > > >
>> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> > > > pramod@datatorrent.com
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> > > > > gaurav.gopi123@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Can same goal not be achieved by
>> > > > > > > > using
>> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> > > > /
>> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
>> > > > > > annotation?
>> > > > > > > >
>> > > > > > >
>> > > > > > > I think it is important to localize the additions in one
>> place so
>> > > > that
>> > > > > it
>> > > > > > > becomes clearer to users about the maturity level of these,
>> > easier
>> > > > for
>> > > > > > > developers to track them towards the path to maturity and also
>> > > > > provides a
>> > > > > > > clearer directive for committers and contributors on
>> acceptance
>> > of
>> > > > new
>> > > > > > > submissions. Relying on the annotations alone makes them
>> spread
>> > all
>> > > > > over
>> > > > > > > the place and adds an additional layer of difficulty in
>> > > > identification
>> > > > > > not
>> > > > > > > just for users but also for developers who want to find such
>> > > > operators
>> > > > > > and
>> > > > > > > improve them. This of this like a folder level annotation
>> where
>> > > > > > everything
>> > > > > > > under this folder is unstable or evolving.
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> > > david@datatorrent.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Malhar in its current state, has way too many
>> operators
>> > > > that
>> > > > > > fall
>> > > > > > > > in
>> > > > > > > > > > the
>> > > > > > > > > > > > "non-production quality" category. We should make it
>> > > > obvious
>> > > > > to
>> > > > > > > > users
>> > > > > > > > > > > that
>> > > > > > > > > > > > which operators are up to par, and which operators
>> are
>> > > not,
>> > > > > and
>> > > > > > > > maybe
>> > > > > > > > > > > even
>> > > > > > > > > > > > remove those that are likely not ever used in a real
>> > use
>> > > > > case.
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > I am ambivalent about revisiting older operators and
>> > doing
>> > > > this
>> > > > > > > > > exercise
>> > > > > > > > > > as
>> > > > > > > > > > > this can cause unnecessary tensions. My original
>> intent
>> > is
>> > > > for
>> > > > > > > > > > > contributions going forward.
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > IMO it is important to address this as well. Operators
>> > > outside
>> > > > > the
>> > > > > > > play
>> > > > > > > > > > area should be of well known quality.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > I think this is important, and I don't anticipate much
>> > tension
>> > > if
>> > > > > we
>> > > > > > > > > establish clear criteria.
>> > > > > > > > > It's not helpful if we let the old subpar operators stay
>> and
>> > > put
>> > > > up
>> > > > > > the
>> > > > > > > > > bars for new operators.
>> > > > > > > > >
>> > > > > > > > > David
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
I would suggest we go through the operators in those packages on an
individual basis and grade them into 3 buckets, those that meet the level
we expect from the operators (could be few of them), those that are
potentially useful but need additional work and those that we don't think
would be useful. The ones in the first bucket can remain in place, the
second set be moved to misc and third set moved to misc and deprecated.

Thanks

On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> wrote:

> Hi all,
>
> I would like to renew the discussion of retiring operators in Malhar.
>
> As stated before, the reason why we would like to retire operators in
> Malhar is because some of them were written a long time ago before Apache
> incubation, and they do not pertain to real use cases, are not up to par in
> code quality, have no potential for improvement, and probably completely
> unused by anybody.
>
> We do not want contributors to use them as a model of their contribution,
> or users to use them thinking they are of quality, and then hit a wall.
> Both scenarios are not beneficial to the reputation of Apex.
>
> The initial 3 packages that we would like to target are *lib/algo*,
> *lib/math*, and *lib/streamquery*.
>
> I'm adding this thread to the users list. Please speak up if you are using
> any operator in these 3 packages. We would like to hear from you.
>
> These are the options I can think of for retiring those operators:
>
> 1) Completely remove them from the malhar repository.
> 2) Move them from malhar-library into a separate artifact called
> malhar-misc
> 3) Mark them deprecated and add to their javadoc that they are no longer
> supported
>
> Note that 2 and 3 are not mutually exclusive. Any thoughts?
>
> David
>
> On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> I wanted to close the loop on this discussion. In general everyone seemed
>> to be favorable to this idea with no serious objections. Folks had good
>> suggestions like documenting capabilities of operators, come up well
>> defined criteria for graduation of operators and what those criteria may
>> be
>> and what to do with existing operators that may not yet be mature or
>> unused.
>>
>> I am going to summarize the key points that resulted from the discussion
>> and would like to proceed with them.
>>
>>    - Operators that do not yet provide the key platform capabilities to
>>    make an operator useful across different applications such as
>> reusability,
>>    partitioning static or dynamic, idempotency, exactly once will still be
>>    accepted as long as they are functionally correct, have unit tests and
>> will
>>    go into a separate module.
>>    - Contrib module was suggested as a place where new contributions go in
>>    that don't yet have all the platform capabilities and are not yet
>> mature.
>>    If there are no other suggestions we will go with this one.
>>    - It was suggested the operators documentation list those platform
>>    capabilities it currently provides from the list above. I will
>> document a
>>    structure for this in the contribution guidelines.
>>    - Folks wanted to know what would be the criteria to graduate an
>>    operator to the big leagues :). I will kick-off a separate thread for
>> it as
>>    I think it requires its own discussion and hopefully we can come up
>> with a
>>    set of guidelines for it.
>>    - David brought up state of some of the existing operators and their
>>
>>    retirement and the layout of operators in Malhar in general and how it
>>    causes problems with development. I will ask him to lead the
>> discussion on
>>    that.
>>
>> Thanks
>>
>> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote:
>>
>> > The two ideas are not conflicting, but rather complementing.
>> >
>> > On the contrary, putting a new process for people trying to contribute
>> > while NOT addressing the old unused subpar operators in the repository
>> is
>> > what is conflicting.
>> >
>> > Keep in mind that when people try to contribute, they always look at the
>> > existing operators already in the repository as examples and likely a
>> model
>> > for their new operators.
>> >
>> > David
>> >
>> >
>> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> >
>> > > Yes there are two conflicting threads now. The original thread was to
>> > open
>> > > up a way for contributors to submit code in a dir (contrib?) as long
>> as
>> > > license part of taken care of.
>> > >
>> > > On the thread of removing non-used operators -> How do we know what is
>> > > being used?
>> > >
>> > > Thks,
>> > > Amol
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> sandesh@datatorrent.com>
>> > > wrote:
>> > >
>> > > > +1 for removing the not-used operators.
>> > > >
>> > > > So we are creating a process for operator writers who don't want to
>> > > > understand the platform, yet wants to contribute? How big is that
>> set?
>> > > > If we tell the app-user, here is the code which has not passed all
>> the
>> > > > checklist, will they be ready to use that in production?
>> > > >
>> > > > This thread has 2 conflicting forces, reduce the operators and make
>> it
>> > > easy
>> > > > to add more operators.
>> > > >
>> > > >
>> > > >
>> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> > pramod@datatorrent.com>
>> > > > wrote:
>> > > >
>> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> > > gaurav.gopi123@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Pramod,
>> > > > > >
>> > > > > > By that logic I would say let's put all partitionable operators
>> > into
>> > > > one
>> > > > > > folder, non-partitionable operators in another and so on...
>> > > > > >
>> > > > >
>> > > > > Remember the original goal of making it easier for new members to
>> > > > > contribute and managing those contributions to maturity. It is
>> not a
>> > > > > functional level separation.
>> > > > >
>> > > > >
>> > > > > > When I look at hadoop code I see these annotations being used at
>> > > class
>> > > > > > level and not at package/folder level.
>> > > > >
>> > > > >
>> > > > > I had a typo in my email, I meant to say "think of this like a
>> > > folder..."
>> > > > > as an analogy and not literally.
>> > > > >
>> > > > > Thanks
>> > > > >
>> > > > >
>> > > > > > Thanks
>> > > > > >
>> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> > > > pramod@datatorrent.com
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> > > > > gaurav.gopi123@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Can same goal not be achieved by
>> > > > > > > > using
>> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> > > > /
>> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
>> > > > > > annotation?
>> > > > > > > >
>> > > > > > >
>> > > > > > > I think it is important to localize the additions in one
>> place so
>> > > > that
>> > > > > it
>> > > > > > > becomes clearer to users about the maturity level of these,
>> > easier
>> > > > for
>> > > > > > > developers to track them towards the path to maturity and also
>> > > > > provides a
>> > > > > > > clearer directive for committers and contributors on
>> acceptance
>> > of
>> > > > new
>> > > > > > > submissions. Relying on the annotations alone makes them
>> spread
>> > all
>> > > > > over
>> > > > > > > the place and adds an additional layer of difficulty in
>> > > > identification
>> > > > > > not
>> > > > > > > just for users but also for developers who want to find such
>> > > > operators
>> > > > > > and
>> > > > > > > improve them. This of this like a folder level annotation
>> where
>> > > > > > everything
>> > > > > > > under this folder is unstable or evolving.
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> > > david@datatorrent.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Malhar in its current state, has way too many
>> operators
>> > > > that
>> > > > > > fall
>> > > > > > > > in
>> > > > > > > > > > the
>> > > > > > > > > > > > "non-production quality" category. We should make it
>> > > > obvious
>> > > > > to
>> > > > > > > > users
>> > > > > > > > > > > that
>> > > > > > > > > > > > which operators are up to par, and which operators
>> are
>> > > not,
>> > > > > and
>> > > > > > > > maybe
>> > > > > > > > > > > even
>> > > > > > > > > > > > remove those that are likely not ever used in a real
>> > use
>> > > > > case.
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > I am ambivalent about revisiting older operators and
>> > doing
>> > > > this
>> > > > > > > > > exercise
>> > > > > > > > > > as
>> > > > > > > > > > > this can cause unnecessary tensions. My original
>> intent
>> > is
>> > > > for
>> > > > > > > > > > > contributions going forward.
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > IMO it is important to address this as well. Operators
>> > > outside
>> > > > > the
>> > > > > > > play
>> > > > > > > > > > area should be of well known quality.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > I think this is important, and I don't anticipate much
>> > tension
>> > > if
>> > > > > we
>> > > > > > > > > establish clear criteria.
>> > > > > > > > > It's not helpful if we let the old subpar operators stay
>> and
>> > > put
>> > > > up
>> > > > > > the
>> > > > > > > > > bars for new operators.
>> > > > > > > > >
>> > > > > > > > > David
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: A proposal for Malhar

Posted by "hsy541@gmail.com" <hs...@gmail.com>.
+1

On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com> wrote:

> Hi all,
>
> I would like to renew the discussion of retiring operators in Malhar.
>
> As stated before, the reason why we would like to retire operators in
> Malhar is because some of them were written a long time ago before Apache
> incubation, and they do not pertain to real use cases, are not up to par in
> code quality, have no potential for improvement, and probably completely
> unused by anybody.
>
> We do not want contributors to use them as a model of their contribution,
> or users to use them thinking they are of quality, and then hit a wall.
> Both scenarios are not beneficial to the reputation of Apex.
>
> The initial 3 packages that we would like to target are *lib/algo*,
> *lib/math*, and *lib/streamquery*.
>
> I'm adding this thread to the users list. Please speak up if you are using
> any operator in these 3 packages. We would like to hear from you.
>
> These are the options I can think of for retiring those operators:
>
> 1) Completely remove them from the malhar repository.
> 2) Move them from malhar-library into a separate artifact called
> malhar-misc
> 3) Mark them deprecated and add to their javadoc that they are no longer
> supported
>
> Note that 2 and 3 are not mutually exclusive. Any thoughts?
>
> David
>
> On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> I wanted to close the loop on this discussion. In general everyone seemed
>> to be favorable to this idea with no serious objections. Folks had good
>> suggestions like documenting capabilities of operators, come up well
>> defined criteria for graduation of operators and what those criteria may
>> be
>> and what to do with existing operators that may not yet be mature or
>> unused.
>>
>> I am going to summarize the key points that resulted from the discussion
>> and would like to proceed with them.
>>
>>    - Operators that do not yet provide the key platform capabilities to
>>    make an operator useful across different applications such as
>> reusability,
>>    partitioning static or dynamic, idempotency, exactly once will still be
>>    accepted as long as they are functionally correct, have unit tests and
>> will
>>    go into a separate module.
>>    - Contrib module was suggested as a place where new contributions go in
>>    that don't yet have all the platform capabilities and are not yet
>> mature.
>>    If there are no other suggestions we will go with this one.
>>    - It was suggested the operators documentation list those platform
>>    capabilities it currently provides from the list above. I will
>> document a
>>    structure for this in the contribution guidelines.
>>    - Folks wanted to know what would be the criteria to graduate an
>>    operator to the big leagues :). I will kick-off a separate thread for
>> it as
>>    I think it requires its own discussion and hopefully we can come up
>> with a
>>    set of guidelines for it.
>>    - David brought up state of some of the existing operators and their
>>    retirement and the layout of operators in Malhar in general and how it
>>    causes problems with development. I will ask him to lead the
>> discussion on
>>    that.
>>
>> Thanks
>>
>> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote:
>>
>> > The two ideas are not conflicting, but rather complementing.
>> >
>> > On the contrary, putting a new process for people trying to contribute
>> > while NOT addressing the old unused subpar operators in the repository
>> is
>> > what is conflicting.
>> >
>> > Keep in mind that when people try to contribute, they always look at the
>> > existing operators already in the repository as examples and likely a
>> model
>> > for their new operators.
>> >
>> > David
>> >
>> >
>> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
>> wrote:
>> >
>> > > Yes there are two conflicting threads now. The original thread was to
>> > open
>> > > up a way for contributors to submit code in a dir (contrib?) as long
>> as
>> > > license part of taken care of.
>> > >
>> > > On the thread of removing non-used operators -> How do we know what is
>> > > being used?
>> > >
>> > > Thks,
>> > > Amol
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> sandesh@datatorrent.com>
>> > > wrote:
>> > >
>> > > > +1 for removing the not-used operators.
>> > > >
>> > > > So we are creating a process for operator writers who don't want to
>> > > > understand the platform, yet wants to contribute? How big is that
>> set?
>> > > > If we tell the app-user, here is the code which has not passed all
>> the
>> > > > checklist, will they be ready to use that in production?
>> > > >
>> > > > This thread has 2 conflicting forces, reduce the operators and make
>> it
>> > > easy
>> > > > to add more operators.
>> > > >
>> > > >
>> > > >
>> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> > pramod@datatorrent.com>
>> > > > wrote:
>> > > >
>> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> > > gaurav.gopi123@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > Pramod,
>> > > > > >
>> > > > > > By that logic I would say let's put all partitionable operators
>> > into
>> > > > one
>> > > > > > folder, non-partitionable operators in another and so on...
>> > > > > >
>> > > > >
>> > > > > Remember the original goal of making it easier for new members to
>> > > > > contribute and managing those contributions to maturity. It is
>> not a
>> > > > > functional level separation.
>> > > > >
>> > > > >
>> > > > > > When I look at hadoop code I see these annotations being used at
>> > > class
>> > > > > > level and not at package/folder level.
>> > > > >
>> > > > >
>> > > > > I had a typo in my email, I meant to say "think of this like a
>> > > folder..."
>> > > > > as an analogy and not literally.
>> > > > >
>> > > > > Thanks
>> > > > >
>> > > > >
>> > > > > > Thanks
>> > > > > >
>> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> > > > pramod@datatorrent.com
>> > > > > >
>> > > > > > wrote:
>> > > > > >
>> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> > > > > gaurav.gopi123@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Can same goal not be achieved by
>> > > > > > > > using
>> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>> > > > /
>> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
>> > > > > > annotation?
>> > > > > > > >
>> > > > > > >
>> > > > > > > I think it is important to localize the additions in one
>> place so
>> > > > that
>> > > > > it
>> > > > > > > becomes clearer to users about the maturity level of these,
>> > easier
>> > > > for
>> > > > > > > developers to track them towards the path to maturity and also
>> > > > > provides a
>> > > > > > > clearer directive for committers and contributors on
>> acceptance
>> > of
>> > > > new
>> > > > > > > submissions. Relying on the annotations alone makes them
>> spread
>> > all
>> > > > > over
>> > > > > > > the place and adds an additional layer of difficulty in
>> > > > identification
>> > > > > > not
>> > > > > > > just for users but also for developers who want to find such
>> > > > operators
>> > > > > > and
>> > > > > > > improve them. This of this like a folder level annotation
>> where
>> > > > > > everything
>> > > > > > > under this folder is unstable or evolving.
>> > > > > > >
>> > > > > > > Thanks
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> > > david@datatorrent.com
>> > > > >
>> > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > Malhar in its current state, has way too many
>> operators
>> > > > that
>> > > > > > fall
>> > > > > > > > in
>> > > > > > > > > > the
>> > > > > > > > > > > > "non-production quality" category. We should make it
>> > > > obvious
>> > > > > to
>> > > > > > > > users
>> > > > > > > > > > > that
>> > > > > > > > > > > > which operators are up to par, and which operators
>> are
>> > > not,
>> > > > > and
>> > > > > > > > maybe
>> > > > > > > > > > > even
>> > > > > > > > > > > > remove those that are likely not ever used in a real
>> > use
>> > > > > case.
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > I am ambivalent about revisiting older operators and
>> > doing
>> > > > this
>> > > > > > > > > exercise
>> > > > > > > > > > as
>> > > > > > > > > > > this can cause unnecessary tensions. My original
>> intent
>> > is
>> > > > for
>> > > > > > > > > > > contributions going forward.
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > IMO it is important to address this as well. Operators
>> > > outside
>> > > > > the
>> > > > > > > play
>> > > > > > > > > > area should be of well known quality.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > I think this is important, and I don't anticipate much
>> > tension
>> > > if
>> > > > > we
>> > > > > > > > > establish clear criteria.
>> > > > > > > > > It's not helpful if we let the old subpar operators stay
>> and
>> > > put
>> > > > up
>> > > > > > the
>> > > > > > > > > bars for new operators.
>> > > > > > > > >
>> > > > > > > > > David
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: A proposal for Malhar

Posted by David Yan <da...@datatorrent.com>.
Hi all,

I would like to renew the discussion of retiring operators in Malhar.

As stated before, the reason why we would like to retire operators in
Malhar is because some of them were written a long time ago before Apache
incubation, and they do not pertain to real use cases, are not up to par in
code quality, have no potential for improvement, and probably completely
unused by anybody.

We do not want contributors to use them as a model of their contribution,
or users to use them thinking they are of quality, and then hit a wall.
Both scenarios are not beneficial to the reputation of Apex.

The initial 3 packages that we would like to target are *lib/algo*,
*lib/math*, and *lib/streamquery*.

I'm adding this thread to the users list. Please speak up if you are using
any operator in these 3 packages. We would like to hear from you.

These are the options I can think of for retiring those operators:

1) Completely remove them from the malhar repository.
2) Move them from malhar-library into a separate artifact called malhar-misc
3) Mark them deprecated and add to their javadoc that they are no longer
supported

Note that 2 and 3 are not mutually exclusive. Any thoughts?

David

On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> I wanted to close the loop on this discussion. In general everyone seemed
> to be favorable to this idea with no serious objections. Folks had good
> suggestions like documenting capabilities of operators, come up well
> defined criteria for graduation of operators and what those criteria may be
> and what to do with existing operators that may not yet be mature or
> unused.
>
> I am going to summarize the key points that resulted from the discussion
> and would like to proceed with them.
>
>    - Operators that do not yet provide the key platform capabilities to
>    make an operator useful across different applications such as
> reusability,
>    partitioning static or dynamic, idempotency, exactly once will still be
>    accepted as long as they are functionally correct, have unit tests and
> will
>    go into a separate module.
>    - Contrib module was suggested as a place where new contributions go in
>    that don't yet have all the platform capabilities and are not yet
> mature.
>    If there are no other suggestions we will go with this one.
>    - It was suggested the operators documentation list those platform
>    capabilities it currently provides from the list above. I will document
> a
>    structure for this in the contribution guidelines.
>    - Folks wanted to know what would be the criteria to graduate an
>    operator to the big leagues :). I will kick-off a separate thread for
> it as
>    I think it requires its own discussion and hopefully we can come up
> with a
>    set of guidelines for it.
>    - David brought up state of some of the existing operators and their
>    retirement and the layout of operators in Malhar in general and how it
>    causes problems with development. I will ask him to lead the discussion
> on
>    that.
>
> Thanks
>
> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote:
>
> > The two ideas are not conflicting, but rather complementing.
> >
> > On the contrary, putting a new process for people trying to contribute
> > while NOT addressing the old unused subpar operators in the repository is
> > what is conflicting.
> >
> > Keep in mind that when people try to contribute, they always look at the
> > existing operators already in the repository as examples and likely a
> model
> > for their new operators.
> >
> > David
> >
> >
> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
> wrote:
> >
> > > Yes there are two conflicting threads now. The original thread was to
> > open
> > > up a way for contributors to submit code in a dir (contrib?) as long as
> > > license part of taken care of.
> > >
> > > On the thread of removing non-used operators -> How do we know what is
> > > being used?
> > >
> > > Thks,
> > > Amol
> > >
> > >
> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> sandesh@datatorrent.com>
> > > wrote:
> > >
> > > > +1 for removing the not-used operators.
> > > >
> > > > So we are creating a process for operator writers who don't want to
> > > > understand the platform, yet wants to contribute? How big is that
> set?
> > > > If we tell the app-user, here is the code which has not passed all
> the
> > > > checklist, will they be ready to use that in production?
> > > >
> > > > This thread has 2 conflicting forces, reduce the operators and make
> it
> > > easy
> > > > to add more operators.
> > > >
> > > >
> > > >
> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> > pramod@datatorrent.com>
> > > > wrote:
> > > >
> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> > > gaurav.gopi123@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Pramod,
> > > > > >
> > > > > > By that logic I would say let's put all partitionable operators
> > into
> > > > one
> > > > > > folder, non-partitionable operators in another and so on...
> > > > > >
> > > > >
> > > > > Remember the original goal of making it easier for new members to
> > > > > contribute and managing those contributions to maturity. It is not
> a
> > > > > functional level separation.
> > > > >
> > > > >
> > > > > > When I look at hadoop code I see these annotations being used at
> > > class
> > > > > > level and not at package/folder level.
> > > > >
> > > > >
> > > > > I had a typo in my email, I meant to say "think of this like a
> > > folder..."
> > > > > as an analogy and not literally.
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> > > > pramod@datatorrent.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > > > > gaurav.gopi123@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Can same goal not be achieved by
> > > > > > > > using
> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> > > > /
> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > > > > > annotation?
> > > > > > > >
> > > > > > >
> > > > > > > I think it is important to localize the additions in one place
> so
> > > > that
> > > > > it
> > > > > > > becomes clearer to users about the maturity level of these,
> > easier
> > > > for
> > > > > > > developers to track them towards the path to maturity and also
> > > > > provides a
> > > > > > > clearer directive for committers and contributors on acceptance
> > of
> > > > new
> > > > > > > submissions. Relying on the annotations alone makes them spread
> > all
> > > > > over
> > > > > > > the place and adds an additional layer of difficulty in
> > > > identification
> > > > > > not
> > > > > > > just for users but also for developers who want to find such
> > > > operators
> > > > > > and
> > > > > > > improve them. This of this like a folder level annotation where
> > > > > > everything
> > > > > > > under this folder is unstable or evolving.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> > > david@datatorrent.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Malhar in its current state, has way too many
> operators
> > > > that
> > > > > > fall
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > "non-production quality" category. We should make it
> > > > obvious
> > > > > to
> > > > > > > > users
> > > > > > > > > > > that
> > > > > > > > > > > > which operators are up to par, and which operators
> are
> > > not,
> > > > > and
> > > > > > > > maybe
> > > > > > > > > > > even
> > > > > > > > > > > > remove those that are likely not ever used in a real
> > use
> > > > > case.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I am ambivalent about revisiting older operators and
> > doing
> > > > this
> > > > > > > > > exercise
> > > > > > > > > > as
> > > > > > > > > > > this can cause unnecessary tensions. My original intent
> > is
> > > > for
> > > > > > > > > > > contributions going forward.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > IMO it is important to address this as well. Operators
> > > outside
> > > > > the
> > > > > > > play
> > > > > > > > > > area should be of well known quality.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > I think this is important, and I don't anticipate much
> > tension
> > > if
> > > > > we
> > > > > > > > > establish clear criteria.
> > > > > > > > > It's not helpful if we let the old subpar operators stay
> and
> > > put
> > > > up
> > > > > > the
> > > > > > > > > bars for new operators.
> > > > > > > > >
> > > > > > > > > David
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: A proposal for Malhar

Posted by David Yan <da...@datatorrent.com>.
Hi all,

I would like to renew the discussion of retiring operators in Malhar.

As stated before, the reason why we would like to retire operators in
Malhar is because some of them were written a long time ago before Apache
incubation, and they do not pertain to real use cases, are not up to par in
code quality, have no potential for improvement, and probably completely
unused by anybody.

We do not want contributors to use them as a model of their contribution,
or users to use them thinking they are of quality, and then hit a wall.
Both scenarios are not beneficial to the reputation of Apex.

The initial 3 packages that we would like to target are *lib/algo*,
*lib/math*, and *lib/streamquery*.

I'm adding this thread to the users list. Please speak up if you are using
any operator in these 3 packages. We would like to hear from you.

These are the options I can think of for retiring those operators:

1) Completely remove them from the malhar repository.
2) Move them from malhar-library into a separate artifact called malhar-misc
3) Mark them deprecated and add to their javadoc that they are no longer
supported

Note that 2 and 3 are not mutually exclusive. Any thoughts?

David

On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> I wanted to close the loop on this discussion. In general everyone seemed
> to be favorable to this idea with no serious objections. Folks had good
> suggestions like documenting capabilities of operators, come up well
> defined criteria for graduation of operators and what those criteria may be
> and what to do with existing operators that may not yet be mature or
> unused.
>
> I am going to summarize the key points that resulted from the discussion
> and would like to proceed with them.
>
>    - Operators that do not yet provide the key platform capabilities to
>    make an operator useful across different applications such as
> reusability,
>    partitioning static or dynamic, idempotency, exactly once will still be
>    accepted as long as they are functionally correct, have unit tests and
> will
>    go into a separate module.
>    - Contrib module was suggested as a place where new contributions go in
>    that don't yet have all the platform capabilities and are not yet
> mature.
>    If there are no other suggestions we will go with this one.
>    - It was suggested the operators documentation list those platform
>    capabilities it currently provides from the list above. I will document
> a
>    structure for this in the contribution guidelines.
>    - Folks wanted to know what would be the criteria to graduate an
>    operator to the big leagues :). I will kick-off a separate thread for
> it as
>    I think it requires its own discussion and hopefully we can come up
> with a
>    set of guidelines for it.
>    - David brought up state of some of the existing operators and their
>    retirement and the layout of operators in Malhar in general and how it
>    causes problems with development. I will ask him to lead the discussion
> on
>    that.
>
> Thanks
>
> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote:
>
> > The two ideas are not conflicting, but rather complementing.
> >
> > On the contrary, putting a new process for people trying to contribute
> > while NOT addressing the old unused subpar operators in the repository is
> > what is conflicting.
> >
> > Keep in mind that when people try to contribute, they always look at the
> > existing operators already in the repository as examples and likely a
> model
> > for their new operators.
> >
> > David
> >
> >
> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com>
> wrote:
> >
> > > Yes there are two conflicting threads now. The original thread was to
> > open
> > > up a way for contributors to submit code in a dir (contrib?) as long as
> > > license part of taken care of.
> > >
> > > On the thread of removing non-used operators -> How do we know what is
> > > being used?
> > >
> > > Thks,
> > > Amol
> > >
> > >
> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> sandesh@datatorrent.com>
> > > wrote:
> > >
> > > > +1 for removing the not-used operators.
> > > >
> > > > So we are creating a process for operator writers who don't want to
> > > > understand the platform, yet wants to contribute? How big is that
> set?
> > > > If we tell the app-user, here is the code which has not passed all
> the
> > > > checklist, will they be ready to use that in production?
> > > >
> > > > This thread has 2 conflicting forces, reduce the operators and make
> it
> > > easy
> > > > to add more operators.
> > > >
> > > >
> > > >
> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> > pramod@datatorrent.com>
> > > > wrote:
> > > >
> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> > > gaurav.gopi123@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Pramod,
> > > > > >
> > > > > > By that logic I would say let's put all partitionable operators
> > into
> > > > one
> > > > > > folder, non-partitionable operators in another and so on...
> > > > > >
> > > > >
> > > > > Remember the original goal of making it easier for new members to
> > > > > contribute and managing those contributions to maturity. It is not
> a
> > > > > functional level separation.
> > > > >
> > > > >
> > > > > > When I look at hadoop code I see these annotations being used at
> > > class
> > > > > > level and not at package/folder level.
> > > > >
> > > > >
> > > > > I had a typo in my email, I meant to say "think of this like a
> > > folder..."
> > > > > as an analogy and not literally.
> > > > >
> > > > > Thanks
> > > > >
> > > > >
> > > > > > Thanks
> > > > > >
> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> > > > pramod@datatorrent.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> > > > > gaurav.gopi123@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Can same goal not be achieved by
> > > > > > > > using
> > > org.apache.hadoop.classification.InterfaceStability.Evolving
> > > > /
> > > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
> > > > > > annotation?
> > > > > > > >
> > > > > > >
> > > > > > > I think it is important to localize the additions in one place
> so
> > > > that
> > > > > it
> > > > > > > becomes clearer to users about the maturity level of these,
> > easier
> > > > for
> > > > > > > developers to track them towards the path to maturity and also
> > > > > provides a
> > > > > > > clearer directive for committers and contributors on acceptance
> > of
> > > > new
> > > > > > > submissions. Relying on the annotations alone makes them spread
> > all
> > > > > over
> > > > > > > the place and adds an additional layer of difficulty in
> > > > identification
> > > > > > not
> > > > > > > just for users but also for developers who want to find such
> > > > operators
> > > > > > and
> > > > > > > improve them. This of this like a folder level annotation where
> > > > > > everything
> > > > > > > under this folder is unstable or evolving.
> > > > > > >
> > > > > > > Thanks
> > > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> > > david@datatorrent.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Malhar in its current state, has way too many
> operators
> > > > that
> > > > > > fall
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > "non-production quality" category. We should make it
> > > > obvious
> > > > > to
> > > > > > > > users
> > > > > > > > > > > that
> > > > > > > > > > > > which operators are up to par, and which operators
> are
> > > not,
> > > > > and
> > > > > > > > maybe
> > > > > > > > > > > even
> > > > > > > > > > > > remove those that are likely not ever used in a real
> > use
> > > > > case.
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I am ambivalent about revisiting older operators and
> > doing
> > > > this
> > > > > > > > > exercise
> > > > > > > > > > as
> > > > > > > > > > > this can cause unnecessary tensions. My original intent
> > is
> > > > for
> > > > > > > > > > > contributions going forward.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > IMO it is important to address this as well. Operators
> > > outside
> > > > > the
> > > > > > > play
> > > > > > > > > > area should be of well known quality.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > I think this is important, and I don't anticipate much
> > tension
> > > if
> > > > > we
> > > > > > > > > establish clear criteria.
> > > > > > > > > It's not helpful if we let the old subpar operators stay
> and
> > > put
> > > > up
> > > > > > the
> > > > > > > > > bars for new operators.
> > > > > > > > >
> > > > > > > > > David
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: A proposal for Malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
A document for malhar contribution guidelines has been prepared and
submitted in a pull request

https://github.com/apache/apex-site/pull/44

Thanks

On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> I wanted to close the loop on this discussion. In general everyone seemed
> to be favorable to this idea with no serious objections. Folks had good
> suggestions like documenting capabilities of operators, come up well
> defined criteria for graduation of operators and what those criteria may be
> and what to do with existing operators that may not yet be mature or
> unused.
>
> I am going to summarize the key points that resulted from the discussion
> and would like to proceed with them.
>
>    - Operators that do not yet provide the key platform capabilities to
>    make an operator useful across different applications such as reusability,
>    partitioning static or dynamic, idempotency, exactly once will still be
>    accepted as long as they are functionally correct, have unit tests and will
>    go into a separate module.
>    - Contrib module was suggested as a place where new contributions go
>    in that don't yet have all the platform capabilities and are not yet
>    mature. If there are no other suggestions we will go with this one.
>    - It was suggested the operators documentation list those platform
>    capabilities it currently provides from the list above. I will document a
>    structure for this in the contribution guidelines.
>    - Folks wanted to know what would be the criteria to graduate an
>    operator to the big leagues :). I will kick-off a separate thread for it as
>    I think it requires its own discussion and hopefully we can come up with a
>    set of guidelines for it.
>    - David brought up state of some of the existing operators and their
>    retirement and the layout of operators in Malhar in general and how it
>    causes problems with development. I will ask him to lead the discussion on
>    that.
>
> Thanks
>
> On Fri, May 27, 2016 at 7:47 PM, David Yan <da...@datatorrent.com> wrote:
>
>> The two ideas are not conflicting, but rather complementing.
>>
>> On the contrary, putting a new process for people trying to contribute
>> while NOT addressing the old unused subpar operators in the repository is
>> what is conflicting.
>>
>> Keep in mind that when people try to contribute, they always look at the
>> existing operators already in the repository as examples and likely a
>> model
>> for their new operators.
>>
>> David
>>
>>
>> On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <am...@datatorrent.com> wrote:
>>
>> > Yes there are two conflicting threads now. The original thread was to
>> open
>> > up a way for contributors to submit code in a dir (contrib?) as long as
>> > license part of taken care of.
>> >
>> > On the thread of removing non-used operators -> How do we know what is
>> > being used?
>> >
>> > Thks,
>> > Amol
>> >
>> >
>> > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <sandesh@datatorrent.com
>> >
>> > wrote:
>> >
>> > > +1 for removing the not-used operators.
>> > >
>> > > So we are creating a process for operator writers who don't want to
>> > > understand the platform, yet wants to contribute? How big is that set?
>> > > If we tell the app-user, here is the code which has not passed all the
>> > > checklist, will they be ready to use that in production?
>> > >
>> > > This thread has 2 conflicting forces, reduce the operators and make it
>> > easy
>> > > to add more operators.
>> > >
>> > >
>> > >
>> > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> pramod@datatorrent.com>
>> > > wrote:
>> > >
>> > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> > gaurav.gopi123@gmail.com>
>> > > > wrote:
>> > > >
>> > > > > Pramod,
>> > > > >
>> > > > > By that logic I would say let's put all partitionable operators
>> into
>> > > one
>> > > > > folder, non-partitionable operators in another and so on...
>> > > > >
>> > > >
>> > > > Remember the original goal of making it easier for new members to
>> > > > contribute and managing those contributions to maturity. It is not a
>> > > > functional level separation.
>> > > >
>> > > >
>> > > > > When I look at hadoop code I see these annotations being used at
>> > class
>> > > > > level and not at package/folder level.
>> > > >
>> > > >
>> > > > I had a typo in my email, I meant to say "think of this like a
>> > folder..."
>> > > > as an analogy and not literally.
>> > > >
>> > > > Thanks
>> > > >
>> > > >
>> > > > > Thanks
>> > > > >
>> > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> > > pramod@datatorrent.com
>> > > > >
>> > > > > wrote:
>> > > > >
>> > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> > > > gaurav.gopi123@gmail.com>
>> > > > > > wrote:
>> > > > > >
>> > > > > > > Can same goal not be achieved by
>> > > > > > > using
>> > org.apache.hadoop.classification.InterfaceStability.Evolving
>> > > /
>> > > > > > > org.apache.hadoop.classification.InterfaceStability.Unstable
>> > > > > annotation?
>> > > > > > >
>> > > > > >
>> > > > > > I think it is important to localize the additions in one place
>> so
>> > > that
>> > > > it
>> > > > > > becomes clearer to users about the maturity level of these,
>> easier
>> > > for
>> > > > > > developers to track them towards the path to maturity and also
>> > > > provides a
>> > > > > > clearer directive for committers and contributors on acceptance
>> of
>> > > new
>> > > > > > submissions. Relying on the annotations alone makes them spread
>> all
>> > > > over
>> > > > > > the place and adds an additional layer of difficulty in
>> > > identification
>> > > > > not
>> > > > > > just for users but also for developers who want to find such
>> > > operators
>> > > > > and
>> > > > > > improve them. This of this like a folder level annotation where
>> > > > > everything
>> > > > > > under this folder is unstable or evolving.
>> > > > > >
>> > > > > > Thanks
>> > > > > >
>> > > > > >
>> > > > > > >
>> > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> > david@datatorrent.com
>> > > >
>> > > > > > wrote:
>> > > > > > >
>> > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > Malhar in its current state, has way too many
>> operators
>> > > that
>> > > > > fall
>> > > > > > > in
>> > > > > > > > > the
>> > > > > > > > > > > "non-production quality" category. We should make it
>> > > obvious
>> > > > to
>> > > > > > > users
>> > > > > > > > > > that
>> > > > > > > > > > > which operators are up to par, and which operators are
>> > not,
>> > > > and
>> > > > > > > maybe
>> > > > > > > > > > even
>> > > > > > > > > > > remove those that are likely not ever used in a real
>> use
>> > > > case.
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > > I am ambivalent about revisiting older operators and
>> doing
>> > > this
>> > > > > > > > exercise
>> > > > > > > > > as
>> > > > > > > > > > this can cause unnecessary tensions. My original intent
>> is
>> > > for
>> > > > > > > > > > contributions going forward.
>> > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > > IMO it is important to address this as well. Operators
>> > outside
>> > > > the
>> > > > > > play
>> > > > > > > > > area should be of well known quality.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > I think this is important, and I don't anticipate much
>> tension
>> > if
>> > > > we
>> > > > > > > > establish clear criteria.
>> > > > > > > > It's not helpful if we let the old subpar operators stay and
>> > put
>> > > up
>> > > > > the
>> > > > > > > > bars for new operators.
>> > > > > > > >
>> > > > > > > > David
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>