You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@apex.apache.org by Lakshmi Velineni <la...@datatorrent.com> on 2016/08/01 07:22:41 UTC

Re: A proposal for Malhar

Hi

I also added recommendation for lib/math operators to the same document as
a separate sheet. Please have a look.

Thanks
Lakshmi Prasanna

On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi,
>
> I also added recommendation for each operator . Please take a look.
>
> thanks
>
> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
> lakshmi@datatorrent.com> wrote:
>
>> Hi,
>>
>> I created a shared google sheet and tracked the various details of
>> operators. Currently, the sheet contains information about operators under
>> lib/algo only. Link is
>> https://docs.google.com/a/datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_CaWpXt3GDccM/edit?usp=sharing .
>> Will update the sheet soon with lib/math too.
>>
>> Thanks
>> Lakshmi Prasanna
>>
>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com> wrote:
>>
>>> Hi Lakshmi,
>>>
>>> Thanks for volunteering.
>>>
>>> I think Pramod's suggestion of putting the operators into 3 buckets and
>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>> individual operators are both good, with the exception that lib/streamquery
>>> is one unit and we probably do not need to look at individual operators
>>> under it.
>>>
>>> If we don't have any objection in the community, let's start the process.
>>>
>>> David
>>>
>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com> wrote:
>>>
>>>> I am interested to work on this.
>>>>
>>>> Regards,
>>>> Lakshmi prasanna
>>>>
>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>>>> wrote:
>>>>
>>>> > Why not have a shared google sheet with a list of operators and
>>>> options
>>>> > that we want to do with it.
>>>> > I think it's case by case.
>>>> > But retire unused or obsolete operators is important and we should do
>>>> it
>>>> > sooner rather than later.
>>>> >
>>>> > Regards,
>>>> > Siyuan
>>>> >
>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>>>> wrote:
>>>> >
>>>> >>
>>>> >> My vote is to do 2&3
>>>> >>
>>>> >> Thks
>>>> >> Amol
>>>> >>
>>>> >>
>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>>> >> VKottapalli@directv.com> wrote:
>>>> >>
>>>> >>> +1 for deprecating the packages listed below.
>>>> >>>
>>>> >>> -----Original Message-----
>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>> >>>
>>>> >>> +1
>>>> >>>
>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <da...@datatorrent.com>
>>>> >>> wrote:
>>>> >>>
>>>> >>> > Hi all,
>>>> >>> >
>>>> >>> > I would like to renew the discussion of retiring operators in
>>>> Malhar.
>>>> >>> >
>>>> >>> > As stated before, the reason why we would like to retire
>>>> operators in
>>>> >>> > Malhar is because some of them were written a long time ago before
>>>> >>> > Apache incubation, and they do not pertain to real use cases, are
>>>> not
>>>> >>> > up to par in code quality, have no potential for improvement, and
>>>> >>> > probably completely unused by anybody.
>>>> >>> >
>>>> >>> > We do not want contributors to use them as a model of their
>>>> >>> > contribution, or users to use them thinking they are of quality,
>>>> and
>>>> >>> then hit a wall.
>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>>> >>> >
>>>> >>> > The initial 3 packages that we would like to target are
>>>> *lib/algo*,
>>>> >>> > *lib/math*, and *lib/streamquery*.
>>>> >>>
>>>> >>> >
>>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>>> are
>>>> >>> > using any operator in these 3 packages. We would like to hear
>>>> from you.
>>>> >>> >
>>>> >>> > These are the options I can think of for retiring those operators:
>>>> >>> >
>>>> >>> > 1) Completely remove them from the malhar repository.
>>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>>> >>> > malhar-misc
>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>>> >>> > longer supported
>>>> >>> >
>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>>> >>> >
>>>> >>> > David
>>>> >>> >
>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>>> >>> > <pr...@datatorrent.com>
>>>> >>> > wrote:
>>>> >>> >
>>>> >>> >> I wanted to close the loop on this discussion. In general
>>>> everyone
>>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>>> Folks
>>>> >>> >> had good suggestions like documenting capabilities of operators,
>>>> come
>>>> >>> >> up well defined criteria for graduation of operators and what
>>>> those
>>>> >>> >> criteria may be and what to do with existing operators that may
>>>> not
>>>> >>> >> yet be mature or unused.
>>>> >>> >>
>>>> >>> >> I am going to summarize the key points that resulted from the
>>>> >>> >> discussion and would like to proceed with them.
>>>> >>> >>
>>>> >>> >>    - Operators that do not yet provide the key platform
>>>> capabilities
>>>> >>> to
>>>> >>> >>    make an operator useful across different applications such as
>>>> >>> >> reusability,
>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once will
>>>> >>> still be
>>>> >>> >>    accepted as long as they are functionally correct, have unit
>>>> tests
>>>> >>> >> and will
>>>> >>> >>    go into a separate module.
>>>> >>> >>    - Contrib module was suggested as a place where new
>>>> contributions
>>>> >>> go in
>>>> >>> >>    that don't yet have all the platform capabilities and are not
>>>> yet
>>>> >>> >> mature.
>>>> >>> >>    If there are no other suggestions we will go with this one.
>>>> >>> >>    - It was suggested the operators documentation list those
>>>> platform
>>>> >>> >>    capabilities it currently provides from the list above. I will
>>>> >>> >> document a
>>>> >>> >>    structure for this in the contribution guidelines.
>>>> >>> >>    - Folks wanted to know what would be the criteria to graduate
>>>> an
>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>>>> thread
>>>> >>> >> for it as
>>>> >>> >>    I think it requires its own discussion and hopefully we can
>>>> come
>>>> >>> >> up with a
>>>> >>> >>    set of guidelines for it.
>>>> >>> >>    - David brought up state of some of the existing operators and
>>>> >>> their
>>>> >>> >>    retirement and the layout of operators in Malhar in general
>>>> and
>>>> >>> how it
>>>> >>> >>    causes problems with development. I will ask him to lead the
>>>> >>> >> discussion on
>>>> >>> >>    that.
>>>> >>> >>
>>>> >>> >> Thanks
>>>> >>> >>
>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>>> david@datatorrent.com>
>>>> >>> wrote:
>>>> >>> >>
>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
>>>> >>> >> >
>>>> >>> >> > On the contrary, putting a new process for people trying to
>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>>> operators in
>>>> >>> >> > the repository
>>>> >>> >> is
>>>> >>> >> > what is conflicting.
>>>> >>> >> >
>>>> >>> >> > Keep in mind that when people try to contribute, they always
>>>> look
>>>> >>> >> > at the existing operators already in the repository as
>>>> examples and
>>>> >>> >> > likely a
>>>> >>> >> model
>>>> >>> >> > for their new operators.
>>>> >>> >> >
>>>> >>> >> > David
>>>> >>> >> >
>>>> >>> >> >
>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>>> amol@datatorrent.com>
>>>> >>> >> wrote:
>>>> >>> >> >
>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>>> thread
>>>> >>> >> > > was to
>>>> >>> >> > open
>>>> >>> >> > > up a way for contributors to submit code in a dir (contrib?)
>>>> as
>>>> >>> >> > > long
>>>> >>> >> as
>>>> >>> >> > > license part of taken care of.
>>>> >>> >> > >
>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>>>> know
>>>> >>> >> > > what is being used?
>>>> >>> >> > >
>>>> >>> >> > > Thks,
>>>> >>> >> > > Amol
>>>> >>> >> > >
>>>> >>> >> > >
>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>>> >>> >> sandesh@datatorrent.com>
>>>> >>> >> > > wrote:
>>>> >>> >> > >
>>>> >>> >> > > > +1 for removing the not-used operators.
>>>> >>> >> > > >
>>>> >>> >> > > > So we are creating a process for operator writers who don't
>>>> >>> >> > > > want to understand the platform, yet wants to contribute?
>>>> How
>>>> >>> >> > > > big is that
>>>> >>> >> set?
>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>>>> passed
>>>> >>> >> > > > all
>>>> >>> >> the
>>>> >>> >> > > > checklist, will they be ready to use that in production?
>>>> >>> >> > > >
>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the operators
>>>> and
>>>> >>> >> > > > make
>>>> >>> >> it
>>>> >>> >> > > easy
>>>> >>> >> > > > to add more operators.
>>>> >>> >> > > >
>>>> >>> >> > > >
>>>> >>> >> > > >
>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>>> >>> >> > pramod@datatorrent.com>
>>>> >>> >> > > > wrote:
>>>> >>> >> > > >
>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>>> >>> >> > > > > wrote:
>>>> >>> >> > > > >
>>>> >>> >> > > > > > Pramod,
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > By that logic I would say let's put all partitionable
>>>> >>> >> > > > > > operators
>>>> >>> >> > into
>>>> >>> >> > > > one
>>>> >>> >> > > > > > folder, non-partitionable operators in another and so
>>>> on...
>>>> >>> >> > > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > Remember the original goal of making it easier for new
>>>> >>> >> > > > > members to contribute and managing those contributions to
>>>> >>> >> > > > > maturity. It is
>>>> >>> >> not a
>>>> >>> >> > > > > functional level separation.
>>>> >>> >> > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>>> being
>>>> >>> >> > > > > > used at
>>>> >>> >> > > class
>>>> >>> >> > > > > > level and not at package/folder level.
>>>> >>> >> > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>>>> like
>>>> >>> >> > > > > a
>>>> >>> >> > > folder..."
>>>> >>> >> > > > > as an analogy and not literally.
>>>> >>> >> > > > >
>>>> >>> >> > > > > Thanks
>>>> >>> >> > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > > > > Thanks
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>>> >>> >> > > > pramod@datatorrent.com
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > wrote:
>>>> >>> >> > > > > >
>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>>> >>> >> > > > > > > wrote:
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>>> >>> >> > > org.apache.hadoop.classification.InterfaceStability.Evolving
>>>> >>> >> > > > /
>>>> >>> >> > > > > > > >
>>>> org.apache.hadoop.classification.InterfaceStability.Uns
>>>> >>> >> > > > > > > > table
>>>> >>> >> > > > > > annotation?
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > I think it is important to localize the additions in
>>>> one
>>>> >>> >> place so
>>>> >>> >> > > > that
>>>> >>> >> > > > > it
>>>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>>>> >>> >> > > > > > > these,
>>>> >>> >> > easier
>>>> >>> >> > > > for
>>>> >>> >> > > > > > > developers to track them towards the path to
>>>> maturity and
>>>> >>> >> > > > > > > also
>>>> >>> >> > > > > provides a
>>>> >>> >> > > > > > > clearer directive for committers and contributors on
>>>> >>> >> acceptance
>>>> >>> >> > of
>>>> >>> >> > > > new
>>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>>>> them
>>>> >>> >> spread
>>>> >>> >> > all
>>>> >>> >> > > > > over
>>>> >>> >> > > > > > > the place and adds an additional layer of difficulty
>>>> in
>>>> >>> >> > > > identification
>>>> >>> >> > > > > > not
>>>> >>> >> > > > > > > just for users but also for developers who want to
>>>> find
>>>> >>> >> > > > > > > such
>>>> >>> >> > > > operators
>>>> >>> >> > > > > > and
>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>>> annotation
>>>> >>> >> where
>>>> >>> >> > > > > > everything
>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > Thanks
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>>> >>> >> > > david@datatorrent.com
>>>> >>> >> > > > >
>>>> >>> >> > > > > > > wrote:
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > > > > >
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > > >
>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
>>>> many
>>>> >>> >> operators
>>>> >>> >> > > > that
>>>> >>> >> > > > > > fall
>>>> >>> >> > > > > > > > in
>>>> >>> >> > > > > > > > > > the
>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>>> should
>>>> >>> >> > > > > > > > > > > > make it
>>>> >>> >> > > > obvious
>>>> >>> >> > > > > to
>>>> >>> >> > > > > > > > users
>>>> >>> >> > > > > > > > > > > that
>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>>>> >>> >> > > > > > > > > > > > operators
>>>> >>> >> are
>>>> >>> >> > > not,
>>>> >>> >> > > > > and
>>>> >>> >> > > > > > > > maybe
>>>> >>> >> > > > > > > > > > > even
>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever used
>>>> in a
>>>> >>> >> > > > > > > > > > > > real
>>>> >>> >> > use
>>>> >>> >> > > > > case.
>>>> >>> >> > > > > > > > > > > >
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>>> operators
>>>> >>> >> > > > > > > > > > > and
>>>> >>> >> > doing
>>>> >>> >> > > > this
>>>> >>> >> > > > > > > > > exercise
>>>> >>> >> > > > > > > > > > as
>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>>> original
>>>> >>> >> intent
>>>> >>> >> > is
>>>> >>> >> > > > for
>>>> >>> >> > > > > > > > > > > contributions going forward.
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > >
>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>>>> >>> >> > > > > > > > > > Operators
>>>> >>> >> > > outside
>>>> >>> >> > > > > the
>>>> >>> >> > > > > > > play
>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>>> >>> >> > > > > > > > > >
>>>> >>> >> > > > > > > > > >
>>>> >>> >> > > > > > > > > I think this is important, and I don't anticipate
>>>> >>> >> > > > > > > > > much
>>>> >>> >> > tension
>>>> >>> >> > > if
>>>> >>> >> > > > > we
>>>> >>> >> > > > > > > > > establish clear criteria.
>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>>> operators
>>>> >>> >> > > > > > > > > stay
>>>> >>> >> and
>>>> >>> >> > > put
>>>> >>> >> > > > up
>>>> >>> >> > > > > > the
>>>> >>> >> > > > > > > > > bars for new operators.
>>>> >>> >> > > > > > > > >
>>>> >>> >> > > > > > > > > David
>>>> >>> >> > > > > > > > >
>>>> >>> >> > > > > > > >
>>>> >>> >> > > > > > >
>>>> >>> >> > > > > >
>>>> >>> >> > > > >
>>>> >>> >> > > >
>>>> >>> >> > >
>>>> >>> >> >
>>>> >>> >>
>>>> >>> >
>>>> >>> >
>>>> >>>
>>>> >>
>>>> >>
>>>> >
>>>>
>>>
>>>
>>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Thomas thanks for the suggestions and the comments in the document. I will
take another look at the ones that I had shortlisted in the document to
keep. Within that subset, would it be ok to leave the ones that don't have
a large state problem, for the time being, till we have replacement
operators implemented with the new windowing and state management. After
the cleanup, I can also help in the development effort of those replacement
operators as well.

Thanks

On Tue, Aug 9, 2016 at 11:21 AM, Thomas Weise <th...@gmail.com>
wrote:

> There are a bunch of operators that don't have proper state management and
> also don't support generic windowing (event time etc.). I would suggest to
> move those out or deprecate them.
>
> The new windowing and state management support along with the appropriate
> aggregators is going to make them obsolete.
>
> Thomas
>
>
> On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> Hi,
>>
>> Friendly Reminder :
>>
>> I created a shared google sheet and tracked the various details of
>> operators. The sheet contains information about operators under lib/algo,
>> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
>> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
>> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
>> operator . Please take a look and provide comments as if any.
>>
>> Thanks
>> Lakshmi Prasanna
>>
>> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
>> wrote:
>>
>>> Added comments, also recommend having the misc folder for the remaining
>>> operators in contrib according to proposed guidelines
>>>
>>> https://github.com/apache/apex-site/pull/44
>>>
>>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com>
>>> wrote:
>>>
>>> > Hi
>>> >
>>> > I also added recommendation for lib/math operators to the same
>>> document as
>>> > a separate sheet. Please have a look.
>>> >
>>> > Thanks
>>> > Lakshmi Prasanna
>>> >
>>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com
>>> > > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I also added recommendation for each operator . Please take a look.
>>> >>
>>> >> thanks
>>> >>
>>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>>> >> lakshmi@datatorrent.com> wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>> I created a shared google sheet and tracked the various details of
>>> >>> operators. Currently, the sheet contains information about operators
>>> under
>>> >>> lib/algo only. Link is https://docs.google.com/a/
>>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>>
>>> >>> lib/math too.
>>> >>>
>>> >>> Thanks
>>> >>> Lakshmi Prasanna
>>> >>>
>>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>>> >>> wrote:
>>> >>>
>>> >>>> Hi Lakshmi,
>>> >>>>
>>> >>>> Thanks for volunteering.
>>> >>>>
>>> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
>>> and
>>> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>> >>>> individual operators are both good, with the exception that
>>> lib/streamquery
>>> >>>> is one unit and we probably do not need to look at individual
>>> operators
>>> >>>> under it.
>>> >>>>
>>> >>>> If we don't have any objection in the community, let's start the
>>> >>>> process.
>>> >>>>
>>> >>>> David
>>> >>>>
>>> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>> >>>> lakshmi@datatorrent.com> wrote:
>>> >>>>
>>> >>>>> I am interested to work on this.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Lakshmi prasanna
>>> >>>>>
>>> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <
>>> hsy541@gmail.com>
>>> >>>>> wrote:
>>> >>>>>
>>> >>>>> > Why not have a shared google sheet with a list of operators and
>>> >>>>> options
>>> >>>>> > that we want to do with it.
>>> >>>>> > I think it's case by case.
>>> >>>>> > But retire unused or obsolete operators is important and we
>>> should
>>> >>>>> do it
>>> >>>>> > sooner rather than later.
>>> >>>>> >
>>> >>>>> > Regards,
>>> >>>>> > Siyuan
>>> >>>>> >
>>> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <
>>> amol@datatorrent.com>
>>> >>>>> wrote:
>>> >>>>> >
>>> >>>>> >>
>>> >>>>> >> My vote is to do 2&3
>>> >>>>> >>
>>> >>>>> >> Thks
>>> >>>>> >> Amol
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>> >>>>> >> VKottapalli@directv.com> wrote:
>>> >>>>> >>
>>> >>>>> >>> +1 for deprecating the packages listed below.
>>> >>>>> >>>
>>> >>>>> >>> -----Original Message-----
>>> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>> >>>>> >>>
>>> >>>>> >>> +1
>>> >>>>> >>>
>>> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>>> david@datatorrent.com
>>> >>>>> >
>>> >>>>> >>> wrote:
>>> >>>>> >>>
>>> >>>>> >>> > Hi all,
>>> >>>>> >>> >
>>> >>>>> >>> > I would like to renew the discussion of retiring operators in
>>> >>>>> Malhar.
>>> >>>>> >>> >
>>> >>>>> >>> > As stated before, the reason why we would like to retire
>>> >>>>> operators in
>>> >>>>> >>> > Malhar is because some of them were written a long time ago
>>> >>>>> before
>>> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>>> >>>>> are not
>>> >>>>> >>> > up to par in code quality, have no potential for
>>> improvement, and
>>> >>>>> >>> > probably completely unused by anybody.
>>> >>>>> >>> >
>>> >>>>> >>> > We do not want contributors to use them as a model of their
>>> >>>>> >>> > contribution, or users to use them thinking they are of
>>> quality,
>>> >>>>> and
>>> >>>>> >>> then hit a wall.
>>> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >>>>> >>> >
>>> >>>>> >>> > The initial 3 packages that we would like to target are
>>> >>>>> *lib/algo*,
>>> >>>>> >>> > *lib/math*, and *lib/streamquery*.
>>> >>>>> >>>
>>> >>>>> >>> >
>>> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
>>> you
>>> >>>>> are
>>> >>>>> >>> > using any operator in these 3 packages. We would like to hear
>>> >>>>> from you.
>>> >>>>> >>> >
>>> >>>>> >>> > These are the options I can think of for retiring those
>>> >>>>> operators:
>>> >>>>> >>> >
>>> >>>>> >>> > 1) Completely remove them from the malhar repository.
>>> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
>>> called
>>> >>>>> >>> > malhar-misc
>>> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they
>>> are no
>>> >>>>> >>> > longer supported
>>> >>>>> >>> >
>>> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >>>>> >>> >
>>> >>>>> >>> > David
>>> >>>>> >>> >
>>> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> >>>>> >>> > <pr...@datatorrent.com>
>>> >>>>> >>> > wrote:
>>> >>>>> >>> >
>>> >>>>> >>> >> I wanted to close the loop on this discussion. In general
>>> >>>>> everyone
>>> >>>>> >>> >> seemed to be favorable to this idea with no serious
>>> objections.
>>> >>>>> Folks
>>> >>>>> >>> >> had good suggestions like documenting capabilities of
>>> >>>>> operators, come
>>> >>>>> >>> >> up well defined criteria for graduation of operators and
>>> what
>>> >>>>> those
>>> >>>>> >>> >> criteria may be and what to do with existing operators that
>>> may
>>> >>>>> not
>>> >>>>> >>> >> yet be mature or unused.
>>> >>>>> >>> >>
>>> >>>>> >>> >> I am going to summarize the key points that resulted from
>>> the
>>> >>>>> >>> >> discussion and would like to proceed with them.
>>> >>>>> >>> >>
>>> >>>>> >>> >>    - Operators that do not yet provide the key platform
>>> >>>>> capabilities
>>> >>>>> >>> to
>>> >>>>> >>> >>    make an operator useful across different applications
>>> such as
>>> >>>>> >>> >> reusability,
>>> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>>> >>>>> will
>>> >>>>> >>> still be
>>> >>>>> >>> >>    accepted as long as they are functionally correct, have
>>> unit
>>> >>>>> tests
>>> >>>>> >>> >> and will
>>> >>>>> >>> >>    go into a separate module.
>>> >>>>> >>> >>    - Contrib module was suggested as a place where new
>>> >>>>> contributions
>>> >>>>> >>> go in
>>> >>>>> >>> >>    that don't yet have all the platform capabilities and are
>>> >>>>> not yet
>>> >>>>> >>> >> mature.
>>> >>>>> >>> >>    If there are no other suggestions we will go with this
>>> one.
>>> >>>>> >>> >>    - It was suggested the operators documentation list those
>>> >>>>> platform
>>> >>>>> >>> >>    capabilities it currently provides from the list above. I
>>> >>>>> will
>>> >>>>> >>> >> document a
>>> >>>>> >>> >>    structure for this in the contribution guidelines.
>>> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
>>> >>>>> graduate an
>>> >>>>> >>> >>    operator to the big leagues :). I will kick-off a
>>> separate
>>> >>>>> thread
>>> >>>>> >>> >> for it as
>>> >>>>> >>> >>    I think it requires its own discussion and hopefully we
>>> can
>>> >>>>> come
>>> >>>>> >>> >> up with a
>>> >>>>> >>> >>    set of guidelines for it.
>>> >>>>> >>> >>    - David brought up state of some of the existing
>>> operators
>>> >>>>> and
>>> >>>>> >>> their
>>> >>>>> >>> >>    retirement and the layout of operators in Malhar in
>>> general
>>> >>>>> and
>>> >>>>> >>> how it
>>> >>>>> >>> >>    causes problems with development. I will ask him to lead
>>> the
>>> >>>>> >>> >> discussion on
>>> >>>>> >>> >>    that.
>>> >>>>> >>> >>
>>> >>>>> >>> >> Thanks
>>> >>>>> >>> >>
>>> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>> >>>>> david@datatorrent.com>
>>> >>>>> >>> wrote:
>>> >>>>> >>> >>
>>> >>>>> >>> >> > The two ideas are not conflicting, but rather
>>> complementing.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > On the contrary, putting a new process for people trying
>>> to
>>> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>> >>>>> operators in
>>> >>>>> >>> >> > the repository
>>> >>>>> >>> >> is
>>> >>>>> >>> >> > what is conflicting.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > Keep in mind that when people try to contribute, they
>>> always
>>> >>>>> look
>>> >>>>> >>> >> > at the existing operators already in the repository as
>>> >>>>> examples and
>>> >>>>> >>> >> > likely a
>>> >>>>> >>> >> model
>>> >>>>> >>> >> > for their new operators.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > David
>>> >>>>> >>> >> >
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>> >>>>> amol@datatorrent.com>
>>> >>>>> >>> >> wrote:
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>> >>>>> thread
>>> >>>>> >>> >> > > was to
>>> >>>>> >>> >> > open
>>> >>>>> >>> >> > > up a way for contributors to submit code in a dir
>>> >>>>> (contrib?) as
>>> >>>>> >>> >> > > long
>>> >>>>> >>> >> as
>>> >>>>> >>> >> > > license part of taken care of.
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > On the thread of removing non-used operators -> How do
>>> we
>>> >>>>> know
>>> >>>>> >>> >> > > what is being used?
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > Thks,
>>> >>>>> >>> >> > > Amol
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>> >>>>> >>> >> sandesh@datatorrent.com>
>>> >>>>> >>> >> > > wrote:
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > > +1 for removing the not-used operators.
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > So we are creating a process for operator writers who
>>> >>>>> don't
>>> >>>>> >>> >> > > > want to understand the platform, yet wants to
>>> contribute?
>>> >>>>> How
>>> >>>>> >>> >> > > > big is that
>>> >>>>> >>> >> set?
>>> >>>>> >>> >> > > > If we tell the app-user, here is the code which has
>>> not
>>> >>>>> passed
>>> >>>>> >>> >> > > > all
>>> >>>>> >>> >> the
>>> >>>>> >>> >> > > > checklist, will they be ready to use that in
>>> production?
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>>> >>>>> operators and
>>> >>>>> >>> >> > > > make
>>> >>>>> >>> >> it
>>> >>>>> >>> >> > > easy
>>> >>>>> >>> >> > > > to add more operators.
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>> >>>>> >>> >> > pramod@datatorrent.com>
>>> >>>>> >>> >> > > > wrote:
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>> >>>>> >>> >> > > > > wrote:
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > Pramod,
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > By that logic I would say let's put all
>>> partitionable
>>> >>>>> >>> >> > > > > > operators
>>> >>>>> >>> >> > into
>>> >>>>> >>> >> > > > one
>>> >>>>> >>> >> > > > > > folder, non-partitionable operators in another
>>> and so
>>> >>>>> on...
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > Remember the original goal of making it easier for
>>> new
>>> >>>>> >>> >> > > > > members to contribute and managing those
>>> contributions
>>> >>>>> to
>>> >>>>> >>> >> > > > > maturity. It is
>>> >>>>> >>> >> not a
>>> >>>>> >>> >> > > > > functional level separation.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>> >>>>> being
>>> >>>>> >>> >> > > > > > used at
>>> >>>>> >>> >> > > class
>>> >>>>> >>> >> > > > > > level and not at package/folder level.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
>>> this
>>> >>>>> like
>>> >>>>> >>> >> > > > > a
>>> >>>>> >>> >> > > folder..."
>>> >>>>> >>> >> > > > > as an analogy and not literally.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > Thanks
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > Thanks
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>> >>>>> >>> >> > > > pramod@datatorrent.com
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > wrote:
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>> >>>>> >>> >> > > > > > > wrote:
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>> >>>>> >>> >> > > org.apache.hadoop.classification.
>>> >>>>> InterfaceStability.Evolving
>>> >>>>> >>> >> > > > /
>>> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>>> >>>>> InterfaceStability.Uns
>>> >>>>> >>> >> > > > > > > > table
>>> >>>>> >>> >> > > > > > annotation?
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > I think it is important to localize the
>>> additions
>>> >>>>> in one
>>> >>>>> >>> >> place so
>>> >>>>> >>> >> > > > that
>>> >>>>> >>> >> > > > > it
>>> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity
>>> level of
>>> >>>>> >>> >> > > > > > > these,
>>> >>>>> >>> >> > easier
>>> >>>>> >>> >> > > > for
>>> >>>>> >>> >> > > > > > > developers to track them towards the path to
>>> >>>>> maturity and
>>> >>>>> >>> >> > > > > > > also
>>> >>>>> >>> >> > > > > provides a
>>> >>>>> >>> >> > > > > > > clearer directive for committers and
>>> contributors on
>>> >>>>> >>> >> acceptance
>>> >>>>> >>> >> > of
>>> >>>>> >>> >> > > > new
>>> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
>>> makes
>>> >>>>> them
>>> >>>>> >>> >> spread
>>> >>>>> >>> >> > all
>>> >>>>> >>> >> > > > > over
>>> >>>>> >>> >> > > > > > > the place and adds an additional layer of
>>> >>>>> difficulty in
>>> >>>>> >>> >> > > > identification
>>> >>>>> >>> >> > > > > > not
>>> >>>>> >>> >> > > > > > > just for users but also for developers who want
>>> to
>>> >>>>> find
>>> >>>>> >>> >> > > > > > > such
>>> >>>>> >>> >> > > > operators
>>> >>>>> >>> >> > > > > > and
>>> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>> >>>>> annotation
>>> >>>>> >>> >> where
>>> >>>>> >>> >> > > > > > everything
>>> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > Thanks
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>> >>>>> >>> >> > > david@datatorrent.com
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > > wrote:
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way
>>> too
>>> >>>>> many
>>> >>>>> >>> >> operators
>>> >>>>> >>> >> > > > that
>>> >>>>> >>> >> > > > > > fall
>>> >>>>> >>> >> > > > > > > > in
>>> >>>>> >>> >> > > > > > > > > > the
>>> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>> >>>>> should
>>> >>>>> >>> >> > > > > > > > > > > > make it
>>> >>>>> >>> >> > > > obvious
>>> >>>>> >>> >> > > > > to
>>> >>>>> >>> >> > > > > > > > users
>>> >>>>> >>> >> > > > > > > > > > > that
>>> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and
>>> which
>>> >>>>> >>> >> > > > > > > > > > > > operators
>>> >>>>> >>> >> are
>>> >>>>> >>> >> > > not,
>>> >>>>> >>> >> > > > > and
>>> >>>>> >>> >> > > > > > > > maybe
>>> >>>>> >>> >> > > > > > > > > > > even
>>> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>>> >>>>> used in a
>>> >>>>> >>> >> > > > > > > > > > > > real
>>> >>>>> >>> >> > use
>>> >>>>> >>> >> > > > > case.
>>> >>>>> >>> >> > > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>> >>>>> operators
>>> >>>>> >>> >> > > > > > > > > > > and
>>> >>>>> >>> >> > doing
>>> >>>>> >>> >> > > > this
>>> >>>>> >>> >> > > > > > > > > exercise
>>> >>>>> >>> >> > > > > > > > > > as
>>> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>> >>>>> original
>>> >>>>> >>> >> intent
>>> >>>>> >>> >> > is
>>> >>>>> >>> >> > > > for
>>> >>>>> >>> >> > > > > > > > > > > contributions going forward.
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as
>>> well.
>>> >>>>> >>> >> > > > > > > > > > Operators
>>> >>>>> >>> >> > > outside
>>> >>>>> >>> >> > > > > the
>>> >>>>> >>> >> > > > > > > play
>>> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
>>> >>>>> anticipate
>>> >>>>> >>> >> > > > > > > > > much
>>> >>>>> >>> >> > tension
>>> >>>>> >>> >> > > if
>>> >>>>> >>> >> > > > > we
>>> >>>>> >>> >> > > > > > > > > establish clear criteria.
>>> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>> >>>>> operators
>>> >>>>> >>> >> > > > > > > > > stay
>>> >>>>> >>> >> and
>>> >>>>> >>> >> > > put
>>> >>>>> >>> >> > > > up
>>> >>>>> >>> >> > > > > > the
>>> >>>>> >>> >> > > > > > > > > bars for new operators.
>>> >>>>> >>> >> > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > David
>>> >>>>> >>> >> > > > > > > > >
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> >
>>> >>>>> >>> >>
>>> >>>>> >>> >
>>> >>>>> >>> >
>>> >>>>> >>>
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>>
>>
>>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Thomas thanks for the suggestions and the comments in the document. I will
take another look at the ones that I had shortlisted in the document to
keep. Within that subset, would it be ok to leave the ones that don't have
a large state problem, for the time being, till we have replacement
operators implemented with the new windowing and state management. After
the cleanup, I can also help in the development effort of those replacement
operators as well.

Thanks

On Tue, Aug 9, 2016 at 11:21 AM, Thomas Weise <th...@gmail.com>
wrote:

> There are a bunch of operators that don't have proper state management and
> also don't support generic windowing (event time etc.). I would suggest to
> move those out or deprecate them.
>
> The new windowing and state management support along with the appropriate
> aggregators is going to make them obsolete.
>
> Thomas
>
>
> On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> Hi,
>>
>> Friendly Reminder :
>>
>> I created a shared google sheet and tracked the various details of
>> operators. The sheet contains information about operators under lib/algo,
>> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
>> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
>> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
>> operator . Please take a look and provide comments as if any.
>>
>> Thanks
>> Lakshmi Prasanna
>>
>> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
>> wrote:
>>
>>> Added comments, also recommend having the misc folder for the remaining
>>> operators in contrib according to proposed guidelines
>>>
>>> https://github.com/apache/apex-site/pull/44
>>>
>>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com>
>>> wrote:
>>>
>>> > Hi
>>> >
>>> > I also added recommendation for lib/math operators to the same
>>> document as
>>> > a separate sheet. Please have a look.
>>> >
>>> > Thanks
>>> > Lakshmi Prasanna
>>> >
>>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>>> lakshmi@datatorrent.com
>>> > > wrote:
>>> >
>>> >> Hi,
>>> >>
>>> >> I also added recommendation for each operator . Please take a look.
>>> >>
>>> >> thanks
>>> >>
>>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>>> >> lakshmi@datatorrent.com> wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>> I created a shared google sheet and tracked the various details of
>>> >>> operators. Currently, the sheet contains information about operators
>>> under
>>> >>> lib/algo only. Link is https://docs.google.com/a/
>>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>>
>>> >>> lib/math too.
>>> >>>
>>> >>> Thanks
>>> >>> Lakshmi Prasanna
>>> >>>
>>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>>> >>> wrote:
>>> >>>
>>> >>>> Hi Lakshmi,
>>> >>>>
>>> >>>> Thanks for volunteering.
>>> >>>>
>>> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
>>> and
>>> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>> >>>> individual operators are both good, with the exception that
>>> lib/streamquery
>>> >>>> is one unit and we probably do not need to look at individual
>>> operators
>>> >>>> under it.
>>> >>>>
>>> >>>> If we don't have any objection in the community, let's start the
>>> >>>> process.
>>> >>>>
>>> >>>> David
>>> >>>>
>>> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>> >>>> lakshmi@datatorrent.com> wrote:
>>> >>>>
>>> >>>>> I am interested to work on this.
>>> >>>>>
>>> >>>>> Regards,
>>> >>>>> Lakshmi prasanna
>>> >>>>>
>>> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <
>>> hsy541@gmail.com>
>>> >>>>> wrote:
>>> >>>>>
>>> >>>>> > Why not have a shared google sheet with a list of operators and
>>> >>>>> options
>>> >>>>> > that we want to do with it.
>>> >>>>> > I think it's case by case.
>>> >>>>> > But retire unused or obsolete operators is important and we
>>> should
>>> >>>>> do it
>>> >>>>> > sooner rather than later.
>>> >>>>> >
>>> >>>>> > Regards,
>>> >>>>> > Siyuan
>>> >>>>> >
>>> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <
>>> amol@datatorrent.com>
>>> >>>>> wrote:
>>> >>>>> >
>>> >>>>> >>
>>> >>>>> >> My vote is to do 2&3
>>> >>>>> >>
>>> >>>>> >> Thks
>>> >>>>> >> Amol
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>> >>>>> >> VKottapalli@directv.com> wrote:
>>> >>>>> >>
>>> >>>>> >>> +1 for deprecating the packages listed below.
>>> >>>>> >>>
>>> >>>>> >>> -----Original Message-----
>>> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>> >>>>> >>>
>>> >>>>> >>> +1
>>> >>>>> >>>
>>> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>>> david@datatorrent.com
>>> >>>>> >
>>> >>>>> >>> wrote:
>>> >>>>> >>>
>>> >>>>> >>> > Hi all,
>>> >>>>> >>> >
>>> >>>>> >>> > I would like to renew the discussion of retiring operators in
>>> >>>>> Malhar.
>>> >>>>> >>> >
>>> >>>>> >>> > As stated before, the reason why we would like to retire
>>> >>>>> operators in
>>> >>>>> >>> > Malhar is because some of them were written a long time ago
>>> >>>>> before
>>> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>>> >>>>> are not
>>> >>>>> >>> > up to par in code quality, have no potential for
>>> improvement, and
>>> >>>>> >>> > probably completely unused by anybody.
>>> >>>>> >>> >
>>> >>>>> >>> > We do not want contributors to use them as a model of their
>>> >>>>> >>> > contribution, or users to use them thinking they are of
>>> quality,
>>> >>>>> and
>>> >>>>> >>> then hit a wall.
>>> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>> >>>>> >>> >
>>> >>>>> >>> > The initial 3 packages that we would like to target are
>>> >>>>> *lib/algo*,
>>> >>>>> >>> > *lib/math*, and *lib/streamquery*.
>>> >>>>> >>>
>>> >>>>> >>> >
>>> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
>>> you
>>> >>>>> are
>>> >>>>> >>> > using any operator in these 3 packages. We would like to hear
>>> >>>>> from you.
>>> >>>>> >>> >
>>> >>>>> >>> > These are the options I can think of for retiring those
>>> >>>>> operators:
>>> >>>>> >>> >
>>> >>>>> >>> > 1) Completely remove them from the malhar repository.
>>> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
>>> called
>>> >>>>> >>> > malhar-misc
>>> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they
>>> are no
>>> >>>>> >>> > longer supported
>>> >>>>> >>> >
>>> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>> >>>>> >>> >
>>> >>>>> >>> > David
>>> >>>>> >>> >
>>> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>> >>>>> >>> > <pr...@datatorrent.com>
>>> >>>>> >>> > wrote:
>>> >>>>> >>> >
>>> >>>>> >>> >> I wanted to close the loop on this discussion. In general
>>> >>>>> everyone
>>> >>>>> >>> >> seemed to be favorable to this idea with no serious
>>> objections.
>>> >>>>> Folks
>>> >>>>> >>> >> had good suggestions like documenting capabilities of
>>> >>>>> operators, come
>>> >>>>> >>> >> up well defined criteria for graduation of operators and
>>> what
>>> >>>>> those
>>> >>>>> >>> >> criteria may be and what to do with existing operators that
>>> may
>>> >>>>> not
>>> >>>>> >>> >> yet be mature or unused.
>>> >>>>> >>> >>
>>> >>>>> >>> >> I am going to summarize the key points that resulted from
>>> the
>>> >>>>> >>> >> discussion and would like to proceed with them.
>>> >>>>> >>> >>
>>> >>>>> >>> >>    - Operators that do not yet provide the key platform
>>> >>>>> capabilities
>>> >>>>> >>> to
>>> >>>>> >>> >>    make an operator useful across different applications
>>> such as
>>> >>>>> >>> >> reusability,
>>> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>>> >>>>> will
>>> >>>>> >>> still be
>>> >>>>> >>> >>    accepted as long as they are functionally correct, have
>>> unit
>>> >>>>> tests
>>> >>>>> >>> >> and will
>>> >>>>> >>> >>    go into a separate module.
>>> >>>>> >>> >>    - Contrib module was suggested as a place where new
>>> >>>>> contributions
>>> >>>>> >>> go in
>>> >>>>> >>> >>    that don't yet have all the platform capabilities and are
>>> >>>>> not yet
>>> >>>>> >>> >> mature.
>>> >>>>> >>> >>    If there are no other suggestions we will go with this
>>> one.
>>> >>>>> >>> >>    - It was suggested the operators documentation list those
>>> >>>>> platform
>>> >>>>> >>> >>    capabilities it currently provides from the list above. I
>>> >>>>> will
>>> >>>>> >>> >> document a
>>> >>>>> >>> >>    structure for this in the contribution guidelines.
>>> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
>>> >>>>> graduate an
>>> >>>>> >>> >>    operator to the big leagues :). I will kick-off a
>>> separate
>>> >>>>> thread
>>> >>>>> >>> >> for it as
>>> >>>>> >>> >>    I think it requires its own discussion and hopefully we
>>> can
>>> >>>>> come
>>> >>>>> >>> >> up with a
>>> >>>>> >>> >>    set of guidelines for it.
>>> >>>>> >>> >>    - David brought up state of some of the existing
>>> operators
>>> >>>>> and
>>> >>>>> >>> their
>>> >>>>> >>> >>    retirement and the layout of operators in Malhar in
>>> general
>>> >>>>> and
>>> >>>>> >>> how it
>>> >>>>> >>> >>    causes problems with development. I will ask him to lead
>>> the
>>> >>>>> >>> >> discussion on
>>> >>>>> >>> >>    that.
>>> >>>>> >>> >>
>>> >>>>> >>> >> Thanks
>>> >>>>> >>> >>
>>> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>> >>>>> david@datatorrent.com>
>>> >>>>> >>> wrote:
>>> >>>>> >>> >>
>>> >>>>> >>> >> > The two ideas are not conflicting, but rather
>>> complementing.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > On the contrary, putting a new process for people trying
>>> to
>>> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>> >>>>> operators in
>>> >>>>> >>> >> > the repository
>>> >>>>> >>> >> is
>>> >>>>> >>> >> > what is conflicting.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > Keep in mind that when people try to contribute, they
>>> always
>>> >>>>> look
>>> >>>>> >>> >> > at the existing operators already in the repository as
>>> >>>>> examples and
>>> >>>>> >>> >> > likely a
>>> >>>>> >>> >> model
>>> >>>>> >>> >> > for their new operators.
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > David
>>> >>>>> >>> >> >
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>> >>>>> amol@datatorrent.com>
>>> >>>>> >>> >> wrote:
>>> >>>>> >>> >> >
>>> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>> >>>>> thread
>>> >>>>> >>> >> > > was to
>>> >>>>> >>> >> > open
>>> >>>>> >>> >> > > up a way for contributors to submit code in a dir
>>> >>>>> (contrib?) as
>>> >>>>> >>> >> > > long
>>> >>>>> >>> >> as
>>> >>>>> >>> >> > > license part of taken care of.
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > On the thread of removing non-used operators -> How do
>>> we
>>> >>>>> know
>>> >>>>> >>> >> > > what is being used?
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > Thks,
>>> >>>>> >>> >> > > Amol
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>> >>>>> >>> >> sandesh@datatorrent.com>
>>> >>>>> >>> >> > > wrote:
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> > > > +1 for removing the not-used operators.
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > So we are creating a process for operator writers who
>>> >>>>> don't
>>> >>>>> >>> >> > > > want to understand the platform, yet wants to
>>> contribute?
>>> >>>>> How
>>> >>>>> >>> >> > > > big is that
>>> >>>>> >>> >> set?
>>> >>>>> >>> >> > > > If we tell the app-user, here is the code which has
>>> not
>>> >>>>> passed
>>> >>>>> >>> >> > > > all
>>> >>>>> >>> >> the
>>> >>>>> >>> >> > > > checklist, will they be ready to use that in
>>> production?
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>>> >>>>> operators and
>>> >>>>> >>> >> > > > make
>>> >>>>> >>> >> it
>>> >>>>> >>> >> > > easy
>>> >>>>> >>> >> > > > to add more operators.
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>> >>>>> >>> >> > pramod@datatorrent.com>
>>> >>>>> >>> >> > > > wrote:
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>> >>>>> >>> >> > > > > wrote:
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > Pramod,
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > By that logic I would say let's put all
>>> partitionable
>>> >>>>> >>> >> > > > > > operators
>>> >>>>> >>> >> > into
>>> >>>>> >>> >> > > > one
>>> >>>>> >>> >> > > > > > folder, non-partitionable operators in another
>>> and so
>>> >>>>> on...
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > Remember the original goal of making it easier for
>>> new
>>> >>>>> >>> >> > > > > members to contribute and managing those
>>> contributions
>>> >>>>> to
>>> >>>>> >>> >> > > > > maturity. It is
>>> >>>>> >>> >> not a
>>> >>>>> >>> >> > > > > functional level separation.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>> >>>>> being
>>> >>>>> >>> >> > > > > > used at
>>> >>>>> >>> >> > > class
>>> >>>>> >>> >> > > > > > level and not at package/folder level.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
>>> this
>>> >>>>> like
>>> >>>>> >>> >> > > > > a
>>> >>>>> >>> >> > > folder..."
>>> >>>>> >>> >> > > > > as an analogy and not literally.
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > Thanks
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > Thanks
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>> >>>>> >>> >> > > > pramod@datatorrent.com
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > wrote:
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>> >>>>> >>> >> > > > > > > wrote:
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>> >>>>> >>> >> > > org.apache.hadoop.classification.
>>> >>>>> InterfaceStability.Evolving
>>> >>>>> >>> >> > > > /
>>> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>>> >>>>> InterfaceStability.Uns
>>> >>>>> >>> >> > > > > > > > table
>>> >>>>> >>> >> > > > > > annotation?
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > I think it is important to localize the
>>> additions
>>> >>>>> in one
>>> >>>>> >>> >> place so
>>> >>>>> >>> >> > > > that
>>> >>>>> >>> >> > > > > it
>>> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity
>>> level of
>>> >>>>> >>> >> > > > > > > these,
>>> >>>>> >>> >> > easier
>>> >>>>> >>> >> > > > for
>>> >>>>> >>> >> > > > > > > developers to track them towards the path to
>>> >>>>> maturity and
>>> >>>>> >>> >> > > > > > > also
>>> >>>>> >>> >> > > > > provides a
>>> >>>>> >>> >> > > > > > > clearer directive for committers and
>>> contributors on
>>> >>>>> >>> >> acceptance
>>> >>>>> >>> >> > of
>>> >>>>> >>> >> > > > new
>>> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
>>> makes
>>> >>>>> them
>>> >>>>> >>> >> spread
>>> >>>>> >>> >> > all
>>> >>>>> >>> >> > > > > over
>>> >>>>> >>> >> > > > > > > the place and adds an additional layer of
>>> >>>>> difficulty in
>>> >>>>> >>> >> > > > identification
>>> >>>>> >>> >> > > > > > not
>>> >>>>> >>> >> > > > > > > just for users but also for developers who want
>>> to
>>> >>>>> find
>>> >>>>> >>> >> > > > > > > such
>>> >>>>> >>> >> > > > operators
>>> >>>>> >>> >> > > > > > and
>>> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>> >>>>> annotation
>>> >>>>> >>> >> where
>>> >>>>> >>> >> > > > > > everything
>>> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > Thanks
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>> >>>>> >>> >> > > david@datatorrent.com
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > > > > > wrote:
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way
>>> too
>>> >>>>> many
>>> >>>>> >>> >> operators
>>> >>>>> >>> >> > > > that
>>> >>>>> >>> >> > > > > > fall
>>> >>>>> >>> >> > > > > > > > in
>>> >>>>> >>> >> > > > > > > > > > the
>>> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>> >>>>> should
>>> >>>>> >>> >> > > > > > > > > > > > make it
>>> >>>>> >>> >> > > > obvious
>>> >>>>> >>> >> > > > > to
>>> >>>>> >>> >> > > > > > > > users
>>> >>>>> >>> >> > > > > > > > > > > that
>>> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and
>>> which
>>> >>>>> >>> >> > > > > > > > > > > > operators
>>> >>>>> >>> >> are
>>> >>>>> >>> >> > > not,
>>> >>>>> >>> >> > > > > and
>>> >>>>> >>> >> > > > > > > > maybe
>>> >>>>> >>> >> > > > > > > > > > > even
>>> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>>> >>>>> used in a
>>> >>>>> >>> >> > > > > > > > > > > > real
>>> >>>>> >>> >> > use
>>> >>>>> >>> >> > > > > case.
>>> >>>>> >>> >> > > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>> >>>>> operators
>>> >>>>> >>> >> > > > > > > > > > > and
>>> >>>>> >>> >> > doing
>>> >>>>> >>> >> > > > this
>>> >>>>> >>> >> > > > > > > > > exercise
>>> >>>>> >>> >> > > > > > > > > > as
>>> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>> >>>>> original
>>> >>>>> >>> >> intent
>>> >>>>> >>> >> > is
>>> >>>>> >>> >> > > > for
>>> >>>>> >>> >> > > > > > > > > > > contributions going forward.
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as
>>> well.
>>> >>>>> >>> >> > > > > > > > > > Operators
>>> >>>>> >>> >> > > outside
>>> >>>>> >>> >> > > > > the
>>> >>>>> >>> >> > > > > > > play
>>> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
>>> >>>>> anticipate
>>> >>>>> >>> >> > > > > > > > > much
>>> >>>>> >>> >> > tension
>>> >>>>> >>> >> > > if
>>> >>>>> >>> >> > > > > we
>>> >>>>> >>> >> > > > > > > > > establish clear criteria.
>>> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>> >>>>> operators
>>> >>>>> >>> >> > > > > > > > > stay
>>> >>>>> >>> >> and
>>> >>>>> >>> >> > > put
>>> >>>>> >>> >> > > > up
>>> >>>>> >>> >> > > > > > the
>>> >>>>> >>> >> > > > > > > > > bars for new operators.
>>> >>>>> >>> >> > > > > > > > >
>>> >>>>> >>> >> > > > > > > > > David
>>> >>>>> >>> >> > > > > > > > >
>>> >>>>> >>> >> > > > > > > >
>>> >>>>> >>> >> > > > > > >
>>> >>>>> >>> >> > > > > >
>>> >>>>> >>> >> > > > >
>>> >>>>> >>> >> > > >
>>> >>>>> >>> >> > >
>>> >>>>> >>> >> >
>>> >>>>> >>> >>
>>> >>>>> >>> >
>>> >>>>> >>> >
>>> >>>>> >>>
>>> >>>>> >>
>>> >>>>> >>
>>> >>>>> >
>>> >>>>>
>>> >>>>
>>> >>>>
>>> >>>
>>> >>
>>> >
>>>
>>
>>
>

Re: A proposal for Malhar

Posted by Thomas Weise <th...@gmail.com>.
There are a bunch of operators that don't have proper state management and
also don't support generic windowing (event time etc.). I would suggest to
move those out or deprecate them.

The new windowing and state management support along with the appropriate
aggregators is going to make them obsolete.

Thomas


On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi,
>
> Friendly Reminder :
>
> I created a shared google sheet and tracked the various details of
> operators. The sheet contains information about operators under lib/algo,
> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
> operator . Please take a look and provide comments as if any.
>
> Thanks
> Lakshmi Prasanna
>
> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> Added comments, also recommend having the misc folder for the remaining
>> operators in contrib according to proposed guidelines
>>
>> https://github.com/apache/apex-site/pull/44
>>
>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>> lakshmi@datatorrent.com>
>> wrote:
>>
>> > Hi
>> >
>> > I also added recommendation for lib/math operators to the same document
>> as
>> > a separate sheet. Please have a look.
>> >
>> > Thanks
>> > Lakshmi Prasanna
>> >
>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>> lakshmi@datatorrent.com
>> > > wrote:
>> >
>> >> Hi,
>> >>
>> >> I also added recommendation for each operator . Please take a look.
>> >>
>> >> thanks
>> >>
>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> >> lakshmi@datatorrent.com> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I created a shared google sheet and tracked the various details of
>> >>> operators. Currently, the sheet contains information about operators
>> under
>> >>> lib/algo only. Link is https://docs.google.com/a/
>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>
>> >>> lib/math too.
>> >>>
>> >>> Thanks
>> >>> Lakshmi Prasanna
>> >>>
>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>>
>> >>>> Hi Lakshmi,
>> >>>>
>> >>>> Thanks for volunteering.
>> >>>>
>> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
>> and
>> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>> >>>> individual operators are both good, with the exception that
>> lib/streamquery
>> >>>> is one unit and we probably do not need to look at individual
>> operators
>> >>>> under it.
>> >>>>
>> >>>> If we don't have any objection in the community, let's start the
>> >>>> process.
>> >>>>
>> >>>> David
>> >>>>
>> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>> >>>> lakshmi@datatorrent.com> wrote:
>> >>>>
>> >>>>> I am interested to work on this.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Lakshmi prasanna
>> >>>>>
>> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hsy541@gmail.com
>> >
>> >>>>> wrote:
>> >>>>>
>> >>>>> > Why not have a shared google sheet with a list of operators and
>> >>>>> options
>> >>>>> > that we want to do with it.
>> >>>>> > I think it's case by case.
>> >>>>> > But retire unused or obsolete operators is important and we should
>> >>>>> do it
>> >>>>> > sooner rather than later.
>> >>>>> >
>> >>>>> > Regards,
>> >>>>> > Siyuan
>> >>>>> >
>> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <amol@datatorrent.com
>> >
>> >>>>> wrote:
>> >>>>> >
>> >>>>> >>
>> >>>>> >> My vote is to do 2&3
>> >>>>> >>
>> >>>>> >> Thks
>> >>>>> >> Amol
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> >>>>> >> VKottapalli@directv.com> wrote:
>> >>>>> >>
>> >>>>> >>> +1 for deprecating the packages listed below.
>> >>>>> >>>
>> >>>>> >>> -----Original Message-----
>> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> >>>>> >>>
>> >>>>> >>> +1
>> >>>>> >>>
>> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>> david@datatorrent.com
>> >>>>> >
>> >>>>> >>> wrote:
>> >>>>> >>>
>> >>>>> >>> > Hi all,
>> >>>>> >>> >
>> >>>>> >>> > I would like to renew the discussion of retiring operators in
>> >>>>> Malhar.
>> >>>>> >>> >
>> >>>>> >>> > As stated before, the reason why we would like to retire
>> >>>>> operators in
>> >>>>> >>> > Malhar is because some of them were written a long time ago
>> >>>>> before
>> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>> >>>>> are not
>> >>>>> >>> > up to par in code quality, have no potential for improvement,
>> and
>> >>>>> >>> > probably completely unused by anybody.
>> >>>>> >>> >
>> >>>>> >>> > We do not want contributors to use them as a model of their
>> >>>>> >>> > contribution, or users to use them thinking they are of
>> quality,
>> >>>>> and
>> >>>>> >>> then hit a wall.
>> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>> >>>>> >>> >
>> >>>>> >>> > The initial 3 packages that we would like to target are
>> >>>>> *lib/algo*,
>> >>>>> >>> > *lib/math*, and *lib/streamquery*.
>> >>>>> >>>
>> >>>>> >>> >
>> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
>> you
>> >>>>> are
>> >>>>> >>> > using any operator in these 3 packages. We would like to hear
>> >>>>> from you.
>> >>>>> >>> >
>> >>>>> >>> > These are the options I can think of for retiring those
>> >>>>> operators:
>> >>>>> >>> >
>> >>>>> >>> > 1) Completely remove them from the malhar repository.
>> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
>> called
>> >>>>> >>> > malhar-misc
>> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they
>> are no
>> >>>>> >>> > longer supported
>> >>>>> >>> >
>> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >>>>> >>> >
>> >>>>> >>> > David
>> >>>>> >>> >
>> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> >>>>> >>> > <pr...@datatorrent.com>
>> >>>>> >>> > wrote:
>> >>>>> >>> >
>> >>>>> >>> >> I wanted to close the loop on this discussion. In general
>> >>>>> everyone
>> >>>>> >>> >> seemed to be favorable to this idea with no serious
>> objections.
>> >>>>> Folks
>> >>>>> >>> >> had good suggestions like documenting capabilities of
>> >>>>> operators, come
>> >>>>> >>> >> up well defined criteria for graduation of operators and what
>> >>>>> those
>> >>>>> >>> >> criteria may be and what to do with existing operators that
>> may
>> >>>>> not
>> >>>>> >>> >> yet be mature or unused.
>> >>>>> >>> >>
>> >>>>> >>> >> I am going to summarize the key points that resulted from the
>> >>>>> >>> >> discussion and would like to proceed with them.
>> >>>>> >>> >>
>> >>>>> >>> >>    - Operators that do not yet provide the key platform
>> >>>>> capabilities
>> >>>>> >>> to
>> >>>>> >>> >>    make an operator useful across different applications
>> such as
>> >>>>> >>> >> reusability,
>> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>> >>>>> will
>> >>>>> >>> still be
>> >>>>> >>> >>    accepted as long as they are functionally correct, have
>> unit
>> >>>>> tests
>> >>>>> >>> >> and will
>> >>>>> >>> >>    go into a separate module.
>> >>>>> >>> >>    - Contrib module was suggested as a place where new
>> >>>>> contributions
>> >>>>> >>> go in
>> >>>>> >>> >>    that don't yet have all the platform capabilities and are
>> >>>>> not yet
>> >>>>> >>> >> mature.
>> >>>>> >>> >>    If there are no other suggestions we will go with this
>> one.
>> >>>>> >>> >>    - It was suggested the operators documentation list those
>> >>>>> platform
>> >>>>> >>> >>    capabilities it currently provides from the list above. I
>> >>>>> will
>> >>>>> >>> >> document a
>> >>>>> >>> >>    structure for this in the contribution guidelines.
>> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
>> >>>>> graduate an
>> >>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>> >>>>> thread
>> >>>>> >>> >> for it as
>> >>>>> >>> >>    I think it requires its own discussion and hopefully we
>> can
>> >>>>> come
>> >>>>> >>> >> up with a
>> >>>>> >>> >>    set of guidelines for it.
>> >>>>> >>> >>    - David brought up state of some of the existing operators
>> >>>>> and
>> >>>>> >>> their
>> >>>>> >>> >>    retirement and the layout of operators in Malhar in
>> general
>> >>>>> and
>> >>>>> >>> how it
>> >>>>> >>> >>    causes problems with development. I will ask him to lead
>> the
>> >>>>> >>> >> discussion on
>> >>>>> >>> >>    that.
>> >>>>> >>> >>
>> >>>>> >>> >> Thanks
>> >>>>> >>> >>
>> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>> >>>>> david@datatorrent.com>
>> >>>>> >>> wrote:
>> >>>>> >>> >>
>> >>>>> >>> >> > The two ideas are not conflicting, but rather
>> complementing.
>> >>>>> >>> >> >
>> >>>>> >>> >> > On the contrary, putting a new process for people trying to
>> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
>> >>>>> operators in
>> >>>>> >>> >> > the repository
>> >>>>> >>> >> is
>> >>>>> >>> >> > what is conflicting.
>> >>>>> >>> >> >
>> >>>>> >>> >> > Keep in mind that when people try to contribute, they
>> always
>> >>>>> look
>> >>>>> >>> >> > at the existing operators already in the repository as
>> >>>>> examples and
>> >>>>> >>> >> > likely a
>> >>>>> >>> >> model
>> >>>>> >>> >> > for their new operators.
>> >>>>> >>> >> >
>> >>>>> >>> >> > David
>> >>>>> >>> >> >
>> >>>>> >>> >> >
>> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>> >>>>> amol@datatorrent.com>
>> >>>>> >>> >> wrote:
>> >>>>> >>> >> >
>> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
>> >>>>> thread
>> >>>>> >>> >> > > was to
>> >>>>> >>> >> > open
>> >>>>> >>> >> > > up a way for contributors to submit code in a dir
>> >>>>> (contrib?) as
>> >>>>> >>> >> > > long
>> >>>>> >>> >> as
>> >>>>> >>> >> > > license part of taken care of.
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>> >>>>> know
>> >>>>> >>> >> > > what is being used?
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > Thks,
>> >>>>> >>> >> > > Amol
>> >>>>> >>> >> > >
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >>>>> >>> >> sandesh@datatorrent.com>
>> >>>>> >>> >> > > wrote:
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > > +1 for removing the not-used operators.
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > So we are creating a process for operator writers who
>> >>>>> don't
>> >>>>> >>> >> > > > want to understand the platform, yet wants to
>> contribute?
>> >>>>> How
>> >>>>> >>> >> > > > big is that
>> >>>>> >>> >> set?
>> >>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>> >>>>> passed
>> >>>>> >>> >> > > > all
>> >>>>> >>> >> the
>> >>>>> >>> >> > > > checklist, will they be ready to use that in
>> production?
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>> >>>>> operators and
>> >>>>> >>> >> > > > make
>> >>>>> >>> >> it
>> >>>>> >>> >> > > easy
>> >>>>> >>> >> > > > to add more operators.
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >>>>> >>> >> > pramod@datatorrent.com>
>> >>>>> >>> >> > > > wrote:
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
>> >>>>> >>> >> > > > > wrote:
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > Pramod,
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > By that logic I would say let's put all
>> partitionable
>> >>>>> >>> >> > > > > > operators
>> >>>>> >>> >> > into
>> >>>>> >>> >> > > > one
>> >>>>> >>> >> > > > > > folder, non-partitionable operators in another and
>> so
>> >>>>> on...
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > Remember the original goal of making it easier for
>> new
>> >>>>> >>> >> > > > > members to contribute and managing those
>> contributions
>> >>>>> to
>> >>>>> >>> >> > > > > maturity. It is
>> >>>>> >>> >> not a
>> >>>>> >>> >> > > > > functional level separation.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>> >>>>> being
>> >>>>> >>> >> > > > > > used at
>> >>>>> >>> >> > > class
>> >>>>> >>> >> > > > > > level and not at package/folder level.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
>> this
>> >>>>> like
>> >>>>> >>> >> > > > > a
>> >>>>> >>> >> > > folder..."
>> >>>>> >>> >> > > > > as an analogy and not literally.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > Thanks
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > Thanks
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >>>>> >>> >> > > > pramod@datatorrent.com
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > wrote:
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>> >>>>> >>> >> > > > > > > wrote:
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>> >>>>> >>> >> > > org.apache.hadoop.classification.
>> >>>>> InterfaceStability.Evolving
>> >>>>> >>> >> > > > /
>> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>> >>>>> InterfaceStability.Uns
>> >>>>> >>> >> > > > > > > > table
>> >>>>> >>> >> > > > > > annotation?
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > I think it is important to localize the additions
>> >>>>> in one
>> >>>>> >>> >> place so
>> >>>>> >>> >> > > > that
>> >>>>> >>> >> > > > > it
>> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity
>> level of
>> >>>>> >>> >> > > > > > > these,
>> >>>>> >>> >> > easier
>> >>>>> >>> >> > > > for
>> >>>>> >>> >> > > > > > > developers to track them towards the path to
>> >>>>> maturity and
>> >>>>> >>> >> > > > > > > also
>> >>>>> >>> >> > > > > provides a
>> >>>>> >>> >> > > > > > > clearer directive for committers and
>> contributors on
>> >>>>> >>> >> acceptance
>> >>>>> >>> >> > of
>> >>>>> >>> >> > > > new
>> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
>> makes
>> >>>>> them
>> >>>>> >>> >> spread
>> >>>>> >>> >> > all
>> >>>>> >>> >> > > > > over
>> >>>>> >>> >> > > > > > > the place and adds an additional layer of
>> >>>>> difficulty in
>> >>>>> >>> >> > > > identification
>> >>>>> >>> >> > > > > > not
>> >>>>> >>> >> > > > > > > just for users but also for developers who want
>> to
>> >>>>> find
>> >>>>> >>> >> > > > > > > such
>> >>>>> >>> >> > > > operators
>> >>>>> >>> >> > > > > > and
>> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
>> >>>>> annotation
>> >>>>> >>> >> where
>> >>>>> >>> >> > > > > > everything
>> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > Thanks
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >>>>> >>> >> > > david@datatorrent.com
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > > wrote:
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way
>> too
>> >>>>> many
>> >>>>> >>> >> operators
>> >>>>> >>> >> > > > that
>> >>>>> >>> >> > > > > > fall
>> >>>>> >>> >> > > > > > > > in
>> >>>>> >>> >> > > > > > > > > > the
>> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>> >>>>> should
>> >>>>> >>> >> > > > > > > > > > > > make it
>> >>>>> >>> >> > > > obvious
>> >>>>> >>> >> > > > > to
>> >>>>> >>> >> > > > > > > > users
>> >>>>> >>> >> > > > > > > > > > > that
>> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and
>> which
>> >>>>> >>> >> > > > > > > > > > > > operators
>> >>>>> >>> >> are
>> >>>>> >>> >> > > not,
>> >>>>> >>> >> > > > > and
>> >>>>> >>> >> > > > > > > > maybe
>> >>>>> >>> >> > > > > > > > > > > even
>> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>> >>>>> used in a
>> >>>>> >>> >> > > > > > > > > > > > real
>> >>>>> >>> >> > use
>> >>>>> >>> >> > > > > case.
>> >>>>> >>> >> > > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>> >>>>> operators
>> >>>>> >>> >> > > > > > > > > > > and
>> >>>>> >>> >> > doing
>> >>>>> >>> >> > > > this
>> >>>>> >>> >> > > > > > > > > exercise
>> >>>>> >>> >> > > > > > > > > > as
>> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>> >>>>> original
>> >>>>> >>> >> intent
>> >>>>> >>> >> > is
>> >>>>> >>> >> > > > for
>> >>>>> >>> >> > > > > > > > > > > contributions going forward.
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as
>> well.
>> >>>>> >>> >> > > > > > > > > > Operators
>> >>>>> >>> >> > > outside
>> >>>>> >>> >> > > > > the
>> >>>>> >>> >> > > > > > > play
>> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
>> >>>>> anticipate
>> >>>>> >>> >> > > > > > > > > much
>> >>>>> >>> >> > tension
>> >>>>> >>> >> > > if
>> >>>>> >>> >> > > > > we
>> >>>>> >>> >> > > > > > > > > establish clear criteria.
>> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>> >>>>> operators
>> >>>>> >>> >> > > > > > > > > stay
>> >>>>> >>> >> and
>> >>>>> >>> >> > > put
>> >>>>> >>> >> > > > up
>> >>>>> >>> >> > > > > > the
>> >>>>> >>> >> > > > > > > > > bars for new operators.
>> >>>>> >>> >> > > > > > > > >
>> >>>>> >>> >> > > > > > > > > David
>> >>>>> >>> >> > > > > > > > >
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > >
>> >>>>> >>> >> >
>> >>>>> >>> >>
>> >>>>> >>> >
>> >>>>> >>> >
>> >>>>> >>>
>> >>>>> >>
>> >>>>> >>
>> >>>>> >
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Thomas Weise <th...@gmail.com>.
There are a bunch of operators that don't have proper state management and
also don't support generic windowing (event time etc.). I would suggest to
move those out or deprecate them.

The new windowing and state management support along with the appropriate
aggregators is going to make them obsolete.

Thomas


On Tue, Aug 9, 2016 at 10:47 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi,
>
> Friendly Reminder :
>
> I created a shared google sheet and tracked the various details of
> operators. The sheet contains information about operators under lib/algo,
> lib/math & lib/streamquery. Link is https://docs.google.com/a/d
> atatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptD
> J_CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
> operator . Please take a look and provide comments as if any.
>
> Thanks
> Lakshmi Prasanna
>
> On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
> wrote:
>
>> Added comments, also recommend having the misc folder for the remaining
>> operators in contrib according to proposed guidelines
>>
>> https://github.com/apache/apex-site/pull/44
>>
>> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <
>> lakshmi@datatorrent.com>
>> wrote:
>>
>> > Hi
>> >
>> > I also added recommendation for lib/math operators to the same document
>> as
>> > a separate sheet. Please have a look.
>> >
>> > Thanks
>> > Lakshmi Prasanna
>> >
>> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
>> lakshmi@datatorrent.com
>> > > wrote:
>> >
>> >> Hi,
>> >>
>> >> I also added recommendation for each operator . Please take a look.
>> >>
>> >> thanks
>> >>
>> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> >> lakshmi@datatorrent.com> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> I created a shared google sheet and tracked the various details of
>> >>> operators. Currently, the sheet contains information about operators
>> under
>> >>> lib/algo only. Link is https://docs.google.com/a/
>> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>
>> >>> lib/math too.
>> >>>
>> >>> Thanks
>> >>> Lakshmi Prasanna
>> >>>
>> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>> >>> wrote:
>> >>>
>> >>>> Hi Lakshmi,
>> >>>>
>> >>>> Thanks for volunteering.
>> >>>>
>> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
>> and
>> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>> >>>> individual operators are both good, with the exception that
>> lib/streamquery
>> >>>> is one unit and we probably do not need to look at individual
>> operators
>> >>>> under it.
>> >>>>
>> >>>> If we don't have any objection in the community, let's start the
>> >>>> process.
>> >>>>
>> >>>> David
>> >>>>
>> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>> >>>> lakshmi@datatorrent.com> wrote:
>> >>>>
>> >>>>> I am interested to work on this.
>> >>>>>
>> >>>>> Regards,
>> >>>>> Lakshmi prasanna
>> >>>>>
>> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hsy541@gmail.com
>> >
>> >>>>> wrote:
>> >>>>>
>> >>>>> > Why not have a shared google sheet with a list of operators and
>> >>>>> options
>> >>>>> > that we want to do with it.
>> >>>>> > I think it's case by case.
>> >>>>> > But retire unused or obsolete operators is important and we should
>> >>>>> do it
>> >>>>> > sooner rather than later.
>> >>>>> >
>> >>>>> > Regards,
>> >>>>> > Siyuan
>> >>>>> >
>> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <amol@datatorrent.com
>> >
>> >>>>> wrote:
>> >>>>> >
>> >>>>> >>
>> >>>>> >> My vote is to do 2&3
>> >>>>> >>
>> >>>>> >> Thks
>> >>>>> >> Amol
>> >>>>> >>
>> >>>>> >>
>> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>> >>>>> >> VKottapalli@directv.com> wrote:
>> >>>>> >>
>> >>>>> >>> +1 for deprecating the packages listed below.
>> >>>>> >>>
>> >>>>> >>> -----Original Message-----
>> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>> >>>>> >>>
>> >>>>> >>> +1
>> >>>>> >>>
>> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
>> david@datatorrent.com
>> >>>>> >
>> >>>>> >>> wrote:
>> >>>>> >>>
>> >>>>> >>> > Hi all,
>> >>>>> >>> >
>> >>>>> >>> > I would like to renew the discussion of retiring operators in
>> >>>>> Malhar.
>> >>>>> >>> >
>> >>>>> >>> > As stated before, the reason why we would like to retire
>> >>>>> operators in
>> >>>>> >>> > Malhar is because some of them were written a long time ago
>> >>>>> before
>> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>> >>>>> are not
>> >>>>> >>> > up to par in code quality, have no potential for improvement,
>> and
>> >>>>> >>> > probably completely unused by anybody.
>> >>>>> >>> >
>> >>>>> >>> > We do not want contributors to use them as a model of their
>> >>>>> >>> > contribution, or users to use them thinking they are of
>> quality,
>> >>>>> and
>> >>>>> >>> then hit a wall.
>> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>> >>>>> >>> >
>> >>>>> >>> > The initial 3 packages that we would like to target are
>> >>>>> *lib/algo*,
>> >>>>> >>> > *lib/math*, and *lib/streamquery*.
>> >>>>> >>>
>> >>>>> >>> >
>> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
>> you
>> >>>>> are
>> >>>>> >>> > using any operator in these 3 packages. We would like to hear
>> >>>>> from you.
>> >>>>> >>> >
>> >>>>> >>> > These are the options I can think of for retiring those
>> >>>>> operators:
>> >>>>> >>> >
>> >>>>> >>> > 1) Completely remove them from the malhar repository.
>> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
>> called
>> >>>>> >>> > malhar-misc
>> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they
>> are no
>> >>>>> >>> > longer supported
>> >>>>> >>> >
>> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>> >>>>> >>> >
>> >>>>> >>> > David
>> >>>>> >>> >
>> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>> >>>>> >>> > <pr...@datatorrent.com>
>> >>>>> >>> > wrote:
>> >>>>> >>> >
>> >>>>> >>> >> I wanted to close the loop on this discussion. In general
>> >>>>> everyone
>> >>>>> >>> >> seemed to be favorable to this idea with no serious
>> objections.
>> >>>>> Folks
>> >>>>> >>> >> had good suggestions like documenting capabilities of
>> >>>>> operators, come
>> >>>>> >>> >> up well defined criteria for graduation of operators and what
>> >>>>> those
>> >>>>> >>> >> criteria may be and what to do with existing operators that
>> may
>> >>>>> not
>> >>>>> >>> >> yet be mature or unused.
>> >>>>> >>> >>
>> >>>>> >>> >> I am going to summarize the key points that resulted from the
>> >>>>> >>> >> discussion and would like to proceed with them.
>> >>>>> >>> >>
>> >>>>> >>> >>    - Operators that do not yet provide the key platform
>> >>>>> capabilities
>> >>>>> >>> to
>> >>>>> >>> >>    make an operator useful across different applications
>> such as
>> >>>>> >>> >> reusability,
>> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>> >>>>> will
>> >>>>> >>> still be
>> >>>>> >>> >>    accepted as long as they are functionally correct, have
>> unit
>> >>>>> tests
>> >>>>> >>> >> and will
>> >>>>> >>> >>    go into a separate module.
>> >>>>> >>> >>    - Contrib module was suggested as a place where new
>> >>>>> contributions
>> >>>>> >>> go in
>> >>>>> >>> >>    that don't yet have all the platform capabilities and are
>> >>>>> not yet
>> >>>>> >>> >> mature.
>> >>>>> >>> >>    If there are no other suggestions we will go with this
>> one.
>> >>>>> >>> >>    - It was suggested the operators documentation list those
>> >>>>> platform
>> >>>>> >>> >>    capabilities it currently provides from the list above. I
>> >>>>> will
>> >>>>> >>> >> document a
>> >>>>> >>> >>    structure for this in the contribution guidelines.
>> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
>> >>>>> graduate an
>> >>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>> >>>>> thread
>> >>>>> >>> >> for it as
>> >>>>> >>> >>    I think it requires its own discussion and hopefully we
>> can
>> >>>>> come
>> >>>>> >>> >> up with a
>> >>>>> >>> >>    set of guidelines for it.
>> >>>>> >>> >>    - David brought up state of some of the existing operators
>> >>>>> and
>> >>>>> >>> their
>> >>>>> >>> >>    retirement and the layout of operators in Malhar in
>> general
>> >>>>> and
>> >>>>> >>> how it
>> >>>>> >>> >>    causes problems with development. I will ask him to lead
>> the
>> >>>>> >>> >> discussion on
>> >>>>> >>> >>    that.
>> >>>>> >>> >>
>> >>>>> >>> >> Thanks
>> >>>>> >>> >>
>> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>> >>>>> david@datatorrent.com>
>> >>>>> >>> wrote:
>> >>>>> >>> >>
>> >>>>> >>> >> > The two ideas are not conflicting, but rather
>> complementing.
>> >>>>> >>> >> >
>> >>>>> >>> >> > On the contrary, putting a new process for people trying to
>> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
>> >>>>> operators in
>> >>>>> >>> >> > the repository
>> >>>>> >>> >> is
>> >>>>> >>> >> > what is conflicting.
>> >>>>> >>> >> >
>> >>>>> >>> >> > Keep in mind that when people try to contribute, they
>> always
>> >>>>> look
>> >>>>> >>> >> > at the existing operators already in the repository as
>> >>>>> examples and
>> >>>>> >>> >> > likely a
>> >>>>> >>> >> model
>> >>>>> >>> >> > for their new operators.
>> >>>>> >>> >> >
>> >>>>> >>> >> > David
>> >>>>> >>> >> >
>> >>>>> >>> >> >
>> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>> >>>>> amol@datatorrent.com>
>> >>>>> >>> >> wrote:
>> >>>>> >>> >> >
>> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
>> >>>>> thread
>> >>>>> >>> >> > > was to
>> >>>>> >>> >> > open
>> >>>>> >>> >> > > up a way for contributors to submit code in a dir
>> >>>>> (contrib?) as
>> >>>>> >>> >> > > long
>> >>>>> >>> >> as
>> >>>>> >>> >> > > license part of taken care of.
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>> >>>>> know
>> >>>>> >>> >> > > what is being used?
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > Thks,
>> >>>>> >>> >> > > Amol
>> >>>>> >>> >> > >
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>> >>>>> >>> >> sandesh@datatorrent.com>
>> >>>>> >>> >> > > wrote:
>> >>>>> >>> >> > >
>> >>>>> >>> >> > > > +1 for removing the not-used operators.
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > So we are creating a process for operator writers who
>> >>>>> don't
>> >>>>> >>> >> > > > want to understand the platform, yet wants to
>> contribute?
>> >>>>> How
>> >>>>> >>> >> > > > big is that
>> >>>>> >>> >> set?
>> >>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>> >>>>> passed
>> >>>>> >>> >> > > > all
>> >>>>> >>> >> the
>> >>>>> >>> >> > > > checklist, will they be ready to use that in
>> production?
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>> >>>>> operators and
>> >>>>> >>> >> > > > make
>> >>>>> >>> >> it
>> >>>>> >>> >> > > easy
>> >>>>> >>> >> > > > to add more operators.
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>> >>>>> >>> >> > pramod@datatorrent.com>
>> >>>>> >>> >> > > > wrote:
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
>> >>>>> >>> >> > > > > wrote:
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > Pramod,
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > By that logic I would say let's put all
>> partitionable
>> >>>>> >>> >> > > > > > operators
>> >>>>> >>> >> > into
>> >>>>> >>> >> > > > one
>> >>>>> >>> >> > > > > > folder, non-partitionable operators in another and
>> so
>> >>>>> on...
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > Remember the original goal of making it easier for
>> new
>> >>>>> >>> >> > > > > members to contribute and managing those
>> contributions
>> >>>>> to
>> >>>>> >>> >> > > > > maturity. It is
>> >>>>> >>> >> not a
>> >>>>> >>> >> > > > > functional level separation.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>> >>>>> being
>> >>>>> >>> >> > > > > > used at
>> >>>>> >>> >> > > class
>> >>>>> >>> >> > > > > > level and not at package/folder level.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
>> this
>> >>>>> like
>> >>>>> >>> >> > > > > a
>> >>>>> >>> >> > > folder..."
>> >>>>> >>> >> > > > > as an analogy and not literally.
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > Thanks
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > Thanks
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>> >>>>> >>> >> > > > pramod@datatorrent.com
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > wrote:
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>> >>>>> >>> >> > > > > > > wrote:
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>> >>>>> >>> >> > > org.apache.hadoop.classification.
>> >>>>> InterfaceStability.Evolving
>> >>>>> >>> >> > > > /
>> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>> >>>>> InterfaceStability.Uns
>> >>>>> >>> >> > > > > > > > table
>> >>>>> >>> >> > > > > > annotation?
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > I think it is important to localize the additions
>> >>>>> in one
>> >>>>> >>> >> place so
>> >>>>> >>> >> > > > that
>> >>>>> >>> >> > > > > it
>> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity
>> level of
>> >>>>> >>> >> > > > > > > these,
>> >>>>> >>> >> > easier
>> >>>>> >>> >> > > > for
>> >>>>> >>> >> > > > > > > developers to track them towards the path to
>> >>>>> maturity and
>> >>>>> >>> >> > > > > > > also
>> >>>>> >>> >> > > > > provides a
>> >>>>> >>> >> > > > > > > clearer directive for committers and
>> contributors on
>> >>>>> >>> >> acceptance
>> >>>>> >>> >> > of
>> >>>>> >>> >> > > > new
>> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
>> makes
>> >>>>> them
>> >>>>> >>> >> spread
>> >>>>> >>> >> > all
>> >>>>> >>> >> > > > > over
>> >>>>> >>> >> > > > > > > the place and adds an additional layer of
>> >>>>> difficulty in
>> >>>>> >>> >> > > > identification
>> >>>>> >>> >> > > > > > not
>> >>>>> >>> >> > > > > > > just for users but also for developers who want
>> to
>> >>>>> find
>> >>>>> >>> >> > > > > > > such
>> >>>>> >>> >> > > > operators
>> >>>>> >>> >> > > > > > and
>> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
>> >>>>> annotation
>> >>>>> >>> >> where
>> >>>>> >>> >> > > > > > everything
>> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > Thanks
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>> >>>>> >>> >> > > david@datatorrent.com
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > > > > > wrote:
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way
>> too
>> >>>>> many
>> >>>>> >>> >> operators
>> >>>>> >>> >> > > > that
>> >>>>> >>> >> > > > > > fall
>> >>>>> >>> >> > > > > > > > in
>> >>>>> >>> >> > > > > > > > > > the
>> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>> >>>>> should
>> >>>>> >>> >> > > > > > > > > > > > make it
>> >>>>> >>> >> > > > obvious
>> >>>>> >>> >> > > > > to
>> >>>>> >>> >> > > > > > > > users
>> >>>>> >>> >> > > > > > > > > > > that
>> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and
>> which
>> >>>>> >>> >> > > > > > > > > > > > operators
>> >>>>> >>> >> are
>> >>>>> >>> >> > > not,
>> >>>>> >>> >> > > > > and
>> >>>>> >>> >> > > > > > > > maybe
>> >>>>> >>> >> > > > > > > > > > > even
>> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>> >>>>> used in a
>> >>>>> >>> >> > > > > > > > > > > > real
>> >>>>> >>> >> > use
>> >>>>> >>> >> > > > > case.
>> >>>>> >>> >> > > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>> >>>>> operators
>> >>>>> >>> >> > > > > > > > > > > and
>> >>>>> >>> >> > doing
>> >>>>> >>> >> > > > this
>> >>>>> >>> >> > > > > > > > > exercise
>> >>>>> >>> >> > > > > > > > > > as
>> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>> >>>>> original
>> >>>>> >>> >> intent
>> >>>>> >>> >> > is
>> >>>>> >>> >> > > > for
>> >>>>> >>> >> > > > > > > > > > > contributions going forward.
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as
>> well.
>> >>>>> >>> >> > > > > > > > > > Operators
>> >>>>> >>> >> > > outside
>> >>>>> >>> >> > > > > the
>> >>>>> >>> >> > > > > > > play
>> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > >
>> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
>> >>>>> anticipate
>> >>>>> >>> >> > > > > > > > > much
>> >>>>> >>> >> > tension
>> >>>>> >>> >> > > if
>> >>>>> >>> >> > > > > we
>> >>>>> >>> >> > > > > > > > > establish clear criteria.
>> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>> >>>>> operators
>> >>>>> >>> >> > > > > > > > > stay
>> >>>>> >>> >> and
>> >>>>> >>> >> > > put
>> >>>>> >>> >> > > > up
>> >>>>> >>> >> > > > > > the
>> >>>>> >>> >> > > > > > > > > bars for new operators.
>> >>>>> >>> >> > > > > > > > >
>> >>>>> >>> >> > > > > > > > > David
>> >>>>> >>> >> > > > > > > > >
>> >>>>> >>> >> > > > > > > >
>> >>>>> >>> >> > > > > > >
>> >>>>> >>> >> > > > > >
>> >>>>> >>> >> > > > >
>> >>>>> >>> >> > > >
>> >>>>> >>> >> > >
>> >>>>> >>> >> >
>> >>>>> >>> >>
>> >>>>> >>> >
>> >>>>> >>> >
>> >>>>> >>>
>> >>>>> >>
>> >>>>> >>
>> >>>>> >
>> >>>>>
>> >>>>
>> >>>>
>> >>>
>> >>
>> >
>>
>
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Hi,

Friendly Reminder :

I created a shared google sheet and tracked the various details of
operators. The sheet contains information about operators under lib/algo,
lib/math & lib/streamquery. Link is https://docs.google.com/a/
datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
operator . Please take a look and provide comments as if any.

Thanks
Lakshmi Prasanna

On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> Added comments, also recommend having the misc folder for the remaining
> operators in contrib according to proposed guidelines
>
> https://github.com/apache/apex-site/pull/44
>
> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <lakshmi@datatorrent.com
> >
> wrote:
>
> > Hi
> >
> > I also added recommendation for lib/math operators to the same document
> as
> > a separate sheet. Please have a look.
> >
> > Thanks
> > Lakshmi Prasanna
> >
> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
> lakshmi@datatorrent.com
> > > wrote:
> >
> >> Hi,
> >>
> >> I also added recommendation for each operator . Please take a look.
> >>
> >> thanks
> >>
> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
> >> lakshmi@datatorrent.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I created a shared google sheet and tracked the various details of
> >>> operators. Currently, the sheet contains information about operators
> under
> >>> lib/algo only. Link is https://docs.google.com/a/
> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
> >>> lib/math too.
> >>>
> >>> Thanks
> >>> Lakshmi Prasanna
> >>>
> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
> >>> wrote:
> >>>
> >>>> Hi Lakshmi,
> >>>>
> >>>> Thanks for volunteering.
> >>>>
> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
> and
> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
> >>>> individual operators are both good, with the exception that
> lib/streamquery
> >>>> is one unit and we probably do not need to look at individual
> operators
> >>>> under it.
> >>>>
> >>>> If we don't have any objection in the community, let's start the
> >>>> process.
> >>>>
> >>>> David
> >>>>
> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
> >>>> lakshmi@datatorrent.com> wrote:
> >>>>
> >>>>> I am interested to work on this.
> >>>>>
> >>>>> Regards,
> >>>>> Lakshmi prasanna
> >>>>>
> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> > Why not have a shared google sheet with a list of operators and
> >>>>> options
> >>>>> > that we want to do with it.
> >>>>> > I think it's case by case.
> >>>>> > But retire unused or obsolete operators is important and we should
> >>>>> do it
> >>>>> > sooner rather than later.
> >>>>> >
> >>>>> > Regards,
> >>>>> > Siyuan
> >>>>> >
> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
> >>>>> wrote:
> >>>>> >
> >>>>> >>
> >>>>> >> My vote is to do 2&3
> >>>>> >>
> >>>>> >> Thks
> >>>>> >> Amol
> >>>>> >>
> >>>>> >>
> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> >>>>> >> VKottapalli@directv.com> wrote:
> >>>>> >>
> >>>>> >>> +1 for deprecating the packages listed below.
> >>>>> >>>
> >>>>> >>> -----Original Message-----
> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
> >>>>> >>>
> >>>>> >>> +1
> >>>>> >>>
> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
> david@datatorrent.com
> >>>>> >
> >>>>> >>> wrote:
> >>>>> >>>
> >>>>> >>> > Hi all,
> >>>>> >>> >
> >>>>> >>> > I would like to renew the discussion of retiring operators in
> >>>>> Malhar.
> >>>>> >>> >
> >>>>> >>> > As stated before, the reason why we would like to retire
> >>>>> operators in
> >>>>> >>> > Malhar is because some of them were written a long time ago
> >>>>> before
> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
> >>>>> are not
> >>>>> >>> > up to par in code quality, have no potential for improvement,
> and
> >>>>> >>> > probably completely unused by anybody.
> >>>>> >>> >
> >>>>> >>> > We do not want contributors to use them as a model of their
> >>>>> >>> > contribution, or users to use them thinking they are of
> quality,
> >>>>> and
> >>>>> >>> then hit a wall.
> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
> >>>>> >>> >
> >>>>> >>> > The initial 3 packages that we would like to target are
> >>>>> *lib/algo*,
> >>>>> >>> > *lib/math*, and *lib/streamquery*.
> >>>>> >>>
> >>>>> >>> >
> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
> you
> >>>>> are
> >>>>> >>> > using any operator in these 3 packages. We would like to hear
> >>>>> from you.
> >>>>> >>> >
> >>>>> >>> > These are the options I can think of for retiring those
> >>>>> operators:
> >>>>> >>> >
> >>>>> >>> > 1) Completely remove them from the malhar repository.
> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
> called
> >>>>> >>> > malhar-misc
> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are
> no
> >>>>> >>> > longer supported
> >>>>> >>> >
> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >>>>> >>> >
> >>>>> >>> > David
> >>>>> >>> >
> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> >>>>> >>> > <pr...@datatorrent.com>
> >>>>> >>> > wrote:
> >>>>> >>> >
> >>>>> >>> >> I wanted to close the loop on this discussion. In general
> >>>>> everyone
> >>>>> >>> >> seemed to be favorable to this idea with no serious
> objections.
> >>>>> Folks
> >>>>> >>> >> had good suggestions like documenting capabilities of
> >>>>> operators, come
> >>>>> >>> >> up well defined criteria for graduation of operators and what
> >>>>> those
> >>>>> >>> >> criteria may be and what to do with existing operators that
> may
> >>>>> not
> >>>>> >>> >> yet be mature or unused.
> >>>>> >>> >>
> >>>>> >>> >> I am going to summarize the key points that resulted from the
> >>>>> >>> >> discussion and would like to proceed with them.
> >>>>> >>> >>
> >>>>> >>> >>    - Operators that do not yet provide the key platform
> >>>>> capabilities
> >>>>> >>> to
> >>>>> >>> >>    make an operator useful across different applications such
> as
> >>>>> >>> >> reusability,
> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
> >>>>> will
> >>>>> >>> still be
> >>>>> >>> >>    accepted as long as they are functionally correct, have
> unit
> >>>>> tests
> >>>>> >>> >> and will
> >>>>> >>> >>    go into a separate module.
> >>>>> >>> >>    - Contrib module was suggested as a place where new
> >>>>> contributions
> >>>>> >>> go in
> >>>>> >>> >>    that don't yet have all the platform capabilities and are
> >>>>> not yet
> >>>>> >>> >> mature.
> >>>>> >>> >>    If there are no other suggestions we will go with this one.
> >>>>> >>> >>    - It was suggested the operators documentation list those
> >>>>> platform
> >>>>> >>> >>    capabilities it currently provides from the list above. I
> >>>>> will
> >>>>> >>> >> document a
> >>>>> >>> >>    structure for this in the contribution guidelines.
> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
> >>>>> graduate an
> >>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
> >>>>> thread
> >>>>> >>> >> for it as
> >>>>> >>> >>    I think it requires its own discussion and hopefully we can
> >>>>> come
> >>>>> >>> >> up with a
> >>>>> >>> >>    set of guidelines for it.
> >>>>> >>> >>    - David brought up state of some of the existing operators
> >>>>> and
> >>>>> >>> their
> >>>>> >>> >>    retirement and the layout of operators in Malhar in general
> >>>>> and
> >>>>> >>> how it
> >>>>> >>> >>    causes problems with development. I will ask him to lead
> the
> >>>>> >>> >> discussion on
> >>>>> >>> >>    that.
> >>>>> >>> >>
> >>>>> >>> >> Thanks
> >>>>> >>> >>
> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
> >>>>> david@datatorrent.com>
> >>>>> >>> wrote:
> >>>>> >>> >>
> >>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
> >>>>> >>> >> >
> >>>>> >>> >> > On the contrary, putting a new process for people trying to
> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
> >>>>> operators in
> >>>>> >>> >> > the repository
> >>>>> >>> >> is
> >>>>> >>> >> > what is conflicting.
> >>>>> >>> >> >
> >>>>> >>> >> > Keep in mind that when people try to contribute, they always
> >>>>> look
> >>>>> >>> >> > at the existing operators already in the repository as
> >>>>> examples and
> >>>>> >>> >> > likely a
> >>>>> >>> >> model
> >>>>> >>> >> > for their new operators.
> >>>>> >>> >> >
> >>>>> >>> >> > David
> >>>>> >>> >> >
> >>>>> >>> >> >
> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
> >>>>> amol@datatorrent.com>
> >>>>> >>> >> wrote:
> >>>>> >>> >> >
> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
> >>>>> thread
> >>>>> >>> >> > > was to
> >>>>> >>> >> > open
> >>>>> >>> >> > > up a way for contributors to submit code in a dir
> >>>>> (contrib?) as
> >>>>> >>> >> > > long
> >>>>> >>> >> as
> >>>>> >>> >> > > license part of taken care of.
> >>>>> >>> >> > >
> >>>>> >>> >> > > On the thread of removing non-used operators -> How do we
> >>>>> know
> >>>>> >>> >> > > what is being used?
> >>>>> >>> >> > >
> >>>>> >>> >> > > Thks,
> >>>>> >>> >> > > Amol
> >>>>> >>> >> > >
> >>>>> >>> >> > >
> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >>>>> >>> >> sandesh@datatorrent.com>
> >>>>> >>> >> > > wrote:
> >>>>> >>> >> > >
> >>>>> >>> >> > > > +1 for removing the not-used operators.
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > So we are creating a process for operator writers who
> >>>>> don't
> >>>>> >>> >> > > > want to understand the platform, yet wants to
> contribute?
> >>>>> How
> >>>>> >>> >> > > > big is that
> >>>>> >>> >> set?
> >>>>> >>> >> > > > If we tell the app-user, here is the code which has not
> >>>>> passed
> >>>>> >>> >> > > > all
> >>>>> >>> >> the
> >>>>> >>> >> > > > checklist, will they be ready to use that in production?
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
> >>>>> operators and
> >>>>> >>> >> > > > make
> >>>>> >>> >> it
> >>>>> >>> >> > > easy
> >>>>> >>> >> > > > to add more operators.
> >>>>> >>> >> > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> >>>>> >>> >> > pramod@datatorrent.com>
> >>>>> >>> >> > > > wrote:
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
> >>>>> >>> >> > > > > wrote:
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > Pramod,
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > By that logic I would say let's put all
> partitionable
> >>>>> >>> >> > > > > > operators
> >>>>> >>> >> > into
> >>>>> >>> >> > > > one
> >>>>> >>> >> > > > > > folder, non-partitionable operators in another and
> so
> >>>>> on...
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > Remember the original goal of making it easier for new
> >>>>> >>> >> > > > > members to contribute and managing those contributions
> >>>>> to
> >>>>> >>> >> > > > > maturity. It is
> >>>>> >>> >> not a
> >>>>> >>> >> > > > > functional level separation.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
> >>>>> being
> >>>>> >>> >> > > > > > used at
> >>>>> >>> >> > > class
> >>>>> >>> >> > > > > > level and not at package/folder level.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
> this
> >>>>> like
> >>>>> >>> >> > > > > a
> >>>>> >>> >> > > folder..."
> >>>>> >>> >> > > > > as an analogy and not literally.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > Thanks
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > Thanks
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> >>>>> >>> >> > > > pramod@datatorrent.com
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > wrote:
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
> >>>>> >>> >> > > > > > > wrote:
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
> >>>>> >>> >> > > org.apache.hadoop.classification.
> >>>>> InterfaceStability.Evolving
> >>>>> >>> >> > > > /
> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
> >>>>> InterfaceStability.Uns
> >>>>> >>> >> > > > > > > > table
> >>>>> >>> >> > > > > > annotation?
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > I think it is important to localize the additions
> >>>>> in one
> >>>>> >>> >> place so
> >>>>> >>> >> > > > that
> >>>>> >>> >> > > > > it
> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity level
> of
> >>>>> >>> >> > > > > > > these,
> >>>>> >>> >> > easier
> >>>>> >>> >> > > > for
> >>>>> >>> >> > > > > > > developers to track them towards the path to
> >>>>> maturity and
> >>>>> >>> >> > > > > > > also
> >>>>> >>> >> > > > > provides a
> >>>>> >>> >> > > > > > > clearer directive for committers and contributors
> on
> >>>>> >>> >> acceptance
> >>>>> >>> >> > of
> >>>>> >>> >> > > > new
> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
> makes
> >>>>> them
> >>>>> >>> >> spread
> >>>>> >>> >> > all
> >>>>> >>> >> > > > > over
> >>>>> >>> >> > > > > > > the place and adds an additional layer of
> >>>>> difficulty in
> >>>>> >>> >> > > > identification
> >>>>> >>> >> > > > > > not
> >>>>> >>> >> > > > > > > just for users but also for developers who want to
> >>>>> find
> >>>>> >>> >> > > > > > > such
> >>>>> >>> >> > > > operators
> >>>>> >>> >> > > > > > and
> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
> >>>>> annotation
> >>>>> >>> >> where
> >>>>> >>> >> > > > > > everything
> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > Thanks
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> >>>>> >>> >> > > david@datatorrent.com
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > > wrote:
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
> >>>>> many
> >>>>> >>> >> operators
> >>>>> >>> >> > > > that
> >>>>> >>> >> > > > > > fall
> >>>>> >>> >> > > > > > > > in
> >>>>> >>> >> > > > > > > > > > the
> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
> >>>>> should
> >>>>> >>> >> > > > > > > > > > > > make it
> >>>>> >>> >> > > > obvious
> >>>>> >>> >> > > > > to
> >>>>> >>> >> > > > > > > > users
> >>>>> >>> >> > > > > > > > > > > that
> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
> >>>>> >>> >> > > > > > > > > > > > operators
> >>>>> >>> >> are
> >>>>> >>> >> > > not,
> >>>>> >>> >> > > > > and
> >>>>> >>> >> > > > > > > > maybe
> >>>>> >>> >> > > > > > > > > > > even
> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
> >>>>> used in a
> >>>>> >>> >> > > > > > > > > > > > real
> >>>>> >>> >> > use
> >>>>> >>> >> > > > > case.
> >>>>> >>> >> > > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
> >>>>> operators
> >>>>> >>> >> > > > > > > > > > > and
> >>>>> >>> >> > doing
> >>>>> >>> >> > > > this
> >>>>> >>> >> > > > > > > > > exercise
> >>>>> >>> >> > > > > > > > > > as
> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
> >>>>> original
> >>>>> >>> >> intent
> >>>>> >>> >> > is
> >>>>> >>> >> > > > for
> >>>>> >>> >> > > > > > > > > > > contributions going forward.
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
> >>>>> >>> >> > > > > > > > > > Operators
> >>>>> >>> >> > > outside
> >>>>> >>> >> > > > > the
> >>>>> >>> >> > > > > > > play
> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
> >>>>> anticipate
> >>>>> >>> >> > > > > > > > > much
> >>>>> >>> >> > tension
> >>>>> >>> >> > > if
> >>>>> >>> >> > > > > we
> >>>>> >>> >> > > > > > > > > establish clear criteria.
> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
> >>>>> operators
> >>>>> >>> >> > > > > > > > > stay
> >>>>> >>> >> and
> >>>>> >>> >> > > put
> >>>>> >>> >> > > > up
> >>>>> >>> >> > > > > > the
> >>>>> >>> >> > > > > > > > > bars for new operators.
> >>>>> >>> >> > > > > > > > >
> >>>>> >>> >> > > > > > > > > David
> >>>>> >>> >> > > > > > > > >
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > >
> >>>>> >>> >> >
> >>>>> >>> >>
> >>>>> >>> >
> >>>>> >>> >
> >>>>> >>>
> >>>>> >>
> >>>>> >>
> >>>>> >
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>

Re: A proposal for Malhar

Posted by Lakshmi Velineni <la...@datatorrent.com>.
Hi,

Friendly Reminder :

I created a shared google sheet and tracked the various details of
operators. The sheet contains information about operators under lib/algo,
lib/math & lib/streamquery. Link is https://docs.google.com/a/
datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
CaWpXt3GDccM/edit?usp=sharing . I also added recommendation for each
operator . Please take a look and provide comments as if any.

Thanks
Lakshmi Prasanna

On Tue, Aug 9, 2016 at 10:07 AM, Pramod Immaneni <pr...@datatorrent.com>
wrote:

> Added comments, also recommend having the misc folder for the remaining
> operators in contrib according to proposed guidelines
>
> https://github.com/apache/apex-site/pull/44
>
> On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <lakshmi@datatorrent.com
> >
> wrote:
>
> > Hi
> >
> > I also added recommendation for lib/math operators to the same document
> as
> > a separate sheet. Please have a look.
> >
> > Thanks
> > Lakshmi Prasanna
> >
> > On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <
> lakshmi@datatorrent.com
> > > wrote:
> >
> >> Hi,
> >>
> >> I also added recommendation for each operator . Please take a look.
> >>
> >> thanks
> >>
> >> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
> >> lakshmi@datatorrent.com> wrote:
> >>
> >>> Hi,
> >>>
> >>> I created a shared google sheet and tracked the various details of
> >>> operators. Currently, the sheet contains information about operators
> under
> >>> lib/algo only. Link is https://docs.google.com/a/
> >>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
> >>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
> >>> lib/math too.
> >>>
> >>> Thanks
> >>> Lakshmi Prasanna
> >>>
> >>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
> >>> wrote:
> >>>
> >>>> Hi Lakshmi,
> >>>>
> >>>> Thanks for volunteering.
> >>>>
> >>>> I think Pramod's suggestion of putting the operators into 3 buckets
> and
> >>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
> >>>> individual operators are both good, with the exception that
> lib/streamquery
> >>>> is one unit and we probably do not need to look at individual
> operators
> >>>> under it.
> >>>>
> >>>> If we don't have any objection in the community, let's start the
> >>>> process.
> >>>>
> >>>> David
> >>>>
> >>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
> >>>> lakshmi@datatorrent.com> wrote:
> >>>>
> >>>>> I am interested to work on this.
> >>>>>
> >>>>> Regards,
> >>>>> Lakshmi prasanna
> >>>>>
> >>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
> >>>>> wrote:
> >>>>>
> >>>>> > Why not have a shared google sheet with a list of operators and
> >>>>> options
> >>>>> > that we want to do with it.
> >>>>> > I think it's case by case.
> >>>>> > But retire unused or obsolete operators is important and we should
> >>>>> do it
> >>>>> > sooner rather than later.
> >>>>> >
> >>>>> > Regards,
> >>>>> > Siyuan
> >>>>> >
> >>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
> >>>>> wrote:
> >>>>> >
> >>>>> >>
> >>>>> >> My vote is to do 2&3
> >>>>> >>
> >>>>> >> Thks
> >>>>> >> Amol
> >>>>> >>
> >>>>> >>
> >>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
> >>>>> >> VKottapalli@directv.com> wrote:
> >>>>> >>
> >>>>> >>> +1 for deprecating the packages listed below.
> >>>>> >>>
> >>>>> >>> -----Original Message-----
> >>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
> >>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
> >>>>> >>>
> >>>>> >>> +1
> >>>>> >>>
> >>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <
> david@datatorrent.com
> >>>>> >
> >>>>> >>> wrote:
> >>>>> >>>
> >>>>> >>> > Hi all,
> >>>>> >>> >
> >>>>> >>> > I would like to renew the discussion of retiring operators in
> >>>>> Malhar.
> >>>>> >>> >
> >>>>> >>> > As stated before, the reason why we would like to retire
> >>>>> operators in
> >>>>> >>> > Malhar is because some of them were written a long time ago
> >>>>> before
> >>>>> >>> > Apache incubation, and they do not pertain to real use cases,
> >>>>> are not
> >>>>> >>> > up to par in code quality, have no potential for improvement,
> and
> >>>>> >>> > probably completely unused by anybody.
> >>>>> >>> >
> >>>>> >>> > We do not want contributors to use them as a model of their
> >>>>> >>> > contribution, or users to use them thinking they are of
> quality,
> >>>>> and
> >>>>> >>> then hit a wall.
> >>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
> >>>>> >>> >
> >>>>> >>> > The initial 3 packages that we would like to target are
> >>>>> *lib/algo*,
> >>>>> >>> > *lib/math*, and *lib/streamquery*.
> >>>>> >>>
> >>>>> >>> >
> >>>>> >>> > I'm adding this thread to the users list. Please speak up if
> you
> >>>>> are
> >>>>> >>> > using any operator in these 3 packages. We would like to hear
> >>>>> from you.
> >>>>> >>> >
> >>>>> >>> > These are the options I can think of for retiring those
> >>>>> operators:
> >>>>> >>> >
> >>>>> >>> > 1) Completely remove them from the malhar repository.
> >>>>> >>> > 2) Move them from malhar-library into a separate artifact
> called
> >>>>> >>> > malhar-misc
> >>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are
> no
> >>>>> >>> > longer supported
> >>>>> >>> >
> >>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
> >>>>> >>> >
> >>>>> >>> > David
> >>>>> >>> >
> >>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
> >>>>> >>> > <pr...@datatorrent.com>
> >>>>> >>> > wrote:
> >>>>> >>> >
> >>>>> >>> >> I wanted to close the loop on this discussion. In general
> >>>>> everyone
> >>>>> >>> >> seemed to be favorable to this idea with no serious
> objections.
> >>>>> Folks
> >>>>> >>> >> had good suggestions like documenting capabilities of
> >>>>> operators, come
> >>>>> >>> >> up well defined criteria for graduation of operators and what
> >>>>> those
> >>>>> >>> >> criteria may be and what to do with existing operators that
> may
> >>>>> not
> >>>>> >>> >> yet be mature or unused.
> >>>>> >>> >>
> >>>>> >>> >> I am going to summarize the key points that resulted from the
> >>>>> >>> >> discussion and would like to proceed with them.
> >>>>> >>> >>
> >>>>> >>> >>    - Operators that do not yet provide the key platform
> >>>>> capabilities
> >>>>> >>> to
> >>>>> >>> >>    make an operator useful across different applications such
> as
> >>>>> >>> >> reusability,
> >>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
> >>>>> will
> >>>>> >>> still be
> >>>>> >>> >>    accepted as long as they are functionally correct, have
> unit
> >>>>> tests
> >>>>> >>> >> and will
> >>>>> >>> >>    go into a separate module.
> >>>>> >>> >>    - Contrib module was suggested as a place where new
> >>>>> contributions
> >>>>> >>> go in
> >>>>> >>> >>    that don't yet have all the platform capabilities and are
> >>>>> not yet
> >>>>> >>> >> mature.
> >>>>> >>> >>    If there are no other suggestions we will go with this one.
> >>>>> >>> >>    - It was suggested the operators documentation list those
> >>>>> platform
> >>>>> >>> >>    capabilities it currently provides from the list above. I
> >>>>> will
> >>>>> >>> >> document a
> >>>>> >>> >>    structure for this in the contribution guidelines.
> >>>>> >>> >>    - Folks wanted to know what would be the criteria to
> >>>>> graduate an
> >>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
> >>>>> thread
> >>>>> >>> >> for it as
> >>>>> >>> >>    I think it requires its own discussion and hopefully we can
> >>>>> come
> >>>>> >>> >> up with a
> >>>>> >>> >>    set of guidelines for it.
> >>>>> >>> >>    - David brought up state of some of the existing operators
> >>>>> and
> >>>>> >>> their
> >>>>> >>> >>    retirement and the layout of operators in Malhar in general
> >>>>> and
> >>>>> >>> how it
> >>>>> >>> >>    causes problems with development. I will ask him to lead
> the
> >>>>> >>> >> discussion on
> >>>>> >>> >>    that.
> >>>>> >>> >>
> >>>>> >>> >> Thanks
> >>>>> >>> >>
> >>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
> >>>>> david@datatorrent.com>
> >>>>> >>> wrote:
> >>>>> >>> >>
> >>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
> >>>>> >>> >> >
> >>>>> >>> >> > On the contrary, putting a new process for people trying to
> >>>>> >>> >> > contribute while NOT addressing the old unused subpar
> >>>>> operators in
> >>>>> >>> >> > the repository
> >>>>> >>> >> is
> >>>>> >>> >> > what is conflicting.
> >>>>> >>> >> >
> >>>>> >>> >> > Keep in mind that when people try to contribute, they always
> >>>>> look
> >>>>> >>> >> > at the existing operators already in the repository as
> >>>>> examples and
> >>>>> >>> >> > likely a
> >>>>> >>> >> model
> >>>>> >>> >> > for their new operators.
> >>>>> >>> >> >
> >>>>> >>> >> > David
> >>>>> >>> >> >
> >>>>> >>> >> >
> >>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
> >>>>> amol@datatorrent.com>
> >>>>> >>> >> wrote:
> >>>>> >>> >> >
> >>>>> >>> >> > > Yes there are two conflicting threads now. The original
> >>>>> thread
> >>>>> >>> >> > > was to
> >>>>> >>> >> > open
> >>>>> >>> >> > > up a way for contributors to submit code in a dir
> >>>>> (contrib?) as
> >>>>> >>> >> > > long
> >>>>> >>> >> as
> >>>>> >>> >> > > license part of taken care of.
> >>>>> >>> >> > >
> >>>>> >>> >> > > On the thread of removing non-used operators -> How do we
> >>>>> know
> >>>>> >>> >> > > what is being used?
> >>>>> >>> >> > >
> >>>>> >>> >> > > Thks,
> >>>>> >>> >> > > Amol
> >>>>> >>> >> > >
> >>>>> >>> >> > >
> >>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
> >>>>> >>> >> sandesh@datatorrent.com>
> >>>>> >>> >> > > wrote:
> >>>>> >>> >> > >
> >>>>> >>> >> > > > +1 for removing the not-used operators.
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > So we are creating a process for operator writers who
> >>>>> don't
> >>>>> >>> >> > > > want to understand the platform, yet wants to
> contribute?
> >>>>> How
> >>>>> >>> >> > > > big is that
> >>>>> >>> >> set?
> >>>>> >>> >> > > > If we tell the app-user, here is the code which has not
> >>>>> passed
> >>>>> >>> >> > > > all
> >>>>> >>> >> the
> >>>>> >>> >> > > > checklist, will they be ready to use that in production?
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
> >>>>> operators and
> >>>>> >>> >> > > > make
> >>>>> >>> >> it
> >>>>> >>> >> > > easy
> >>>>> >>> >> > > > to add more operators.
> >>>>> >>> >> > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
> >>>>> >>> >> > pramod@datatorrent.com>
> >>>>> >>> >> > > > wrote:
> >>>>> >>> >> > > >
> >>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
> >>>>> >>> >> > > gaurav.gopi123@gmail.com>
> >>>>> >>> >> > > > > wrote:
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > Pramod,
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > By that logic I would say let's put all
> partitionable
> >>>>> >>> >> > > > > > operators
> >>>>> >>> >> > into
> >>>>> >>> >> > > > one
> >>>>> >>> >> > > > > > folder, non-partitionable operators in another and
> so
> >>>>> on...
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > Remember the original goal of making it easier for new
> >>>>> >>> >> > > > > members to contribute and managing those contributions
> >>>>> to
> >>>>> >>> >> > > > > maturity. It is
> >>>>> >>> >> not a
> >>>>> >>> >> > > > > functional level separation.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
> >>>>> being
> >>>>> >>> >> > > > > > used at
> >>>>> >>> >> > > class
> >>>>> >>> >> > > > > > level and not at package/folder level.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of
> this
> >>>>> like
> >>>>> >>> >> > > > > a
> >>>>> >>> >> > > folder..."
> >>>>> >>> >> > > > > as an analogy and not literally.
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > Thanks
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > Thanks
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
> >>>>> >>> >> > > > pramod@datatorrent.com
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > wrote:
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
> >>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
> >>>>> >>> >> > > > > > > wrote:
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > > Can same goal not be achieved by using
> >>>>> >>> >> > > org.apache.hadoop.classification.
> >>>>> InterfaceStability.Evolving
> >>>>> >>> >> > > > /
> >>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
> >>>>> InterfaceStability.Uns
> >>>>> >>> >> > > > > > > > table
> >>>>> >>> >> > > > > > annotation?
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > I think it is important to localize the additions
> >>>>> in one
> >>>>> >>> >> place so
> >>>>> >>> >> > > > that
> >>>>> >>> >> > > > > it
> >>>>> >>> >> > > > > > > becomes clearer to users about the maturity level
> of
> >>>>> >>> >> > > > > > > these,
> >>>>> >>> >> > easier
> >>>>> >>> >> > > > for
> >>>>> >>> >> > > > > > > developers to track them towards the path to
> >>>>> maturity and
> >>>>> >>> >> > > > > > > also
> >>>>> >>> >> > > > > provides a
> >>>>> >>> >> > > > > > > clearer directive for committers and contributors
> on
> >>>>> >>> >> acceptance
> >>>>> >>> >> > of
> >>>>> >>> >> > > > new
> >>>>> >>> >> > > > > > > submissions. Relying on the annotations alone
> makes
> >>>>> them
> >>>>> >>> >> spread
> >>>>> >>> >> > all
> >>>>> >>> >> > > > > over
> >>>>> >>> >> > > > > > > the place and adds an additional layer of
> >>>>> difficulty in
> >>>>> >>> >> > > > identification
> >>>>> >>> >> > > > > > not
> >>>>> >>> >> > > > > > > just for users but also for developers who want to
> >>>>> find
> >>>>> >>> >> > > > > > > such
> >>>>> >>> >> > > > operators
> >>>>> >>> >> > > > > > and
> >>>>> >>> >> > > > > > > improve them. This of this like a folder level
> >>>>> annotation
> >>>>> >>> >> where
> >>>>> >>> >> > > > > > everything
> >>>>> >>> >> > > > > > > under this folder is unstable or evolving.
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > Thanks
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
> >>>>> >>> >> > > david@datatorrent.com
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > > > > > wrote:
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
> >>>>> many
> >>>>> >>> >> operators
> >>>>> >>> >> > > > that
> >>>>> >>> >> > > > > > fall
> >>>>> >>> >> > > > > > > > in
> >>>>> >>> >> > > > > > > > > > the
> >>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
> >>>>> should
> >>>>> >>> >> > > > > > > > > > > > make it
> >>>>> >>> >> > > > obvious
> >>>>> >>> >> > > > > to
> >>>>> >>> >> > > > > > > > users
> >>>>> >>> >> > > > > > > > > > > that
> >>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
> >>>>> >>> >> > > > > > > > > > > > operators
> >>>>> >>> >> are
> >>>>> >>> >> > > not,
> >>>>> >>> >> > > > > and
> >>>>> >>> >> > > > > > > > maybe
> >>>>> >>> >> > > > > > > > > > > even
> >>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
> >>>>> used in a
> >>>>> >>> >> > > > > > > > > > > > real
> >>>>> >>> >> > use
> >>>>> >>> >> > > > > case.
> >>>>> >>> >> > > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
> >>>>> operators
> >>>>> >>> >> > > > > > > > > > > and
> >>>>> >>> >> > doing
> >>>>> >>> >> > > > this
> >>>>> >>> >> > > > > > > > > exercise
> >>>>> >>> >> > > > > > > > > > as
> >>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
> >>>>> original
> >>>>> >>> >> intent
> >>>>> >>> >> > is
> >>>>> >>> >> > > > for
> >>>>> >>> >> > > > > > > > > > > contributions going forward.
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > >
> >>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
> >>>>> >>> >> > > > > > > > > > Operators
> >>>>> >>> >> > > outside
> >>>>> >>> >> > > > > the
> >>>>> >>> >> > > > > > > play
> >>>>> >>> >> > > > > > > > > > area should be of well known quality.
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > >
> >>>>> >>> >> > > > > > > > > I think this is important, and I don't
> >>>>> anticipate
> >>>>> >>> >> > > > > > > > > much
> >>>>> >>> >> > tension
> >>>>> >>> >> > > if
> >>>>> >>> >> > > > > we
> >>>>> >>> >> > > > > > > > > establish clear criteria.
> >>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
> >>>>> operators
> >>>>> >>> >> > > > > > > > > stay
> >>>>> >>> >> and
> >>>>> >>> >> > > put
> >>>>> >>> >> > > > up
> >>>>> >>> >> > > > > > the
> >>>>> >>> >> > > > > > > > > bars for new operators.
> >>>>> >>> >> > > > > > > > >
> >>>>> >>> >> > > > > > > > > David
> >>>>> >>> >> > > > > > > > >
> >>>>> >>> >> > > > > > > >
> >>>>> >>> >> > > > > > >
> >>>>> >>> >> > > > > >
> >>>>> >>> >> > > > >
> >>>>> >>> >> > > >
> >>>>> >>> >> > >
> >>>>> >>> >> >
> >>>>> >>> >>
> >>>>> >>> >
> >>>>> >>> >
> >>>>> >>>
> >>>>> >>
> >>>>> >>
> >>>>> >
> >>>>>
> >>>>
> >>>>
> >>>
> >>
> >
>

Re: A proposal for Malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
Added comments, also recommend having the misc folder for the remaining
operators in contrib according to proposed guidelines

https://github.com/apache/apex-site/pull/44

On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi
>
> I also added recommendation for lib/math operators to the same document as
> a separate sheet. Please have a look.
>
> Thanks
> Lakshmi Prasanna
>
> On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> Hi,
>>
>> I also added recommendation for each operator . Please take a look.
>>
>> thanks
>>
>> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> lakshmi@datatorrent.com> wrote:
>>
>>> Hi,
>>>
>>> I created a shared google sheet and tracked the various details of
>>> operators. Currently, the sheet contains information about operators under
>>> lib/algo only. Link is https://docs.google.com/a/
>>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>> lib/math too.
>>>
>>> Thanks
>>> Lakshmi Prasanna
>>>
>>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>>> wrote:
>>>
>>>> Hi Lakshmi,
>>>>
>>>> Thanks for volunteering.
>>>>
>>>> I think Pramod's suggestion of putting the operators into 3 buckets and
>>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>>> individual operators are both good, with the exception that lib/streamquery
>>>> is one unit and we probably do not need to look at individual operators
>>>> under it.
>>>>
>>>> If we don't have any objection in the community, let's start the
>>>> process.
>>>>
>>>> David
>>>>
>>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>>> lakshmi@datatorrent.com> wrote:
>>>>
>>>>> I am interested to work on this.
>>>>>
>>>>> Regards,
>>>>> Lakshmi prasanna
>>>>>
>>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Why not have a shared google sheet with a list of operators and
>>>>> options
>>>>> > that we want to do with it.
>>>>> > I think it's case by case.
>>>>> > But retire unused or obsolete operators is important and we should
>>>>> do it
>>>>> > sooner rather than later.
>>>>> >
>>>>> > Regards,
>>>>> > Siyuan
>>>>> >
>>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>>>>> wrote:
>>>>> >
>>>>> >>
>>>>> >> My vote is to do 2&3
>>>>> >>
>>>>> >> Thks
>>>>> >> Amol
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>>>> >> VKottapalli@directv.com> wrote:
>>>>> >>
>>>>> >>> +1 for deprecating the packages listed below.
>>>>> >>>
>>>>> >>> -----Original Message-----
>>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>>> >>>
>>>>> >>> +1
>>>>> >>>
>>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <david@datatorrent.com
>>>>> >
>>>>> >>> wrote:
>>>>> >>>
>>>>> >>> > Hi all,
>>>>> >>> >
>>>>> >>> > I would like to renew the discussion of retiring operators in
>>>>> Malhar.
>>>>> >>> >
>>>>> >>> > As stated before, the reason why we would like to retire
>>>>> operators in
>>>>> >>> > Malhar is because some of them were written a long time ago
>>>>> before
>>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>>>>> are not
>>>>> >>> > up to par in code quality, have no potential for improvement, and
>>>>> >>> > probably completely unused by anybody.
>>>>> >>> >
>>>>> >>> > We do not want contributors to use them as a model of their
>>>>> >>> > contribution, or users to use them thinking they are of quality,
>>>>> and
>>>>> >>> then hit a wall.
>>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>>>> >>> >
>>>>> >>> > The initial 3 packages that we would like to target are
>>>>> *lib/algo*,
>>>>> >>> > *lib/math*, and *lib/streamquery*.
>>>>> >>>
>>>>> >>> >
>>>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>>>> are
>>>>> >>> > using any operator in these 3 packages. We would like to hear
>>>>> from you.
>>>>> >>> >
>>>>> >>> > These are the options I can think of for retiring those
>>>>> operators:
>>>>> >>> >
>>>>> >>> > 1) Completely remove them from the malhar repository.
>>>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>>>> >>> > malhar-misc
>>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>>>> >>> > longer supported
>>>>> >>> >
>>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>>>> >>> >
>>>>> >>> > David
>>>>> >>> >
>>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>>>> >>> > <pr...@datatorrent.com>
>>>>> >>> > wrote:
>>>>> >>> >
>>>>> >>> >> I wanted to close the loop on this discussion. In general
>>>>> everyone
>>>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>>>> Folks
>>>>> >>> >> had good suggestions like documenting capabilities of
>>>>> operators, come
>>>>> >>> >> up well defined criteria for graduation of operators and what
>>>>> those
>>>>> >>> >> criteria may be and what to do with existing operators that may
>>>>> not
>>>>> >>> >> yet be mature or unused.
>>>>> >>> >>
>>>>> >>> >> I am going to summarize the key points that resulted from the
>>>>> >>> >> discussion and would like to proceed with them.
>>>>> >>> >>
>>>>> >>> >>    - Operators that do not yet provide the key platform
>>>>> capabilities
>>>>> >>> to
>>>>> >>> >>    make an operator useful across different applications such as
>>>>> >>> >> reusability,
>>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>>>>> will
>>>>> >>> still be
>>>>> >>> >>    accepted as long as they are functionally correct, have unit
>>>>> tests
>>>>> >>> >> and will
>>>>> >>> >>    go into a separate module.
>>>>> >>> >>    - Contrib module was suggested as a place where new
>>>>> contributions
>>>>> >>> go in
>>>>> >>> >>    that don't yet have all the platform capabilities and are
>>>>> not yet
>>>>> >>> >> mature.
>>>>> >>> >>    If there are no other suggestions we will go with this one.
>>>>> >>> >>    - It was suggested the operators documentation list those
>>>>> platform
>>>>> >>> >>    capabilities it currently provides from the list above. I
>>>>> will
>>>>> >>> >> document a
>>>>> >>> >>    structure for this in the contribution guidelines.
>>>>> >>> >>    - Folks wanted to know what would be the criteria to
>>>>> graduate an
>>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>>>>> thread
>>>>> >>> >> for it as
>>>>> >>> >>    I think it requires its own discussion and hopefully we can
>>>>> come
>>>>> >>> >> up with a
>>>>> >>> >>    set of guidelines for it.
>>>>> >>> >>    - David brought up state of some of the existing operators
>>>>> and
>>>>> >>> their
>>>>> >>> >>    retirement and the layout of operators in Malhar in general
>>>>> and
>>>>> >>> how it
>>>>> >>> >>    causes problems with development. I will ask him to lead the
>>>>> >>> >> discussion on
>>>>> >>> >>    that.
>>>>> >>> >>
>>>>> >>> >> Thanks
>>>>> >>> >>
>>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>>>> david@datatorrent.com>
>>>>> >>> wrote:
>>>>> >>> >>
>>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
>>>>> >>> >> >
>>>>> >>> >> > On the contrary, putting a new process for people trying to
>>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>>>> operators in
>>>>> >>> >> > the repository
>>>>> >>> >> is
>>>>> >>> >> > what is conflicting.
>>>>> >>> >> >
>>>>> >>> >> > Keep in mind that when people try to contribute, they always
>>>>> look
>>>>> >>> >> > at the existing operators already in the repository as
>>>>> examples and
>>>>> >>> >> > likely a
>>>>> >>> >> model
>>>>> >>> >> > for their new operators.
>>>>> >>> >> >
>>>>> >>> >> > David
>>>>> >>> >> >
>>>>> >>> >> >
>>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>>>> amol@datatorrent.com>
>>>>> >>> >> wrote:
>>>>> >>> >> >
>>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>>>> thread
>>>>> >>> >> > > was to
>>>>> >>> >> > open
>>>>> >>> >> > > up a way for contributors to submit code in a dir
>>>>> (contrib?) as
>>>>> >>> >> > > long
>>>>> >>> >> as
>>>>> >>> >> > > license part of taken care of.
>>>>> >>> >> > >
>>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>>>>> know
>>>>> >>> >> > > what is being used?
>>>>> >>> >> > >
>>>>> >>> >> > > Thks,
>>>>> >>> >> > > Amol
>>>>> >>> >> > >
>>>>> >>> >> > >
>>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>>>> >>> >> sandesh@datatorrent.com>
>>>>> >>> >> > > wrote:
>>>>> >>> >> > >
>>>>> >>> >> > > > +1 for removing the not-used operators.
>>>>> >>> >> > > >
>>>>> >>> >> > > > So we are creating a process for operator writers who
>>>>> don't
>>>>> >>> >> > > > want to understand the platform, yet wants to contribute?
>>>>> How
>>>>> >>> >> > > > big is that
>>>>> >>> >> set?
>>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>>>>> passed
>>>>> >>> >> > > > all
>>>>> >>> >> the
>>>>> >>> >> > > > checklist, will they be ready to use that in production?
>>>>> >>> >> > > >
>>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>>>>> operators and
>>>>> >>> >> > > > make
>>>>> >>> >> it
>>>>> >>> >> > > easy
>>>>> >>> >> > > > to add more operators.
>>>>> >>> >> > > >
>>>>> >>> >> > > >
>>>>> >>> >> > > >
>>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>>>> >>> >> > pramod@datatorrent.com>
>>>>> >>> >> > > > wrote:
>>>>> >>> >> > > >
>>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>>>> >>> >> > > > > wrote:
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > Pramod,
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > By that logic I would say let's put all partitionable
>>>>> >>> >> > > > > > operators
>>>>> >>> >> > into
>>>>> >>> >> > > > one
>>>>> >>> >> > > > > > folder, non-partitionable operators in another and so
>>>>> on...
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > Remember the original goal of making it easier for new
>>>>> >>> >> > > > > members to contribute and managing those contributions
>>>>> to
>>>>> >>> >> > > > > maturity. It is
>>>>> >>> >> not a
>>>>> >>> >> > > > > functional level separation.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>>>> being
>>>>> >>> >> > > > > > used at
>>>>> >>> >> > > class
>>>>> >>> >> > > > > > level and not at package/folder level.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>>>>> like
>>>>> >>> >> > > > > a
>>>>> >>> >> > > folder..."
>>>>> >>> >> > > > > as an analogy and not literally.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > Thanks
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > Thanks
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>>>> >>> >> > > > pramod@datatorrent.com
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > wrote:
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>>>> >>> >> > > > > > > wrote:
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>>>> >>> >> > > org.apache.hadoop.classification.
>>>>> InterfaceStability.Evolving
>>>>> >>> >> > > > /
>>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>>>>> InterfaceStability.Uns
>>>>> >>> >> > > > > > > > table
>>>>> >>> >> > > > > > annotation?
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > I think it is important to localize the additions
>>>>> in one
>>>>> >>> >> place so
>>>>> >>> >> > > > that
>>>>> >>> >> > > > > it
>>>>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>>>>> >>> >> > > > > > > these,
>>>>> >>> >> > easier
>>>>> >>> >> > > > for
>>>>> >>> >> > > > > > > developers to track them towards the path to
>>>>> maturity and
>>>>> >>> >> > > > > > > also
>>>>> >>> >> > > > > provides a
>>>>> >>> >> > > > > > > clearer directive for committers and contributors on
>>>>> >>> >> acceptance
>>>>> >>> >> > of
>>>>> >>> >> > > > new
>>>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>>>>> them
>>>>> >>> >> spread
>>>>> >>> >> > all
>>>>> >>> >> > > > > over
>>>>> >>> >> > > > > > > the place and adds an additional layer of
>>>>> difficulty in
>>>>> >>> >> > > > identification
>>>>> >>> >> > > > > > not
>>>>> >>> >> > > > > > > just for users but also for developers who want to
>>>>> find
>>>>> >>> >> > > > > > > such
>>>>> >>> >> > > > operators
>>>>> >>> >> > > > > > and
>>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>>>> annotation
>>>>> >>> >> where
>>>>> >>> >> > > > > > everything
>>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > Thanks
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>>>> >>> >> > > david@datatorrent.com
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > > wrote:
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
>>>>> many
>>>>> >>> >> operators
>>>>> >>> >> > > > that
>>>>> >>> >> > > > > > fall
>>>>> >>> >> > > > > > > > in
>>>>> >>> >> > > > > > > > > > the
>>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>>>> should
>>>>> >>> >> > > > > > > > > > > > make it
>>>>> >>> >> > > > obvious
>>>>> >>> >> > > > > to
>>>>> >>> >> > > > > > > > users
>>>>> >>> >> > > > > > > > > > > that
>>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>>>>> >>> >> > > > > > > > > > > > operators
>>>>> >>> >> are
>>>>> >>> >> > > not,
>>>>> >>> >> > > > > and
>>>>> >>> >> > > > > > > > maybe
>>>>> >>> >> > > > > > > > > > > even
>>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>>>>> used in a
>>>>> >>> >> > > > > > > > > > > > real
>>>>> >>> >> > use
>>>>> >>> >> > > > > case.
>>>>> >>> >> > > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>>>> operators
>>>>> >>> >> > > > > > > > > > > and
>>>>> >>> >> > doing
>>>>> >>> >> > > > this
>>>>> >>> >> > > > > > > > > exercise
>>>>> >>> >> > > > > > > > > > as
>>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>>>> original
>>>>> >>> >> intent
>>>>> >>> >> > is
>>>>> >>> >> > > > for
>>>>> >>> >> > > > > > > > > > > contributions going forward.
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>>>>> >>> >> > > > > > > > > > Operators
>>>>> >>> >> > > outside
>>>>> >>> >> > > > > the
>>>>> >>> >> > > > > > > play
>>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > I think this is important, and I don't
>>>>> anticipate
>>>>> >>> >> > > > > > > > > much
>>>>> >>> >> > tension
>>>>> >>> >> > > if
>>>>> >>> >> > > > > we
>>>>> >>> >> > > > > > > > > establish clear criteria.
>>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>>>> operators
>>>>> >>> >> > > > > > > > > stay
>>>>> >>> >> and
>>>>> >>> >> > > put
>>>>> >>> >> > > > up
>>>>> >>> >> > > > > > the
>>>>> >>> >> > > > > > > > > bars for new operators.
>>>>> >>> >> > > > > > > > >
>>>>> >>> >> > > > > > > > > David
>>>>> >>> >> > > > > > > > >
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > >
>>>>> >>> >> > >
>>>>> >>> >> >
>>>>> >>> >>
>>>>> >>> >
>>>>> >>> >
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>

Re: A proposal for Malhar

Posted by Pramod Immaneni <pr...@datatorrent.com>.
Added comments, also recommend having the misc folder for the remaining
operators in contrib according to proposed guidelines

https://github.com/apache/apex-site/pull/44

On Mon, Aug 1, 2016 at 12:22 AM, Lakshmi Velineni <la...@datatorrent.com>
wrote:

> Hi
>
> I also added recommendation for lib/math operators to the same document as
> a separate sheet. Please have a look.
>
> Thanks
> Lakshmi Prasanna
>
> On Fri, Jul 29, 2016 at 7:19 PM, Lakshmi Velineni <lakshmi@datatorrent.com
> > wrote:
>
>> Hi,
>>
>> I also added recommendation for each operator . Please take a look.
>>
>> thanks
>>
>> On Thu, Jul 28, 2016 at 11:03 AM, Lakshmi Velineni <
>> lakshmi@datatorrent.com> wrote:
>>
>>> Hi,
>>>
>>> I created a shared google sheet and tracked the various details of
>>> operators. Currently, the sheet contains information about operators under
>>> lib/algo only. Link is https://docs.google.com/a/
>>> datatorrent.com/spreadsheets/d/15IjMa-vYK6Wru4kZnUGIgg6sQAptDJ_
>>> CaWpXt3GDccM/edit?usp=sharing . Will update the sheet soon with
>>> lib/math too.
>>>
>>> Thanks
>>> Lakshmi Prasanna
>>>
>>> On Tue, Jul 12, 2016 at 2:36 PM, David Yan <da...@datatorrent.com>
>>> wrote:
>>>
>>>> Hi Lakshmi,
>>>>
>>>> Thanks for volunteering.
>>>>
>>>> I think Pramod's suggestion of putting the operators into 3 buckets and
>>>> Siyuan's suggestion of starting a shared Google Sheet that tracks
>>>> individual operators are both good, with the exception that lib/streamquery
>>>> is one unit and we probably do not need to look at individual operators
>>>> under it.
>>>>
>>>> If we don't have any objection in the community, let's start the
>>>> process.
>>>>
>>>> David
>>>>
>>>> On Tue, Jul 12, 2016 at 2:24 PM, Lakshmi Velineni <
>>>> lakshmi@datatorrent.com> wrote:
>>>>
>>>>> I am interested to work on this.
>>>>>
>>>>> Regards,
>>>>> Lakshmi prasanna
>>>>>
>>>>> On Tue, Jul 12, 2016 at 1:55 PM, hsy541@gmail.com <hs...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> > Why not have a shared google sheet with a list of operators and
>>>>> options
>>>>> > that we want to do with it.
>>>>> > I think it's case by case.
>>>>> > But retire unused or obsolete operators is important and we should
>>>>> do it
>>>>> > sooner rather than later.
>>>>> >
>>>>> > Regards,
>>>>> > Siyuan
>>>>> >
>>>>> > On Tue, Jul 12, 2016 at 1:09 PM, Amol Kekre <am...@datatorrent.com>
>>>>> wrote:
>>>>> >
>>>>> >>
>>>>> >> My vote is to do 2&3
>>>>> >>
>>>>> >> Thks
>>>>> >> Amol
>>>>> >>
>>>>> >>
>>>>> >> On Tue, Jul 12, 2016 at 12:14 PM, Kottapalli, Venkatesh <
>>>>> >> VKottapalli@directv.com> wrote:
>>>>> >>
>>>>> >>> +1 for deprecating the packages listed below.
>>>>> >>>
>>>>> >>> -----Original Message-----
>>>>> >>> From: hsy541@gmail.com [mailto:hsy541@gmail.com]
>>>>> >>> Sent: Tuesday, July 12, 2016 12:01 PM
>>>>> >>>
>>>>> >>> +1
>>>>> >>>
>>>>> >>> On Tue, Jul 12, 2016 at 11:53 AM, David Yan <david@datatorrent.com
>>>>> >
>>>>> >>> wrote:
>>>>> >>>
>>>>> >>> > Hi all,
>>>>> >>> >
>>>>> >>> > I would like to renew the discussion of retiring operators in
>>>>> Malhar.
>>>>> >>> >
>>>>> >>> > As stated before, the reason why we would like to retire
>>>>> operators in
>>>>> >>> > Malhar is because some of them were written a long time ago
>>>>> before
>>>>> >>> > Apache incubation, and they do not pertain to real use cases,
>>>>> are not
>>>>> >>> > up to par in code quality, have no potential for improvement, and
>>>>> >>> > probably completely unused by anybody.
>>>>> >>> >
>>>>> >>> > We do not want contributors to use them as a model of their
>>>>> >>> > contribution, or users to use them thinking they are of quality,
>>>>> and
>>>>> >>> then hit a wall.
>>>>> >>> > Both scenarios are not beneficial to the reputation of Apex.
>>>>> >>> >
>>>>> >>> > The initial 3 packages that we would like to target are
>>>>> *lib/algo*,
>>>>> >>> > *lib/math*, and *lib/streamquery*.
>>>>> >>>
>>>>> >>> >
>>>>> >>> > I'm adding this thread to the users list. Please speak up if you
>>>>> are
>>>>> >>> > using any operator in these 3 packages. We would like to hear
>>>>> from you.
>>>>> >>> >
>>>>> >>> > These are the options I can think of for retiring those
>>>>> operators:
>>>>> >>> >
>>>>> >>> > 1) Completely remove them from the malhar repository.
>>>>> >>> > 2) Move them from malhar-library into a separate artifact called
>>>>> >>> > malhar-misc
>>>>> >>> > 3) Mark them deprecated and add to their javadoc that they are no
>>>>> >>> > longer supported
>>>>> >>> >
>>>>> >>> > Note that 2 and 3 are not mutually exclusive. Any thoughts?
>>>>> >>> >
>>>>> >>> > David
>>>>> >>> >
>>>>> >>> > On Tue, Jun 7, 2016 at 2:27 PM, Pramod Immaneni
>>>>> >>> > <pr...@datatorrent.com>
>>>>> >>> > wrote:
>>>>> >>> >
>>>>> >>> >> I wanted to close the loop on this discussion. In general
>>>>> everyone
>>>>> >>> >> seemed to be favorable to this idea with no serious objections.
>>>>> Folks
>>>>> >>> >> had good suggestions like documenting capabilities of
>>>>> operators, come
>>>>> >>> >> up well defined criteria for graduation of operators and what
>>>>> those
>>>>> >>> >> criteria may be and what to do with existing operators that may
>>>>> not
>>>>> >>> >> yet be mature or unused.
>>>>> >>> >>
>>>>> >>> >> I am going to summarize the key points that resulted from the
>>>>> >>> >> discussion and would like to proceed with them.
>>>>> >>> >>
>>>>> >>> >>    - Operators that do not yet provide the key platform
>>>>> capabilities
>>>>> >>> to
>>>>> >>> >>    make an operator useful across different applications such as
>>>>> >>> >> reusability,
>>>>> >>> >>    partitioning static or dynamic, idempotency, exactly once
>>>>> will
>>>>> >>> still be
>>>>> >>> >>    accepted as long as they are functionally correct, have unit
>>>>> tests
>>>>> >>> >> and will
>>>>> >>> >>    go into a separate module.
>>>>> >>> >>    - Contrib module was suggested as a place where new
>>>>> contributions
>>>>> >>> go in
>>>>> >>> >>    that don't yet have all the platform capabilities and are
>>>>> not yet
>>>>> >>> >> mature.
>>>>> >>> >>    If there are no other suggestions we will go with this one.
>>>>> >>> >>    - It was suggested the operators documentation list those
>>>>> platform
>>>>> >>> >>    capabilities it currently provides from the list above. I
>>>>> will
>>>>> >>> >> document a
>>>>> >>> >>    structure for this in the contribution guidelines.
>>>>> >>> >>    - Folks wanted to know what would be the criteria to
>>>>> graduate an
>>>>> >>> >>    operator to the big leagues :). I will kick-off a separate
>>>>> thread
>>>>> >>> >> for it as
>>>>> >>> >>    I think it requires its own discussion and hopefully we can
>>>>> come
>>>>> >>> >> up with a
>>>>> >>> >>    set of guidelines for it.
>>>>> >>> >>    - David brought up state of some of the existing operators
>>>>> and
>>>>> >>> their
>>>>> >>> >>    retirement and the layout of operators in Malhar in general
>>>>> and
>>>>> >>> how it
>>>>> >>> >>    causes problems with development. I will ask him to lead the
>>>>> >>> >> discussion on
>>>>> >>> >>    that.
>>>>> >>> >>
>>>>> >>> >> Thanks
>>>>> >>> >>
>>>>> >>> >> On Fri, May 27, 2016 at 7:47 PM, David Yan <
>>>>> david@datatorrent.com>
>>>>> >>> wrote:
>>>>> >>> >>
>>>>> >>> >> > The two ideas are not conflicting, but rather complementing.
>>>>> >>> >> >
>>>>> >>> >> > On the contrary, putting a new process for people trying to
>>>>> >>> >> > contribute while NOT addressing the old unused subpar
>>>>> operators in
>>>>> >>> >> > the repository
>>>>> >>> >> is
>>>>> >>> >> > what is conflicting.
>>>>> >>> >> >
>>>>> >>> >> > Keep in mind that when people try to contribute, they always
>>>>> look
>>>>> >>> >> > at the existing operators already in the repository as
>>>>> examples and
>>>>> >>> >> > likely a
>>>>> >>> >> model
>>>>> >>> >> > for their new operators.
>>>>> >>> >> >
>>>>> >>> >> > David
>>>>> >>> >> >
>>>>> >>> >> >
>>>>> >>> >> > On Fri, May 27, 2016 at 4:05 PM, Amol Kekre <
>>>>> amol@datatorrent.com>
>>>>> >>> >> wrote:
>>>>> >>> >> >
>>>>> >>> >> > > Yes there are two conflicting threads now. The original
>>>>> thread
>>>>> >>> >> > > was to
>>>>> >>> >> > open
>>>>> >>> >> > > up a way for contributors to submit code in a dir
>>>>> (contrib?) as
>>>>> >>> >> > > long
>>>>> >>> >> as
>>>>> >>> >> > > license part of taken care of.
>>>>> >>> >> > >
>>>>> >>> >> > > On the thread of removing non-used operators -> How do we
>>>>> know
>>>>> >>> >> > > what is being used?
>>>>> >>> >> > >
>>>>> >>> >> > > Thks,
>>>>> >>> >> > > Amol
>>>>> >>> >> > >
>>>>> >>> >> > >
>>>>> >>> >> > > On Fri, May 27, 2016 at 3:40 PM, Sandesh Hegde <
>>>>> >>> >> sandesh@datatorrent.com>
>>>>> >>> >> > > wrote:
>>>>> >>> >> > >
>>>>> >>> >> > > > +1 for removing the not-used operators.
>>>>> >>> >> > > >
>>>>> >>> >> > > > So we are creating a process for operator writers who
>>>>> don't
>>>>> >>> >> > > > want to understand the platform, yet wants to contribute?
>>>>> How
>>>>> >>> >> > > > big is that
>>>>> >>> >> set?
>>>>> >>> >> > > > If we tell the app-user, here is the code which has not
>>>>> passed
>>>>> >>> >> > > > all
>>>>> >>> >> the
>>>>> >>> >> > > > checklist, will they be ready to use that in production?
>>>>> >>> >> > > >
>>>>> >>> >> > > > This thread has 2 conflicting forces, reduce the
>>>>> operators and
>>>>> >>> >> > > > make
>>>>> >>> >> it
>>>>> >>> >> > > easy
>>>>> >>> >> > > > to add more operators.
>>>>> >>> >> > > >
>>>>> >>> >> > > >
>>>>> >>> >> > > >
>>>>> >>> >> > > > On Fri, May 27, 2016 at 3:03 PM Pramod Immaneni <
>>>>> >>> >> > pramod@datatorrent.com>
>>>>> >>> >> > > > wrote:
>>>>> >>> >> > > >
>>>>> >>> >> > > > > On Fri, May 27, 2016 at 2:30 PM, Gaurav Gupta <
>>>>> >>> >> > > gaurav.gopi123@gmail.com>
>>>>> >>> >> > > > > wrote:
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > Pramod,
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > By that logic I would say let's put all partitionable
>>>>> >>> >> > > > > > operators
>>>>> >>> >> > into
>>>>> >>> >> > > > one
>>>>> >>> >> > > > > > folder, non-partitionable operators in another and so
>>>>> on...
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > Remember the original goal of making it easier for new
>>>>> >>> >> > > > > members to contribute and managing those contributions
>>>>> to
>>>>> >>> >> > > > > maturity. It is
>>>>> >>> >> not a
>>>>> >>> >> > > > > functional level separation.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > When I look at hadoop code I see these annotations
>>>>> being
>>>>> >>> >> > > > > > used at
>>>>> >>> >> > > class
>>>>> >>> >> > > > > > level and not at package/folder level.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > I had a typo in my email, I meant to say "think of this
>>>>> like
>>>>> >>> >> > > > > a
>>>>> >>> >> > > folder..."
>>>>> >>> >> > > > > as an analogy and not literally.
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > Thanks
>>>>> >>> >> > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > Thanks
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > On Fri, May 27, 2016 at 2:10 PM, Pramod Immaneni <
>>>>> >>> >> > > > pramod@datatorrent.com
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > wrote:
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > > > > On Fri, May 27, 2016 at 1:05 PM, Gaurav Gupta <
>>>>> >>> >> > > > > gaurav.gopi123@gmail.com>
>>>>> >>> >> > > > > > > wrote:
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > > Can same goal not be achieved by using
>>>>> >>> >> > > org.apache.hadoop.classification.
>>>>> InterfaceStability.Evolving
>>>>> >>> >> > > > /
>>>>> >>> >> > > > > > > > org.apache.hadoop.classification.
>>>>> InterfaceStability.Uns
>>>>> >>> >> > > > > > > > table
>>>>> >>> >> > > > > > annotation?
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > I think it is important to localize the additions
>>>>> in one
>>>>> >>> >> place so
>>>>> >>> >> > > > that
>>>>> >>> >> > > > > it
>>>>> >>> >> > > > > > > becomes clearer to users about the maturity level of
>>>>> >>> >> > > > > > > these,
>>>>> >>> >> > easier
>>>>> >>> >> > > > for
>>>>> >>> >> > > > > > > developers to track them towards the path to
>>>>> maturity and
>>>>> >>> >> > > > > > > also
>>>>> >>> >> > > > > provides a
>>>>> >>> >> > > > > > > clearer directive for committers and contributors on
>>>>> >>> >> acceptance
>>>>> >>> >> > of
>>>>> >>> >> > > > new
>>>>> >>> >> > > > > > > submissions. Relying on the annotations alone makes
>>>>> them
>>>>> >>> >> spread
>>>>> >>> >> > all
>>>>> >>> >> > > > > over
>>>>> >>> >> > > > > > > the place and adds an additional layer of
>>>>> difficulty in
>>>>> >>> >> > > > identification
>>>>> >>> >> > > > > > not
>>>>> >>> >> > > > > > > just for users but also for developers who want to
>>>>> find
>>>>> >>> >> > > > > > > such
>>>>> >>> >> > > > operators
>>>>> >>> >> > > > > > and
>>>>> >>> >> > > > > > > improve them. This of this like a folder level
>>>>> annotation
>>>>> >>> >> where
>>>>> >>> >> > > > > > everything
>>>>> >>> >> > > > > > > under this folder is unstable or evolving.
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > Thanks
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > > > On Fri, May 27, 2016 at 12:35 PM, David Yan <
>>>>> >>> >> > > david@datatorrent.com
>>>>> >>> >> > > > >
>>>>> >>> >> > > > > > > wrote:
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > > Malhar in its current state, has way too
>>>>> many
>>>>> >>> >> operators
>>>>> >>> >> > > > that
>>>>> >>> >> > > > > > fall
>>>>> >>> >> > > > > > > > in
>>>>> >>> >> > > > > > > > > > the
>>>>> >>> >> > > > > > > > > > > > "non-production quality" category. We
>>>>> should
>>>>> >>> >> > > > > > > > > > > > make it
>>>>> >>> >> > > > obvious
>>>>> >>> >> > > > > to
>>>>> >>> >> > > > > > > > users
>>>>> >>> >> > > > > > > > > > > that
>>>>> >>> >> > > > > > > > > > > > which operators are up to par, and which
>>>>> >>> >> > > > > > > > > > > > operators
>>>>> >>> >> are
>>>>> >>> >> > > not,
>>>>> >>> >> > > > > and
>>>>> >>> >> > > > > > > > maybe
>>>>> >>> >> > > > > > > > > > > even
>>>>> >>> >> > > > > > > > > > > > remove those that are likely not ever
>>>>> used in a
>>>>> >>> >> > > > > > > > > > > > real
>>>>> >>> >> > use
>>>>> >>> >> > > > > case.
>>>>> >>> >> > > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > > I am ambivalent about revisiting older
>>>>> operators
>>>>> >>> >> > > > > > > > > > > and
>>>>> >>> >> > doing
>>>>> >>> >> > > > this
>>>>> >>> >> > > > > > > > > exercise
>>>>> >>> >> > > > > > > > > > as
>>>>> >>> >> > > > > > > > > > > this can cause unnecessary tensions. My
>>>>> original
>>>>> >>> >> intent
>>>>> >>> >> > is
>>>>> >>> >> > > > for
>>>>> >>> >> > > > > > > > > > > contributions going forward.
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > >
>>>>> >>> >> > > > > > > > > > IMO it is important to address this as well.
>>>>> >>> >> > > > > > > > > > Operators
>>>>> >>> >> > > outside
>>>>> >>> >> > > > > the
>>>>> >>> >> > > > > > > play
>>>>> >>> >> > > > > > > > > > area should be of well known quality.
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > >
>>>>> >>> >> > > > > > > > > I think this is important, and I don't
>>>>> anticipate
>>>>> >>> >> > > > > > > > > much
>>>>> >>> >> > tension
>>>>> >>> >> > > if
>>>>> >>> >> > > > > we
>>>>> >>> >> > > > > > > > > establish clear criteria.
>>>>> >>> >> > > > > > > > > It's not helpful if we let the old subpar
>>>>> operators
>>>>> >>> >> > > > > > > > > stay
>>>>> >>> >> and
>>>>> >>> >> > > put
>>>>> >>> >> > > > up
>>>>> >>> >> > > > > > the
>>>>> >>> >> > > > > > > > > bars for new operators.
>>>>> >>> >> > > > > > > > >
>>>>> >>> >> > > > > > > > > David
>>>>> >>> >> > > > > > > > >
>>>>> >>> >> > > > > > > >
>>>>> >>> >> > > > > > >
>>>>> >>> >> > > > > >
>>>>> >>> >> > > > >
>>>>> >>> >> > > >
>>>>> >>> >> > >
>>>>> >>> >> >
>>>>> >>> >>
>>>>> >>> >
>>>>> >>> >
>>>>> >>>
>>>>> >>
>>>>> >>
>>>>> >
>>>>>
>>>>
>>>>
>>>
>>
>