You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@apex.apache.org by "York, Brennon" <Br...@capitalone.com> on 2015/12/09 22:42:23 UTC

Simple Operators within Malhar (MLHR-1914)

All, I’ve been working on the JIRA ticket MLHR-1914 (at https://malhar.atlassian.net/projects/MLHR/issues/MLHR-1914?filter=allopenissues) and I wanted to shoot this out to describe what I’ve been doing and get feedback now that its in a state of something that we can discuss ;)

Before going into depth here is the code on my local repo:
https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/complex
https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/simple
The tests are in the same respective test directory.

So, the biggest impetus for this JIRA is that there should be a set of operators that 1. standardize the input and output ports and 2. make it very simple for a developer to merely implement a process method and forget the rest. Given all of this I found that there were two sets of operators based on the complexity of ports and how they mapped to each other. I gave them the package names ‘simple’ and ‘complex’ for lack of a better idea at the time. Feel free to propose something better :)

Under ‘simple’ are three operators:

 *   SingleInputOutput: This abstracts the input and output port (defined as ‘input’ and ‘output’) and merely allows a user to implement a process method.
 *   SingleInputMultiOutput: Like above, but the return value from the ‘process’ method is emitted to N output ports where N defaults to 2.
 *   MultiInputSingleOutput: N inputs are mapped into a single ‘process’ method with a single output port with N defaulting to 2.

Under ‘complex’ are four operators:

 *   SingleInputListOutput: a single input port and ‘process’ method where the return value of the ‘process’ method is a list of values with each value in the array matching the N output ports with N defaulting to 2.
 *   DirectMultiInputOutput: This maps N inputs to N outputs processed under a single ‘process’ method with N defaulting to 2.
 *   AllWayMultiInputOutput: maps N inputs to M outputs such that, for each input the ‘process’ method is called and, with the return value of the process method, it is sent to each of the M output ports with M and N defaulting to 2.
 *   AllWayMultiInputListOutput: like above except that, instead of having the ‘process’ method return value emit to each of the M output ports, the return value from ‘process’ is a list with each element in the list emitting to a different output port. Concretely, v[0] => O[0], v[1] => O[1], etc. where v[] is the array of values from the ‘process’ method and O[] is the array of output ports.

Like I said I’m still working through the test and error cases (say where v[].len != O[].len) although I’d love to get feedback on everything thus far! Also, forgot to mention above, but this work is heavily related and will be the base of MLHR-1915 whereby we can build higher level operators such as ‘map’, ‘filter’, ‘reduce’, ‘join’, etc. Thoughts?
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.

Re: Simple Operators within Malhar (MLHR-1914)

Posted by Timothy Farkas <ti...@datatorrent.com>.

Hey Siyuan and Brennon,

Just an Idea, not sure how relevant it is to the work you guys are doing. I
have a use case for alerts where I want to have a proxy operator receive an
alert, then based on the communication channels listed in the alert, I want
the alert to be forwarded to one or more operators. Currently to do this I
have to hard code output ports in the operator, then add custom logic to
forward the alert to the correct operator. Later if i want to add a new
output I have to add a new output port and modify my operator logic. It
would be nice if I could implement a mapping function in the proxy operator
once which dictates which stream ID a tuple goes to, and at the dag level I
just connect outputs to the proxy operator without needing to declare
output ports in it. Would this fall in the scope of the work being done on
the operators and api?

Thanks,
Tim

On Fri, Dec 11, 2015 at 8:44 AM, Siyuan Hua <si...@datatorrent.com> wrote:

> I think the work is different.
>
> We need the work of general operator (ie one to one, one to many, many to
> one, and many to many) to ease the construction of DAG. That is what
> Brennon will work on.
>
> I will work on, I would say another abstraction layer on top of operators
> and ports, which is stream.  The ticket is here (
> https://malhar.atlassian.net/browse/MLHR-1939)
> A stream could be a bunch of operators and ports chained together. And a
> DAG can be expressed by num of streams.
> A good example is flink stream API
> https://flink.apache.org/news/2015/02/09/streaming-example.html
>
> And stream becomes a standard concept as Java 8 introduced the Stream api
> https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html
>
> In some case, stream is easier to understand and less verbose.
>
> It's still at the design phase.  When I collect enough information, I will
> start another thread to talk more about stream API and collect feedbacks
> from there.
>
>
> On Wed, Dec 9, 2015 at 10:44 PM, York, Brennon <
> Brennon.York@capitalone.com>
> wrote:
>
> > I see the goals as twofold.
> >
> > First, to abstract away what an app developer needs to write to be
> > successful (ie input and output ports) and to provide a common interface
> > for accessing such (ie all input and output ports and titled "input" and
> > "output" respectively).
> >
> > Second, to use these sets of operator processing primitives (ie one to
> > one, one to many, many to one, and many to many) to build a suite of
> > functional operators such as 'map', 'reduce', 'groupBy', etc. to, again,
> > abstract away what is necessary for the developer to write a solid apex
> > application.
> >
> > I see this benefitting the community as a whole in that it allows Apex to
> > build higher level tools and operators to ease the burden off the
> > application developer. It is great that someone can define their own
> input
> > and output ports, but is it necessary in 90% of the cases? I know
> > personally from applications we've built that having these design
> patterns
> > makes it easier as we internally developed versions of SingleInputOutput
> > and SingleInputMultiOutput exactly for that reason.
> >
> > Does that answer the question and/or clear things up? Happy to discuss
> > further :)
> >
> >
> >
> > -----Original Message-----
> > From: Thomas Weise [thomas@datatorrent.com<mailto:thomas@datatorrent.com
> >]
> > Sent: Wednesday, December 09, 2015 07:30 PM Eastern Standard Time
> > To: dev@apex.incubator.apache.org
> > Subject: Re: Simple Operators within Malhar (MLHR-1914)
> >
> >
> > Hi Brennon,
> >
> > What is the goal here? Make it easier for someone to build an application
> > or make it easier to write an operator or both? Is this for custom
> operator
> > development?
> >
> > Siyuan was also looking at the higher level API, from an application
> > developer's perspective.
> >
> > For the application developer, it should not matter how the operators
> were
> > written, how they are connected should be hidden by the API. That will be
> > important since we already have many operators that we want to reuse,
> such
> > as join or the adapters.
> >
> > Thomas
> >
> >
> > On Wed, Dec 9, 2015 at 1:42 PM, York, Brennon <
> Brennon.York@capitalone.com
> > >
> > wrote:
> >
> > > All, I’ve been working on the JIRA ticket MLHR-1914 (at
> > >
> >
> https://malhar.atlassian.net/projects/MLHR/issues/MLHR-1914?filter=allopenissues
> > )
> > > and I wanted to shoot this out to describe what I’ve been doing and get
> > > feedback now that its in a state of something that we can discuss ;)
> > >
> > > Before going into depth here is the code on my local repo:
> > >
> > >
> >
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/complex
> > >
> > >
> >
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/simple
> > > The tests are in the same respective test directory.
> > >
> > > So, the biggest impetus for this JIRA is that there should be a set of
> > > operators that 1. standardize the input and output ports and 2. make it
> > > very simple for a developer to merely implement a process method and
> > forget
> > > the rest. Given all of this I found that there were two sets of
> operators
> > > based on the complexity of ports and how they mapped to each other. I
> > gave
> > > them the package names ‘simple’ and ‘complex’ for lack of a better idea
> > at
> > > the time. Feel free to propose something better :)
> > >
> > > Under ‘simple’ are three operators:
> > >
> > >  *   SingleInputOutput: This abstracts the input and output port
> (defined
> > > as ‘input’ and ‘output’) and merely allows a user to implement a
> process
> > > method.
> > >  *   SingleInputMultiOutput: Like above, but the return value from the
> > > ‘process’ method is emitted to N output ports where N defaults to 2.
> > >  *   MultiInputSingleOutput: N inputs are mapped into a single
> ‘process’
> > > method with a single output port with N defaulting to 2.
> > >
> > > Under ‘complex’ are four operators:
> > >
> > >  *   SingleInputListOutput: a single input port and ‘process’ method
> > where
> > > the return value of the ‘process’ method is a list of values with each
> > > value in the array matching the N output ports with N defaulting to 2.
> > >  *   DirectMultiInputOutput: This maps N inputs to N outputs processed
> > > under a single ‘process’ method with N defaulting to 2.
> > >  *   AllWayMultiInputOutput: maps N inputs to M outputs such that, for
> > > each input the ‘process’ method is called and, with the return value of
> > the
> > > process method, it is sent to each of the M output ports with M and N
> > > defaulting to 2.
> > >  *   AllWayMultiInputListOutput: like above except that, instead of
> > having
> > > the ‘process’ method return value emit to each of the M output ports,
> the
> > > return value from ‘process’ is a list with each element in the list
> > > emitting to a different output port. Concretely, v[0] => O[0], v[1] =>
> > > O[1], etc. where v[] is the array of values from the ‘process’ method
> and
> > > O[] is the array of output ports.
> > >
> > > Like I said I’m still working through the test and error cases (say
> where
> > > v[].len != O[].len) although I’d love to get feedback on everything
> thus
> > > far! Also, forgot to mention above, but this work is heavily related
> and
> > > will be the base of MLHR-1915 whereby we can build higher level
> operators
> > > such as ‘map’, ‘filter’, ‘reduce’, ‘join’, etc. Thoughts?
> > > ________________________________________________________
> > >
> > > The information contained in this e-mail is confidential and/or
> > > proprietary to Capital One and/or its affiliates and may only be used
> > > solely in performance of work or services for Capital One. The
> > information
> > > transmitted herewith is intended only for use by the individual or
> entity
> > > to which it is addressed. If the reader of this message is not the
> > intended
> > > recipient, you are hereby notified that any review, retransmission,
> > > dissemination, distribution, copying or other use of, or taking of any
> > > action in reliance upon this information is strictly prohibited. If you
> > > have received this communication in error, please contact the sender
> and
> > > delete the material from your computer.
> > >
> > ________________________________________________________
> >
> > The information contained in this e-mail is confidential and/or
> > proprietary to Capital One and/or its affiliates and may only be used
> > solely in performance of work or services for Capital One. The
> information
> > transmitted herewith is intended only for use by the individual or entity
> > to which it is addressed. If the reader of this message is not the
> intended
> > recipient, you are hereby notified that any review, retransmission,
> > dissemination, distribution, copying or other use of, or taking of any
> > action in reliance upon this information is strictly prohibited. If you
> > have received this communication in error, please contact the sender and
> > delete the material from your computer.
> >
>

Re: Simple Operators within Malhar (MLHR-1914)

Posted by Siyuan Hua <si...@datatorrent.com>.

I think the work is different.

We need the work of general operator (ie one to one, one to many, many to
one, and many to many) to ease the construction of DAG. That is what
Brennon will work on.

I will work on, I would say another abstraction layer on top of operators
and ports, which is stream.  The ticket is here (
https://malhar.atlassian.net/browse/MLHR-1939)
A stream could be a bunch of operators and ports chained together. And a
DAG can be expressed by num of streams.
A good example is flink stream API
https://flink.apache.org/news/2015/02/09/streaming-example.html

And stream becomes a standard concept as Java 8 introduced the Stream api
https://docs.oracle.com/javase/8/docs/api/java/util/stream/Stream.html

In some case, stream is easier to understand and less verbose.

It's still at the design phase.  When I collect enough information, I will
start another thread to talk more about stream API and collect feedbacks
from there.


On Wed, Dec 9, 2015 at 10:44 PM, York, Brennon <Br...@capitalone.com>
wrote:

> I see the goals as twofold.
>
> First, to abstract away what an app developer needs to write to be
> successful (ie input and output ports) and to provide a common interface
> for accessing such (ie all input and output ports and titled "input" and
> "output" respectively).
>
> Second, to use these sets of operator processing primitives (ie one to
> one, one to many, many to one, and many to many) to build a suite of
> functional operators such as 'map', 'reduce', 'groupBy', etc. to, again,
> abstract away what is necessary for the developer to write a solid apex
> application.
>
> I see this benefitting the community as a whole in that it allows Apex to
> build higher level tools and operators to ease the burden off the
> application developer. It is great that someone can define their own input
> and output ports, but is it necessary in 90% of the cases? I know
> personally from applications we've built that having these design patterns
> makes it easier as we internally developed versions of SingleInputOutput
> and SingleInputMultiOutput exactly for that reason.
>
> Does that answer the question and/or clear things up? Happy to discuss
> further :)
>
>
>
> -----Original Message-----
> From: Thomas Weise [thomas@datatorrent.com<ma...@datatorrent.com>]
> Sent: Wednesday, December 09, 2015 07:30 PM Eastern Standard Time
> To: dev@apex.incubator.apache.org
> Subject: Re: Simple Operators within Malhar (MLHR-1914)
>
>
> Hi Brennon,
>
> What is the goal here? Make it easier for someone to build an application
> or make it easier to write an operator or both? Is this for custom operator
> development?
>
> Siyuan was also looking at the higher level API, from an application
> developer's perspective.
>
> For the application developer, it should not matter how the operators were
> written, how they are connected should be hidden by the API. That will be
> important since we already have many operators that we want to reuse, such
> as join or the adapters.
>
> Thomas
>
>
> On Wed, Dec 9, 2015 at 1:42 PM, York, Brennon <Brennon.York@capitalone.com
> >
> wrote:
>
> > All, I’ve been working on the JIRA ticket MLHR-1914 (at
> >
> https://malhar.atlassian.net/projects/MLHR/issues/MLHR-1914?filter=allopenissues
> )
> > and I wanted to shoot this out to describe what I’ve been doing and get
> > feedback now that its in a state of something that we can discuss ;)
> >
> > Before going into depth here is the code on my local repo:
> >
> >
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/complex
> >
> >
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/simple
> > The tests are in the same respective test directory.
> >
> > So, the biggest impetus for this JIRA is that there should be a set of
> > operators that 1. standardize the input and output ports and 2. make it
> > very simple for a developer to merely implement a process method and
> forget
> > the rest. Given all of this I found that there were two sets of operators
> > based on the complexity of ports and how they mapped to each other. I
> gave
> > them the package names ‘simple’ and ‘complex’ for lack of a better idea
> at
> > the time. Feel free to propose something better :)
> >
> > Under ‘simple’ are three operators:
> >
> >  *   SingleInputOutput: This abstracts the input and output port (defined
> > as ‘input’ and ‘output’) and merely allows a user to implement a process
> > method.
> >  *   SingleInputMultiOutput: Like above, but the return value from the
> > ‘process’ method is emitted to N output ports where N defaults to 2.
> >  *   MultiInputSingleOutput: N inputs are mapped into a single ‘process’
> > method with a single output port with N defaulting to 2.
> >
> > Under ‘complex’ are four operators:
> >
> >  *   SingleInputListOutput: a single input port and ‘process’ method
> where
> > the return value of the ‘process’ method is a list of values with each
> > value in the array matching the N output ports with N defaulting to 2.
> >  *   DirectMultiInputOutput: This maps N inputs to N outputs processed
> > under a single ‘process’ method with N defaulting to 2.
> >  *   AllWayMultiInputOutput: maps N inputs to M outputs such that, for
> > each input the ‘process’ method is called and, with the return value of
> the
> > process method, it is sent to each of the M output ports with M and N
> > defaulting to 2.
> >  *   AllWayMultiInputListOutput: like above except that, instead of
> having
> > the ‘process’ method return value emit to each of the M output ports, the
> > return value from ‘process’ is a list with each element in the list
> > emitting to a different output port. Concretely, v[0] => O[0], v[1] =>
> > O[1], etc. where v[] is the array of values from the ‘process’ method and
> > O[] is the array of output ports.
> >
> > Like I said I’m still working through the test and error cases (say where
> > v[].len != O[].len) although I’d love to get feedback on everything thus
> > far! Also, forgot to mention above, but this work is heavily related and
> > will be the base of MLHR-1915 whereby we can build higher level operators
> > such as ‘map’, ‘filter’, ‘reduce’, ‘join’, etc. Thoughts?
> > ________________________________________________________
> >
> > The information contained in this e-mail is confidential and/or
> > proprietary to Capital One and/or its affiliates and may only be used
> > solely in performance of work or services for Capital One. The
> information
> > transmitted herewith is intended only for use by the individual or entity
> > to which it is addressed. If the reader of this message is not the
> intended
> > recipient, you are hereby notified that any review, retransmission,
> > dissemination, distribution, copying or other use of, or taking of any
> > action in reliance upon this information is strictly prohibited. If you
> > have received this communication in error, please contact the sender and
> > delete the material from your computer.
> >
> ________________________________________________________
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>

RE: Simple Operators within Malhar (MLHR-1914)

Posted by "York, Brennon" <Br...@capitalone.com>.

I see the goals as twofold.

First, to abstract away what an app developer needs to write to be successful (ie input and output ports) and to provide a common interface for accessing such (ie all input and output ports and titled "input" and "output" respectively).

Second, to use these sets of operator processing primitives (ie one to one, one to many, many to one, and many to many) to build a suite of functional operators such as 'map', 'reduce', 'groupBy', etc. to, again, abstract away what is necessary for the developer to write a solid apex application.

I see this benefitting the community as a whole in that it allows Apex to build higher level tools and operators to ease the burden off the application developer. It is great that someone can define their own input and output ports, but is it necessary in 90% of the cases? I know personally from applications we've built that having these design patterns makes it easier as we internally developed versions of SingleInputOutput and SingleInputMultiOutput exactly for that reason.

Does that answer the question and/or clear things up? Happy to discuss further :)

-----Original Message-----
From: Thomas Weise [thomas@datatorrent.com<ma...@datatorrent.com>]
Sent: Wednesday, December 09, 2015 07:30 PM Eastern Standard Time
To: dev@apex.incubator.apache.org
Subject: Re: Simple Operators within Malhar (MLHR-1914)

Hi Brennon,

What is the goal here? Make it easier for someone to build an application
or make it easier to write an operator or both? Is this for custom operator
development?

Siyuan was also looking at the higher level API, from an application
developer's perspective.

For the application developer, it should not matter how the operators were
written, how they are connected should be hidden by the API. That will be
important since we already have many operators that we want to reuse, such
as join or the adapters.

Thomas

On Wed, Dec 9, 2015 at 1:42 PM, York, Brennon <Br...@capitalone.com>
wrote:

> All, I’ve been working on the JIRA ticket MLHR-1914 (at
> https://malhar.atlassian.net/projects/MLHR/issues/MLHR-1914?filter=allopenissues)
> and I wanted to shoot this out to describe what I’ve been doing and get
> feedback now that its in a state of something that we can discuss ;)
>
> Before going into depth here is the code on my local repo:
>
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/complex
>
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/simple
> The tests are in the same respective test directory.
>
> So, the biggest impetus for this JIRA is that there should be a set of
> operators that 1. standardize the input and output ports and 2. make it
> very simple for a developer to merely implement a process method and forget
> the rest. Given all of this I found that there were two sets of operators
> based on the complexity of ports and how they mapped to each other. I gave
> them the package names ‘simple’ and ‘complex’ for lack of a better idea at
> the time. Feel free to propose something better :)
>
> Under ‘simple’ are three operators:
>
>  *   SingleInputOutput: This abstracts the input and output port (defined
> as ‘input’ and ‘output’) and merely allows a user to implement a process
> method.
>  *   SingleInputMultiOutput: Like above, but the return value from the
> ‘process’ method is emitted to N output ports where N defaults to 2.
>  *   MultiInputSingleOutput: N inputs are mapped into a single ‘process’
> method with a single output port with N defaulting to 2.
>
> Under ‘complex’ are four operators:
>
>  *   SingleInputListOutput: a single input port and ‘process’ method where
> the return value of the ‘process’ method is a list of values with each
> value in the array matching the N output ports with N defaulting to 2.
>  *   DirectMultiInputOutput: This maps N inputs to N outputs processed
> under a single ‘process’ method with N defaulting to 2.
>  *   AllWayMultiInputOutput: maps N inputs to M outputs such that, for
> each input the ‘process’ method is called and, with the return value of the
> process method, it is sent to each of the M output ports with M and N
> defaulting to 2.
>  *   AllWayMultiInputListOutput: like above except that, instead of having
> the ‘process’ method return value emit to each of the M output ports, the
> return value from ‘process’ is a list with each element in the list
> emitting to a different output port. Concretely, v[0] => O[0], v[1] =>
> O[1], etc. where v[] is the array of values from the ‘process’ method and
> O[] is the array of output ports.
>
> Like I said I’m still working through the test and error cases (say where
> v[].len != O[].len) although I’d love to get feedback on everything thus
> far! Also, forgot to mention above, but this work is heavily related and
> will be the base of MLHR-1915 whereby we can build higher level operators
> such as ‘map’, ‘filter’, ‘reduce’, ‘join’, etc. Thoughts?
> ________________________________________________________
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>
________________________________________________________

The information contained in this e-mail is confidential and/or proprietary to Capital One and/or its affiliates and may only be used solely in performance of work or services for Capital One. The information transmitted herewith is intended only for use by the individual or entity to which it is addressed. If the reader of this message is not the intended recipient, you are hereby notified that any review, retransmission, dissemination, distribution, copying or other use of, or taking of any action in reliance upon this information is strictly prohibited. If you have received this communication in error, please contact the sender and delete the material from your computer.

Re: Simple Operators within Malhar (MLHR-1914)

Posted by Thomas Weise <th...@datatorrent.com>.

Hi Brennon,

What is the goal here? Make it easier for someone to build an application
or make it easier to write an operator or both? Is this for custom operator
development?

Siyuan was also looking at the higher level API, from an application
developer's perspective.

For the application developer, it should not matter how the operators were
written, how they are connected should be hidden by the API. That will be
important since we already have many operators that we want to reuse, such
as join or the adapters.

Thomas


On Wed, Dec 9, 2015 at 1:42 PM, York, Brennon <Br...@capitalone.com>
wrote:

> All, I’ve been working on the JIRA ticket MLHR-1914 (at
> https://malhar.atlassian.net/projects/MLHR/issues/MLHR-1914?filter=allopenissues)
> and I wanted to shoot this out to describe what I’ve been doing and get
> feedback now that its in a state of something that we can discuss ;)
>
> Before going into depth here is the code on my local repo:
>
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/complex
>
> https://github.com/brennonyork/incubator-apex-malhar/tree/MLHR-1914/library/src/main/java/com/datatorrent/lib/simple
> The tests are in the same respective test directory.
>
> So, the biggest impetus for this JIRA is that there should be a set of
> operators that 1. standardize the input and output ports and 2. make it
> very simple for a developer to merely implement a process method and forget
> the rest. Given all of this I found that there were two sets of operators
> based on the complexity of ports and how they mapped to each other. I gave
> them the package names ‘simple’ and ‘complex’ for lack of a better idea at
> the time. Feel free to propose something better :)
>
> Under ‘simple’ are three operators:
>
>  *   SingleInputOutput: This abstracts the input and output port (defined
> as ‘input’ and ‘output’) and merely allows a user to implement a process
> method.
>  *   SingleInputMultiOutput: Like above, but the return value from the
> ‘process’ method is emitted to N output ports where N defaults to 2.
>  *   MultiInputSingleOutput: N inputs are mapped into a single ‘process’
> method with a single output port with N defaulting to 2.
>
> Under ‘complex’ are four operators:
>
>  *   SingleInputListOutput: a single input port and ‘process’ method where
> the return value of the ‘process’ method is a list of values with each
> value in the array matching the N output ports with N defaulting to 2.
>  *   DirectMultiInputOutput: This maps N inputs to N outputs processed
> under a single ‘process’ method with N defaulting to 2.
>  *   AllWayMultiInputOutput: maps N inputs to M outputs such that, for
> each input the ‘process’ method is called and, with the return value of the
> process method, it is sent to each of the M output ports with M and N
> defaulting to 2.
>  *   AllWayMultiInputListOutput: like above except that, instead of having
> the ‘process’ method return value emit to each of the M output ports, the
> return value from ‘process’ is a list with each element in the list
> emitting to a different output port. Concretely, v[0] => O[0], v[1] =>
> O[1], etc. where v[] is the array of values from the ‘process’ method and
> O[] is the array of output ports.
>
> Like I said I’m still working through the test and error cases (say where
> v[].len != O[].len) although I’d love to get feedback on everything thus
> far! Also, forgot to mention above, but this work is heavily related and
> will be the base of MLHR-1915 whereby we can build higher level operators
> such as ‘map’, ‘filter’, ‘reduce’, ‘join’, etc. Thoughts?
> ________________________________________________________
>
> The information contained in this e-mail is confidential and/or
> proprietary to Capital One and/or its affiliates and may only be used
> solely in performance of work or services for Capital One. The information
> transmitted herewith is intended only for use by the individual or entity
> to which it is addressed. If the reader of this message is not the intended
> recipient, you are hereby notified that any review, retransmission,
> dissemination, distribution, copying or other use of, or taking of any
> action in reliance upon this information is strictly prohibited. If you
> have received this communication in error, please contact the sender and
> delete the material from your computer.
>