You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@apex.apache.org by Chinmay Kolhatkar <ch...@datatorrent.com> on 2017/02/27 06:33:35 UTC

Proposal: CompositeAccumulation for Windowed Operator

Dear Community,

Currently we have accumulations for individual types of accumulations.
But if one wants to do more than one accumulations in a single stage of
Windowed Operator it is not possible.

I want to propose an idea about "CompositeAccumulation" where more than one
accumulation can be configured and this accoumulation can relay on multiple
accumulations to generate final result/output.

The output can be either of the 2 forms:
1. Just the list of outputs with AccumulationTags as identifiers.
2. Merge the results of multiple accumulations using some user defined
logic.
     For eg. In aggregation case, Input POJO to this accumulation can be a
POJO containing NumberOfOrders as field and in output one might need to
generate a final(single) POJO which contains result of multiple
accumulations like SUM, COUNT on NumberOfOrders as different fields of
outgoing POJO.

I particularly see the use of this for Multiple Aggregation which we would
like to do in SQL on Apex Integration.

Please share your thoughts on the same.

Thanks,
Chinmay.

Re: Proposal: CompositeAccumulation for Windowed Operator

Posted by Chinmay Kolhatkar <ch...@datatorrent.com>.
Thanks Bright. I've reviewed your PR. It looks good.. Just a minor change
required. Please see my comment there.

On Mon, Feb 27, 2017 at 11:26 PM, Bright Chen <br...@datatorrent.com>
wrote:

> A jira created: https://issues.apache.org/jira/browse/APEXMALHAR-2428
>
>
> On Mon, Feb 27, 2017 at 9:53 AM, Bright Chen <br...@datatorrent.com>
> wrote:
>
> > I think Chimay's proposal could make application more clear and increase
> > the performance as locate of key/window cost most of time.
> >
> > A suggested usage for Composite Accumulation could as following:
> >
> > *    //following is the sample code how to add sub accumulations*
> >
> > *    CompositeAccumulation<Long> accumulations = new
> > CompositeAccumulation<>();*
> >
> > *    AccumulationTag sumTag =
> > accumulations.addAccumulation((Accumulation)new SumAccumulation());*
> >
> > *    AccumulationTag countTag =
> > accumulations.addAccumulation((Accumulation)new Count());*
> >
> > *    AccumulationTag maxTag = accumulations.addAccumulation(new Max());*
> >
> > *    AccumulationTag minTag = accumulations.addAccumulation(new Min());*
> >
> > *    //following is the sample how to get the sub-accumulation output*
> >
> > *    accumulations.getSubOutput(sumTag, outputValues)*
> >
> > *    accumulations.getSubOutput(countTag, outputValues)*
> >
> > *    accumulations.getSubOutput(maxTag, outputValues)*
> >
> > *    accumulations.getSubOutput(minTag, outputValues)*
> >
> >
> > Thanks
> >
> > Bright
> >
> > On Sun, Feb 26, 2017 at 10:33 PM, Chinmay Kolhatkar <
> > chinmay@datatorrent.com> wrote:
> >
> >> Dear Community,
> >>
> >> Currently we have accumulations for individual types of accumulations.
> >> But if one wants to do more than one accumulations in a single stage of
> >> Windowed Operator it is not possible.
> >>
> >> I want to propose an idea about "CompositeAccumulation" where more than
> >> one
> >> accumulation can be configured and this accoumulation can relay on
> >> multiple
> >> accumulations to generate final result/output.
> >>
> >> The output can be either of the 2 forms:
> >> 1. Just the list of outputs with AccumulationTags as identifiers.
> >> 2. Merge the results of multiple accumulations using some user defined
> >> logic.
> >>      For eg. In aggregation case, Input POJO to this accumulation can
> be a
> >> POJO containing NumberOfOrders as field and in output one might need to
> >> generate a final(single) POJO which contains result of multiple
> >> accumulations like SUM, COUNT on NumberOfOrders as different fields of
> >> outgoing POJO.
> >>
> >> I particularly see the use of this for Multiple Aggregation which we
> would
> >> like to do in SQL on Apex Integration.
> >>
> >> Please share your thoughts on the same.
> >>
> >> Thanks,
> >> Chinmay.
> >>
> >
> >
>

Re: Proposal: CompositeAccumulation for Windowed Operator

Posted by Bright Chen <br...@datatorrent.com>.
A jira created: https://issues.apache.org/jira/browse/APEXMALHAR-2428


On Mon, Feb 27, 2017 at 9:53 AM, Bright Chen <br...@datatorrent.com> wrote:

> I think Chimay's proposal could make application more clear and increase
> the performance as locate of key/window cost most of time.
>
> A suggested usage for Composite Accumulation could as following:
>
> *    //following is the sample code how to add sub accumulations*
>
> *    CompositeAccumulation<Long> accumulations = new
> CompositeAccumulation<>();*
>
> *    AccumulationTag sumTag =
> accumulations.addAccumulation((Accumulation)new SumAccumulation());*
>
> *    AccumulationTag countTag =
> accumulations.addAccumulation((Accumulation)new Count());*
>
> *    AccumulationTag maxTag = accumulations.addAccumulation(new Max());*
>
> *    AccumulationTag minTag = accumulations.addAccumulation(new Min());*
>
> *    //following is the sample how to get the sub-accumulation output*
>
> *    accumulations.getSubOutput(sumTag, outputValues)*
>
> *    accumulations.getSubOutput(countTag, outputValues)*
>
> *    accumulations.getSubOutput(maxTag, outputValues)*
>
> *    accumulations.getSubOutput(minTag, outputValues)*
>
>
> Thanks
>
> Bright
>
> On Sun, Feb 26, 2017 at 10:33 PM, Chinmay Kolhatkar <
> chinmay@datatorrent.com> wrote:
>
>> Dear Community,
>>
>> Currently we have accumulations for individual types of accumulations.
>> But if one wants to do more than one accumulations in a single stage of
>> Windowed Operator it is not possible.
>>
>> I want to propose an idea about "CompositeAccumulation" where more than
>> one
>> accumulation can be configured and this accoumulation can relay on
>> multiple
>> accumulations to generate final result/output.
>>
>> The output can be either of the 2 forms:
>> 1. Just the list of outputs with AccumulationTags as identifiers.
>> 2. Merge the results of multiple accumulations using some user defined
>> logic.
>>      For eg. In aggregation case, Input POJO to this accumulation can be a
>> POJO containing NumberOfOrders as field and in output one might need to
>> generate a final(single) POJO which contains result of multiple
>> accumulations like SUM, COUNT on NumberOfOrders as different fields of
>> outgoing POJO.
>>
>> I particularly see the use of this for Multiple Aggregation which we would
>> like to do in SQL on Apex Integration.
>>
>> Please share your thoughts on the same.
>>
>> Thanks,
>> Chinmay.
>>
>
>

Re: Proposal: CompositeAccumulation for Windowed Operator

Posted by Bright Chen <br...@datatorrent.com>.
I think Chimay's proposal could make application more clear and increase
the performance as locate of key/window cost most of time.

A suggested usage for Composite Accumulation could as following:

*    //following is the sample code how to add sub accumulations*

*    CompositeAccumulation<Long> accumulations = new
CompositeAccumulation<>();*

*    AccumulationTag sumTag =
accumulations.addAccumulation((Accumulation)new SumAccumulation());*

*    AccumulationTag countTag =
accumulations.addAccumulation((Accumulation)new Count());*

*    AccumulationTag maxTag = accumulations.addAccumulation(new Max());*

*    AccumulationTag minTag = accumulations.addAccumulation(new Min());*

*    //following is the sample how to get the sub-accumulation output*

*    accumulations.getSubOutput(sumTag, outputValues)*

*    accumulations.getSubOutput(countTag, outputValues)*

*    accumulations.getSubOutput(maxTag, outputValues)*

*    accumulations.getSubOutput(minTag, outputValues)*


Thanks

Bright

On Sun, Feb 26, 2017 at 10:33 PM, Chinmay Kolhatkar <chinmay@datatorrent.com
> wrote:

> Dear Community,
>
> Currently we have accumulations for individual types of accumulations.
> But if one wants to do more than one accumulations in a single stage of
> Windowed Operator it is not possible.
>
> I want to propose an idea about "CompositeAccumulation" where more than one
> accumulation can be configured and this accoumulation can relay on multiple
> accumulations to generate final result/output.
>
> The output can be either of the 2 forms:
> 1. Just the list of outputs with AccumulationTags as identifiers.
> 2. Merge the results of multiple accumulations using some user defined
> logic.
>      For eg. In aggregation case, Input POJO to this accumulation can be a
> POJO containing NumberOfOrders as field and in output one might need to
> generate a final(single) POJO which contains result of multiple
> accumulations like SUM, COUNT on NumberOfOrders as different fields of
> outgoing POJO.
>
> I particularly see the use of this for Multiple Aggregation which we would
> like to do in SQL on Apex Integration.
>
> Please share your thoughts on the same.
>
> Thanks,
> Chinmay.
>