You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@storm.apache.org by Jungtaek Lim <ka...@gmail.com> on 2016/08/30 08:16:58 UTC

Question regarding design of the Trident (operations)

Hi devs,

While implementing some features of Storm SQL on top of Trident, I realized
that there's no Trident operation which provides a way to add fields from
some Function and remove existing fields within one operation.
'each' appends the function output fields to origin input fields, and 'map'
and 'flatMap' use same output fields, so no luck.

Is this a missing feature, or there's a reason for doing so? Trident uses
TridentTupleView so I guess this is related but I didn't have a deep look
for Trident itself.
If anyone knows about the history please let me know.

Thanks,
Jungtaek Lim (HeartSaVioR)

ps. It might be better to explain why I'm finding that feature. I'm trying
to minimize below each -> project -> each -> project chain into one.
https://github.com/apache/storm/blob/master/external/sql/storm-sql-core/src/jvm/org/apache/storm/sql/compiler/backends/trident/TridentLogicalPlanCompiler.java#L164-L167
Please note that Trident doesn't allow duplicated field name so using
temporary field name is necessary if we should preserve existing fields.

Re: Question regarding design of the Trident (operations)

Posted by Jungtaek Lim <ka...@gmail.com>.
FYI: Just worked to have one -
http://issues.apache.org/jira/browse/STORM-2072

This adds map and flatMap to have additional parameter: outputFields. If
outputFields are specified, it replaces origin fields with output fields.

- Jungtaek Lim (HeartSaVioR)


2016년 8월 31일 (수) 오전 10:27, Jungtaek Lim <ka...@gmail.com>님이 작성:

> FYI: I just found that Stream class has TODO comment above of class:
> *// TODO: need to be able to replace existing fields with the function
> fields (like Cascading Fields.REPLACE)*
> which is written at Aug. 2012, initial importing Trident source code.
>
> According to the TODO comment, it seems to be a just missing feature.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
>
> 2016년 8월 31일 (수) 오전 7:29, Jungtaek Lim <ka...@gmail.com>님이 작성:
>
>> Hi Manu,
>>
>> I'm finding for 1:1 tuple transition, not aggregation. In other words,
>> I'm finding like 'V map(T, Function<T, V>)' which seems fit for many use
>> case but Trident doesn't support this yet so I'd like to find out why.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2016년 8월 31일 (수) 오전 7:09, Manu Zhang <ow...@gmail.com>님이 작성:
>>
>>> I think we have 'aggregate' in Trident, where the function output fields
>>> replace the input fields.
>>>
>>> Thanks,
>>> Manu Zhang
>>>
>>> On Tue, Aug 30, 2016 at 4:17 PM Jungtaek Lim <ka...@gmail.com> wrote:
>>>
>>> > Hi devs,
>>> >
>>> > While implementing some features of Storm SQL on top of Trident, I
>>> realized
>>> > that there's no Trident operation which provides a way to add fields
>>> from
>>> > some Function and remove existing fields within one operation.
>>> > 'each' appends the function output fields to origin input fields, and
>>> 'map'
>>> > and 'flatMap' use same output fields, so no luck.
>>> >
>>> > Is this a missing feature, or there's a reason for doing so? Trident
>>> uses
>>> > TridentTupleView so I guess this is related but I didn't have a deep
>>> look
>>> > for Trident itself.
>>> > If anyone knows about the history please let me know.
>>> >
>>> > Thanks,
>>> > Jungtaek Lim (HeartSaVioR)
>>> >
>>> > ps. It might be better to explain why I'm finding that feature. I'm
>>> trying
>>> > to minimize below each -> project -> each -> project chain into one.
>>> >
>>> >
>>> https://github.com/apache/storm/blob/master/external/sql/storm-sql-core/src/jvm/org/apache/storm/sql/compiler/backends/trident/TridentLogicalPlanCompiler.java#L164-L167
>>> > Please note that Trident doesn't allow duplicated field name so using
>>> > temporary field name is necessary if we should preserve existing
>>> fields.
>>> >
>>>
>>

Re: Question regarding design of the Trident (operations)

Posted by Jungtaek Lim <ka...@gmail.com>.
FYI: I just found that Stream class has TODO comment above of class:
*// TODO: need to be able to replace existing fields with the function
fields (like Cascading Fields.REPLACE)*
which is written at Aug. 2012, initial importing Trident source code.

According to the TODO comment, it seems to be a just missing feature.

Thanks,
Jungtaek Lim (HeartSaVioR)


2016년 8월 31일 (수) 오전 7:29, Jungtaek Lim <ka...@gmail.com>님이 작성:

> Hi Manu,
>
> I'm finding for 1:1 tuple transition, not aggregation. In other words, I'm
> finding like 'V map(T, Function<T, V>)' which seems fit for many use case
> but Trident doesn't support this yet so I'd like to find out why.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2016년 8월 31일 (수) 오전 7:09, Manu Zhang <ow...@gmail.com>님이 작성:
>
>> I think we have 'aggregate' in Trident, where the function output fields
>> replace the input fields.
>>
>> Thanks,
>> Manu Zhang
>>
>> On Tue, Aug 30, 2016 at 4:17 PM Jungtaek Lim <ka...@gmail.com> wrote:
>>
>> > Hi devs,
>> >
>> > While implementing some features of Storm SQL on top of Trident, I
>> realized
>> > that there's no Trident operation which provides a way to add fields
>> from
>> > some Function and remove existing fields within one operation.
>> > 'each' appends the function output fields to origin input fields, and
>> 'map'
>> > and 'flatMap' use same output fields, so no luck.
>> >
>> > Is this a missing feature, or there's a reason for doing so? Trident
>> uses
>> > TridentTupleView so I guess this is related but I didn't have a deep
>> look
>> > for Trident itself.
>> > If anyone knows about the history please let me know.
>> >
>> > Thanks,
>> > Jungtaek Lim (HeartSaVioR)
>> >
>> > ps. It might be better to explain why I'm finding that feature. I'm
>> trying
>> > to minimize below each -> project -> each -> project chain into one.
>> >
>> >
>> https://github.com/apache/storm/blob/master/external/sql/storm-sql-core/src/jvm/org/apache/storm/sql/compiler/backends/trident/TridentLogicalPlanCompiler.java#L164-L167
>> > Please note that Trident doesn't allow duplicated field name so using
>> > temporary field name is necessary if we should preserve existing fields.
>> >
>>
>

Re: Question regarding design of the Trident (operations)

Posted by Jungtaek Lim <ka...@gmail.com>.
Hi Manu,

I'm finding for 1:1 tuple transition, not aggregation. In other words, I'm
finding like 'V map(T, Function<T, V>)' which seems fit for many use case
but Trident doesn't support this yet so I'd like to find out why.

Thanks,
Jungtaek Lim (HeartSaVioR)

2016년 8월 31일 (수) 오전 7:09, Manu Zhang <ow...@gmail.com>님이 작성:

> I think we have 'aggregate' in Trident, where the function output fields
> replace the input fields.
>
> Thanks,
> Manu Zhang
>
> On Tue, Aug 30, 2016 at 4:17 PM Jungtaek Lim <ka...@gmail.com> wrote:
>
> > Hi devs,
> >
> > While implementing some features of Storm SQL on top of Trident, I
> realized
> > that there's no Trident operation which provides a way to add fields from
> > some Function and remove existing fields within one operation.
> > 'each' appends the function output fields to origin input fields, and
> 'map'
> > and 'flatMap' use same output fields, so no luck.
> >
> > Is this a missing feature, or there's a reason for doing so? Trident uses
> > TridentTupleView so I guess this is related but I didn't have a deep look
> > for Trident itself.
> > If anyone knows about the history please let me know.
> >
> > Thanks,
> > Jungtaek Lim (HeartSaVioR)
> >
> > ps. It might be better to explain why I'm finding that feature. I'm
> trying
> > to minimize below each -> project -> each -> project chain into one.
> >
> >
> https://github.com/apache/storm/blob/master/external/sql/storm-sql-core/src/jvm/org/apache/storm/sql/compiler/backends/trident/TridentLogicalPlanCompiler.java#L164-L167
> > Please note that Trident doesn't allow duplicated field name so using
> > temporary field name is necessary if we should preserve existing fields.
> >
>

Re: Question regarding design of the Trident (operations)

Posted by Manu Zhang <ow...@gmail.com>.
I think we have 'aggregate' in Trident, where the function output fields
replace the input fields.

Thanks,
Manu Zhang

On Tue, Aug 30, 2016 at 4:17 PM Jungtaek Lim <ka...@gmail.com> wrote:

> Hi devs,
>
> While implementing some features of Storm SQL on top of Trident, I realized
> that there's no Trident operation which provides a way to add fields from
> some Function and remove existing fields within one operation.
> 'each' appends the function output fields to origin input fields, and 'map'
> and 'flatMap' use same output fields, so no luck.
>
> Is this a missing feature, or there's a reason for doing so? Trident uses
> TridentTupleView so I guess this is related but I didn't have a deep look
> for Trident itself.
> If anyone knows about the history please let me know.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> ps. It might be better to explain why I'm finding that feature. I'm trying
> to minimize below each -> project -> each -> project chain into one.
>
> https://github.com/apache/storm/blob/master/external/sql/storm-sql-core/src/jvm/org/apache/storm/sql/compiler/backends/trident/TridentLogicalPlanCompiler.java#L164-L167
> Please note that Trident doesn't allow duplicated field name so using
> temporary field name is necessary if we should preserve existing fields.
>