You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@flink.apache.org by Fabian Hueske <fh...@gmail.com> on 2015/02/03 10:39:03 UTC

Re: Design Question in Expression API

I am also +1 for hiding the internals.
Having conversion functions from and to DataSet sounds like the way to go
for me.


2015-01-31 11:04 GMT+01:00 Aljoscha Krettek <al...@apache.org>:

> Yes, that's exactly my reasoning for wanting to hide it.
>
> On Sat, Jan 31, 2015 at 10:32 AM, Stephan Ewen <se...@apache.org> wrote:
> > My first Intuition is to not expose the row data type. If we add columnar
> > executing later, there may never be a Row data type during runtime (cp
> > paper on hyper runtime engine).
> >
> > For these declarative operations, I think it is a big advantage to keep
> the
> > underpinnings strictly separate so we can change the execution model.
> >
> > Also, I think that explicit switches between the logical and physical
> > abstraction (switching from class type to logical row type and vice
> versa)
> > make things more transparent to the user. As an example: A filter in a
> > logical query expression may be pushed down, a filter defined as as udf
> on
> > a physical type is not pushed down.
>

Re: Design Question in Expression API

Posted by Max Michels <ma...@data-artisans.com>.

If we want to have a tight integration with our existing API we have
to hide the results of the expressions behind a wrapper. This enables
us to change the internal implementation at any time and support
future Flink API changes and features.

+1 for not directly exposing the results as a row DataSet.

On Tue, Feb 3, 2015 at 10:39 AM, Fabian Hueske <fh...@gmail.com> wrote:
> I am also +1 for hiding the internals.
> Having conversion functions from and to DataSet sounds like the way to go
> for me.
>
>
> 2015-01-31 11:04 GMT+01:00 Aljoscha Krettek <al...@apache.org>:
>
>> Yes, that's exactly my reasoning for wanting to hide it.
>>
>> On Sat, Jan 31, 2015 at 10:32 AM, Stephan Ewen <se...@apache.org> wrote:
>> > My first Intuition is to not expose the row data type. If we add columnar
>> > executing later, there may never be a Row data type during runtime (cp
>> > paper on hyper runtime engine).
>> >
>> > For these declarative operations, I think it is a big advantage to keep
>> the
>> > underpinnings strictly separate so we can change the execution model.
>> >
>> > Also, I think that explicit switches between the logical and physical
>> > abstraction (switching from class type to logical row type and vice
>> versa)
>> > make things more transparent to the user. As an example: A filter in a
>> > logical query expression may be pushed down, a filter defined as as udf
>> on
>> > a physical type is not pushed down.
>>