You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Iñigo Mediavilla <im...@gmail.com> on 2017/11/22 21:16:27 UTC

Adding UnsafeRow to Calcite

Hello,

I'm using Calcite to provide an SQL interface (Read-Only) for a Java
Service that contains its data in memory. However most of the fields are
primitive types and exposing them in a ProjectableFilterableTable forces
boxing of the fields values since the scan method returns
Enumerable<Object[]>.

When I look at how ProjectableFilterableTable is used, it seems that for
every Object array calcite is generating a Row object that relies on an
array of Object internally.

I was wondering if an UnsafeRow similar to what Spark has implemented could
be considered for Calcite given the possible savings that it could bring in
terms of memory and how it could in some cases like mine help avoiding
unnecessary boxing / unboxing.

https://github.com/apache/spark/blob/master/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java

Kind regards,

Inigo Mediavilla

Re: Adding UnsafeRow to Calcite

Posted by txomin pelu <tx...@gmail.com>.
Thanks Enrico and sorry for using a different email address to reply.

I agree that an interface could be a good idea. I am far from an expert in
Calcite, but I may try to implement something along those lines just to see
how difficult it is.  Anyone else sees other reasons for or against
Object[] as the base for Calcite Rows ?

Best,

Inigo

On Thu, Nov 23, 2017 at 10:04 PM, Enrico Olivelli <eo...@gmail.com>
wrote:

> Il mer 22 nov 2017, 22:28 Iñigo Mediavilla <im...@gmail.com> ha
> scritto:
>
> > Hello,
> >
> > I'm using Calcite to provide an SQL interface (Read-Only) for a Java
> > Service that contains its data in memory. However most of the fields are
> > primitive types and exposing them in a ProjectableFilterableTable forces
> > boxing of the fields values since the scan method returns
> > Enumerable<Object[]>.
> >
>
> I see another issue, maybe from a different perspective.
> Having to deal with Object[] means that you always have to unpack your
> record to have this particular representation. For instance in my system I
> have a page of data loaded in memory and it contains a bunch of
> records/tuples.
> If I want to feed Calcite I always have to unpack each record to Object[],
> and as there is a projection to apply very often a new Object[] is to be
> created. This is not efficient and not GC friendly.
> A great enhancement maybe it would be to have some interface instead of the
> bare array.
> This will enable the underlying implementation not to deserialize cells
> Object[]
>
> As far as my limited experience of Calcite suggests me this change will be
> very difficult to introduce as it will break binary compatibility, as most
> of the Enumerable stuff is done with reflection.
>
> Just my 2 cents
> Enrico
>
>
> >
> >
> >
> >
> > When I look at how ProjectableFilterableTable is used, it seems that for
> > every Object array calcite is generating a Row object that relies on an
> > array of Object internally.
> >
> > I was wondering if an UnsafeRow similar to what Spark has implemented
> could
> > be considered for Calcite given the possible savings that it could bring
> in
> > terms of memory and how it could in some cases like mine help avoiding
> > unnecessary boxing / unboxing.
> >
> >
> > https://github.com/apache/spark/blob/master/sql/
> catalyst/src/main/java/org/apache/spark/sql/catalyst/
> expressions/UnsafeRow.java
> >
> > Kind regards,
> >
> > Inigo Mediavilla
> >
> --
>
>
> -- Enrico Olivelli
>

Re: Adding UnsafeRow to Calcite

Posted by Enrico Olivelli <eo...@gmail.com>.
Il mer 22 nov 2017, 22:28 Iñigo Mediavilla <im...@gmail.com> ha scritto:

> Hello,
>
> I'm using Calcite to provide an SQL interface (Read-Only) for a Java
> Service that contains its data in memory. However most of the fields are
> primitive types and exposing them in a ProjectableFilterableTable forces
> boxing of the fields values since the scan method returns
> Enumerable<Object[]>.
>

I see another issue, maybe from a different perspective.
Having to deal with Object[] means that you always have to unpack your
record to have this particular representation. For instance in my system I
have a page of data loaded in memory and it contains a bunch of
records/tuples.
If I want to feed Calcite I always have to unpack each record to Object[],
and as there is a projection to apply very often a new Object[] is to be
created. This is not efficient and not GC friendly.
A great enhancement maybe it would be to have some interface instead of the
bare array.
This will enable the underlying implementation not to deserialize cells
Object[]

As far as my limited experience of Calcite suggests me this change will be
very difficult to introduce as it will break binary compatibility, as most
of the Enumerable stuff is done with reflection.

Just my 2 cents
Enrico


>
>
>
>
> When I look at how ProjectableFilterableTable is used, it seems that for
> every Object array calcite is generating a Row object that relies on an
> array of Object internally.
>
> I was wondering if an UnsafeRow similar to what Spark has implemented could
> be considered for Calcite given the possible savings that it could bring in
> terms of memory and how it could in some cases like mine help avoiding
> unnecessary boxing / unboxing.
>
>
> https://github.com/apache/spark/blob/master/sql/catalyst/src/main/java/org/apache/spark/sql/catalyst/expressions/UnsafeRow.java
>
> Kind regards,
>
> Inigo Mediavilla
>
-- 


-- Enrico Olivelli