You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@calcite.apache.org by Ajay Babu Maguluri <aj...@6dtech.co.in> on 2019/07/18 16:44:13 UTC

Calcite JDBC-Spark Integration

Hello Calcite Team,

 

I  have a requirement to query multiple data sources in single SQL query, I
saw Calcite was providing this with JDBC adaptor, So here I found some
challenges while execution like, 

 

1.      This execution (like join etc.) was happening over the memory, So
this will cause OOM when big data execution.

2.      I saw their was spark option also, If we enable this option
execution will happen over spark?

3.      If 2nd yes when I am trying with that option I am getting exception
like,

Caused by: java.lang.NullPointerException

       at CalciteProgram162944.bind(Unknown Source)

       at
org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePr
epare.java:355)

       at
org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionIm
pl.java:314)

       at
org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java
:506)

       at
org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:
497)

       at
org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:18
2)

       at
org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:64)

       at
org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:43)

       at
org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.jav
a:667)

       at
org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.ja
va:566)

       at
org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(Avati
caConnection.java:675)

       at
org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement
.java:156)

   ... 2 more

 

 

4.      And please let me know how to overcome OOM when big data execution.

 

Thanks and Regards

Ajay Babu Maguluri.

Re: Calcite JDBC-Spark Integration

Posted by Danny Chan <yu...@gmail.com>.

Hi, Ajay
AFAIK, the calcite-spark is not even applicable now, while the default Enumerables execution holds all the data in memory, so there is no surprise  to trigger the OOM exception.

That means Calcite does not support big data set very well now, the most common way to use it is to do planning promotion or query the small data set.

Best,
Danny Chan
在 2019年7月21日 +0800 PM11:51，Ajay Babu Maguluri <aj...@6dtech.co.in>，写道：
> Hello Calcite Team,
>
>
>
> I have a requirement to query multiple data sources in single SQL query, I
> saw Calcite was providing this with JDBC adaptor, So here I found some
> challenges while execution like,
>
>
>
> 1. This execution (like join etc.) was happening over the memory, So
> this will cause OOM when big data execution.
>
> 2. I saw their was spark option also, If we enable this option
> execution will happen over spark?
>
> 3. If 2nd yes when I am trying with that option I am getting exception
> like,
>
> Caused by: java.lang.NullPointerException
>
> at CalciteProgram162944.bind(Unknown Source)
>
> at
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePr
> epare.java:355)
>
> at
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionIm
> pl.java:314)
>
> at
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java
> :506)
>
> at
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:
> 497)
>
> at
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:18
> 2)
>
> at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:64)
>
> at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:43)
>
> at
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.jav
> a:667)
>
> at
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.ja
> va:566)
>
> at
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(Avati
> caConnection.java:675)
>
> at
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement
> .java:156)
>
> ... 2 more
>
>
>
>
>
> 4. And please let me know how to overcome OOM when big data execution.
>
>
>
> Thanks and Regards
>
> Ajay Babu Maguluri.
>

Re: Calcite JDBC-Spark Integration

Posted by Stamatis Zampetakis <za...@gmail.com>.

Hi Ajay,

There is indeed an option to use Spark as the internal execution engine of
Calcite but looking into the code it seems to be work in progress so the
OOM does not surprise me.
There are various places in the code indicating that the work is not yet
finished and if I am not wrong nobody is actively working on this part of
the codebase.
Any contribution on this part is very welcomed.

Best,
Stamatis

On Sun, Jul 21, 2019 at 5:51 PM Ajay Babu Maguluri <aj...@6dtech.co.in>
wrote:

> Hello Calcite Team,
>
>
>
> I  have a requirement to query multiple data sources in single SQL query, I
> saw Calcite was providing this with JDBC adaptor, So here I found some
> challenges while execution like,
>
>
>
> 1.      This execution (like join etc.) was happening over the memory, So
> this will cause OOM when big data execution.
>
> 2.      I saw their was spark option also, If we enable this option
> execution will happen over spark?
>
> 3.      If 2nd yes when I am trying with that option I am getting exception
> like,
>
> Caused by: java.lang.NullPointerException
>
>        at CalciteProgram162944.bind(Unknown Source)
>
>        at
>
> org.apache.calcite.jdbc.CalcitePrepare$CalciteSignature.enumerable(CalcitePr
> epare.java:355)
>
>        at
>
> org.apache.calcite.jdbc.CalciteConnectionImpl.enumerable(CalciteConnectionIm
> pl.java:314)
>
>        at
>
> org.apache.calcite.jdbc.CalciteMetaImpl._createIterable(CalciteMetaImpl.java
> :506)
>
>        at
>
> org.apache.calcite.jdbc.CalciteMetaImpl.createIterable(CalciteMetaImpl.java:
> 497)
>
>        at
>
> org.apache.calcite.avatica.AvaticaResultSet.execute(AvaticaResultSet.java:18
> 2)
>
>        at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:64)
>
>        at
> org.apache.calcite.jdbc.CalciteResultSet.execute(CalciteResultSet.java:43)
>
>        at
>
> org.apache.calcite.avatica.AvaticaConnection$1.execute(AvaticaConnection.jav
> a:667)
>
>        at
>
> org.apache.calcite.jdbc.CalciteMetaImpl.prepareAndExecute(CalciteMetaImpl.ja
> va:566)
>
>        at
>
> org.apache.calcite.avatica.AvaticaConnection.prepareAndExecuteInternal(Avati
> caConnection.java:675)
>
>        at
>
> org.apache.calcite.avatica.AvaticaStatement.executeInternal(AvaticaStatement
> .java:156)
>
>    ... 2 more
>
>
>
>
>
> 4.      And please let me know how to overcome OOM when big data execution.
>
>
>
> Thanks and Regards
>
> Ajay Babu Maguluri.
>
>