You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Jesse Yates <je...@gmail.com> on 2015/07/21 17:58:31 UTC

streaming support for infinite streams

Hi,

I've only been using Calcite for a short while and am trying to hook up my
own streaming table. The problem I'm running into seems be, at its core, an
impedance mismatch.

Streams, by their very nature, are expected to be infinite (this what
Julian is getting at in the stream tutorial
<http://calcite.incubator.apache.org/docs/stream.html>). As such, they
should send incremental results to along to the ResultSet.

However, when running a simple query (e.g. select stream * from orders) and
the StreamableTable returning a Enumerable<Object[]> from an infinite
stream using the standard Linq4j tools, you end up quickly trying to store
all the values of the stream in a ListSink in a TableScanNode.
<https://github.com/apache/incubator-calcite/blob/3e50232b681e8dadb921580ee6f3e0376dd0f664/core/src/main/java/org/apache/calcite/interpreter/TableScanNode.java#L73>

It seems like the TableScanNode needs to be made stream aware and the sink
needs to forward results onward.

Here is a simple test
<https://github.com/jyates/incubator-calcite/commit/7111a0ed8456a567bd846660485bf509d77614fc>
that verifies an infinite stream infinitely adds data to the ListSink (drop
a breakpoint in TableScanNode to see it in action).

As I mentioned, I'm still pretty new to Calcite, so any pointers to my
being completely wrong would be much appreciated (or I'm happy to follow up
in a JIRA if this looks like a bug).

Thanks,
Jesse Yates

Re: streaming support for infinite streams

Posted by Jesse Yates <je...@gmail.com>.
Thanks Julian! Filed: https://issues.apache.org/jira/browse/CALCITE-809

On Tue, Jul 21, 2015 at 12:50 PM Julian Hyde <jh...@apache.org> wrote:

> Ah yes. You're hitting the interpreter "cheap and dirty"
> implementation of TableScan. I made the interpreter the simplest thing
> that could possibly work, so I made every operator build a list. (I
> know, I know. Enumerable uses iterators, and other implementations do
> even better. But I wanted to fit it into one page of code.)
>
> Can you log a jira case please?
>
> The solution will be either to fix the interpreter to use iterators
> (or similar) rather than lists, or to recognize that a query is
> infinite and not use the interpreter.
>
> Julian
>
>
> On Tue, Jul 21, 2015 at 8:58 AM, Jesse Yates <je...@gmail.com>
> wrote:
> > Hi,
> >
> > I've only been using Calcite for a short while and am trying to hook up
> my
> > own streaming table. The problem I'm running into seems be, at its core,
> an
> > impedance mismatch.
> >
> > Streams, by their very nature, are expected to be infinite (this what
> > Julian is getting at in the stream tutorial
> > <http://calcite.incubator.apache.org/docs/stream.html>). As such, they
> > should send incremental results to along to the ResultSet.
> >
> > However, when running a simple query (e.g. select stream * from orders)
> and
> > the StreamableTable returning a Enumerable<Object[]> from an infinite
> > stream using the standard Linq4j tools, you end up quickly trying to
> store
> > all the values of the stream in a ListSink in a TableScanNode.
> > <
> https://github.com/apache/incubator-calcite/blob/3e50232b681e8dadb921580ee6f3e0376dd0f664/core/src/main/java/org/apache/calcite/interpreter/TableScanNode.java#L73
> >
> >
> > It seems like the TableScanNode needs to be made stream aware and the
> sink
> > needs to forward results onward.
> >
> > Here is a simple test
> > <
> https://github.com/jyates/incubator-calcite/commit/7111a0ed8456a567bd846660485bf509d77614fc
> >
> > that verifies an infinite stream infinitely adds data to the ListSink
> (drop
> > a breakpoint in TableScanNode to see it in action).
> >
> > As I mentioned, I'm still pretty new to Calcite, so any pointers to my
> > being completely wrong would be much appreciated (or I'm happy to follow
> up
> > in a JIRA if this looks like a bug).
> >
> > Thanks,
> > Jesse Yates
>

Re: streaming support for infinite streams

Posted by Julian Hyde <jh...@apache.org>.
Ah yes. You're hitting the interpreter "cheap and dirty"
implementation of TableScan. I made the interpreter the simplest thing
that could possibly work, so I made every operator build a list. (I
know, I know. Enumerable uses iterators, and other implementations do
even better. But I wanted to fit it into one page of code.)

Can you log a jira case please?

The solution will be either to fix the interpreter to use iterators
(or similar) rather than lists, or to recognize that a query is
infinite and not use the interpreter.

Julian


On Tue, Jul 21, 2015 at 8:58 AM, Jesse Yates <je...@gmail.com> wrote:
> Hi,
>
> I've only been using Calcite for a short while and am trying to hook up my
> own streaming table. The problem I'm running into seems be, at its core, an
> impedance mismatch.
>
> Streams, by their very nature, are expected to be infinite (this what
> Julian is getting at in the stream tutorial
> <http://calcite.incubator.apache.org/docs/stream.html>). As such, they
> should send incremental results to along to the ResultSet.
>
> However, when running a simple query (e.g. select stream * from orders) and
> the StreamableTable returning a Enumerable<Object[]> from an infinite
> stream using the standard Linq4j tools, you end up quickly trying to store
> all the values of the stream in a ListSink in a TableScanNode.
> <https://github.com/apache/incubator-calcite/blob/3e50232b681e8dadb921580ee6f3e0376dd0f664/core/src/main/java/org/apache/calcite/interpreter/TableScanNode.java#L73>
>
> It seems like the TableScanNode needs to be made stream aware and the sink
> needs to forward results onward.
>
> Here is a simple test
> <https://github.com/jyates/incubator-calcite/commit/7111a0ed8456a567bd846660485bf509d77614fc>
> that verifies an infinite stream infinitely adds data to the ListSink (drop
> a breakpoint in TableScanNode to see it in action).
>
> As I mentioned, I'm still pretty new to Calcite, so any pointers to my
> being completely wrong would be much appreciated (or I'm happy to follow up
> in a JIRA if this looks like a bug).
>
> Thanks,
> Jesse Yates