You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@calcite.apache.org by Christian Tzolov <ct...@pivotal.io> on 2017/12/10 04:39:08 UTC

GroupBy for Indefinite SQL Streams

Following the CsvStreamingTableXXX and StreamTest.java examples i've
implemented a naive Geode "streaming" adapter prototype [1].

If i understand it correct the sql-stream implementation for querying
"indefinite" tables, returns an "indefinite" ResultSet, so one can keep
calling the ResultSet.next for next result produced.

This logic works for non aggregation stream queries where every input in
the table triggers a new result in the ResultSet. For example the following
queries yield the expected result:

SELECT STREAM FLOOR(rowtime TO MINUTE) AS rowtime FROM BookMasterStream

For streaming aggregation (group-by) queries though the
GeodeStreamEnumerator's current and moveNext methods are called
indefinitely but no result is ever produced. It seems that the execution
never gets out of this loop: the
https://github.com/apache/calcite/blob/master/linq4j/src/main/java/org/apache/calcite/linq4j/EnumerableDefaults.java#L825

In a way this is expected given the indefinite nature of the input table.
Therefore I don't understand why or how this seems to work in the
StreamTest.java#148 tests?
But i'm almost certain that it will not work if you use aggregation instead
of simple scan in the CsvTest.java#758 example.

I am really stuck here and would appreciate an advise.

Thanks,
Christian

[1]
https://github.com/tzolov/calcite/tree/geode-1.3-stream/geode/src/main/java/org/apache/calcite/adapter/geode/stream




-- 
Christian Tzolov <http://www.linkedin.com/in/tzolov> | Principle Software
Engineer | Spring <https://spring.io/>.io | Pivotal <http://pivotal.io/> |
ctzolov@pivotal.io