You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Γιώργος Θεοδωράκης <gi...@gmail.com> on 2016/11/02 17:53:48 UTC

Re: Window Semantics for Streams

Hello,

Can someone inform me if we can define the bounds of sliding windows with
OVER in Calcite at this moment? I am trying to define sliding windows
according to the examples given in https://calcite.apache
.org/docs/stream.html and I keep getting wrong results. Some examples and
the plans (I use ProjectToWindowRUle) they generate are :
1)
               "select s.orders.productid, SUM(units) over pr " + "from
s.orders " + "window pr as (ORDER BY productid ROWS BETWEEN 5 PRECEDING AND
10 FOLLOWING)"

==>
LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), CAST($3):INTEGER,
null)]) LogicalWindow(window#0=[window(partition {} order by [0] rows
between $2 PRECEDING and $3 FOLLOWING aggs [COUNT($1), $SUM0($1)])])
LogicalProject(productid=[$1], units=[$2]) LogicalTableScan(table=[[s,
orders]])

in which, the numbers I have used are "converted" to columns.

2)
"select s.orders.productid, SUM(units) over (ORDER BY productid ROWS
BETWEEN 5 PRECEDING AND 10 FOLLOWING) " + "from s.orders "

For this I get the same plan as before.

3)The same goes for RANGE :

             "select s.orders.productid, SUM(units) over (ORDER BY productid
RANGE BETWEEN 5 PRECEDING AND 10 FOLLOWING) "
                     + "from s.orders "

4)
             "select s.orders.productid, SUM(units) over (ORDER BY productid
RANGE 3600 PRECEDING) "
                     + "from s.orders "

==>
LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), CAST($3):INTEGER,
null)])
  LogicalWindow(window#0=[window(partition {} order by [0] range between $2
PRECEDING and CURRENT ROW aggs [COUNT($1), $SUM0($1)])])
    LogicalProject(productid=[$1], units=[$2])
      LogicalTableScan(table=[[s, orders]])

Can someone provide me a working example?

Thanks,
George

2016-10-18 20:33 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:

> Also this query in Calcite :
>
>  "select * from ("
>                 + "select s.orders.productid , avg(units) OVER (product
> ROWS BETWEEN 10 PRECEDING and 5 FOLLOWING)"                   + " as m10, "
>                 + "AVG(units) OVER (product RANGE INTERVAL '7' DAY
> PRECEDING) AS d7 "
>                 + "from s.orders "
>                 + " WINDOW product AS (PARTITION BY productId)) "
>                 + "where m10>d7 "
>          );
>
> gives me after optimization the following plan, that doesn't have any
> window boundaries :
>
> LogicalFilter(condition=[>($1, $2)])
>   LogicalProject(productid=[$0], m10=[CAST(/($1, $2)):INTEGER NOT NULL],
> d7=[CAST(/($3, $4)):INTEGER NOT NULL])
>     LogicalProject(productid=[$0], $1=[$2], $2=[$3], $3=[$4], $4=[$5])
>       LogicalWindow(window#0=[window(partition {0} order by [] rows
> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1),
> COUNT($1)])], window#1=[window(partition {0} order by [] range between
> UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1), COUNT($1)])])
>         LogicalProject(productid=[$1], units=[$2])
>           LogicalTableScan(table=[[s, orders]])
>
> 2016-10-18 20:23 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
>
>> Hi,
>>
>> I was wondering if there is any possible way to define windows with SQl
>> in Calcite for queries that don't have an aggregate function? For example,
>> I want to define the queries from Linear Road Benchmark of the STREAM
>> project (http://infolab.stanford.edu/stream/cql-benchmark.html):
>>
>> 1)
>> SELECT DISTINCT car_id
>> FROM CarSegStr [RANGE 30 SECONDS];
>>
>> 2)
>> SELECT car_id, exp_way, dir, seg
>> FROM CarSegStr [PARTITION BY car_id ROWS 1], CurActiveCars
>> WHERE CarSegStr.car_id = CurActiveCars.car_id;
>>
>> Thank you in advance,
>> George.
>>
>
>

RE: Window Semantics for Streams

Posted by Radu Tudoran <ra...@huawei.com>.
Hi,

In this case that the way to define the windows differs from the website I am keeping my suggestion to modify the examples on the website.


Regarding your question about unbounded widnows, I followed the syntax you used and indeed I get the same error with UNBOUNDED lower bound for the window 

Lowerbound UNBOUNDED PRECEDING
Upper bound CURRENT ROW


However, I tried also the syntax suggested by Julian


"select s.orders.productid, SUM(units) over pr "
+ "from s.orders "
+ "window pr as (ORDER BY productid ROWS BETWEEN 5 PRECEDING AND 10 FOLLOWING)"
);

And it correctly defines the boundaries for the window. Perhaps you can try it this way.





-----Original Message-----
From: Γιώργος Θεοδωράκης [mailto:giwrgosrtheod@gmail.com] 
Sent: Thursday, November 03, 2016 10:38 AM
To: dev@calcite.apache.org
Subject: Re: Window Semantics for Streams

I think the right syntax is like this:

"select s.orders.productid, SUM(units) over pr "
+ "from s.orders "
+ "window pr as (PARTITION BY productid ORDER BY productid ROWS BETWEEN 
+ 5
PRECEDING AND 10 FOLLOWING)"
);

or this :

"select s.orders.productid, SUM(units) over (pr ROWS BETWEEN 5 PRECEDING AND 10 FOLLOWING) "
+ "from s.orders "
+ "window pr as (PARTITION BY productid ORDER BY productid)"
);

Although with the second way the bounds are undefined for me.
Do you get right the bounds for your query?

2016-11-03 11:22 GMT+02:00 Radu Tudoran <ra...@huawei.com>:

> Hi,
>
> I am also working on a similar topic and encountered a problem with 
> the window definition and parsing it's syntax.
>
> I am following the example to define the windows with the OVER clause. 
> And I am trying to use also the Partition BY clause when defining the window.
> Can I get some help to find the proper syntax to define the window. It 
> seems that the example syntax did not work for me as shown below.
>
>
> SELECT CLIENT_NAME,
>         SUM(AMOUNT) OVER prodW (RANGE INTERVAL '10' MINUTE PRECEDING) 
> AS
> m10
>       FROM inputdata
>       WINDOW prodW AS (ORDER BY ETIME PARTITION BY ID)
>
> However this leads to a parsing error
>
>
> Encountered "(" at line 1, column 46.
> Was expecting one of:
>     "FROM" ...
>     "," ...
>     "AS" ...
> ...
> at org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(
> SqlParserImpl.java:388)
>         at org.apache.calcite.sql.parser.impl.SqlParserImpl.
> normalizeException(SqlParserImpl.java:119)
>         at org.apache.calcite.sql.parser.SqlParser.parseQuery(
> SqlParser.java:131)
>         at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.
> java:156)
>         at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.
> java:167)
>         at org.huawei.demo.sqlparsercsv.LocalParser3.main(
> LocalParser3.java:68)
>
>
> Alternatively I tried to define the window completely
>
>
> SELECT CLIENT_NAME,
>         SUM(AMOUNT) OVER prodW AS m10
>       FROM inputdata
>       WINDOW prodW AS (ORDER BY ETIME RANGE INTERVAL '10' MINUTE 
> PRECEDING PARTITION BY ID)
>
> Which leads to a different error:
>
> Exception in thread "main" org.apache.calcite.sql.parser.SqlParseException:
> Encountered "PARTITION" at line 1, column 158.
> Was expecting one of:
>     ")" ...
>     "ALLOW" ...
>     "DISALLOW" ...
>
>         at org.apache.calcite.sql.parser.impl.SqlParserImpl.
> convertException(SqlParserImpl.java:388)
>         at org.apache.calcite.sql.parser.impl.SqlParserImpl.
> normalizeException(SqlParserImpl.java:119)
>         at org.apache.calcite.sql.parser.SqlParser.parseQuery(
> SqlParser.java:131)
>         at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.
> java:156)
>         at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.
> java:167)
>         at org.huawei.demo.sqlparsercsv.LocalParser3.main(
> LocalParser3.java:74)
>
>
>
>
>
> -----Original Message-----
> From: Julian Hyde [mailto:jhyde@apache.org]
> Sent: Wednesday, November 02, 2016 7:14 PM
> To: dev@calcite.apache.org
> Subject: Re: Window Semantics for Streams
>
> As you know, streams.html is a specification. We do not claim that it 
> is all implemented.
>
> Did you do a search of the existing tests? JdbcTest.testWinAgg2 
> features windows that have a variety of bounds, and produces the correct results.
> There are also tests in winagg.iq.
>
> I suspect that the “constants” field of Window is not output as part 
> of the explain for Window (or LogicalWindow). The $2 and $3 refer to 
> those hidden constants.
>
> Julian
>
> > On Nov 2, 2016, at 10:53 AM, Γιώργος Θεοδωράκης 
> > <gi...@gmail.com>
> wrote:
> >
> > Hello,
> >
> > Can someone inform me if we can define the bounds of sliding windows 
> > with OVER in Calcite at this moment? I am trying to define sliding 
> > windows according to the examples given in https://calcite.apache 
> > .org/docs/stream.html and I keep getting wrong results. Some 
> > examples and the plans (I use ProjectToWindowRUle) they generate are :
> > 1)
> >               "select s.orders.productid, SUM(units) over pr " + 
> > "from s.orders " + "window pr as (ORDER BY productid ROWS BETWEEN 5 
> > PRECEDING AND
> > 10 FOLLOWING)"
> >
> > ==>
> > LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), 
> > CAST($3):INTEGER,
> > null)]) LogicalWindow(window#0=[window(partition {} order by [0] 
> > rows between $2 PRECEDING and $3 FOLLOWING aggs [COUNT($1), 
> > $SUM0($1)])]) LogicalProject(productid=[$1], units=[$2]) 
> > LogicalTableScan(table=[[s,
> > orders]])
> >
> > in which, the numbers I have used are "converted" to columns.
> >
> > 2)
> > "select s.orders.productid, SUM(units) over (ORDER BY productid ROWS 
> > BETWEEN 5 PRECEDING AND 10 FOLLOWING) " + "from s.orders "
> >
> > For this I get the same plan as before.
> >
> > 3)The same goes for RANGE :
> >
> >             "select s.orders.productid, SUM(units) over (ORDER BY 
> > productid RANGE BETWEEN 5 PRECEDING AND 10 FOLLOWING) "
> >                     + "from s.orders "
> >
> > 4)
> >             "select s.orders.productid, SUM(units) over (ORDER BY 
> > productid RANGE 3600 PRECEDING) "
> >                     + "from s.orders "
> >
> > ==>
> > LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), 
> > CAST($3):INTEGER,
> > null)])
> >  LogicalWindow(window#0=[window(partition {} order by [0] range 
> > between $2 PRECEDING and CURRENT ROW aggs [COUNT($1), $SUM0($1)])])
> >    LogicalProject(productid=[$1], units=[$2])
> >      LogicalTableScan(table=[[s, orders]])
> >
> > Can someone provide me a working example?
> >
> > Thanks,
> > George
> >
> > 2016-10-18 20:33 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
> >
> >> Also this query in Calcite :
> >>
> >> "select * from ("
> >>                + "select s.orders.productid , avg(units) OVER (product
> >> ROWS BETWEEN 10 PRECEDING and 5 FOLLOWING)"                   + " as
> m10, "
> >>                + "AVG(units) OVER (product RANGE INTERVAL '7' DAY
> >> PRECEDING) AS d7 "
> >>                + "from s.orders "
> >>                + " WINDOW product AS (PARTITION BY productId)) "
> >>                + "where m10>d7 "
> >>         );
> >>
> >> gives me after optimization the following plan, that doesn't have 
> >> any window boundaries :
> >>
> >> LogicalFilter(condition=[>($1, $2)])  
> >> LogicalProject(productid=[$0], m10=[CAST(/($1, $2)):INTEGER NOT 
> >> NULL], d7=[CAST(/($3, $4)):INTEGER NOT NULL])
> >>    LogicalProject(productid=[$0], $1=[$2], $2=[$3], $3=[$4], $4=[$5])
> >>      LogicalWindow(window#0=[window(partition {0} order by [] rows 
> >> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1), 
> >> COUNT($1)])], window#1=[window(partition {0} order by [] range 
> >> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1),
> COUNT($1)])])
> >>        LogicalProject(productid=[$1], units=[$2])
> >>          LogicalTableScan(table=[[s, orders]])
> >>
> >> 2016-10-18 20:23 GMT+03:00 Γιώργος Θεοδωράκης 
> >> <giwrgosrtheod@gmail.com
> >:
> >>
> >>> Hi,
> >>>
> >>> I was wondering if there is any possible way to define windows 
> >>> with SQl in Calcite for queries that don't have an aggregate function?
> >>> For example, I want to define the queries from Linear Road 
> >>> Benchmark of the STREAM project (http://infolab.stanford.edu/
> stream/cql-benchmark.html):
> >>>
> >>> 1)
> >>> SELECT DISTINCT car_id
> >>> FROM CarSegStr [RANGE 30 SECONDS];
> >>>
> >>> 2)
> >>> SELECT car_id, exp_way, dir, seg
> >>> FROM CarSegStr [PARTITION BY car_id ROWS 1], CurActiveCars WHERE 
> >>> CarSegStr.car_id = CurActiveCars.car_id;
> >>>
> >>> Thank you in advance,
> >>> George.
> >>>
> >>
> >>
>
>

Re: Window Semantics for Streams

Posted by Γιώργος Θεοδωράκης <gi...@gmail.com>.
I think the right syntax is like this:

"select s.orders.productid, SUM(units) over pr "
+ "from s.orders "
+ "window pr as (PARTITION BY productid ORDER BY productid ROWS BETWEEN 5
PRECEDING AND 10 FOLLOWING)"
);

or this :

"select s.orders.productid, SUM(units) over (pr ROWS BETWEEN 5 PRECEDING
AND 10 FOLLOWING) "
+ "from s.orders "
+ "window pr as (PARTITION BY productid ORDER BY productid)"
);

Although with the second way the bounds are undefined for me.
Do you get right the bounds for your query?

2016-11-03 11:22 GMT+02:00 Radu Tudoran <ra...@huawei.com>:

> Hi,
>
> I am also working on a similar topic and encountered a problem with the
> window definition and parsing it's syntax.
>
> I am following the example to define the windows with the OVER clause. And
> I am trying to use also the Partition BY clause when defining the window.
> Can I get some help to find the proper syntax to define the window. It
> seems that the example syntax did not work for me as shown below.
>
>
> SELECT CLIENT_NAME,
>         SUM(AMOUNT) OVER prodW (RANGE INTERVAL '10' MINUTE PRECEDING) AS
> m10
>       FROM inputdata
>       WINDOW prodW AS (ORDER BY ETIME PARTITION BY ID)
>
> However this leads to a parsing error
>
>
> Encountered "(" at line 1, column 46.
> Was expecting one of:
>     "FROM" ...
>     "," ...
>     "AS" ...
> ...
> at org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(
> SqlParserImpl.java:388)
>         at org.apache.calcite.sql.parser.impl.SqlParserImpl.
> normalizeException(SqlParserImpl.java:119)
>         at org.apache.calcite.sql.parser.SqlParser.parseQuery(
> SqlParser.java:131)
>         at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.
> java:156)
>         at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.
> java:167)
>         at org.huawei.demo.sqlparsercsv.LocalParser3.main(
> LocalParser3.java:68)
>
>
> Alternatively I tried to define the window completely
>
>
> SELECT CLIENT_NAME,
>         SUM(AMOUNT) OVER prodW AS m10
>       FROM inputdata
>       WINDOW prodW AS (ORDER BY ETIME RANGE INTERVAL '10' MINUTE PRECEDING
> PARTITION BY ID)
>
> Which leads to a different error:
>
> Exception in thread "main" org.apache.calcite.sql.parser.SqlParseException:
> Encountered "PARTITION" at line 1, column 158.
> Was expecting one of:
>     ")" ...
>     "ALLOW" ...
>     "DISALLOW" ...
>
>         at org.apache.calcite.sql.parser.impl.SqlParserImpl.
> convertException(SqlParserImpl.java:388)
>         at org.apache.calcite.sql.parser.impl.SqlParserImpl.
> normalizeException(SqlParserImpl.java:119)
>         at org.apache.calcite.sql.parser.SqlParser.parseQuery(
> SqlParser.java:131)
>         at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.
> java:156)
>         at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.
> java:167)
>         at org.huawei.demo.sqlparsercsv.LocalParser3.main(
> LocalParser3.java:74)
>
>
>
>
>
> -----Original Message-----
> From: Julian Hyde [mailto:jhyde@apache.org]
> Sent: Wednesday, November 02, 2016 7:14 PM
> To: dev@calcite.apache.org
> Subject: Re: Window Semantics for Streams
>
> As you know, streams.html is a specification. We do not claim that it is
> all implemented.
>
> Did you do a search of the existing tests? JdbcTest.testWinAgg2 features
> windows that have a variety of bounds, and produces the correct results.
> There are also tests in winagg.iq.
>
> I suspect that the “constants” field of Window is not output as part of
> the explain for Window (or LogicalWindow). The $2 and $3 refer to those
> hidden constants.
>
> Julian
>
> > On Nov 2, 2016, at 10:53 AM, Γιώργος Θεοδωράκης <gi...@gmail.com>
> wrote:
> >
> > Hello,
> >
> > Can someone inform me if we can define the bounds of sliding windows
> > with OVER in Calcite at this moment? I am trying to define sliding
> > windows according to the examples given in https://calcite.apache
> > .org/docs/stream.html and I keep getting wrong results. Some examples
> > and the plans (I use ProjectToWindowRUle) they generate are :
> > 1)
> >               "select s.orders.productid, SUM(units) over pr " + "from
> > s.orders " + "window pr as (ORDER BY productid ROWS BETWEEN 5
> > PRECEDING AND
> > 10 FOLLOWING)"
> >
> > ==>
> > LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0),
> > CAST($3):INTEGER,
> > null)]) LogicalWindow(window#0=[window(partition {} order by [0] rows
> > between $2 PRECEDING and $3 FOLLOWING aggs [COUNT($1), $SUM0($1)])])
> > LogicalProject(productid=[$1], units=[$2]) LogicalTableScan(table=[[s,
> > orders]])
> >
> > in which, the numbers I have used are "converted" to columns.
> >
> > 2)
> > "select s.orders.productid, SUM(units) over (ORDER BY productid ROWS
> > BETWEEN 5 PRECEDING AND 10 FOLLOWING) " + "from s.orders "
> >
> > For this I get the same plan as before.
> >
> > 3)The same goes for RANGE :
> >
> >             "select s.orders.productid, SUM(units) over (ORDER BY
> > productid RANGE BETWEEN 5 PRECEDING AND 10 FOLLOWING) "
> >                     + "from s.orders "
> >
> > 4)
> >             "select s.orders.productid, SUM(units) over (ORDER BY
> > productid RANGE 3600 PRECEDING) "
> >                     + "from s.orders "
> >
> > ==>
> > LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0),
> > CAST($3):INTEGER,
> > null)])
> >  LogicalWindow(window#0=[window(partition {} order by [0] range
> > between $2 PRECEDING and CURRENT ROW aggs [COUNT($1), $SUM0($1)])])
> >    LogicalProject(productid=[$1], units=[$2])
> >      LogicalTableScan(table=[[s, orders]])
> >
> > Can someone provide me a working example?
> >
> > Thanks,
> > George
> >
> > 2016-10-18 20:33 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
> >
> >> Also this query in Calcite :
> >>
> >> "select * from ("
> >>                + "select s.orders.productid , avg(units) OVER (product
> >> ROWS BETWEEN 10 PRECEDING and 5 FOLLOWING)"                   + " as
> m10, "
> >>                + "AVG(units) OVER (product RANGE INTERVAL '7' DAY
> >> PRECEDING) AS d7 "
> >>                + "from s.orders "
> >>                + " WINDOW product AS (PARTITION BY productId)) "
> >>                + "where m10>d7 "
> >>         );
> >>
> >> gives me after optimization the following plan, that doesn't have any
> >> window boundaries :
> >>
> >> LogicalFilter(condition=[>($1, $2)])
> >>  LogicalProject(productid=[$0], m10=[CAST(/($1, $2)):INTEGER NOT
> >> NULL], d7=[CAST(/($3, $4)):INTEGER NOT NULL])
> >>    LogicalProject(productid=[$0], $1=[$2], $2=[$3], $3=[$4], $4=[$5])
> >>      LogicalWindow(window#0=[window(partition {0} order by [] rows
> >> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1),
> >> COUNT($1)])], window#1=[window(partition {0} order by [] range
> >> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1),
> COUNT($1)])])
> >>        LogicalProject(productid=[$1], units=[$2])
> >>          LogicalTableScan(table=[[s, orders]])
> >>
> >> 2016-10-18 20:23 GMT+03:00 Γιώργος Θεοδωράκης <giwrgosrtheod@gmail.com
> >:
> >>
> >>> Hi,
> >>>
> >>> I was wondering if there is any possible way to define windows with
> >>> SQl in Calcite for queries that don't have an aggregate function?
> >>> For example, I want to define the queries from Linear Road Benchmark
> >>> of the STREAM project (http://infolab.stanford.edu/
> stream/cql-benchmark.html):
> >>>
> >>> 1)
> >>> SELECT DISTINCT car_id
> >>> FROM CarSegStr [RANGE 30 SECONDS];
> >>>
> >>> 2)
> >>> SELECT car_id, exp_way, dir, seg
> >>> FROM CarSegStr [PARTITION BY car_id ROWS 1], CurActiveCars WHERE
> >>> CarSegStr.car_id = CurActiveCars.car_id;
> >>>
> >>> Thank you in advance,
> >>> George.
> >>>
> >>
> >>
>
>

RE: Window Semantics for Streams

Posted by Radu Tudoran <ra...@huawei.com>.
Hi,

I am also working on a similar topic and encountered a problem with the window definition and parsing it's syntax.

I am following the example to define the windows with the OVER clause. And I am trying to use also the Partition BY clause when defining the window. Can I get some help to find the proper syntax to define the window. It seems that the example syntax did not work for me as shown below. 


SELECT CLIENT_NAME,
	SUM(AMOUNT) OVER prodW (RANGE INTERVAL '10' MINUTE PRECEDING) AS m10
      FROM inputdata 
      WINDOW prodW AS (ORDER BY ETIME PARTITION BY ID)

However this leads to a parsing error


Encountered "(" at line 1, column 46.
Was expecting one of:
    "FROM" ...
    "," ...
    "AS" ...
...
at org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:388)
	at org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:119)
	at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:131)
	at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:156)
	at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.java:167)
	at org.huawei.demo.sqlparsercsv.LocalParser3.main(LocalParser3.java:68)


Alternatively I tried to define the window completely


SELECT CLIENT_NAME,
	SUM(AMOUNT) OVER prodW AS m10
      FROM inputdata 
      WINDOW prodW AS (ORDER BY ETIME RANGE INTERVAL '10' MINUTE PRECEDING PARTITION BY ID)

Which leads to a different error:

Exception in thread "main" org.apache.calcite.sql.parser.SqlParseException: Encountered "PARTITION" at line 1, column 158.
Was expecting one of:
    ")" ...
    "ALLOW" ...
    "DISALLOW" ...
    
	at org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:388)
	at org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:119)
	at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:131)
	at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:156)
	at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.java:167)
	at org.huawei.demo.sqlparsercsv.LocalParser3.main(LocalParser3.java:74)





-----Original Message-----
From: Julian Hyde [mailto:jhyde@apache.org] 
Sent: Wednesday, November 02, 2016 7:14 PM
To: dev@calcite.apache.org
Subject: Re: Window Semantics for Streams

As you know, streams.html is a specification. We do not claim that it is all implemented.

Did you do a search of the existing tests? JdbcTest.testWinAgg2 features windows that have a variety of bounds, and produces the correct results. There are also tests in winagg.iq.

I suspect that the “constants” field of Window is not output as part of the explain for Window (or LogicalWindow). The $2 and $3 refer to those hidden constants.

Julian

> On Nov 2, 2016, at 10:53 AM, Γιώργος Θεοδωράκης <gi...@gmail.com> wrote:
> 
> Hello,
> 
> Can someone inform me if we can define the bounds of sliding windows 
> with OVER in Calcite at this moment? I am trying to define sliding 
> windows according to the examples given in https://calcite.apache 
> .org/docs/stream.html and I keep getting wrong results. Some examples 
> and the plans (I use ProjectToWindowRUle) they generate are :
> 1)
>               "select s.orders.productid, SUM(units) over pr " + "from 
> s.orders " + "window pr as (ORDER BY productid ROWS BETWEEN 5 
> PRECEDING AND
> 10 FOLLOWING)"
> 
> ==>
> LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), 
> CAST($3):INTEGER,
> null)]) LogicalWindow(window#0=[window(partition {} order by [0] rows 
> between $2 PRECEDING and $3 FOLLOWING aggs [COUNT($1), $SUM0($1)])]) 
> LogicalProject(productid=[$1], units=[$2]) LogicalTableScan(table=[[s,
> orders]])
> 
> in which, the numbers I have used are "converted" to columns.
> 
> 2)
> "select s.orders.productid, SUM(units) over (ORDER BY productid ROWS 
> BETWEEN 5 PRECEDING AND 10 FOLLOWING) " + "from s.orders "
> 
> For this I get the same plan as before.
> 
> 3)The same goes for RANGE :
> 
>             "select s.orders.productid, SUM(units) over (ORDER BY 
> productid RANGE BETWEEN 5 PRECEDING AND 10 FOLLOWING) "
>                     + "from s.orders "
> 
> 4)
>             "select s.orders.productid, SUM(units) over (ORDER BY 
> productid RANGE 3600 PRECEDING) "
>                     + "from s.orders "
> 
> ==>
> LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), 
> CAST($3):INTEGER,
> null)])
>  LogicalWindow(window#0=[window(partition {} order by [0] range 
> between $2 PRECEDING and CURRENT ROW aggs [COUNT($1), $SUM0($1)])])
>    LogicalProject(productid=[$1], units=[$2])
>      LogicalTableScan(table=[[s, orders]])
> 
> Can someone provide me a working example?
> 
> Thanks,
> George
> 
> 2016-10-18 20:33 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
> 
>> Also this query in Calcite :
>> 
>> "select * from ("
>>                + "select s.orders.productid , avg(units) OVER (product
>> ROWS BETWEEN 10 PRECEDING and 5 FOLLOWING)"                   + " as m10, "
>>                + "AVG(units) OVER (product RANGE INTERVAL '7' DAY
>> PRECEDING) AS d7 "
>>                + "from s.orders "
>>                + " WINDOW product AS (PARTITION BY productId)) "
>>                + "where m10>d7 "
>>         );
>> 
>> gives me after optimization the following plan, that doesn't have any 
>> window boundaries :
>> 
>> LogicalFilter(condition=[>($1, $2)])
>>  LogicalProject(productid=[$0], m10=[CAST(/($1, $2)):INTEGER NOT 
>> NULL], d7=[CAST(/($3, $4)):INTEGER NOT NULL])
>>    LogicalProject(productid=[$0], $1=[$2], $2=[$3], $3=[$4], $4=[$5])
>>      LogicalWindow(window#0=[window(partition {0} order by [] rows 
>> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1), 
>> COUNT($1)])], window#1=[window(partition {0} order by [] range 
>> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1), COUNT($1)])])
>>        LogicalProject(productid=[$1], units=[$2])
>>          LogicalTableScan(table=[[s, orders]])
>> 
>> 2016-10-18 20:23 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
>> 
>>> Hi,
>>> 
>>> I was wondering if there is any possible way to define windows with 
>>> SQl in Calcite for queries that don't have an aggregate function? 
>>> For example, I want to define the queries from Linear Road Benchmark 
>>> of the STREAM project (http://infolab.stanford.edu/stream/cql-benchmark.html):
>>> 
>>> 1)
>>> SELECT DISTINCT car_id
>>> FROM CarSegStr [RANGE 30 SECONDS];
>>> 
>>> 2)
>>> SELECT car_id, exp_way, dir, seg
>>> FROM CarSegStr [PARTITION BY car_id ROWS 1], CurActiveCars WHERE 
>>> CarSegStr.car_id = CurActiveCars.car_id;
>>> 
>>> Thank you in advance,
>>> George.
>>> 
>> 
>> 


RE: Window Semantics for Streams

Posted by Radu Tudoran <ra...@huawei.com>.
As per any Murphy law I got a working example right after sending the email :(

It seems that the only working way (at least for me) is to use the following syntax:

SELECT CLIENT_NAME,
	SUM(AMOUNT) OVER prodW AS m10
      FROM inputdata 
      WINDOW prodW AS (PARTITION BY ID ORDER BY ETIME RANGE INTERVAL '10' MINUTE PRECEDING)

If it is confirmed that indeed the other options are not correct I would suggest that the examples on the website to be modified to show the working syntax




-----Original Message-----
From: Radu Tudoran 
Sent: Thursday, November 03, 2016 10:23 AM
To: dev@calcite.apache.org
Subject: RE: Window Semantics for Streams

Hi,

I am also working on a similar topic and encountered a problem with the window definition and parsing it's syntax.

I am following the example to define the windows with the OVER clause. And I am trying to use also the Partition BY clause when defining the window. Can I get some help to find the proper syntax to define the window. It seems that the example syntax did not work for me as shown below. 


SELECT CLIENT_NAME,
	SUM(AMOUNT) OVER prodW (RANGE INTERVAL '10' MINUTE PRECEDING) AS m10
      FROM inputdata 
      WINDOW prodW AS (ORDER BY ETIME PARTITION BY ID)

However this leads to a parsing error


Encountered "(" at line 1, column 46.
Was expecting one of:
    "FROM" ...
    "," ...
    "AS" ...
...
at org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:388)
	at org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:119)
	at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:131)
	at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:156)
	at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.java:167)
	at org.huawei.demo.sqlparsercsv.LocalParser3.main(LocalParser3.java:68)


Alternatively I tried to define the window completely


SELECT CLIENT_NAME,
	SUM(AMOUNT) OVER prodW AS m10
      FROM inputdata 
      WINDOW prodW AS (ORDER BY ETIME RANGE INTERVAL '10' MINUTE PRECEDING PARTITION BY ID)

Which leads to a different error:

Exception in thread "main" org.apache.calcite.sql.parser.SqlParseException: Encountered "PARTITION" at line 1, column 158.
Was expecting one of:
    ")" ...
    "ALLOW" ...
    "DISALLOW" ...
    
	at org.apache.calcite.sql.parser.impl.SqlParserImpl.convertException(SqlParserImpl.java:388)
	at org.apache.calcite.sql.parser.impl.SqlParserImpl.normalizeException(SqlParserImpl.java:119)
	at org.apache.calcite.sql.parser.SqlParser.parseQuery(SqlParser.java:131)
	at org.apache.calcite.sql.parser.SqlParser.parseStmt(SqlParser.java:156)
	at org.apache.calcite.prepare.PlannerImpl.parse(PlannerImpl.java:167)
	at org.huawei.demo.sqlparsercsv.LocalParser3.main(LocalParser3.java:74)





-----Original Message-----
From: Julian Hyde [mailto:jhyde@apache.org]
Sent: Wednesday, November 02, 2016 7:14 PM
To: dev@calcite.apache.org
Subject: Re: Window Semantics for Streams

As you know, streams.html is a specification. We do not claim that it is all implemented.

Did you do a search of the existing tests? JdbcTest.testWinAgg2 features windows that have a variety of bounds, and produces the correct results. There are also tests in winagg.iq.

I suspect that the “constants” field of Window is not output as part of the explain for Window (or LogicalWindow). The $2 and $3 refer to those hidden constants.

Julian

> On Nov 2, 2016, at 10:53 AM, Γιώργος Θεοδωράκης <gi...@gmail.com> wrote:
> 
> Hello,
> 
> Can someone inform me if we can define the bounds of sliding windows 
> with OVER in Calcite at this moment? I am trying to define sliding 
> windows according to the examples given in https://calcite.apache 
> .org/docs/stream.html and I keep getting wrong results. Some examples 
> and the plans (I use ProjectToWindowRUle) they generate are :
> 1)
>               "select s.orders.productid, SUM(units) over pr " + "from 
> s.orders " + "window pr as (ORDER BY productid ROWS BETWEEN 5 
> PRECEDING AND
> 10 FOLLOWING)"
> 
> ==>
> LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), 
> CAST($3):INTEGER,
> null)]) LogicalWindow(window#0=[window(partition {} order by [0] rows 
> between $2 PRECEDING and $3 FOLLOWING aggs [COUNT($1), $SUM0($1)])]) 
> LogicalProject(productid=[$1], units=[$2]) LogicalTableScan(table=[[s,
> orders]])
> 
> in which, the numbers I have used are "converted" to columns.
> 
> 2)
> "select s.orders.productid, SUM(units) over (ORDER BY productid ROWS 
> BETWEEN 5 PRECEDING AND 10 FOLLOWING) " + "from s.orders "
> 
> For this I get the same plan as before.
> 
> 3)The same goes for RANGE :
> 
>             "select s.orders.productid, SUM(units) over (ORDER BY 
> productid RANGE BETWEEN 5 PRECEDING AND 10 FOLLOWING) "
>                     + "from s.orders "
> 
> 4)
>             "select s.orders.productid, SUM(units) over (ORDER BY 
> productid RANGE 3600 PRECEDING) "
>                     + "from s.orders "
> 
> ==>
> LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), 
> CAST($3):INTEGER,
> null)])
>  LogicalWindow(window#0=[window(partition {} order by [0] range 
> between $2 PRECEDING and CURRENT ROW aggs [COUNT($1), $SUM0($1)])])
>    LogicalProject(productid=[$1], units=[$2])
>      LogicalTableScan(table=[[s, orders]])
> 
> Can someone provide me a working example?
> 
> Thanks,
> George
> 
> 2016-10-18 20:33 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
> 
>> Also this query in Calcite :
>> 
>> "select * from ("
>>                + "select s.orders.productid , avg(units) OVER (product
>> ROWS BETWEEN 10 PRECEDING and 5 FOLLOWING)"                   + " as m10, "
>>                + "AVG(units) OVER (product RANGE INTERVAL '7' DAY
>> PRECEDING) AS d7 "
>>                + "from s.orders "
>>                + " WINDOW product AS (PARTITION BY productId)) "
>>                + "where m10>d7 "
>>         );
>> 
>> gives me after optimization the following plan, that doesn't have any 
>> window boundaries :
>> 
>> LogicalFilter(condition=[>($1, $2)])
>>  LogicalProject(productid=[$0], m10=[CAST(/($1, $2)):INTEGER NOT 
>> NULL], d7=[CAST(/($3, $4)):INTEGER NOT NULL])
>>    LogicalProject(productid=[$0], $1=[$2], $2=[$3], $3=[$4], $4=[$5])
>>      LogicalWindow(window#0=[window(partition {0} order by [] rows 
>> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1), 
>> COUNT($1)])], window#1=[window(partition {0} order by [] range 
>> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1), COUNT($1)])])
>>        LogicalProject(productid=[$1], units=[$2])
>>          LogicalTableScan(table=[[s, orders]])
>> 
>> 2016-10-18 20:23 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
>> 
>>> Hi,
>>> 
>>> I was wondering if there is any possible way to define windows with 
>>> SQl in Calcite for queries that don't have an aggregate function?
>>> For example, I want to define the queries from Linear Road Benchmark 
>>> of the STREAM project (http://infolab.stanford.edu/stream/cql-benchmark.html):
>>> 
>>> 1)
>>> SELECT DISTINCT car_id
>>> FROM CarSegStr [RANGE 30 SECONDS];
>>> 
>>> 2)
>>> SELECT car_id, exp_way, dir, seg
>>> FROM CarSegStr [PARTITION BY car_id ROWS 1], CurActiveCars WHERE 
>>> CarSegStr.car_id = CurActiveCars.car_id;
>>> 
>>> Thank you in advance,
>>> George.
>>> 
>> 
>> 


Re: Window Semantics for Streams

Posted by Julian Hyde <jh...@apache.org>.
As you know, streams.html is a specification. We do not claim that it is all implemented.

Did you do a search of the existing tests? JdbcTest.testWinAgg2 features windows that have a variety of bounds, and produces the correct results. There are also tests in winagg.iq.

I suspect that the “constants” field of Window is not output as part of the explain for Window (or LogicalWindow). The $2 and $3 refer to those hidden constants.

Julian

> On Nov 2, 2016, at 10:53 AM, Γιώργος Θεοδωράκης <gi...@gmail.com> wrote:
> 
> Hello,
> 
> Can someone inform me if we can define the bounds of sliding windows with
> OVER in Calcite at this moment? I am trying to define sliding windows
> according to the examples given in https://calcite.apache
> .org/docs/stream.html and I keep getting wrong results. Some examples and
> the plans (I use ProjectToWindowRUle) they generate are :
> 1)
>               "select s.orders.productid, SUM(units) over pr " + "from
> s.orders " + "window pr as (ORDER BY productid ROWS BETWEEN 5 PRECEDING AND
> 10 FOLLOWING)"
> 
> ==>
> LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), CAST($3):INTEGER,
> null)]) LogicalWindow(window#0=[window(partition {} order by [0] rows
> between $2 PRECEDING and $3 FOLLOWING aggs [COUNT($1), $SUM0($1)])])
> LogicalProject(productid=[$1], units=[$2]) LogicalTableScan(table=[[s,
> orders]])
> 
> in which, the numbers I have used are "converted" to columns.
> 
> 2)
> "select s.orders.productid, SUM(units) over (ORDER BY productid ROWS
> BETWEEN 5 PRECEDING AND 10 FOLLOWING) " + "from s.orders "
> 
> For this I get the same plan as before.
> 
> 3)The same goes for RANGE :
> 
>             "select s.orders.productid, SUM(units) over (ORDER BY productid
> RANGE BETWEEN 5 PRECEDING AND 10 FOLLOWING) "
>                     + "from s.orders "
> 
> 4)
>             "select s.orders.productid, SUM(units) over (ORDER BY productid
> RANGE 3600 PRECEDING) "
>                     + "from s.orders "
> 
> ==>
> LogicalProject(productid=[$0], EXPR$1=[CASE(>($2, 0), CAST($3):INTEGER,
> null)])
>  LogicalWindow(window#0=[window(partition {} order by [0] range between $2
> PRECEDING and CURRENT ROW aggs [COUNT($1), $SUM0($1)])])
>    LogicalProject(productid=[$1], units=[$2])
>      LogicalTableScan(table=[[s, orders]])
> 
> Can someone provide me a working example?
> 
> Thanks,
> George
> 
> 2016-10-18 20:33 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
> 
>> Also this query in Calcite :
>> 
>> "select * from ("
>>                + "select s.orders.productid , avg(units) OVER (product
>> ROWS BETWEEN 10 PRECEDING and 5 FOLLOWING)"                   + " as m10, "
>>                + "AVG(units) OVER (product RANGE INTERVAL '7' DAY
>> PRECEDING) AS d7 "
>>                + "from s.orders "
>>                + " WINDOW product AS (PARTITION BY productId)) "
>>                + "where m10>d7 "
>>         );
>> 
>> gives me after optimization the following plan, that doesn't have any
>> window boundaries :
>> 
>> LogicalFilter(condition=[>($1, $2)])
>>  LogicalProject(productid=[$0], m10=[CAST(/($1, $2)):INTEGER NOT NULL],
>> d7=[CAST(/($3, $4)):INTEGER NOT NULL])
>>    LogicalProject(productid=[$0], $1=[$2], $2=[$3], $3=[$4], $4=[$5])
>>      LogicalWindow(window#0=[window(partition {0} order by [] rows
>> between UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1),
>> COUNT($1)])], window#1=[window(partition {0} order by [] range between
>> UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [SUM($1), COUNT($1)])])
>>        LogicalProject(productid=[$1], units=[$2])
>>          LogicalTableScan(table=[[s, orders]])
>> 
>> 2016-10-18 20:23 GMT+03:00 Γιώργος Θεοδωράκης <gi...@gmail.com>:
>> 
>>> Hi,
>>> 
>>> I was wondering if there is any possible way to define windows with SQl
>>> in Calcite for queries that don't have an aggregate function? For example,
>>> I want to define the queries from Linear Road Benchmark of the STREAM
>>> project (http://infolab.stanford.edu/stream/cql-benchmark.html):
>>> 
>>> 1)
>>> SELECT DISTINCT car_id
>>> FROM CarSegStr [RANGE 30 SECONDS];
>>> 
>>> 2)
>>> SELECT car_id, exp_way, dir, seg
>>> FROM CarSegStr [PARTITION BY car_id ROWS 1], CurActiveCars
>>> WHERE CarSegStr.car_id = CurActiveCars.car_id;
>>> 
>>> Thank you in advance,
>>> George.
>>> 
>> 
>>