You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@calcite.apache.org by Yoon-Min Nam <yo...@heavy.ai> on 2022/06/03 14:35:32 UTC

Express millisecond / microsecond / nanosecond with INTERVAL keyword

Hi, all.
I have questions regarding the INTERVAL keyword supported by Apache Calcite.
Recently, I noticed that Calcite can represent milliseconds with INTERVAL
keyword like
*INTERVAL '0.001' SECOND* which stands for 1ms.
But AFAIK there is no way to represent a microsecond without using SECOND.
So, the following statement returns an exception:

*INTERVAL 1 MILLISECOND*
SQL Error: Encountered "microsecond" at line 1, column 21.
Was expecting one of:
    "DAY" ...
    "DAYS" ...
    "HOUR" ...
    "HOURS" ...
    "MINUTE" ...
    "MINUTES" ...
    "MONTH" ...
    "MONTHS" ...
    "SECOND" ...
    "SECONDS" ...
    "YEAR" ...
    "YEARS" ...

The same issues occur for both microseconds and nanoseconds.
So, the first question is how can we represent microseconds and nanoseconds
with the INTERVAL keyword.

The second question is about the window framing clause that we can only use
the INTERVAL keyword to define window framing bound when order by key has
one of date/time/timestamp types.
So, the following statement returns an exception:


*OVER (ORDER BY timestamp_ RANGE BETWEEN EXTRACT(MILLISECOND FROM tm9)
PRECEDING AND CURRENT ROW).*SQL Error: From line 1, column 72 to line 1,
column 105: Data Type mismatch between ORDER BY and RANGE clause

When tracking the exception, I guess the following logic causes this
exception:

In SqlTypeFamili.java:

public List<SqlTypeFamily> allowableDifferenceTypes() {
  switch (this) {
  case NUMERIC:
    return ImmutableList.of(NUMERIC);
  case DATE:
  case TIME:
  case TIMESTAMP:
    return ImmutableList.of(INTERVAL_DAY_TIME, INTERVAL_YEAR_MONTH);
  default:
    return ImmutableList.of();
  }
And IIUC this means we can only use INTERVAL statement there and there
is no way to use milliseconds/microseconds/nanoseconds even if the
ordering column has timestamp(9) type.

So the second question is can we relax the requirement to use other
data types in this case?
If so, we can exploit the extracted milliseconds to compute window
frame bounds with the ordering column having timestamp type as:

*OVER (ORDER BY timestamp_ RANGE BETWEEN **EXTRACT(MILLISECOND FROM
tm9)** PRECEDING AND CURRENT ROW)*

Ultimately, it's very useful if we can allow the following:
*OVER (ORDER BY timestamp_ RANGE BETWEEN INTERVAL 1 MILLISECOND PRECEDING
AND INTERVAL 3 MICROSECOND FOLLOWING).*

Any feedback regarding those issues can be very helpful to figure out
the solution we need.

Best,
Yoon-Min Nam