You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user-zh@flink.apache.org by Mic <sh...@163.com> on 2020/10/19 07:03:54 UTC
SQL interval join 问题
现有 SQL 语句如下:
create table source1(
id varchar PRIMARY KEY,
a varchar,
proctime AS PROCTIME()
) with (
'connector' = 'kafka'
...
);
create table source2(
id varchar PRIMARY KEY,
a varchar,
proctime AS PROCTIME()
) with (
'connector' = 'kafka'
...
);
select
case
when s1.id is not null then s1.id
else s2.id
end as ids,
s1.a, s2.b
from source1 as s1 full outer join source2 as s2 on s1.id = s2.id where s1.proctime between s2.proctime - INTERVAL '5' SECOND and s2.proctime + INTERVAL '5' SECOND;
最后的 join 语句预期是 如果两个source的消息<id, a>, <id, b>先后到达时间超过 10 秒,则输出<id, a>, <id, b> 两条消息。
目前的观察结果是,如果两条消息<id, a>, <id, b>先后到达时间超过10 秒,输出为:<id, a>, <id, a, b>
为何超过 10 秒后,仍然会输出 <id, a, b> ?
Re: SQL interval join 问题
Posted by Benchao Li <li...@apache.org>.
Hi Mic,
感谢关注这个issue,这个issue当前还在讨论中。
我认为问题已经定位清楚了,抄送了其他的committer同学进一步讨论确认。
Mic <sh...@163.com> 于2020年10月19日周一 下午3:51写道:
> 搜了一下,目前是有一个 issue 看起来相关,https://issues.apache.org/jira/browse/FLINK-18996
> <br/><br/>不知道处理进度如何?
> 在 2020-10-19 15:03:54,"Mic" <sh...@163.com> 写道:
> >现有 SQL 语句如下:
> >create table source1(
> > id varchar PRIMARY KEY,
> > a varchar,
> > proctime AS PROCTIME()
> >) with (
> > 'connector' = 'kafka'
> > ...
> >);
> >create table source2(
> > id varchar PRIMARY KEY,
> > a varchar,
> > proctime AS PROCTIME()
> >) with (
> > 'connector' = 'kafka'
> > ...
> >);
> >select
> > case
> > when s1.id is not null then s1.id
> > else s2.id
> > end as ids,
> > s1.a, s2.b
> >from source1 as s1 full outer join source2 as s2 on s1.id = s2.id where
> s1.proctime between s2.proctime - INTERVAL '5' SECOND and s2.proctime +
> INTERVAL '5' SECOND;
> >
> >
> >最后的 join 语句预期是 如果两个source的消息<id, a>, <id, b>先后到达时间超过 10 秒,则输出<id, a>,
> <id, b> 两条消息。
> >
> >
> >目前的观察结果是,如果两条消息<id, a>, <id, b>先后到达时间超过10 秒,输出为:<id, a>, <id, a, b>
> >为何超过 10 秒后,仍然会输出 <id, a, b> ?
>
--
Best,
Benchao Li
Re:SQL interval join 问题
Posted by Mic <sh...@163.com>.
搜了一下,目前是有一个 issue 看起来相关,https://issues.apache.org/jira/browse/FLINK-18996<br/><br/>不知道处理进度如何?
在 2020-10-19 15:03:54,"Mic" <sh...@163.com> 写道:
>现有 SQL 语句如下:
>create table source1(
> id varchar PRIMARY KEY,
> a varchar,
> proctime AS PROCTIME()
>) with (
> 'connector' = 'kafka'
> ...
>);
>create table source2(
> id varchar PRIMARY KEY,
> a varchar,
> proctime AS PROCTIME()
>) with (
> 'connector' = 'kafka'
> ...
>);
>select
> case
> when s1.id is not null then s1.id
> else s2.id
> end as ids,
> s1.a, s2.b
>from source1 as s1 full outer join source2 as s2 on s1.id = s2.id where s1.proctime between s2.proctime - INTERVAL '5' SECOND and s2.proctime + INTERVAL '5' SECOND;
>
>
>最后的 join 语句预期是 如果两个source的消息<id, a>, <id, b>先后到达时间超过 10 秒,则输出<id, a>, <id, b> 两条消息。
>
>
>目前的观察结果是,如果两条消息<id, a>, <id, b>先后到达时间超过10 秒,输出为:<id, a>, <id, a, b>
>为何超过 10 秒后,仍然会输出 <id, a, b> ?