You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Dawid Wysakowicz (Jira)" <ji...@apache.org> on 2022/01/20 13:15:00 UTC

[jira] [Comment Edited] (FLINK-25683) wrong result if table transfrom to DataStream then window process in batch mode

    [ https://issues.apache.org/jira/browse/FLINK-25683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17479357#comment-17479357 ] 

Dawid Wysakowicz edited comment on FLINK-25683 at 1/20/22, 1:14 PM:
--------------------------------------------------------------------

I can't explain you where does the flag comes from, as I don't know this part of code. However as Timo explained:

{cite}
that it doesn't make sense to forward other watermarks if there is no time attribute in SQL for it. I wanted to be conservative here and let the user decide whether it makes sense to rely on DataStream API or not. We should not let watermarks flow throw the SQL engine without a clear purpose, at least this was the original thought behind it.
{cite}

So the idea is that if the table API does not expect watermarks from the datastream API, it cuts off any unexpected watermarks.


was (Author: dawidwys):
I can't explain you where does the flag comes from, as I don't know this part of code. However as Timo explained:

{qoute}
that it doesn't make sense to forward other watermarks if there is no time attribute in SQL for it. I wanted to be conservative here and let the user decide whether it makes sense to rely on DataStream API or not. We should not let watermarks flow throw the SQL engine without a clear purpose, at least this was the original thought behind it.
{quote}

So the idea is that if the table API does not expect watermarks from the datastream API, it cuts off any unexpected watermarks.

> wrong result if table transfrom to DataStream then window process in batch mode
> -------------------------------------------------------------------------------
>
>                 Key: FLINK-25683
>                 URL: https://issues.apache.org/jira/browse/FLINK-25683
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / API, Table SQL / Runtime
>    Affects Versions: 1.14.2
>         Environment: mac book pro m1 
> jdk 8 
> scala 2.11
> flink 1.14.2
> idea 2020
>            Reporter: zhangzh
>            Assignee: Yao Zhang
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: TableToDataStreamBatchWindowTest.scala, pom.xml
>
>
> I have 5 line datas,
> i first need to transform current data with SQL
> then mix current data and historical data which is batch get from hbase
> for some special reason the program must run in batch mode
> i think the correct result should be like this:
> (BOB,1)
> (EMA,1)
> (DOUG,1)
> (ALICE,1)
> (CENDI,1)
> but the result is :
> (EMA,1)
>  
> if i set different parallelism ,the result is different.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)