You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/11/15 08:55:59 UTC

[GitHub] [flink] godfreyhe commented on pull request #20745: [FLINK-28988] Don't push above filters down into the right table for temporal join

godfreyhe commented on PR #20745:
URL: https://github.com/apache/flink/pull/20745#issuecomment-1314988767

   > > Hi @lincoln-lil, I have updated the PR accordingly, please have a look. BTW, is there a principle that filters can not be pushed down into a temporal table?
   > 
   > There's no specific rules but we should always keep the right semantic, and here, filters that corrupt the version table are definitely to be avoided.
   > 
   > While it seems semantically safe to push down the filter associated with the left table alone (since this does not push all the way down to the watermark node and break watermark generation), there is another important factor that we need to consider: if the left stream is an upsert source, e.g., upsert-kafka, when no filter is pushed down and the upsert key and join key are consistent, it is possible that the final execution plan can be optimized to upsert mode (instead of the more expensive retract mode) but when this partial push down takes effect, the filter pushed down will instead degrade the upsert mode to retract mode (the corresponding upsert-kafka source will add a expensive materialization node ChangelogNormalize to keep the correctness, see [FLINK-9528](https://issues.apache.org/jira/browse/FLINK-9528)), and it is hard to tell which of these two choices is better in practice. (Also if we try to de-optimize in this bad pushdown case, the filter pull-up will become mor
 e complicated).
   > 
   > So, taking all factors together, I prefer to keep the solution simple, i.e., just not pushing down any filter into an event time temporal join at all. WDYT?
   
   +1 for simple solution


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org