You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by "hililiwei (via GitHub)" <gi...@apache.org> on 2023/03/31 10:34:25 UTC

[GitHub] [iceberg] hililiwei opened a new pull request, #7254: Flink: Introduce Flink 1.17

hililiwei opened a new pull request, #7254:
URL: https://github.com/apache/iceberg/pull/7254

   There are 4 commits, please do not use `Squash`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu merged pull request #7254: Flink: Introduce Flink 1.17

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu merged PR #7254:
URL: https://github.com/apache/iceberg/pull/7254


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#issuecomment-1491709587

   I'm not sure why approval is needed to execute UT. @stevenzwu Could you please help to review it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1154700876


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -445,8 +446,7 @@ public void testFilterPushDownNotInNull() {
     String sqlNotInNull = String.format("SELECT * FROM %s WHERE id NOT IN (1,2,NULL) ", TABLE_NAME);
     List<Row> resultGT = sql(sqlNotInNull);
     Assert.assertEquals("Should have 0 record", 0, resultGT.size());
-    Assert.assertEquals(
-        "Should not push down a filter", Expressions.alwaysTrue(), lastScanEvent.filter());
+    Assert.assertNull("NULL In the not-in statement results in empty result", lastScanEvent);

Review Comment:
   the message doesn't match the assertion. the message seems applicable to the assertion line above



##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =

Review Comment:
   add a comment that `%%` won't match the row with null value



##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   why is this query match zero rows now?



##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";

Review Comment:
   we should move this to a separate test method



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1156786134


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";

Review Comment:
   `testFilterPushDownLike()`  is used to test the `like filter` that can be pushed down, and we seem to have migrated it here from `testFilterNotPushDownLike()` just fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1156784659


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1156783455


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   `%%i` is equivalent to `startWith('%i')`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159288642


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -445,8 +446,9 @@ public void testFilterPushDownNotInNull() {
     String sqlNotInNull = String.format("SELECT * FROM %s WHERE id NOT IN (1,2,NULL) ", TABLE_NAME);
     List<Row> resultGT = sql(sqlNotInNull);
     Assert.assertEquals("Should have 0 record", 0, resultGT.size());
-    Assert.assertEquals(
-        "Should not push down a filter", Expressions.alwaysTrue(), lastScanEvent.filter());
+    Assert.assertNull(
+        "As the filtering is pushed down, Flink does not generate data, so there is no ScanEvent",

Review Comment:
   Flink skipped scanning the source table, did not create `DataStream`, returned an empty result, and finished the job. So the iceberg scanning plan is not created here.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159231201


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   > I thought `%%i` meant a string ending with char `i`. that is why it shouldn't match any rows. why does it match the row `iceberg` in 1.16. There is still sth I am not understanding here.
   
   What i say `%%i is equivalent to startWith('%i')`  means that the filter `data LIKE '%%i'` pushed down by flink is `like(data, '%i')`, then `FlinkFilters(org.apache.iceberg.flink.FlinkFilters)` will parse it to `startWith('%i')`. Because `StartWith`  literal cannot start with `%`, `FlinkFilters` has not successfully pushed it down.
   That's how `FlinkFilters` handles this filter in detail.
   
   https://github.com/apache/iceberg/blob/7c61537194f9ab68e3d83a16b339ec47180f5f10/flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java#L60-L63
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1161099428


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -422,8 +422,9 @@ public void testFilterPushDownInNull() {
     Assert.assertEquals("Should have 1 record", 1, result.size());
     Assert.assertEquals(
         "Should produce the expected record", Row.of(1, "iceberg", 10.0), result.get(0));
+    String expectedScan = "ref(name=\"data\") == \"iceberg\"";

Review Comment:
   Sorry for the misunderstanding. Comments have been added.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1160435338


##########
.github/workflows/flink-ci.yml:
##########
@@ -62,7 +62,7 @@ jobs:
     strategy:
       matrix:
         jvm: [8, 11]

Review Comment:
   Both Java 8 and 11 are supported (though 8 has been deprecated).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159314226


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -422,8 +422,9 @@ public void testFilterPushDownInNull() {
     Assert.assertEquals("Should have 1 record", 1, result.size());
     Assert.assertEquals(
         "Should produce the expected record", Row.of(1, "iceberg", 10.0), result.get(0));
+    String expectedScan = "ref(name=\"data\") == \"iceberg\"";

Review Comment:
   In previous versions, Flink did not push down the `IN ()` to the source tableļ¼ˆ `@Override
     public Result applyFilters(List<ResolvedExpression> flinkFilters) `) if it contained `NULL`.  This time it removes the `NULL` and pushes the rest down.
   
   I'm not sure whether this change was done by Flink itself or because it upgraded the version of Calcite to to 1.29.0. Let me look into it.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] 911432 commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "911432 (via GitHub)" <gi...@apache.org>.
911432 commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1160389849


##########
.github/workflows/flink-ci.yml:
##########
@@ -62,7 +62,7 @@ jobs:
     strategy:
       matrix:
         jvm: [8, 11]

Review Comment:
   I don't know if Java 8 needs to be uninstalled.
   This is because, according to the Flink 1.17 documentation, Flink 1.17 requires Java 11, not Java 8.
   https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/try-flink/local_installation/
   
   However, looking at the Dockerfile it seems Java 8 is still supported.
   https://github.com/apache/flink-docker/tree/master/1.17
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159288642


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -445,8 +446,9 @@ public void testFilterPushDownNotInNull() {
     String sqlNotInNull = String.format("SELECT * FROM %s WHERE id NOT IN (1,2,NULL) ", TABLE_NAME);
     List<Row> resultGT = sql(sqlNotInNull);
     Assert.assertEquals("Should have 0 record", 0, resultGT.size());
-    Assert.assertEquals(
-        "Should not push down a filter", Expressions.alwaysTrue(), lastScanEvent.filter());
+    Assert.assertNull(
+        "As the filtering is pushed down, Flink does not generate data, so there is no ScanEvent",

Review Comment:
   Flink skipped scanning the source table, did not create `DataStream`, returned an empty result, and finished the job.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on pull request #7254: Flink: Introduce Flink 1.17

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#issuecomment-1500909672

   thanks @hililiwei for the contribution


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159231201


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   > I thought `%%i` meant a string ending with char `i`. that is why it shouldn't match any rows. why does it match the row `iceberg` in 1.16. There is still sth I am not understanding here.
   
   What i say `%%i is equivalent to startWith('%i')`  means that the filter `data LIKE '%%i'` pushed down by flink is `like(data, '%i')`, then `FlinkFilters(org.apache.iceberg.flink.FlinkFilters)` will parse it to `startWith('%i')`. Because the filter condition cannot start with `%`, `FlinkFilters` has not successfully pushed it down.
   That's how `FlinkFilters` handles this filter in detail.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159231201


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   > I thought `%%i` meant a string ending with char `i`. that is why it shouldn't match any rows. why does it match the row `iceberg` in 1.16. There is still sth I am not understanding here.
   
   What i say `% %i is equivalent to startWith('%i')`  means that the filter `data LIKE '%%i'` pushed down by flink is `like(data, '%i')`, then `FlinkFilters(org.apache.iceberg.flink.FlinkFilters)` will parse it to `startWith('%i')`. Because the filter condition cannot start with `%`, `FlinkFilters` has not successfully pushed it down.
   That's how `FlinkFilters` handles this filter in detail.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1160839862


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -422,8 +422,9 @@ public void testFilterPushDownInNull() {
     Assert.assertEquals("Should have 1 record", 1, result.size());
     Assert.assertEquals(
         "Should produce the expected record", Row.of(1, "iceberg", 10.0), result.get(0));
+    String expectedScan = "ref(name=\"data\") == \"iceberg\"";

Review Comment:
   I am not saying the behavior is wrong. maybe just add a comment in the code to explain it a bit. E.g. your note above is perfect. In SQL, null check can only be done as `IS NULL` or `IS NOT NULL`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1154314699


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -422,8 +422,9 @@ public void testFilterPushDownInNull() {
     Assert.assertEquals("Should have 1 record", 1, result.size());
     Assert.assertEquals(
         "Should produce the expected record", Row.of(1, "iceberg", 10.0), result.get(0));
+    String expectedScan = "ref(name=\"data\") == \"iceberg\"";

Review Comment:
   Currently Flink 1.17 can push this down.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1157537021


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   it is a behavior change? 
   
   I thought `%%i` meant a string ending with char `i`. that is why it shouldn't match any rows. why does it match the row `iceberg` in 1.16. There is still sth I am not understanding here.



##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -445,8 +446,9 @@ public void testFilterPushDownNotInNull() {
     String sqlNotInNull = String.format("SELECT * FROM %s WHERE id NOT IN (1,2,NULL) ", TABLE_NAME);
     List<Row> resultGT = sql(sqlNotInNull);
     Assert.assertEquals("Should have 0 record", 0, resultGT.size());
-    Assert.assertEquals(
-        "Should not push down a filter", Expressions.alwaysTrue(), lastScanEvent.filter());
+    Assert.assertNull(
+        "As the filtering is pushed down, Flink does not generate data, so there is no ScanEvent",

Review Comment:
   nit: "As the predicate pushdown filter out all rows, Iceberg scan planning doesn't publish any ScanEvent`.
   
   On the other hand, is this correct? shouldn't `ScanEvent` always be notified since there is a table scan planning?
   ```
   /** Event sent to listeners when a table scan is planned. */
   public final class ScanEvent {
   ```



##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";

Review Comment:
   sounds good to me, as it seems to match the existing style. originally I was thinking each test method covers one scenario. In this case, we can have `testFilterPushDownLikeMatchingPrefix` and `testFilterPushDownLikeMatchingAnyChars`



##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -422,8 +422,9 @@ public void testFilterPushDownInNull() {
     Assert.assertEquals("Should have 1 record", 1, result.size());
     Assert.assertEquals(
         "Should produce the expected record", Row.of(1, "iceberg", 10.0), result.get(0));
+    String expectedScan = "ref(name=\"data\") == \"iceberg\"";

Review Comment:
   you are saying Flink skips the `null` for predicate pushdown? if yes, please add it the comment. it will also be great we can link to a Flink jira (if there is) for future improvement on pushdown with `null` value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159206389


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   No, nothing has changed here. I just fixed the clerical error. The Assert Message is incorrect.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#issuecomment-1499943292

   Rebase this pr because #7269 was merged.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1156784837


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -445,8 +446,7 @@ public void testFilterPushDownNotInNull() {
     String sqlNotInNull = String.format("SELECT * FROM %s WHERE id NOT IN (1,2,NULL) ", TABLE_NAME);
     List<Row> resultGT = sql(sqlNotInNull);
     Assert.assertEquals("Should have 0 record", 0, resultGT.size());
-    Assert.assertEquals(
-        "Should not push down a filter", Expressions.alwaysTrue(), lastScanEvent.filter());
+    Assert.assertNull("NULL In the not-in statement results in empty result", lastScanEvent);

Review Comment:
   changed to "As the filtering is pushed down, Flink does not generate data, so there is no ScanEvent"



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1156786134


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";

Review Comment:
   `testFilterPushDownLike()`  is used to test the `like filter` that can be pushed down, and it seems ok to migrate it here from `testFilterNotPushDownLike()`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159231201


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   > I thought `%%i` meant a string ending with char `i`. that is why it shouldn't match any rows. why does it match the row `iceberg` in 1.16. There is still sth I am not understanding here.
   
   What i say `%%i is equivalent to startWith('%i')`  means that the filter `data LIKE '%%i'` pushed down by flink is `like(data, '%i')`, then `FlinkFilters(org.apache.iceberg.flink.FlinkFilters)` will parse it to `startWith('%i')`. Because `StartWith`  literal cannot start with `%`, `FlinkFilters` has not successfully pushed it down.
   That's how `FlinkFilters` handles this filter in detail.
   
   https://github.com/apache/iceberg/blob/7c61537194f9ab68e3d83a16b339ec47180f5f10/flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java#L47
   
   https://github.com/apache/iceberg/blob/7c61537194f9ab68e3d83a16b339ec47180f5f10/flink/v1.16/flink/src/main/java/org/apache/iceberg/flink/FlinkFilters.java#L60-L63
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1154314474


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";

Review Comment:
   Currently Flink 1.17 can push this down.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159231201


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   > I thought `%%i` meant a string ending with char `i`. that is why it shouldn't match any rows. why does it match the row `iceberg` in 1.16. There is still sth I am not understanding here.
   
   What i say `%%i is equivalent to startWith('%i')`  means that the filter `data LIKE '%%i'` pushed down by flink is `like(data, '%i')`, then `FlinkFilters(org.apache.iceberg.flink.FlinkFilters)` will parse it to `startWith('%i')`. Because `StartWith`  literal cannot start with `%`, `FlinkFilters` has not successfully pushed it down.
   That's how `FlinkFilters` handles this filter in detail.
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] stevenzwu commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "stevenzwu (via GitHub)" <gi...@apache.org>.
stevenzwu commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1159950419


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -542,14 +542,24 @@ public void testFilterPushDownLike() {
     Assert.assertEquals("Should create only one scan", 1, scanEventCount);
     Assert.assertEquals(
         "Should contain the push down filter", expectedFilter, lastScanEvent.filter().toString());
+
+    sqlLike = "SELECT * FROM  " + TABLE_NAME + "  WHERE data LIKE '%%' ";
+    resultLike = sql(sqlLike);
+    Assert.assertEquals("Should have 2 records", 2, resultLike.size());
+    List<Row> expectedRecords =
+        Lists.newArrayList(Row.of(1, "iceberg", 10.0), Row.of(2, "b", 20.0));
+    assertSameElements(expectedRecords, resultLike);
+    String expectedScan = "not_null(ref(name=\"data\"))";
+    Assert.assertEquals(
+        "Should contain the push down filter", expectedScan, lastScanEvent.filter().toString());
   }
 
   @Test
   public void testFilterNotPushDownLike() {
     Row expectRecord = Row.of(1, "iceberg", 10.0);
     String sqlNoPushDown = "SELECT * FROM " + TABLE_NAME + " WHERE data LIKE '%%i' ";
     List<Row> resultLike = sql(sqlNoPushDown);
-    Assert.assertEquals("Should have 1 record", 0, resultLike.size());
+    Assert.assertEquals("Should have 0 record", 0, resultLike.size());

Review Comment:
   oh. didn't realize it is just a fix for the error msg. somehow I thought the assertion also changed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1160371074


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -445,8 +446,9 @@ public void testFilterPushDownNotInNull() {
     String sqlNotInNull = String.format("SELECT * FROM %s WHERE id NOT IN (1,2,NULL) ", TABLE_NAME);
     List<Row> resultGT = sql(sqlNotInNull);
     Assert.assertEquals("Should have 0 record", 0, resultGT.size());
-    Assert.assertEquals(
-        "Should not push down a filter", Expressions.alwaysTrue(), lastScanEvent.filter());
+    Assert.assertNull(
+        "As the filtering is pushed down, Flink does not generate data, so there is no ScanEvent",

Review Comment:
   changed to "As the predicate pushdown filter out all rows, Flink did not create scan plan, so it doesn't publish any ScanEvent." 
   Is it OK to write it this way? @stevenzwu 
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] hililiwei commented on a diff in pull request #7254: Flink: Introduce Flink 1.17

Posted by "hililiwei (via GitHub)" <gi...@apache.org>.
hililiwei commented on code in PR #7254:
URL: https://github.com/apache/iceberg/pull/7254#discussion_r1160433815


##########
flink/v1.17/flink/src/test/java/org/apache/iceberg/flink/source/TestFlinkTableSource.java:
##########
@@ -422,8 +422,9 @@ public void testFilterPushDownInNull() {
     Assert.assertEquals("Should have 1 record", 1, result.size());
     Assert.assertEquals(
         "Should produce the expected record", Row.of(1, "iceberg", 10.0), result.get(0));
+    String expectedScan = "ref(name=\"data\") == \"iceberg\"";

Review Comment:
   > you are saying Flink skips the `null` for predicate pushdown? if yes, please add it the comment. it will also be great we can link to a Flink jira (if there is) for future improvement on pushdown with `null` value.
   
   As a side note, NULL has no meaning in `in ()`, and it cannot query data which is  NULL, it is correct to ignore it here. So we don't have to push it down.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org