You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/05/05 02:42:09 UTC

[GitHub] [iceberg] kbendick opened a new pull request, #4697: Flink 1.15 - Fix flaky test that has implicit order dependency in TestFlinkTableSource

kbendick opened a new pull request, #4697:
URL: https://github.com/apache/iceberg/pull/4697

   Some tests in Flink's `TestFlinkTableSource` have been found to be order dependent.
   
   As the order of results on read is not defined (without an order by clause), most of the tests were fixed in https://github.com/apache/iceberg/commit/c726b2222f491aac02f0f1910898b4a8dfda2b14
   
   However, there is one test that has a `LIMIT 1` that then asserts on the row, which could be any of 3 possible rows.
   
   As Flink happened to be returning the records in a stable manner prior to Flink 1.15, this test was passing. But in PR https://github.com/apache/iceberg/pull/4693, this test was discovered to occasionally fail, as `SELECT * .... LIMIT 1` on a table with 3 results has no guarantee of which record will be returned.
   
   Failed execution output:
   ```
   org.apache.iceberg.flink.TestFlinkTableSource > testLimitPushDown FAILED
       java.lang.AssertionError: Should produce the expected records expected:<+I[1, iceberg, 10.0]> but was:<+I[2, b, 20.0]>
           at org.junit.Assert.fail(Assert.java:89)
           at org.junit.Assert.failNotEquals(Assert.java:835)
           at org.junit.Assert.assertEquals(Assert.java:120)
           at org.apache.iceberg.flink.TestFlinkTableSource.testLimitPushDown(TestFlinkTableSource.java:107)
   ```
   
   I've updated the test to assert that there is only one record, and that it is one of the 3 records in the table (which is assured by the previous query, which issues `SELECT * .. LIMIT 4` with no filter and gets 3 records back).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer merged pull request #4697: Flink 1.15 - Fix flaky test that has implicit order dependency in TestFlinkTableSource

Posted by GitBox <gi...@apache.org>.
RussellSpitzer merged PR #4697:
URL: https://github.com/apache/iceberg/pull/4697


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on pull request #4697: Flink 1.15 - Fix flaky test that has implicit order dependency in TestFlinkTableSource

Posted by GitBox <gi...@apache.org>.
kbendick commented on PR #4697:
URL: https://github.com/apache/iceberg/pull/4697#issuecomment-1120073714

   > I think the check on result size is redundant, but other than that looks good.
   
   Agreed. I left it in to make the changes as minimal as possible. I can remove it though.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on a diff in pull request #4697: Flink 1.15 - Fix flaky test that has implicit order dependency in TestFlinkTableSource

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on code in PR #4697:
URL: https://github.com/apache/iceberg/pull/4697#discussion_r867254031


##########
flink/v1.15/flink/src/test/java/org/apache/iceberg/flink/TestFlinkTableSource.java:
##########
@@ -121,6 +115,16 @@ public void testLimitPushDown() {
     );
     assertSameElements(expectedList, resultExceed);
 
+    String querySql = String.format("SELECT * FROM %s LIMIT 1", TABLE_NAME);
+    String explain = getTableEnv().explainSql(querySql);
+    String expectedExplain = "limit=[1]";
+    Assert.assertTrue("Explain should contain LimitPushDown", explain.contains(expectedExplain));
+    List<Row> result = sql(querySql);
+    Assert.assertEquals("Should have 1 record", 1, result.size());
+    Assertions.assertThat(result)
+        .containsAnyElementsOf(expectedList)
+        .size().isEqualTo(1);

Review Comment:
   Not sure why we need the .size().isEqualTo(1); here when we checked that result.size() == 1 immediately before



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] RussellSpitzer commented on pull request #4697: Flink 1.15 - Fix flaky test that has implicit order dependency in TestFlinkTableSource

Posted by GitBox <gi...@apache.org>.
RussellSpitzer commented on PR #4697:
URL: https://github.com/apache/iceberg/pull/4697#issuecomment-1120078934

   Thanks @kbendick for the patch!
   Also thanks to @stevenzwu and @hililiwei for the reviews!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on a diff in pull request #4697: Flink 1.15 - Fix flaky test that has implicit order dependency in TestFlinkTableSource

Posted by GitBox <gi...@apache.org>.
kbendick commented on code in PR #4697:
URL: https://github.com/apache/iceberg/pull/4697#discussion_r865522274


##########
flink/v1.15/flink/src/test/java/org/apache/iceberg/flink/TestFlinkTableSource.java:
##########
@@ -121,6 +115,16 @@ public void testLimitPushDown() {
     );
     assertSameElements(expectedList, resultExceed);
 
+    String querySql = String.format("SELECT * FROM %s LIMIT 1", TABLE_NAME);
+    String explain = getTableEnv().explainSql(querySql);
+    String expectedExplain = "limit=[1]";
+    Assert.assertTrue("Explain should contain LimitPushDown", explain.contains(expectedExplain));
+    List<Row> result = sql(querySql);
+    Assert.assertEquals("Should have 1 record", 1, result.size());
+    Assertions.assertThat(result)
+        .containsAnyElementsOf(expectedList)
+        .size().isEqualTo(1);

Review Comment:
   For people who cherry-pick, I'm not sure if this reordering is the best way to go about this.
   
   I needed to use `expectedList`, so I just moved the test to be below the test that declares all 3 expected records of that table. Given that there are already multiple tests in this single test function, I think it should be fine.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] kbendick commented on pull request #4697: Flink 1.15 - Fix flaky test that has implicit order dependency in TestFlinkTableSource

Posted by GitBox <gi...@apache.org>.
kbendick commented on PR #4697:
URL: https://github.com/apache/iceberg/pull/4697#issuecomment-1118112986

   cc @rdblue @openinx @stevenzwu @hililiwei @yittg this test is flaky (and technically always has been - but previously Flink was returning the records in a defined ordering).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org