You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/02/02 18:11:33 UTC

[GitHub] [iceberg] szehon-ho opened a new issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky

szehon-ho opened a new issue #4031:
URL: https://github.com/apache/iceberg/issues/4031


   https://github.com/apache/iceberg/blob/master/spark/v3.2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestUpdate.java#L222
   
   Example failure : https://github.com/apache/iceberg/runs/5030269431?check_suite_focus=true
   
   It seems the hash function is not deterministic and sometimes two rows does hash into same file with low probability.
   
   Not familiar if any deterministic way to test, wondering if we just assert that deletedFiles == addedDataFiles, and that this value <= 3, instead of exactly 3?  Open to suggestions
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1032889318


   Got it. So the problem is that Spark's sampling isn't deterministic. That makes sense to me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1033346700


   Resolving as it was fixed in PR #4033.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] rdblue commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky

Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1030905935


   What hash function is not deterministic? That seems odd to me. Are we generating random data or something?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1032886531


   I commented on the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi edited a comment on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky

Posted by GitBox <gi...@apache.org>.
aokolnychyi edited a comment on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1032851191


   @szehon-ho @rdblue, I think the problem is that we request a hash distribution by `_file` and sometimes records for multiple files end up in a single output task. That leads to an inconsistent number of output files. It used to happen frequently when we had only 4 shuffle partitions but then I changed it to 200. Looks like this is still not enough and we should come up with a more reliable fix.
   
   ```
   // set the num of shuffle partitions to 200 instead of default 4 to reduce the chance of hashing
   // records for multiple source files to one writing task (needed for a predictable num of output files)
   withSQLConf(ImmutableMap.of(SQLConf.SHUFFLE_PARTITIONS().key(), "200"), () -> {
     sql("UPDATE %s SET id = -1", tableName);
   });
   ```
   
   Let me see the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky

Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1032851191


   @szehon-ho @rdblue, I think the problem is that we request a hash distribution by `_file` and sometimes records for multiple files end up in a single output task. That leads to an inconsistent number of output files. It used to happen frequently when we had only 4 shuffle partitions but then I changed it to 200. Looks like this is still not enough and we should come up with a more reliable fix.
   
   ```
       // set the num of shuffle partitions to 200 instead of default 4 to reduce the chance of hashing
       // records for multiple source files to one writing task (needed for a predictable num of output files)
       withSQLConf(ImmutableMap.of(SQLConf.SHUFFLE_PARTITIONS().key(), "200"), () -> {
         sql("UPDATE %s SET id = -1", tableName);
       });
   ```
   
   Let me see the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] aokolnychyi closed issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky

Posted by GitBox <gi...@apache.org>.
aokolnychyi closed issue #4031:
URL: https://github.com/apache/iceberg/issues/4031


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org


[GitHub] [iceberg] szehon-ho commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky

Posted by GitBox <gi...@apache.org>.
szehon-ho commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1031785164


   Yea , I meant there looks like some randomness of Spark shuffle assigning the row to same/different Spark partition..  I'll have to investigate if I have some time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org