You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@iceberg.apache.org by GitBox <gi...@apache.org> on 2022/02/02 18:11:33 UTC
[GitHub] [iceberg] szehon-ho opened a new issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky
szehon-ho opened a new issue #4031:
URL: https://github.com/apache/iceberg/issues/4031
https://github.com/apache/iceberg/blob/master/spark/v3.2/spark-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestUpdate.java#L222
Example failure : https://github.com/apache/iceberg/runs/5030269431?check_suite_focus=true
It seems the hash function is not deterministic and sometimes two rows does hash into same file with low probability.
Not familiar if any deterministic way to test, wondering if we just assert that deletedFiles == addedDataFiles, and that this value <= 3, instead of exactly 3? Open to suggestions
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] rdblue commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky
Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1032889318
Got it. So the problem is that Spark's sampling isn't deterministic. That makes sense to me.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] aokolnychyi commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky
Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1033346700
Resolving as it was fixed in PR #4033.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] rdblue commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky
Posted by GitBox <gi...@apache.org>.
rdblue commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1030905935
What hash function is not deterministic? That seems odd to me. Are we generating random data or something?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] aokolnychyi commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky
Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1032886531
I commented on the PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] aokolnychyi edited a comment on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky
Posted by GitBox <gi...@apache.org>.
aokolnychyi edited a comment on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1032851191
@szehon-ho @rdblue, I think the problem is that we request a hash distribution by `_file` and sometimes records for multiple files end up in a single output task. That leads to an inconsistent number of output files. It used to happen frequently when we had only 4 shuffle partitions but then I changed it to 200. Looks like this is still not enough and we should come up with a more reliable fix.
```
// set the num of shuffle partitions to 200 instead of default 4 to reduce the chance of hashing
// records for multiple source files to one writing task (needed for a predictable num of output files)
withSQLConf(ImmutableMap.of(SQLConf.SHUFFLE_PARTITIONS().key(), "200"), () -> {
sql("UPDATE %s SET id = -1", tableName);
});
```
Let me see the PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] aokolnychyi commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky
Posted by GitBox <gi...@apache.org>.
aokolnychyi commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1032851191
@szehon-ho @rdblue, I think the problem is that we request a hash distribution by `_file` and sometimes records for multiple files end up in a single output task. That leads to an inconsistent number of output files. It used to happen frequently when we had only 4 shuffle partitions but then I changed it to 200. Looks like this is still not enough and we should come up with a more reliable fix.
```
// set the num of shuffle partitions to 200 instead of default 4 to reduce the chance of hashing
// records for multiple source files to one writing task (needed for a predictable num of output files)
withSQLConf(ImmutableMap.of(SQLConf.SHUFFLE_PARTITIONS().key(), "200"), () -> {
sql("UPDATE %s SET id = -1", tableName);
});
```
Let me see the PR.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] aokolnychyi closed issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky
Posted by GitBox <gi...@apache.org>.
aokolnychyi closed issue #4031:
URL: https://github.com/apache/iceberg/issues/4031
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org
[GitHub] [iceberg] szehon-ho commented on issue #4031: TestCopyOnWriteUpdate::testUpdateWithoutCondition is flaky
Posted by GitBox <gi...@apache.org>.
szehon-ho commented on issue #4031:
URL: https://github.com/apache/iceberg/issues/4031#issuecomment-1031785164
Yea , I meant there looks like some randomness of Spark shuffle assigning the row to same/different Spark partition.. I'll have to investigate if I have some time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@iceberg.apache.org
For additional commands, e-mail: issues-help@iceberg.apache.org