You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by ChengXiangLi <gi...@git.apache.org> on 2015/08/24 11:52:34 UTC
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
GitHub user ChengXiangLi opened a pull request:
https://github.com/apache/flink/pull/1047
[FLINK-2564] Failing Test: RandomSamplerTest
As discussed in [here](https://github.com/apache/flink/pull/949#issuecomment-133990256), it's a balance between accurate verification and failure-positive result. As this failure and KS test failure happens more often than expected, we should expand the verification boundary which would reduce the fail-positive case to an acceptable level.
With this PR, the KS test and fraction verification never fail after tens of thousands executions in my local environment. @tillrohrmann , @chiwanpark .
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ChengXiangLi/flink FLINK-2564
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/1047.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #1047
----
commit ab5c81d708c31591df160b323244df7fe4445dc6
Author: chengxiang li <ch...@intel.com>
Date: 2015-08-24T09:49:44Z
[FLINK-2564] [test] expand the verification boundary for random sampler test.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/1047#issuecomment-134130912
I've tested with increased count of sources and count of samples. It seems okay but I'm not sure which approach is more proper.
https://travis-ci.org/chiwanpark/flink/builds/76923033
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/flink/pull/1047
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/1047#issuecomment-134145495
Here is my changes: chiwanpark@0699411
The changes are tested in my local environment about 200 hundred times and also tested in Travis-CI 10 times.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/1047#issuecomment-134459358
Merging....
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:
https://github.com/apache/flink/pull/1047#issuecomment-134517776
Are you still merging @chiwanpark? Doesn't seem that the commit made it to the master.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/1047#issuecomment-134531036
@tillrohrmann Sorry for late. I forgot push the commit to apache master. The commit is pushed.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by ChengXiangLi <gi...@git.apache.org>.
Github user ChengXiangLi commented on the pull request:
https://github.com/apache/flink/pull/1047#issuecomment-134442744
Merge part of chiwanpark's change, add the execute time for fraction sample from 5 to 20. As changing input size from 10000 to 50000 would increase the unit test costs from 15s to 74s in my local environment, it seems too much for a unit test, so i didn't merge this part.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by ChengXiangLi <gi...@git.apache.org>.
Github user ChengXiangLi commented on the pull request:
https://github.com/apache/flink/pull/1047#issuecomment-134138815
@chiwanpark , it may need thousands of times' execution to verify, as the failure possibility is quite low. Besides, how many times do you increased count of sources and count of samples, the tests of reservoir samplers are executed on seconds level, so be careful about the unit test cost as well.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:
https://github.com/apache/flink/pull/1047#issuecomment-134145892
I think both ways (increasing the number of times we sample as well as softening the accuracy thresholds) will improve the test stability. However, as @ChengXiangLi pointed out, increasing the number of samplings and the sample size, will also increase the unit test costs (even though it should be in the range of fractions of a second). Maybe we can also combine both approaches.
Other than that, LGTM.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---
[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest
Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:
https://github.com/apache/flink/pull/1047#issuecomment-134449682
Looks good to merge.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---