You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by ChengXiangLi <gi...@git.apache.org> on 2015/08/24 11:52:34 UTC

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

GitHub user ChengXiangLi opened a pull request:

    https://github.com/apache/flink/pull/1047

    [FLINK-2564] Failing Test: RandomSamplerTest

    As discussed in [here](https://github.com/apache/flink/pull/949#issuecomment-133990256), it's a balance between accurate verification and failure-positive result. As this failure and KS test failure happens more often than expected, we should expand the verification boundary which would reduce the fail-positive case to an acceptable level.
    With this PR, the KS test and fraction verification never fail after tens of thousands executions in my local environment. @tillrohrmann , @chiwanpark .

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/ChengXiangLi/flink FLINK-2564

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/flink/pull/1047.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1047
    
----
commit ab5c81d708c31591df160b323244df7fe4445dc6
Author: chengxiang li <ch...@intel.com>
Date:   2015-08-24T09:49:44Z

    [FLINK-2564] [test] expand the verification boundary for random sampler test.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1047#issuecomment-134130912
  
    I've tested with increased count of sources and count of samples. It seems okay but I'm not sure which approach is more proper.
    
    https://travis-ci.org/chiwanpark/flink/builds/76923033


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/flink/pull/1047


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1047#issuecomment-134145495
  
    Here is my changes: chiwanpark@0699411
    
    The changes are tested in my local environment about 200 hundred times and also tested in Travis-CI 10 times.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1047#issuecomment-134459358
  
    Merging....


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1047#issuecomment-134517776
  
    Are you still merging @chiwanpark? Doesn't seem that the commit made it to the master.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1047#issuecomment-134531036
  
    @tillrohrmann Sorry for late. I forgot push the commit to apache master. The commit is pushed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by ChengXiangLi <gi...@git.apache.org>.
Github user ChengXiangLi commented on the pull request:

    https://github.com/apache/flink/pull/1047#issuecomment-134442744
  
    Merge part of chiwanpark's change, add the execute time for fraction sample from 5 to 20. As changing input size from 10000 to 50000 would increase the unit test costs from 15s to 74s in my local environment, it seems too much for a unit test, so i didn't merge this part.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by ChengXiangLi <gi...@git.apache.org>.
Github user ChengXiangLi commented on the pull request:

    https://github.com/apache/flink/pull/1047#issuecomment-134138815
  
    @chiwanpark , it may need thousands of times' execution to verify, as the failure possibility is quite low. Besides, how many times do you increased count of sources and count of samples, the tests of reservoir samplers are executed on seconds level, so be careful about the unit test cost as well. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by tillrohrmann <gi...@git.apache.org>.
Github user tillrohrmann commented on the pull request:

    https://github.com/apache/flink/pull/1047#issuecomment-134145892
  
    I think both ways (increasing the number of times we sample as well as softening the accuracy thresholds) will improve the test stability. However, as @ChengXiangLi pointed out, increasing the number of samplings and the sample size, will also increase the unit test costs (even though it should be in the range of fractions of a second). Maybe we can also combine both approaches.
    
    Other than that, LGTM. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] flink pull request: [FLINK-2564] Failing Test: RandomSamplerTest

Posted by chiwanpark <gi...@git.apache.org>.
Github user chiwanpark commented on the pull request:

    https://github.com/apache/flink/pull/1047#issuecomment-134449682
  
    Looks good to merge.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---