You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Chengxiang Li (JIRA)" <ji...@apache.org> on 2015/09/09 05:29:45 UTC

[jira] [Updated] (FLINK-2533) Gap based random sample optimization

     [ https://issues.apache.org/jira/browse/FLINK-2533?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Chengxiang Li updated FLINK-2533:
---------------------------------
    Assignee: GaoLun

> Gap based random sample optimization
> ------------------------------------
>
>                 Key: FLINK-2533
>                 URL: https://issues.apache.org/jira/browse/FLINK-2533
>             Project: Flink
>          Issue Type: Improvement
>          Components: Core
>            Reporter: Chengxiang Li
>            Assignee: GaoLun
>            Priority: Minor
>
> For random sampler with fraction, like BernoulliSampler and PoissonSampler, Gap based random sampler could exploit O(k) sample implementation instead of previous O\(n\) sample implementation, it should perform better while sample fraction is very small. [This blog|http://erikerlandson.github.io/blog/2014/09/11/faster-random-samples-with-gap-sampling/] describes more detail about gap based random sampler.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)