You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@flink.apache.org by "Sihua Zhou (JIRA)" <ji...@apache.org> on 2018/05/30 05:35:00 UTC

[jira] [Created] (FLINK-9474) Introduce an approximate version of "count distinct"

Sihua Zhou created FLINK-9474:
---------------------------------

             Summary: Introduce an approximate version of "count distinct"
                 Key: FLINK-9474
                 URL: https://issues.apache.org/jira/browse/FLINK-9474
             Project: Flink
          Issue Type: New Feature
          Components: Table API &amp; SQL
    Affects Versions: 1.5.0
            Reporter: Sihua Zhou
            Assignee: Sihua Zhou


We can implement an approximate version of "count distinct" base on the "Elastic Bloom Filter", It could be very fast because we don't need to query the state anymore, its accuracy should could be configurable. e.g 95%, 98%.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)