You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hivemall.apache.org by "Makoto Yui (JIRA)" <ji...@apache.org> on 2016/11/15 11:52:58 UTC

[jira] [Created] (HIVEMALL-18) Support approx_count UDAF using HyperLogLog

Makoto Yui created HIVEMALL-18:
----------------------------------

             Summary: Support approx_count UDAF using HyperLogLog
                 Key: HIVEMALL-18
                 URL: https://issues.apache.org/jira/browse/HIVEMALL-18
             Project: Hivemall
          Issue Type: Sub-task
            Reporter: Makoto Yui
            Priority: Minor


https://github.com/addthis/stream-lib could be used for underlying library.

http://www.slideshare.net/bzamecnik/hyperloglog-in-hive-how-to-count-sheep-efficiently
https://databricks.com/blog/2016/05/19/approximate-algorithms-in-apache-spark-hyperloglog-and-quantiles.html

There exist several HLL implementation as Hive UDAF.
https://github.com/MLnick/hive-udf/wiki
https://github.com/klout/brickhouse



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)