You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/04/15 02:53:55 UTC

[GitHub] [incubator-doris] wangbo opened a new issue #3319: Support Java Verision HyperLogLog

wangbo opened a new issue #3319: Support Java Verision HyperLogLog
URL: https://github.com/apache/incubator-doris/issues/3319
 
 
   **Description**
   For Spark Load Process,we need a java version hll to calculate approximate distinct value.
   The main request for java's implementation is try to keep consistent with Be C++ version
   
   **Hll Write**
   The ```Hll Write``` can be consistent with C++ version.
   For java treats unsigned and signed numbers in the same way when execute addition,substraction, multiplication and shift. 
   The above operator is enough for ```Hll Write```.
   For ```mod operator```, we can use BigInteger to execute ```mod```
   
   **data structure**
   The main data structure for hll is Integer and not touch unsigned long, so this can be consistent with C++
   
   **Hll.estimateCardinality**
   Need a further research to judge whether C++ 's float-point calculation can be totally consistent with Java's float-point calculation.
   Though they all follow ```IEEE 754``` rule, but for C++, OS architecture and compile option may affect the real calculation.
   ```estimateCardinality``` is not necessary for Spark Load Process, so this is not a urgent problem for now.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] 924060929 edited a comment on issue #3319: Support Java Verision HyperLogLog

Posted by GitBox <gi...@apache.org>.
924060929 edited a comment on issue #3319:
URL: https://github.com/apache/incubator-doris/issues/3319#issuecomment-788863320


   @wangbo I already implement hll/bitmap scala version and do many unit test in our company, see data/spark-on-doris/common/src/main/scala/org/apache/doris/spark/sql/function/


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] wangbo closed issue #3319: Support Java Verision HyperLogLog

Posted by GitBox <gi...@apache.org>.
wangbo closed issue #3319:
URL: https://github.com/apache/incubator-doris/issues/3319


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] 924060929 commented on issue #3319: Support Java Verision HyperLogLog

Posted by GitBox <gi...@apache.org>.
924060929 commented on issue #3319:
URL: https://github.com/apache/incubator-doris/issues/3319#issuecomment-788863320


   @wangbo I already implement hll/bitmap scala version add do many unit test in our company, see data/spark-on-doris/common/src/main/scala/org/apache/doris/spark/sql/function/


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org