You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/01/26 23:12:58 UTC

[GitHub] [pinot] kishoreg commented on pull request #8074: For DISTINCT_COUNT, automatically convert Set to HyperLogLog when cardinality is too high

kishoreg commented on pull request #8074:
URL: https://github.com/apache/pinot/pull/8074#issuecomment-1022688453


   can we list out all the options we have
   1. Automatically convert set to hyperloglog after a threshold 
      a. Threshold is set to something 100K by default
      b. threshold is set to -1 which means feature is off and folks can change it
      c.  user has the ability to control the threshold through query option (enable_approx_distinct_threshold=100,000)
   
   2. Return error if the threshold is reached
      a. user then uses disctintcounthll 
      
   The reason why I don't prefer second option where we return error and ask users to use distinctcounthll
   - the users cannot change to distinctcountsql because this will always return approximate even when it does not hit the threshold. 
   - most of them will not hit this error in testing and will directly see this in production which is too late.
   - Pinot is mostly accessed programmatically via apps and the app user cannot really do much when the app returns error.
   - distinctcounthll is not really a standard sql and wont work with other standard tools like tableau, superset etc
   
   My preference is to go with option 1 but start with -1 as the default value which makes the feature off by default but have the ability to override it using per query option or server config.
   
   
    


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org