You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2020/04/23 05:40:20 UTC

[GitHub] [druid] clintropolis edited a comment on issue #9736: HLLSketchMerge aggregator failing for some metrics after upgrade to v0.18

clintropolis edited a comment on issue #9736:
URL: https://github.com/apache/druid/issues/9736#issuecomment-618189702


   > Is there a schema file somewhere that describes the layout of the hll_segment.zip file that @scrawfor attached to this issue. I'd like to write a parser so that in the future if we need to pull out just the sketches I can do it more efficiently. I was able to find some of the sketches by hand, but it is a lot of work :)
   
   @gianm described how to get the raw column out of the segment in [this comment](https://github.com/apache/druid/issues/9736#issuecomment-617411096) by finding the position information in the `meta.smoosh`, but you can also extract base64 serialized versions of the column with the [dump-segment tool](https://druid.apache.org/docs/latest/operations/dump-segment.html), if you're wanting to easily look at values for individual rows.
   
   From the druid package directory:
   
   ```
   $ java -classpath "lib/*" -Ddruid.extensions.loadList="[\"druid-datasketches\"]" org.apache.druid.cli.Main tools dump-segment -d /Users/clint/workspace/data/druid/localStorage/hll_segment/ --dump rows --column unique_views_hll
   ```
   
   which will spit out something like this:
   ```
   {"unique_views_hll":"AwEHDggYAAGwAAAAAK6iCQFzJwQDzXwGBTMWCgeFZARCyewJDVSOBw8Q6QQRuVwO2jliD4GTxAnxybMEFnw+CRj2xw4Z9/MEGovwBxyq+godenYKHl8CBx9N7AaPkw0KIzRKEQ8VoATGX+QGJnkMBydvIAXWU8YNKqXbBCuKUgYu5Y0LMOZHBzLeBQczolUHJ66KBjVFJA427xcFGB5sB+tuxQs5gRAbYG1uFD13/RA+TOQF+EZ8FIe4PxBCoDwEKhhEF27WWgc9CJYGSFPOD0ld1A3NfH8JS/3JB0ynuAu6ugsETuvmCk/ByAutozQHUVk8FkiEUQdUcOoIVRcHCFZKLgRXXloFNSM8DqxXSwZbS8IEXNjhDF3WcxJeRQYIYGuxB2EK6wdjnhESZNyBBGeCsghoY0UHSCCMBWwN7gRuu5sKb9xvCXES+wpy6RALcx69BXSE1At1ZcgEvF0GBrPFjQx5L7YFeia2Bkwu+A2AfysHgDD6C4GBoQSCmcQLRZ43BO+VRROFOvMHh6nKBrPqiQWLPqQHjcHlBo5JHQSPwzgErnF6DpF0vAWSLWcLaQ4UC5ckBAyYzLcHm+LIBZ3OQAuBHaIFoEwADKGVjASibboM7VpAB6SJAAWlDrwHptSdCaf2UxAdjHAFgSqBG6YGjgiul1EIM19xBLAZqgqyZUEMs2H7EbTFIRC2l1YOuIUZJLlyjw26y6cINoV1CbxJvQi9qn0HvpsMB78T3Qa8Bu4MxgD9Bch6GwjvbC8Hy/MyB5E7aAXN0FEFzmVGCc8vIwfSby4G1ENTD4JTYwTW2XcH18g/BNhTUBr/bVUF2tK0BtxzuAbd9OYR3/THBuCZsQrirXUHrpsqBeUdggnmZX8E5zomDurPIQXr6l0H7Vs3Be5otAY2ua4E8aijCvJQzgf0gtAW+CxWBPrluhH7eqkKrsTVBHKM9wQ="}
   {"unique_views_hll":"AgEHDgMIBQDERQoE+BSGB6dz6ATE/HoHZs5tIA=="}
   {"unique_views_hll":"AgEHDgMIBQCPrFwWNt2mCkmuNQZFWZgYxJpXDg=="}
   ....
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org