You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2020/02/21 09:08:46 UTC
[GitHub] [incubator-doris] kangpinghuang opened a new issue #2967: add a
test for different encoding
kangpinghuang opened a new issue #2967: add a test for different encoding
URL: https://github.com/apache/incubator-doris/issues/2967
I add a test for encoding in different situation.
I generate 100million int, classified into 4 type: sequence/random/small step/large step.
the original data size is as following:
sequence | random | small_step | big_step
-- | -- | -- | --
848 | 1000M | 859 | 859
- tests
I test for encoding method, including: alpha/beta_bitshuffle/beta_for(frame of reference)/beta_rle. The result is as following:
1. space
the following is space after encoding for 100million ints.
单位(KB) | sequence | random | small_step | big_step
-- | -- | -- | -- | --
alpha | 2865.152 | 104420.4 | 2108.416 | 2224.128
beta_bitshuffle | 4094.976 | 143268.9 | 1682.432 | 2679.808
beta_for | 4582.4 | 94251.01 | 797.3325 | 956.233728
beta_rle | 818.0101 | 10342.4 | 778.4581 | 783.970304
the graph is as following:
![image](https://user-images.githubusercontent.com/40422952/75019415-c9441d00-54cb-11ea-92f7-a6ac0e2ae9f7.png)
2. query time cost for count(*)
the time is 95% percentile time cost, unit is : ms
| sequence | random | small_step | big_step
-- | -- | -- | -- | --
alpha | 7399.1 | 5416.48 | 6231 | 5372.88
beta_bitshuffle | 14342 | 12059.82 | 9186.91 | 8817.78
beta_for | 8752.04 | 11379.43 | 12403.98 | 8415.49
beta_rle | 8544.95 | 8614.29 | 9299.58 | 8295.44
the graph is:
![image](https://user-images.githubusercontent.com/40422952/75019604-22ac4c00-54cc-11ea-8ad9-38d9bb419ace.png)
3. query time cost for point query
select count(*) from table where id = xxx;
| sequence | random | small_step | big_step
-- | -- | -- | -- | --
alhpa | 8.3 | 8.66 | 477.26 | 10.73
beta_bitshuffle | 9.65 | 9.86 | 413.63 | 10.91
beta_for | 25.3 | 29.98 | 401.32 | 30.13
beta_rle | 8.65 | 9.06 | 398.92 | 10.86
the graph is:
![image](https://user-images.githubusercontent.com/40422952/75019723-61420680-54cc-11ea-80ef-da9862433f4c.png)
- conclusion
beta rle aquire the best space efficiency in all situation than other beta's encodings and alpha encoding. The query performance of beta rle is the best in encodings of Segment V2, but is a bit poor than alpha encoding.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [incubator-doris] gaodayue commented on issue #2967: add a test for
different encoding
Posted by GitBox <gi...@apache.org>.
gaodayue commented on issue #2967: add a test for different encoding
URL: https://github.com/apache/incubator-doris/issues/2967#issuecomment-589928568
To obtain accurate and reproducible test results, we should
1. write benchmark code that directly tests different `PageBuilder` and `PageDecoder`
2. open source it so that other people can review and run the benchmark
In addition, I think we should add a test case on seek read performance for segment_v2 because it significant affects the performance of SegmentIterator.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org
[GitHub] [incubator-doris] kangpinghuang commented on issue #2967: add a
test for different encoding
Posted by GitBox <gi...@apache.org>.
kangpinghuang commented on issue #2967: add a test for different encoding
URL: https://github.com/apache/incubator-doris/issues/2967#issuecomment-589564958
parent issue: #2886
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org