You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Vikas Vishwakarma (JIRA)" <ji...@apache.org> on 2017/04/17 02:29:41 UTC
[jira] [Comment Edited] (HADOOP-14313) Replace/improve Hadoop's byte[] comparator

    [ https://issues.apache.org/jira/browse/HADOOP-14313?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15970603#comment-15970603 ] 

Vikas Vishwakarma edited comment on HADOOP-14313 at 4/17/17 2:29 AM:
---------------------------------------------------------------------

These microbenchmarks are calculated using JMH, 20 warmup cycles, 50 measurement cycles (total run time 20 mins each for Hadoop & guava)

|---|Hadoop|---|---|guava|---|---|%diff|---|---|
|byte array size|min|mean|max|min|mean|max|min|mean|max|
|4|19838.782|20072.437|20090.503|24026.12|24284.021|24300.338|21|21|21|
|8|22012.932|22044.713|22051.793|22199.453|22253.173|22261.712|1|1|1|
|16|19606.912|19616.322|19649.318|21995.475|22113.443|22120.836|12|13|13|
|20|16482.859|16705.744|16776.35|18625.111|18660.355|18704.285|13|12|11|
|32|19307.22|19345.122|19352.411|21309.337|21359.051|21377.868|10|10|10|
|50|15864.759|15941.543|15953.412|18468.621|18613.202|18749.651|16|17|18|
|64|20152.624|20379.32|20397.359|21065.113|21116.799|21128.523|5|4|4|
|100|13293.668|13385.7|13403.439|15594.286|15690.369|15796.081|17|17|18|
|128|17016.59|17260.48|17278.799|17668.509|19205.199|19333.922|4|11|12|
|200|11599.469|11733.228|11755.622|14540.79|14648.077|14728.363|25|25|25|
|256|13205.591|13315.903|13326.772|14448.858|14933.008|15064.242|9|12|13|
|512|9031.652|9142.564|9149.54|10236.501|10376.17|10389.971|13|13|14|
|1024|3863.105|3864.757|3871.94|6911.951|7002.067|7009.792|79|81|81|
|2048|2129.32|2151.807|2155.381|4072.481|4085.278|4089.185|91|90|90|
|4096|1069.962|1076.303|1076.993|2319.74|2326.514|2328.69|117|116|116|
|8192|863.716|866.808|867.296|931.945|1131.406|1136.288|8|31|31|
|16384|432.556|434.158|434.698|582.37|584.294|584.852|35|35|35|


was (Author: vik.karma):
These microbenchmarks are calculated using JMH, 20 warmup cycles, 50 measurement cycles (total run time 20 mins each for Hadoop & guava)

|---|Hadoop|---|---|guava|---|---|diff|---|---|
|byte array size|min|mean|max|min|mean|max|min|mean|max|
|19838.782|20072.437|20090.503|24026.12|24284.021|24300.338|21|21|21|
|22012.932|22044.713|22051.793|22199.453|22253.173|22261.712|1|1|1|
|19606.912|19616.322|19649.318|21995.475|22113.443|22120.836|12|13|13|
|16482.859|16705.744|16776.35|18625.111|18660.355|18704.285|13|12|11|
|19307.22|19345.122|19352.411|21309.337|21359.051|21377.868|10|10|10|
|15864.759|15941.543|15953.412|18468.621|18613.202|18749.651|16|17|18|
|20152.624|20379.32|20397.359|21065.113|21116.799|21128.523|5|4|4|
|13293.668|13385.7|13403.439|15594.286|15690.369|15796.081|17|17|18|
|17016.59|17260.48|17278.799|17668.509|19205.199|19333.922|4|11|12|
|11599.469|11733.228|11755.622|14540.79|14648.077|14728.363|25|25|25|
|13205.591|13315.903|13326.772|14448.858|14933.008|15064.242|9|12|13|
|9031.652|9142.564|9149.54|10236.501|10376.17|10389.971|13|13|14|
|3863.105|3864.757|3871.94|6911.951|7002.067|7009.792|79|81|81|
|2129.32|2151.807|2155.381|4072.481|4085.278|4089.185|91|90|90|
|1069.962|1076.303|1076.993|2319.74|2326.514|2328.69|117|116|116|
|863.716|866.808|867.296|931.945|1131.406|1136.288|8|31|31|
|432.556|434.158|434.698|582.37|584.294|584.852|35|35|35|

> Replace/improve Hadoop's byte[] comparator
> ------------------------------------------
>
>                 Key: HADOOP-14313
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14313
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: common
>            Reporter: Vikas Vishwakarma
>
> Hi,
> Recently we were looking at the Lexicographic byte array comparison in HBase. We did microbenchmark for the byte array comparator of HADOOP, HBase Vs the latest byte array comparator from guava  ( https://github.com/google/guava/blob/master/guava/src/com/google/common/primitives/UnsignedBytes.java#L362 ) and observed that the guava main branch version is much faster. 
> Specifically we see very good improvement when the byteArraySize%8 != 0 and also for large byte arrays. I will update the benchmark results using JMH for Hadoop vs Guava. For the jira on HBase, please refer HBASE-17877. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org