You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Liang-Chi Hsieh (JIRA)" <ji...@apache.org> on 2019/04/19 04:28:00 UTC

[jira] [Commented] (SPARK-27367) Faster RoaringBitmap Serialization with v0.8.0

    [ https://issues.apache.org/jira/browse/SPARK-27367?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16821653#comment-16821653 ] 

Liang-Chi Hsieh commented on SPARK-27367:
-----------------------------------------

I do upgrade it in local. But seems the performance improvement isn't so obvious. Maybe the optimization is only significant on larger bitmap. I'm not sure if in Spark we will have large bitmap that can take advantage of this optimization.

I compare 0.7.45 (used in current master) and 0.8.1 (latest release), except for serde to bytebuffer, I didn't see other noticeable commits.

So, do we still want to upgrade to 0.8.1? If so, I can make a PR.

 

> Faster RoaringBitmap Serialization with v0.8.0
> ----------------------------------------------
>
>                 Key: SPARK-27367
>                 URL: https://issues.apache.org/jira/browse/SPARK-27367
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core
>    Affects Versions: 3.0.0
>            Reporter: Imran Rashid
>            Priority: Major
>
> RoaringBitmap 0.8.0 adds faster serde, but also requires us to change how we call the serde routines slightly to take advantage of it.  This is probably a worthwhile optimization as the every shuffle map task with a large # of partitions generates these bitmaps, and the driver especially has to deserialize many of these messages.
> See 
> * https://github.com/apache/spark/pull/24264#issuecomment-479675572
> * https://github.com/RoaringBitmap/RoaringBitmap/pull/325
> * https://github.com/RoaringBitmap/RoaringBitmap/issues/319



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org