You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sital Kedia (JIRA)" <ji...@apache.org> on 2016/10/10 17:45:20 UTC

[jira] [Updated] (SPARK-17839) Use Nio's directbuffer instead of BufferedInputStream in order to avoid additional copy from os buffer cache to user buffer

     [ https://issues.apache.org/jira/browse/SPARK-17839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sital Kedia updated SPARK-17839:
--------------------------------
    Summary: Use Nio's directbuffer instead of BufferedInputStream in order to avoid additional copy from os buffer cache to user buffer   (was: UnsafeSorterSpillReader should use Nio's directbuffer to read the spill files in order to avoid additional copy)

> Use Nio's directbuffer instead of BufferedInputStream in order to avoid additional copy from os buffer cache to user buffer 
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-17839
>                 URL: https://issues.apache.org/jira/browse/SPARK-17839
>             Project: Spark
>          Issue Type: Improvement
>          Components: Shuffle
>    Affects Versions: 2.0.1
>            Reporter: Sital Kedia
>            Priority: Minor
>
> Currently we use BufferedInputStream to read the shuffle file which copies the file content from os buffer cache to the user buffer. This adds additional latency in reading the spill files. We made a change to use java nio's direct buffer to read the spill files and for certain jobs spilling significant amount of data, we see between 5 - 7% speedup.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org