You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Liang-Chi Hsieh (JIRA)" <ji...@apache.org> on 2016/12/20 08:49:58 UTC

[jira] [Updated] (SPARK-18800) Correct the assert in UnsafeKVExternalSorter which ensures array size

     [ https://issues.apache.org/jira/browse/SPARK-18800?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Liang-Chi Hsieh updated SPARK-18800:
------------------------------------
    Summary: Correct the assert in UnsafeKVExternalSorter which ensures array size  (was: UnsafeInMemorySorter throws exception when used in UnsafeKVExternalSorter)

> Correct the assert in UnsafeKVExternalSorter which ensures array size
> ---------------------------------------------------------------------
>
>                 Key: SPARK-18800
>                 URL: https://issues.apache.org/jira/browse/SPARK-18800
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>            Reporter: Liang-Chi Hsieh
>
> UnsafeKVExternalSorter uses UnsafeInMemorySorter to sort the records of BytesToBytesMap if it is given a map.
> Currently we use the number of keys in BytesToBytesMap to determine if the array used for sort is enough or not. It should be wrong. Because we can have multiple values of the same key. Extremely said, you can have BytesToBytesMap.numKeys() == 1, but BytesToBytesMap.numValues() is a big number.
> In this case, we cannot just use BytesToBytesMap's array to do sorting. Otherwise, a exception will be thrown like this:
> {code}
> [info] - SPARK-18800: pass BytesToBytesMap which contains numValues is more than numKeys *** FAILED *** (61 milliseconds)
> [info]   java.lang.IllegalStateException: There is no space for new record
> [info]   at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.insertRecord(UnsafeInMemorySorter.jav
> a:225)
> [info]   at org.apache.spark.sql.execution.UnsafeKVExternalSorter.<init>(UnsafeKVExternalSorter.java:147)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org