You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by "Stephan Ewen (JIRA)" <ji...@apache.org> on 2014/07/09 11:06:05 UTC

[jira] [Commented] (FLINK-1013) ArithmeticException: / by zero in MutableHashTable

    [ https://issues.apache.org/jira/browse/FLINK-1013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14056018#comment-14056018 ] 

Stephan Ewen commented on FLINK-1013:
-------------------------------------

Can we simply solve this by assuming a may for the avgRecordLenPartition? It make sense anyways to not let the number of buckets get too low. This formula is only to find a tradeoff between really sort records (like Tuple2<Long, Double>) and longer records.

How about capping that value at 512? That should be fine. The loss in initial partition buffers (in favor of bucket buffers) is rather small then...

Make the formula
{code}
bucketCount = (int) (((long) totalBuffersAvailable) * RECORD_TABLE_BYTES / (Math.min(512, avgRecordLenPartition) + RECORD_OVERHEAD_BYTES));
{code}

> ArithmeticException: / by zero in MutableHashTable
> --------------------------------------------------
>
>                 Key: FLINK-1013
>                 URL: https://issues.apache.org/jira/browse/FLINK-1013
>             Project: Flink
>          Issue Type: Bug
>            Reporter: Till Rohrmann
>
> I encountered a division by zero exception in the MutableHashTable. It happened when I joined two datasets of relatively big records (approx. 40-50 MB I think). When joining them the buildTableFromSpilledPartition method of the MutableHashTable is called. In case that the available buffers are smaller than the needed number of buffers, the mutable hash table will calculate the bucket count
> {code}
> bucketCount = (int) (((long) totalBuffersAvailable) * RECORD_TABLE_BYTES / (avgRecordLenPartition + RECORD_OVERHEAD_BYTES));
> {code}
> If the average record length is sufficiently large, then the bucket count will be 0. Initializing the hash table with a 0 bucket count will cause then the division by 0 exception. I don't know whether this problem can be mitigated but it should at least throw a meaningful exception instead of the ArithmeticException.



--
This message was sent by Atlassian JIRA
(v6.2#6252)