You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Ethan Li (JIRA)" <ji...@apache.org> on 2018/02/07 19:40:00 UTC
[jira] [Updated] (STORM-2940) Worker dies when there are too many
downstream tasks if using LoadAwareShuffleGrouping
[ https://issues.apache.org/jira/browse/STORM-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ethan Li updated STORM-2940:
----------------------------
Description:
We have seen exceptions from workers
{code:java}
java.lang.IllegalArgumentException: bound must be positive
{code}
The stack trace points to LoadAwareShuffleGrouping.java#L234:
{code:java}
//in case we didn't fill in enough
for (; currentIdx < CAPACITY; currentIdx++) {
prepareChoices[currentIdx] = prepareChoices[random.nextInt(currentIdx)];
}
{code}
This is because in some situation,
{code:java}
int count = (int) ((indexAndWeights.weight / (double) weightSum) * CAPACITY);
{code}
the above code will be 0 during the whole for-loop. For example, when there are more than 1000 (which is CAPACITY) downstream tasks, say 1001, and each task has weight 1. Then for every target task, count =(int) ( (1 / 1001.0) * 1000) = 0.
was:
We have seen exceptions from workers
{code:java}
java.lang.IllegalArgumentException: bound must be positive
{code}
when there are too many downstream task. The stack trace points to LoadAwareShuffleGrouping.java#L234:
{code:java}
//in case we didn't fill in enough
for (; currentIdx < CAPACITY; currentIdx++) {
prepareChoices[currentIdx] = prepareChoices[random.nextInt(currentIdx)];
}
{code}
This is because in some situation,
{code:java}
int count = (int) ((indexAndWeights.weight / (double) weightSum) * CAPACITY);
{code}
the above code will be 0 during the whole for-loop. For example, when there are more than 1000 (which is CAPACITY) downstream tasks, say 1001, and each task has weight 1. Then for every target task, count =(int) ( (1 / 1001.0) * 1000) = 0.
> Worker dies when there are too many downstream tasks if using LoadAwareShuffleGrouping
> --------------------------------------------------------------------------------------
>
> Key: STORM-2940
> URL: https://issues.apache.org/jira/browse/STORM-2940
> Project: Apache Storm
> Issue Type: Bug
> Reporter: Ethan Li
> Assignee: Ethan Li
> Priority: Major
>
> We have seen exceptions from workers
> {code:java}
> java.lang.IllegalArgumentException: bound must be positive
> {code}
> The stack trace points to LoadAwareShuffleGrouping.java#L234:
> {code:java}
> //in case we didn't fill in enough
> for (; currentIdx < CAPACITY; currentIdx++) {
> prepareChoices[currentIdx] = prepareChoices[random.nextInt(currentIdx)];
> }
> {code}
> This is because in some situation,
> {code:java}
> int count = (int) ((indexAndWeights.weight / (double) weightSum) * CAPACITY);
> {code}
> the above code will be 0 during the whole for-loop. For example, when there are more than 1000 (which is CAPACITY) downstream tasks, say 1001, and each task has weight 1. Then for every target task, count =(int) ( (1 / 1001.0) * 1000) = 0.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)