You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@storm.apache.org by "Ethan Li (JIRA)" <ji...@apache.org> on 2018/02/07 19:40:00 UTC

[jira] [Updated] (STORM-2940) Worker dies when there are too many downstream tasks if using LoadAwareShuffleGrouping

     [ https://issues.apache.org/jira/browse/STORM-2940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ethan Li updated STORM-2940:
----------------------------
    Description: 
We have seen exceptions from workers 
{code:java}
java.lang.IllegalArgumentException: bound must be positive
{code}
The stack trace points to LoadAwareShuffleGrouping.java#L234:
{code:java}
//in case we didn't fill in enough
for (; currentIdx < CAPACITY; currentIdx++) {
prepareChoices[currentIdx] = prepareChoices[random.nextInt(currentIdx)];
}
{code}
This is because in some situation, 
{code:java}
int count = (int) ((indexAndWeights.weight / (double) weightSum) * CAPACITY);
{code}
the above code will be 0 during the whole for-loop. For example, when there are more than 1000 (which is CAPACITY) downstream tasks, say 1001, and each task has weight 1.  Then for every target task, count =(int) ( (1 / 1001.0) * 1000) = 0.

 

  was:
We have seen exceptions from workers 
{code:java}
java.lang.IllegalArgumentException: bound must be positive
{code}
when there are too many downstream task. The stack trace points to LoadAwareShuffleGrouping.java#L234:
{code:java}
//in case we didn't fill in enough
for (; currentIdx < CAPACITY; currentIdx++) {
prepareChoices[currentIdx] = prepareChoices[random.nextInt(currentIdx)];
}
{code}
This is because in some situation, 
{code:java}
int count = (int) ((indexAndWeights.weight / (double) weightSum) * CAPACITY);
{code}
the above code will be 0 during the whole for-loop. For example, when there are more than 1000 (which is CAPACITY) downstream tasks, say 1001, and each task has weight 1.  Then for every target task, count =(int) ( (1 / 1001.0) * 1000) = 0.

 


> Worker dies when there are too many downstream tasks if using LoadAwareShuffleGrouping
> --------------------------------------------------------------------------------------
>
>                 Key: STORM-2940
>                 URL: https://issues.apache.org/jira/browse/STORM-2940
>             Project: Apache Storm
>          Issue Type: Bug
>            Reporter: Ethan Li
>            Assignee: Ethan Li
>            Priority: Major
>
> We have seen exceptions from workers 
> {code:java}
> java.lang.IllegalArgumentException: bound must be positive
> {code}
> The stack trace points to LoadAwareShuffleGrouping.java#L234:
> {code:java}
> //in case we didn't fill in enough
> for (; currentIdx < CAPACITY; currentIdx++) {
> prepareChoices[currentIdx] = prepareChoices[random.nextInt(currentIdx)];
> }
> {code}
> This is because in some situation, 
> {code:java}
> int count = (int) ((indexAndWeights.weight / (double) weightSum) * CAPACITY);
> {code}
> the above code will be 0 during the whole for-loop. For example, when there are more than 1000 (which is CAPACITY) downstream tasks, say 1001, and each task has weight 1.  Then for every target task, count =(int) ( (1 / 1001.0) * 1000) = 0.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)