You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whirr.apache.org by "Evan Pollan (Created) (JIRA)" <ji...@apache.org> on 2012/01/26 23:25:49 UTC

[jira] [Created] (WHIRR-488) whirr hangs in certain cases when creating a spot-priced EC2 cluster

whirr hangs in certain cases when creating a spot-priced EC2 cluster
--------------------------------------------------------------------

                 Key: WHIRR-488
                 URL: https://issues.apache.org/jira/browse/WHIRR-488
             Project: Whirr
          Issue Type: Bug
    Affects Versions: 0.6.0
         Environment: EC2, creating a cdh hadoop cluster
            Reporter: Evan Pollan


In about 1 out of every 5-7 attempts, whirr will hang while creating a spot-priced cluster in EC2.  The process just sits there, consuming no system resources and writing nothing to stderr or stdout.  In each case, some number of cluster nodes are up and running.

This happened again to me today, and whirr was hung for about 4 hours.  As usual, there were a bunch of errors logged while it was trying to create the instances.  About 10 minutes in, though, whirr just went radio silent and stayed that way until I killed it.

I'll attached the output -- it looks similar to the other instances where whirr has had this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-488) whirr hangs in certain cases when creating a spot-priced EC2 cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194340#comment-13194340 ] 

Andrei Savu commented on WHIRR-488:
-----------------------------------

As far as I can see you are getting an InternalError from Amazon when starting / requesting a bunch of spot instances: 

{noformat}
org.jclouds.aws.AWSResponseException: request POST https://ec2.us-east-1.amazonaws.com/ HTTP/1.1 failed with code 400, error: AWSError{requestId='19b7d163-0c51-4c1d-8447-947beff61dbc', requestToken='null', code='InternalError', message='An internal error has occurred', context='{Response=, Errors=}'}
{noformat}

We can't do that much as an workaround this except for failing faster. Thanks for reporting! 

                
> whirr hangs in certain cases when creating a spot-priced EC2 cluster
> --------------------------------------------------------------------
>
>                 Key: WHIRR-488
>                 URL: https://issues.apache.org/jira/browse/WHIRR-488
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>         Environment: EC2, creating a cdh hadoop cluster
>            Reporter: Evan Pollan
>         Attachments: whirr.startup.hang.log
>
>
> In about 1 out of every 5-7 attempts, whirr will hang while creating a spot-priced cluster in EC2.  The process just sits there, consuming no system resources and writing nothing to stderr or stdout.  In each case, some number of cluster nodes are up and running.
> This happened again to me today, and whirr was hung for about 4 hours.  As usual, there were a bunch of errors logged while it was trying to create the instances.  About 10 minutes in, though, whirr just went radio silent and stayed that way until I killed it.
> I'll attached the output -- it looks similar to the other instances where whirr has had this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Commented] (WHIRR-488) whirr hangs in certain cases when creating a spot-priced EC2 cluster

Posted by "Andrei Savu (Commented) (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13194341#comment-13194341 ] 

Andrei Savu commented on WHIRR-488:
-----------------------------------

Adrian have you seen this before? Is there a property in jclouds to decrease the wait time on spot requests? 
                
> whirr hangs in certain cases when creating a spot-priced EC2 cluster
> --------------------------------------------------------------------
>
>                 Key: WHIRR-488
>                 URL: https://issues.apache.org/jira/browse/WHIRR-488
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>         Environment: EC2, creating a cdh hadoop cluster
>            Reporter: Evan Pollan
>         Attachments: whirr.startup.hang.log
>
>
> In about 1 out of every 5-7 attempts, whirr will hang while creating a spot-priced cluster in EC2.  The process just sits there, consuming no system resources and writing nothing to stderr or stdout.  In each case, some number of cluster nodes are up and running.
> This happened again to me today, and whirr was hung for about 4 hours.  As usual, there were a bunch of errors logged while it was trying to create the instances.  About 10 minutes in, though, whirr just went radio silent and stayed that way until I killed it.
> I'll attached the output -- it looks similar to the other instances where whirr has had this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-488) whirr hangs in certain cases when creating a spot-priced EC2 cluster

Posted by "Evan Pollan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Evan Pollan updated WHIRR-488:
------------------------------

    Attachment: whirr.jclouds.spotPriceHang.log

Here's how this behavior manifests itself in 0.7.0 and the trunk (see whirr.jclouds.spotPriceHang.log).

It's worth noting that I have a cron job that checks to see if whirr is still running launch-cluster 30 minutes after it was started.  If so, it kills it, waits 5 minutes, and tries to start the cluster again with the same properties.  ~75% of the time, this works.  If it hangs again, I edit the cluster properties to use on-demand pricing, and it works 100% of the time.

The incidences of hangs are now in more than 50% of the time, BTW.  Seems like the problem is worse in 0.7.0+...
                
> whirr hangs in certain cases when creating a spot-priced EC2 cluster
> --------------------------------------------------------------------
>
>                 Key: WHIRR-488
>                 URL: https://issues.apache.org/jira/browse/WHIRR-488
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>         Environment: EC2, creating a cdh hadoop cluster
>            Reporter: Evan Pollan
>         Attachments: whirr.jclouds.spotPriceHang.log, whirr.startup.hang.log
>
>
> In about 1 out of every 5-7 attempts, whirr will hang while creating a spot-priced cluster in EC2.  The process just sits there, consuming no system resources and writing nothing to stderr or stdout.  In each case, some number of cluster nodes are up and running.
> This happened again to me today, and whirr was hung for about 4 hours.  As usual, there were a bunch of errors logged while it was trying to create the instances.  About 10 minutes in, though, whirr just went radio silent and stayed that way until I killed it.
> I'll attached the output -- it looks similar to the other instances where whirr has had this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] [Updated] (WHIRR-488) whirr hangs in certain cases when creating a spot-priced EC2 cluster

Posted by "Evan Pollan (Updated) (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Evan Pollan updated WHIRR-488:
------------------------------

    Attachment: whirr.startup.hang.log
    
> whirr hangs in certain cases when creating a spot-priced EC2 cluster
> --------------------------------------------------------------------
>
>                 Key: WHIRR-488
>                 URL: https://issues.apache.org/jira/browse/WHIRR-488
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.6.0
>         Environment: EC2, creating a cdh hadoop cluster
>            Reporter: Evan Pollan
>         Attachments: whirr.startup.hang.log
>
>
> In about 1 out of every 5-7 attempts, whirr will hang while creating a spot-priced cluster in EC2.  The process just sits there, consuming no system resources and writing nothing to stderr or stdout.  In each case, some number of cluster nodes are up and running.
> This happened again to me today, and whirr was hung for about 4 hours.  As usual, there were a bunch of errors logged while it was trying to create the instances.  About 10 minutes in, though, whirr just went radio silent and stayed that way until I killed it.
> I'll attached the output -- it looks similar to the other instances where whirr has had this problem.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira