You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whirr.apache.org by "Lars George (JIRA)" <ji...@apache.org> on 2011/01/14 16:25:46 UTC

[jira] Created: (WHIRR-207) Handle wget timeouts better

Handle wget timeouts better
---------------------------

                 Key: WHIRR-207
                 URL: https://issues.apache.org/jira/browse/WHIRR-207
             Project: Whirr
          Issue Type: Bug
    Affects Versions: 0.3.0
            Reporter: Lars George


I have had that happen before and now again. We need to handle this better:

{code}
+ for i in '`seq 1 3`'
+ curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
curl: (18) transfer closed with 12646997 bytes remaining to read
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991466#comment-12991466 ] 

Andrei Savu commented on WHIRR-207:
-----------------------------------

>From the {{set}} manpage:

{code}
-e   errexit
          Exit immediately if a simple command exits with a non-zero
          status, unless the command that fails is part of an until or
          while loop, part of an if statement, part of a && or || list,
          or if the command's return status is being inverted using !. 
{code}

I will replace the {{for}} loop with a {{while}} and add a wait before a retry 
as Lars suggested and move everything inside a function that can be reused. 

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>             Fix For: 0.4.0
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-207:
------------------------------

    Assignee: Andrei Savu
      Status: Patch Available  (was: Open)

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>            Assignee: Andrei Savu
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (WHIRR-207) Handle wget timeouts better

Posted by "Lars George (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981851#action_12981851 ] 

Lars George commented on WHIRR-207:
-----------------------------------

+1, I suggest some bash function() that can be reused to download with retries.

> Handle wget timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-207:
------------------------------

    Attachment: WHIRR-207.patch

I've updated the patch and it should work for all the services (only tested with hadoop and zookeeper). Let me know if it works for you. 

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991465#comment-12991465 ] 

Andrei Savu commented on WHIRR-207:
-----------------------------------

We are seeing this failure because we do at the beginning of script {{set -e}} and we don't handle the curl exit code (18). The loop and the retry is never executed in this failure scenario.  

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>             Fix For: 0.4.0
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-207:
------------------------------

    Status: Open  (was: Patch Available)

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>            Assignee: Andrei Savu
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992726#comment-12992726 ] 

Andrei Savu commented on WHIRR-207:
-----------------------------------

Great! I will fix the patch as soon as possible.

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>            Assignee: Andrei Savu
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-207:
------------------------------

    Attachment: WHIRR-207.patch

I've tested the {{install_tar}} function on the development machine by shutting down and restarting the connection. I haven't run the integration tests yet. I'm planning to do that tomorrow. 

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-207:
------------------------------

    Attachment: WHIRR-207.patch

I've fixed the patch. Tested with hbase(should also cover hadoop and zookeeper) and cassandra. 

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>            Reporter: Lars George
>            Assignee: Andrei Savu
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-207:
------------------------------

    Fix Version/s: 0.4.0

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>             Fix For: 0.4.0
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Tom White (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tom White updated WHIRR-207:
----------------------------

    Attachment: WHIRR-207.patch

Unfortunately WHIRR-225 broke this patch completely, so I've generated an equivalent. The nice thing is that we only need one copy of the install_tarball function with the WHIRR-225 approach.

I've tested with ZooKeeper, but haven't done HBase yet, since it has some different semantics for resolving the tar name from the URL (this was a problem with the original patch too). Can we generalize the function to take an optional tar name too?

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>            Assignee: Andrei Savu
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Lars George (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Lars George updated WHIRR-207:
------------------------------

    Summary: Handle curl timeouts better  (was: Handle wget timeouts better)

It is actually curl, sorry for the wrong title (corrected). The scripts already have a retry:

{code}
  curl="curl --retry 3 --silent --show-error --fail"
  for i in `seq 1 3`;
  do
    $curl -O $hbase_tar_url
    $curl -O $hbase_tar_url.md5
    if md5sum -c $hbase_tar_md5_file; then
      break;
    else
      rm -f $hbase_tar_file $hbase_tar_md5_file
    fi
  done
{code}

Are these errors that actually do the retry loop? And if so, should the be a 

{code}
if [ $i -gt 1 ]; then
    sleep 10;
fi
{code}

or some such to wait before the retry? When to give up entirely?

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-207:
------------------------------

    Resolution: Fixed
        Status: Resolved  (was: Patch Available)

I've just committed this. Thanks Tom for reviewing. 

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>            Reporter: Lars George
>            Assignee: Andrei Savu
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Issue Comment Edited: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991465#comment-12991465 ] 

Andrei Savu edited comment on WHIRR-207 at 2/7/11 4:52 PM:
-----------------------------------------------------------

We are seeing this failure because we do at the beginning of script {{set -e}} and we don't handle the curl exit code (18). The loop and the retry are never executed in this failure scenario.  

      was (Author: savu.andrei):
    We are seeing this failure because we do at the beginning of script {{set -e}} and we don't handle the curl exit code (18). The loop and the retry is never executed in this failure scenario.  
  
> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>             Fix For: 0.4.0
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Updated: (WHIRR-207) Handle curl timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-207:
------------------------------

    Affects Version/s:     (was: 0.3.0)
               Status: Patch Available  (was: Open)

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>            Reporter: Lars George
>            Assignee: Andrei Savu
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

[jira] Commented: (WHIRR-207) Handle wget timeouts better

Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981840#action_12981840 ] 

Andrei Savu commented on WHIRR-207:
-----------------------------------

The curl exit codes are well documented. We should check that and retry as needed. 

> Handle wget timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>    Affects Versions: 0.3.0
>            Reporter: Lars George
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (WHIRR-207) Handle curl timeouts better

Posted by "Tom White (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996818#comment-12996818 ] 

Tom White commented on WHIRR-207:
---------------------------------

+1 looks good.

> Handle curl timeouts better
> ---------------------------
>
>                 Key: WHIRR-207
>                 URL: https://issues.apache.org/jira/browse/WHIRR-207
>             Project: Whirr
>          Issue Type: Bug
>            Reporter: Lars George
>            Assignee: Andrei Savu
>             Fix For: 0.4.0
>
>         Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}

-- 
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira