You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whirr.apache.org by "Lars George (JIRA)" <ji...@apache.org> on 2011/01/14 16:25:46 UTC
[jira] Created: (WHIRR-207) Handle wget timeouts better
Handle wget timeouts better
---------------------------
Key: WHIRR-207
URL: https://issues.apache.org/jira/browse/WHIRR-207
Project: Whirr
Issue Type: Bug
Affects Versions: 0.3.0
Reporter: Lars George
I have had that happen before and now again. We need to handle this better:
{code}
+ for i in '`seq 1 3`'
+ curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
curl: (18) transfer closed with 12646997 bytes remaining to read
{code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991466#comment-12991466 ]
Andrei Savu commented on WHIRR-207:
-----------------------------------
>From the {{set}} manpage:
{code}
-e errexit
Exit immediately if a simple command exits with a non-zero
status, unless the command that fails is part of an until or
while loop, part of an if statement, part of a && or || list,
or if the command's return status is being inverted using !.
{code}
I will replace the {{for}} loop with a {{while}} and add a wait before a retry
as Lars suggested and move everything inside a function that can be reused.
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Fix For: 0.4.0
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrei Savu updated WHIRR-207:
------------------------------
Assignee: Andrei Savu
Status: Patch Available (was: Open)
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Assignee: Andrei Savu
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (WHIRR-207) Handle wget timeouts better
Posted by "Lars George (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981851#action_12981851 ]
Lars George commented on WHIRR-207:
-----------------------------------
+1, I suggest some bash function() that can be reused to download with retries.
> Handle wget timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrei Savu updated WHIRR-207:
------------------------------
Attachment: WHIRR-207.patch
I've updated the patch and it should work for all the services (only tested with hadoop and zookeeper). Let me know if it works for you.
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991465#comment-12991465 ]
Andrei Savu commented on WHIRR-207:
-----------------------------------
We are seeing this failure because we do at the beginning of script {{set -e}} and we don't handle the curl exit code (18). The loop and the retry is never executed in this failure scenario.
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Fix For: 0.4.0
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrei Savu updated WHIRR-207:
------------------------------
Status: Open (was: Patch Available)
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Assignee: Andrei Savu
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12992726#comment-12992726 ]
Andrei Savu commented on WHIRR-207:
-----------------------------------
Great! I will fix the patch as soon as possible.
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Assignee: Andrei Savu
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrei Savu updated WHIRR-207:
------------------------------
Attachment: WHIRR-207.patch
I've tested the {{install_tar}} function on the development machine by shutting down and restarting the connection. I haven't run the integration tests yet. I'm planning to do that tomorrow.
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrei Savu updated WHIRR-207:
------------------------------
Attachment: WHIRR-207.patch
I've fixed the patch. Tested with hbase(should also cover hadoop and zookeeper) and cassandra.
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Reporter: Lars George
> Assignee: Andrei Savu
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrei Savu updated WHIRR-207:
------------------------------
Fix Version/s: 0.4.0
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Fix For: 0.4.0
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Tom White (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tom White updated WHIRR-207:
----------------------------
Attachment: WHIRR-207.patch
Unfortunately WHIRR-225 broke this patch completely, so I've generated an equivalent. The nice thing is that we only need one copy of the install_tarball function with the WHIRR-225 approach.
I've tested with ZooKeeper, but haven't done HBase yet, since it has some different semantics for resolving the tar name from the URL (this was a problem with the original patch too). Can we generalize the function to take an optional tar name too?
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Assignee: Andrei Savu
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Lars George (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Lars George updated WHIRR-207:
------------------------------
Summary: Handle curl timeouts better (was: Handle wget timeouts better)
It is actually curl, sorry for the wrong title (corrected). The scripts already have a retry:
{code}
curl="curl --retry 3 --silent --show-error --fail"
for i in `seq 1 3`;
do
$curl -O $hbase_tar_url
$curl -O $hbase_tar_url.md5
if md5sum -c $hbase_tar_md5_file; then
break;
else
rm -f $hbase_tar_file $hbase_tar_md5_file
fi
done
{code}
Are these errors that actually do the retry loop? And if so, should the be a
{code}
if [ $i -gt 1 ]; then
sleep 10;
fi
{code}
or some such to wait before the retry? When to give up entirely?
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrei Savu updated WHIRR-207:
------------------------------
Resolution: Fixed
Status: Resolved (was: Patch Available)
I've just committed this. Thanks Tom for reviewing.
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Reporter: Lars George
> Assignee: Andrei Savu
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Issue Comment Edited: (WHIRR-207) Handle curl timeouts
better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12991465#comment-12991465 ]
Andrei Savu edited comment on WHIRR-207 at 2/7/11 4:52 PM:
-----------------------------------------------------------
We are seeing this failure because we do at the beginning of script {{set -e}} and we don't handle the curl exit code (18). The loop and the retry are never executed in this failure scenario.
was (Author: savu.andrei):
We are seeing this failure because we do at the beginning of script {{set -e}} and we don't handle the curl exit code (18). The loop and the retry is never executed in this failure scenario.
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
> Fix For: 0.4.0
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Updated: (WHIRR-207) Handle curl timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andrei Savu updated WHIRR-207:
------------------------------
Affects Version/s: (was: 0.3.0)
Status: Patch Available (was: Open)
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Reporter: Lars George
> Assignee: Andrei Savu
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] Commented: (WHIRR-207) Handle wget timeouts better
Posted by "Andrei Savu (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12981840#action_12981840 ]
Andrei Savu commented on WHIRR-207:
-----------------------------------
The curl exit codes are well documented. We should check that and retry as needed.
> Handle wget timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Affects Versions: 0.3.0
> Reporter: Lars George
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (WHIRR-207) Handle curl timeouts better
Posted by "Tom White (JIRA)" <ji...@apache.org>.
[ https://issues.apache.org/jira/browse/WHIRR-207?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12996818#comment-12996818 ]
Tom White commented on WHIRR-207:
---------------------------------
+1 looks good.
> Handle curl timeouts better
> ---------------------------
>
> Key: WHIRR-207
> URL: https://issues.apache.org/jira/browse/WHIRR-207
> Project: Whirr
> Issue Type: Bug
> Reporter: Lars George
> Assignee: Andrei Savu
> Fix For: 0.4.0
>
> Attachments: WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch, WHIRR-207.patch
>
>
> I have had that happen before and now again. We need to handle this better:
> {code}
> + for i in '`seq 1 3`'
> + curl --retry 3 --silent --show-error --fail -O http://archive.apache.org/dist/hbase/hbase-0.20.6/hbase-0.20.6.tar.gz
> curl: (18) transfer closed with 12646997 bytes remaining to read
> {code}
--
This message is automatically generated by JIRA.
-
For more information on JIRA, see: http://www.atlassian.com/software/jira