You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "David Knupp (Code Review)" <ge...@cloudera.org> on 2016/09/09 18:00:18 UTC

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

David Knupp has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4348

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................

IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic for a couple of reasons (most
notably because it queried the wrong Zookeeper node) so we stopped
using it during our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with certain updates:

- replace the original script with check-hbase-nodes.py
- query the corect node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 116 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 9: Code-Review+2

Thanks for the explanation. Harrison, thanks for reviewing!

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Harrison Sheinblatt <hs...@hotmail.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4348/8/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 88:         LOGGER.debug("Success: " + str(zk_client))
> That's a good catch, but I think it might be OK as-is. This is the output o
Wfm.


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 9:

What about the other clients like for HDFS?

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Harrison Sheinblatt <hs...@hotmail.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Ishaan Joshi (Code Review)" <ge...@cloudera.org>.
Ishaan Joshi has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4348/3/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 103:             except Exception:
> Unless you want this to catch SIGINT(Ctrl-C), you want to do:
This is wrong, ignore. Nevertheless, thanks to David for looking into it. FWIW, the problems occures with an except:, not an exception Exception:


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 5:

(9 comments)

Under what circumstances is more than one zookeeper needed?

http://gerrit.cloudera.org:8080/#/c/4348/5//COMMIT_MSG
Commit Message:

PS5, Line 11: The original script was problematic for a couple of reasons (most
            : notably because it queried the wrong Zookeeper node) so we stopped
            : using it during our mini-cluster HBase start up procedure.
Note quite right. I didn't use it because it wasn't clear it was needed anymore, the so-called "HBase race" had been fixed. I didn't notice the master vs. rs discrepancy at that time.


http://gerrit.cloudera.org:8080/#/c/4348/5/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

PS5, Line 50:     parser.add_argument('--timeout', '-t', action='store'
I find action='store' to be redundant since that's the default.


PS5, Line 52:                               'Default is 30 seconds.'))
Don't hardcode 30. Use a format string and read from TIMEOUT_SECONDS.


PS5, Line 56:                               'e.g, 0.0.0.0:5070. Default is localhost:5070.'))
Use a format string and read from HDFS_HOST.


PS5, Line 60:                               'e.g, 0.0.0.0:2181. Default is localhost:2181.'))
Use a format string and read from ZK_HOSTS.


PS5, Line 66:                               '/hbase/master and /hbase/rs.')
Use a format string. Maybe use '-n '.join(HBASE_NODES)  as part of it?


PS5, Line 127:         hdfs_client.list('/')
What does this do?


PS5, Line 125:     try:
             :         hdfs_client = InsecureClient('http://' + args.hdfs_host)
             :         hdfs_client.list('/')
             :     except requests.exceptions.ConnectionError as e:
             :         msg = 'Could not connect to HDFS web host http://{0} - {1}'.format(args.hdfs_host, e)
             :         LOGGER.error(msg)
             :         sys.exit(1)
             : 
             :     zk_client = connect_to_zookeeper(args.zookeeper_hosts, args.timeout)
             :     errors = sum([check_hbase_node(zk_client, node, args.timeout) for node in args.nodes])
I suggest you make this a method. The __main__ can get the arguments, the errors, and print the error and exit accordingly. It would be better to have a separate method to make a connection and do the query work.


PS5, Line 135:     if errors:
It would be better to have the explicit numerical > 0 comparison.


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 9:

Harrison and I looked at that too, but the hdfs lib binds to the WebHDFS/HttpFS API, and doesn't maintain active sessions. (The hdfs client doesn't provide any kind of __exit__() or close() interface.)

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Harrison Sheinblatt <hs...@hotmail.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 8: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 8: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic, so we stopped using it during
our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Reviewed-on: http://gerrit.cloudera.org:8080/4348
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Internal Jenkins
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 176 insertions(+), 59 deletions(-)

Approvals:
  Internal Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 10
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Harrison Sheinblatt <hs...@hotmail.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has uploaded a new patch set (#7).

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................

IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic, so we stopped using it during
our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 173 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has uploaded a new patch set (#6).

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................

IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic, so we stopped using it during
our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 154 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has uploaded a new patch set (#5).

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................

IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic for a couple of reasons (most
notably because it queried the wrong Zookeeper node) so we stopped
using it during our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 140 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 8:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4348/8/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 88:         LOGGER.debug("Success: " + str(zk_client))
> LOGGER.info? Otherwise it looks like we're trying to connect but never succ
That's a good catch, but I think it might be OK as-is. This is the output of a typical run:

    
    Contents of HDFS root: [u'hbase', u'home', u'test-warehouse', u'tmp', u'user']
    Connecting to Zookeeper host(s).
    Waiting for HBase node: /hbase/master
    Success: /hbase/master
    Waiting for HBase node: /hbase/rs
    Success: /hbase/rs 
    [etc...]
    

The missing line would have been:

    
    Success: <kazoo.client.KazooClient object at 0x7f12dda65250>
    

If the connection attempt didn't succeed, I don't think it would be a mystery -- you'd get the exception msg, and the script would exit with an error. Is that acceptable? (Honestly, I think might have intended to take that line out.)


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
Hello Michael Brown, Internal Jenkins, Alex Behm,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4348

to look at the new patch set (#9).

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................

IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic, so we stopped using it during
our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 176 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/9
-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 9: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Harrison Sheinblatt <hs...@hotmail.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4348/6/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

PS6, Line 59:                         help=('Comma-delineated string of hosts in host:PORT format, '
When is more than one zk host needed?


PS6, Line 135:     # Confirm that HDFS is available. There is a pathological case where the HBase master
             :     # can start up briefly if HDFS is not available, and then quit immediately, but that can
             :     # be long enough to give a false positive that the HBase master is running.
             :     #
             :     try:
             :         hdfs_client = InsecureClient('http://' + args.hdfs_host)
             :         hdfs_client.list('/')
             :     except requests.exceptions.ConnectionError as e:
             :         msg = 'Could not connect to HDFS web host http://{0} - {1}'.format(args.hdfs_host, e)
             :         LOGGER.error(msg)
             :         sys.exit(1)
             : 
             :     errors = check_znodes_list_for_errors(args.nodes, args.zookeeper_hosts, args.timeout)
I meant for this section also to be a method, with errors returned.


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Harrison Sheinblatt (Code Review)" <ge...@cloudera.org>.
Harrison Sheinblatt has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4348/9/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 135:         zk_client.stop()
A problem is that zk client wants you to stop() and then close().  In this case, if there is an exception or something that gets through, the close() will be called without stop().  This may work without a 'graceful shutdown' of the client, and be OK.  I've done a 2-phase context manager in the past to ensure stop() is called before close().


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Harrison Sheinblatt <hs...@hotmail.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has uploaded a new patch set (#8).

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................

IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic, so we stopped using it during
our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 173 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 9:

Harrison very astutely pointed out in an out-of-band message that I had neglected to call close on the zk_client. Latest patch wraps the client in contextlib.closing() to avoid potentially leaking file handles.

-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4348/8/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 88:         LOGGER.debug("Success: " + str(zk_client))
LOGGER.info? Otherwise it looks like we're trying to connect but never succeed


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "Ishaan Joshi (Code Review)" <ge...@cloudera.org>.
Ishaan Joshi has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4348/3/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 75:     except Exception as e:
Does the message you tack on to the exception give you a lot more information than the exception? If not, then there's no need to catch it and then do an exit().


Line 92:     while True:
More readable to put the timeout at the top of the loop and exit if you bug out of it.

while time.time() - start_time > TIMEOUT_SECONDS:
  ...
  if foo: return 0

LOGGER.error(..)
return 1

On that note, why not return True/False instead of 1/0 ?


Line 103:             except Exception:
Unless you want this to catch SIGINT(Ctrl-C), you want to do:

except Exception as e: instead of except Exception


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 5:

(8 comments)

Reworked the git commit message, and fixed all the issues. I'm trying to find a good way to check the remote health of the namenode on a remote cluster.

http://gerrit.cloudera.org:8080/#/c/4348/5/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

PS5, Line 50:     parser.add_argument('--timeout', '-t', action='store'
> I find action='store' to be redundant since that's the default.
Done


PS5, Line 52:                               'Default is 30 seconds.'))
> Don't hardcode 30. Use a format string and read from TIMEOUT_SECONDS.
Done


PS5, Line 56:                               'e.g, 0.0.0.0:5070. Default is localhost:5070.'))
> Use a format string and read from HDFS_HOST.
Done


PS5, Line 60:                               'e.g, 0.0.0.0:2181. Default is localhost:2181.'))
> Use a format string and read from ZK_HOSTS.
Done


PS5, Line 66:                               '/hbase/master and /hbase/rs.')
> Use a format string. Maybe use '-n '.join(HBASE_NODES)  as part of it?
Done


PS5, Line 127:         hdfs_client.list('/')
> What does this do?
Throws an exception if we can't connect to HDFS. I was re-thinking this weekend that maybe this whole bit should either moved or reworked. It doesn't seem to be working on remote hosts.


PS5, Line 125:     try:
             :         hdfs_client = InsecureClient('http://' + args.hdfs_host)
             :         hdfs_client.list('/')
             :     except requests.exceptions.ConnectionError as e:
             :         msg = 'Could not connect to HDFS web host http://{0} - {1}'.format(args.hdfs_host, e)
             :         LOGGER.error(msg)
             :         sys.exit(1)
             : 
             :     zk_client = connect_to_zookeeper(args.zookeeper_hosts, args.timeout)
             :     errors = sum([check_hbase_node(zk_client, node, args.timeout) for node in args.nodes])
> I suggest you make this a method. The __main__ can get the arguments, the e
Done


PS5, Line 135:     if errors:
> It would be better to have the explicit numerical > 0 comparison.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has uploaded a new patch set (#4).

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................

IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

We used to include a step in run-hbase.sh for calling a python
script that queried Zookeeper to see if the HBase master was up.
The original script was problematic for a couple of reasons (most
notably because it queried the wrong Zookeeper node) so we stopped
using it during our mini-cluster HBase start up procedure.

HBase start up issues continue to plague us, however. This patch
reintroduces a Zookeeper check, with the following updates:

- replace the original script with check-hbase-nodes.py
- query the correct node /hbase/master, not just /hbase/rs
- use the python Zookeeper library kazoo, rather than calling
  out to the shell and parsing the return string
- since we are moving toward testing on a remote cluster, also
  add the capability to pass in the address for the host that
  provides the Zookeeper and HBase services
- add an additional check that the HDFS service is running,
  because of an edge case where the HBase master can briefly
  start without a cluster running.

In addition to the expected tests, this script was also tested
under the conditions of IMPALA-4088, whereby the HBase RegionServer
is running, but the master fails because another listening process
has already taken its TCP port (60010) during startup.

Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
---
M infra/python/deps/requirements.txt
A testdata/bin/check-hbase-nodes.py
M testdata/bin/run-hbase.sh
D testdata/bin/wait-for-hbase-master.py
4 files changed, 139 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/48/4348/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 3:

(3 comments)

Thanks for the feedback. It was helpful.

Also, note that I'm adding a preliminary check for HDFS.

http://gerrit.cloudera.org:8080/#/c/4348/3/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

Line 75:     except Exception as e:
> Does the message you tack on to the exception give you a lot more informati
No, but the exception doesn't give me any addtional helpful information, and I prefer not to look at tracebacks if I don't need to. Purely an aesthetic choice. If you strongly prefer the traceback, I can remove.


Line 92:     while True:
> More readable to put the timeout at the top of the loop and exit if you bug
I think you mean "<", yes? ;-)

I find the more verbose version more explicit and legible, but eh, not a big deal either way.

WRT to 1/0 vs. T/F, I'm actually summing the number of errors at the end. Probably not necessary, but now you can pass in any number of znodes, and see which ones fail and which don't. (It's probably overkill.)


Line 103:             except Exception:
> This is wrong, ignore. Nevertheless, thanks to David for looking into it. F
Actually, this was a useful comment, because it prompted me to not be lazy and actually import and check for specific exceptions, which is the right thing to do.


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-2013: Reintroduce steps for checking HBase health in run-hbase.sh
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4348/6/testdata/bin/check-hbase-nodes.py
File testdata/bin/check-hbase-nodes.py:

PS6, Line 59:                         help=('Comma-delineated string of hosts in host:PORT format, '
> When is more than one zk host needed?
Don't know, but that's the way the API is written.


PS6, Line 135:     # Confirm that HDFS is available. There is a pathological case where the HBase master
             :     # can start up briefly if HDFS is not available, and then quit immediately, but that can
             :     # be long enough to give a false positive that the HBase master is running.
             :     #
             :     try:
             :         hdfs_client = InsecureClient('http://' + args.hdfs_host)
             :         hdfs_client.list('/')
             :     except requests.exceptions.ConnectionError as e:
             :         msg = 'Could not connect to HDFS web host http://{0} - {1}'.format(args.hdfs_host, e)
             :         LOGGER.error(msg)
             :         sys.exit(1)
             : 
             :     errors = check_znodes_list_for_errors(args.nodes, args.zookeeper_hosts, args.timeout)
> I meant for this section also to be a method, with errors returned.
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4348
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I9b81f3cfb6ea0ba7b18ce5fcd5d268f515c8b0c3
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Ishaan Joshi <is...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes