You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Michael Brown (Code Review)" <ge...@cloudera.org> on 2017/04/28 21:52:55 UTC

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/6763

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................

IMPALA-5162,IMPALA-5163: stress test support on secure clusters

This patch adds support for running the stress test
(concurrent_select.py) and loading nested data (load_nested.py) into a
Kerberized, SSL-enabled Impala cluster. It assumes the calling user
already has a valid Kerberos ticket. One way to do that is:

1. Get access to a keytab and krb5.config
2. Set KRB5_CONFIG and KRB5CCNAME appropriately
3. Run kinit(1)
4. Run load_nested.py and/or concurrent_select.py within this
   environment.

Because our Python clients already support Kerberos and SSL, we simply
need to make sure to use the correct options when calling the entry
points and initializing the clients:

Impala: Impyla
Hive: Impyla
HDFS: hdfs.ext.kerberos.KerberosClient

With this patch, I was able to manually do a short concurrent_select.py
run against a secure cluster without connection or auth errors, and I
was able to do the same with load_nested.py for a cluster that already
had TPC-H loaded.

Follow-ons for future cleanup work:

IMPALA-5263: support CA bundles when running stress test against SSL'd
             Impala

IMPALA-5264: fix InsecurePlatformWarning under stress test with SSL

Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
---
M testdata/bin/load_nested.py
M tests/comparison/cli_options.py
M tests/comparison/cluster.py
M tests/comparison/db_connection.py
M tests/stress/concurrent_select.py
5 files changed, 61 insertions(+), 18 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/63/6763/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Posted by "Matthew Mulder (Code Review)" <ge...@cloudera.org>.
Matthew Mulder has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

PS1, Line 404: 
Why is this removed?


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Matthew Mulder <mm...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

PS1, Line 404: 
> Why is this removed?
It's an unsupported parameter. http://hdfscli.readthedocs.io/en/latest/api.html#hdfs.ext.kerberos.KerberosClient


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Matthew Mulder <mm...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................


IMPALA-5162,IMPALA-5163: stress test support on secure clusters

This patch adds support for running the stress test
(concurrent_select.py) and loading nested data (load_nested.py) into a
Kerberized, SSL-enabled Impala cluster. It assumes the calling user
already has a valid Kerberos ticket. One way to do that is:

1. Get access to a keytab and krb5.config
2. Set KRB5_CONFIG and KRB5CCNAME appropriately
3. Run kinit(1)
4. Run load_nested.py and/or concurrent_select.py within this
   environment.

Because our Python clients already support Kerberos and SSL, we simply
need to make sure to use the correct options when calling the entry
points and initializing the clients:

Impala: Impyla
Hive: Impyla
HDFS: hdfs.ext.kerberos.KerberosClient

With this patch, I was able to manually do a short concurrent_select.py
run against a secure cluster without connection or auth errors, and I
was able to do the same with load_nested.py for a cluster that already
had TPC-H loaded.

Follow-ons for future cleanup work:

IMPALA-5263: support CA bundles when running stress test against SSL'd
             Impala

IMPALA-5264: fix InsecurePlatformWarning under stress test with SSL

Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Reviewed-on: http://gerrit.cloudera.org:8080/6763
Reviewed-by: Matthew Mulder <mm...@cloudera.com>
Reviewed-by: Alex Behm <al...@cloudera.com>
Tested-by: Impala Public Jenkins
---
M testdata/bin/load_nested.py
M tests/comparison/cli_options.py
M tests/comparison/cluster.py
M tests/comparison/db_connection.py
M tests/stress/concurrent_select.py
5 files changed, 61 insertions(+), 18 deletions(-)

Approvals:
  Matthew Mulder: Looks good to me, but someone else must approve
  Impala Public Jenkins: Verified
  Alex Behm: Looks good to me, approved



-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Mulder <mm...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

Line 364:                                               "0.0.0.0:50070")
> Independent question:
I can only speculate: My guess it has to do with supporting Mini vs. real clusters, where the port numbers differ, and dev environments that are half-set up or whatever. Do you want me to alter get_hadoop_config() in this patch and remove the employments of default values?


Line 412:           local_shell(pip_path + " install pykerberos==1.1.14 requests-kerberos==0.11.0",
> Not your change, but this flow strikes me as odd. We have packages+versions
I can only speculate: I've seen this pattern in a few places and it's likely an attempt at micro-optimization to prevent the Impala Python virtual environment from having unnecessary packages. Pre-commit tests won't go through this path, for example, thus don't need the packages.


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Matthew Mulder <mm...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................


Patch Set 1: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Mulder <mm...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................


Patch Set 1: Code-Review+2

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

Line 364:                                               "0.0.0.0:50070")
> I can only speculate: My guess it has to do with supporting Mini vs. real c
Let's not make the changes in this patch to avoid breaking functionality. Just wanted to get your take on this pattern.


Line 412:           local_shell(pip_path + " install pykerberos==1.1.14 requests-kerberos==0.11.0",
> I can only speculate: I've seen this pattern in a few places and it's likel
Thanks. If you agree there is questionable/little benefit to this lazy install+import, we should consider simplifying it - but not in this patch.


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Matthew Mulder <mm...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................


Patch Set 1:

Build started: http://jenkins.impala.io:8080/job/gerrit-verify-dryrun/524/

-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins
Gerrit-Reviewer: Matthew Mulder <mm...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Posted by "Matthew Mulder (Code Review)" <ge...@cloudera.org>.
Matthew Mulder has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Matthew Mulder <mm...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-5162,IMPALA-5163: stress test support on secure clusters

Posted by "Alex Behm (Code Review)" <ge...@cloudera.org>.
Alex Behm has posted comments on this change.

Change subject: IMPALA-5162,IMPALA-5163: stress test support on secure clusters
......................................................................


Patch Set 1:

(2 comments)

Changes look reasonable, but I'm not super familiar with this code.

http://gerrit.cloudera.org:8080/#/c/6763/1/tests/comparison/cluster.py
File tests/comparison/cluster.py:

Line 364:                                               "0.0.0.0:50070")
Independent question:
Does it even make sense to plug in default values here?
Seems like a misconfiguration might be hard to debug if we plug in default values, instead of throwing an error.


Line 412:           local_shell(pip_path + " install pykerberos==1.1.14 requests-kerberos==0.11.0",
Not your change, but this flow strikes me as odd. We have packages+versions baked into the code here. What's the benefit of doing this lazy install+import as opposed to requiring these to be installed up-front?


-- 
To view, visit http://gerrit.cloudera.org:8080/6763
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I0daad57bb8ceeb5071b75125f11c1997ed7e0179
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Alex Behm <al...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Matthew Mulder <mm...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-HasComments: Yes