You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Michael Brown (Code Review)" <ge...@cloudera.org> on 2016/10/10 23:08:20 UTC

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Michael Brown has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/4678

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to use external Docker volumes as a workaround to KUDU-1419. This
patch introduces a series of environment variables a user may tweak in
order to help with that purpose. The patch assumes a viable, reasonable
Docker container based on a standard Linux distribution like Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need the testdata off the container
onto the host running Docker Engine. To do that we suggest a strategy
using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 182 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 1
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 6:

Patch set 6 is a rebase, and conflicts with the commit of https://gerrit.cloudera.org/#/c/4187/ were fixed.

-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 8:

(3 comments)

Thanks for the review, Taras. Please see patch set 8.

http://gerrit.cloudera.org:8080/#/c/4678/7/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS7, Line 142: art_command += (
> I feel like it's not elegant that we have to add another -v here. How about
Done


PS7, Line 171:   
> It's interesting that the closing brace is on a separate line. Is this some
It satisfies flake8 to switch to this style. If I join this line up to the line above and run flake8, I get an error like this:

  impala_docker_env.py:169:9: E125 continuation line with same indent as next logical line


PS7, Line 312: '--delete -
> it's a little confusing here, why are there two single quotes followed by a
I'm gonna call line this a Casey-ism: It's so I can align the ssh options under the ssh command, not include extra white space in the rendered Fabric command, and satisfy flake8

This:

  '       -o UserKnownHostsFile=/dev/null -p {ssh_port}" '

...causes a bunch of white space to exist in the rendered command; it can be seen when you are running ps.

And this:

  'rsync -e "ssh [etc]" '
             '-o UserKnownHostsFile [etc]" '

... is seen by flake8 as an over-indented line and produces en error.

This seemed the easiest way to satisfy all of the above.


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 7:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/4678/7/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS7, Line 142: volume_ops = '-v ' + volume_ops
I feel like it's not elegant that we have to add another -v here. How about changing the template in the line above to '-v {host_path}:{container_path}' and removing '-v' from the line 138? This line can then be removed.


PS7, Line 171: ):
It's interesting that the closing brace is on a separate line. Is this some new style?


PS7, Line 312: ''         
it's a little confusing here, why are there two single quotes followed by a bunch of spaces?


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4678/4/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS4, Line 23: normpath
Perhaps parentheses around these. It took me a little bit to parse this.


    (join as join_path, normpath)


(Probably would have been easier if I were familiar with normpath.)


PS4, Line 55: os.path.sep + join_path('home', DOCKER_USER_NAME, 'Impala', 'testdata', 'cluster')
This is a bit long, and make it hard to read. Maybe it could be saved as DEFAULT_TESTDATA_VOLUME_PATH?


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has uploaded a new patch set (#7).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to mount an external Docker volume as a workaround to KUDU-1419.
This patch introduces a series of environment variables a user may tweak
in order to help with that purpose. The patch assumes a viable,
reasonable Docker container based on a standard Linux distribution like
Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need to copy the testdata off the
container onto the host running Docker Engine. To do that we suggest a
strategy using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 210 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/7
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has uploaded a new patch set (#6).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to use external Docker volumes as a workaround to KUDU-1419. This
patch introduces a series of environment variables a user may tweak in
order to help with that purpose. The patch assumes a viable, reasonable
Docker container based on a standard Linux distribution like Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need the testdata off the container
onto the host running Docker Engine. To do that we suggest a strategy
using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 199 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/6
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has submitted this change and it was merged.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to mount an external Docker volume as a workaround to KUDU-1419.
This patch introduces a series of environment variables a user may tweak
in order to help with that purpose. The patch assumes a viable,
reasonable Docker container based on a standard Linux distribution like
Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need to copy the testdata off the
container onto the host running Docker Engine. To do that we suggest a
strategy using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Reviewed-on: http://gerrit.cloudera.org:8080/4678
Reviewed-by: Michael Brown <mi...@cloudera.com>
Reviewed-by: Taras Bobrovytsky <tb...@cloudera.com>
Tested-by: Internal Jenkins
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 209 insertions(+), 60 deletions(-)

Approvals:
  Michael Brown: Looks good to me, but someone else must approve
  Taras Bobrovytsky: Looks good to me, approved
  Internal Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 9
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 8: Code-Review+1

Carry David's +1

-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 7: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 7
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 2:

Patch set 3 fixes a tiny "regression" introduced in patch set 2.

-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Internal Jenkins (Code Review)" <ge...@cloudera.org>.
Internal Jenkins has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 8: Verified+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has uploaded a new patch set (#4).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to use external Docker volumes as a workaround to KUDU-1419. This
patch introduces a series of environment variables a user may tweak in
order to help with that purpose. The patch assumes a viable, reasonable
Docker container based on a standard Linux distribution like Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need the testdata off the container
onto the host running Docker Engine. To do that we suggest a strategy
using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 192 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/4
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/4678/4/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS4, Line 23: normpath
> Perhaps parentheses around these. It took me a little bit to parse this.
Done


PS4, Line 55: os.path.sep + join_path('home', DOCKER_USER_NAME, 'Impala', 'testdata', 'cluster')
> This is a bit long, and make it hard to read. Maybe it could be saved as DE
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 5:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4678/5//COMMIT_MSG
Commit Message:

PS5, Line 10: use
> Would "mount" be an appropriate verb. If so, I think it's more descriptive.
Done


PS5, Line 19: need the
> "need to move the"?
Done, but it's not a "move"; it's a copy.


PS5, Line 21: rsync
> Why rsync versus, say, just cp or mv?
This is over a networking virtual layer, so cp won't work. Again, we're not moving data, we're copying it. rsync is more efficient than scp once you've warmed a volume once. It's also convenient to be able to use rsync -a, --chown, and the usage is a lot better than SCP. If you want to do any serious transferring of data, rsync is the better tool.


http://gerrit.cloudera.org:8080/#/c/4678/5/tests/comparison/leopard/README
File tests/comparison/leopard/README:

PS5, Line 17: Basic Configuration
> It's a nit, but it might be nicer if this line, and "External Volume Config
Done


PS5, Line 43: use
> "mount external Docker volumes that contain the necessary testdata"?
Done


PS5, Line 54: path on TARGET_HOST where the
            : external volume will reside
> This is just a directory, right?
Yes, and Leopard is responsible for creating it.


http://gerrit.cloudera.org:8080/#/c/4678/5/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

Line 298:     if os.environ.get('KUDU_IS_SUPPORTED') == 'true':
> Does it makes sense to reference KUDU-1419 here, or at least the README, to
Done


Line 308:             'rsync -e "ssh -i {priv_key} -o StrictHostKeyChecking=no '
> I'm just curious -- how long does this process take?
I timed it last week and it was about 20 minutes in the cold case; almost instantly in the very hot case.


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has uploaded a new patch set (#3).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to use external Docker volumes as a workaround to KUDU-1419. This
patch introduces a series of environment variables a user may tweak in
order to help with that purpose. The patch assumes a viable, reasonable
Docker container based on a standard Linux distribution like Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need the testdata off the container
onto the host running Docker Engine. To do that we suggest a strategy
using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 192 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "David Knupp (Code Review)" <ge...@cloudera.org>.
David Knupp has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 5:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/4678/5//COMMIT_MSG
Commit Message:

PS5, Line 10: use
Would "mount" be an appropriate verb. If so, I think it's more descriptive.


PS5, Line 19: need the
"need to move the"?


PS5, Line 21: rsync
Why rsync versus, say, just cp or mv?


http://gerrit.cloudera.org:8080/#/c/4678/5/tests/comparison/leopard/README
File tests/comparison/leopard/README:

PS5, Line 17: Basic Configuration
It's a nit, but it might be nicer if this line, and "External Volume Configuration" below, looked a little more like headers.


PS5, Line 43: use
"mount external Docker volumes that contain the necessary testdata"?


PS5, Line 54: path on TARGET_HOST where the
            : external volume will reside
This is just a directory, right?


http://gerrit.cloudera.org:8080/#/c/4678/5/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

Line 298:     if os.environ.get('KUDU_IS_SUPPORTED') == 'true':
Does it makes sense to reference KUDU-1419 here, or at least the README, to explain why the extra work is being done?


Line 308:             'rsync -e "ssh -i {priv_key} -o StrictHostKeyChecking=no '
I'm just curious -- how long does this process take?


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has uploaded a new patch set (#5).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to use external Docker volumes as a workaround to KUDU-1419. This
patch introduces a series of environment variables a user may tweak in
order to help with that purpose. The patch assumes a viable, reasonable
Docker container based on a standard Linux distribution like Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need the testdata off the container
onto the host running Docker Engine. To do that we suggest a strategy
using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 198 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/5
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 2:

Patch set 2 makes a few cleanups but doesn't introduce significant functional change.

-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Taras Bobrovytsky (Code Review)" <ge...@cloudera.org>.
Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................


Patch Set 8: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4678/7/tests/comparison/leopard/impala_docker_env.py
File tests/comparison/leopard/impala_docker_env.py:

PS7, Line 312: '--delete -
> I'm gonna call line this a Casey-ism: It's so I can align the ssh options u
Sounds good. I think it would be good to pick a standard and move to it slowly. (If flake8 does not like big indents we should stop making them).


-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Hello David Knupp,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4678

to look at the new patch set (#8).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to mount an external Docker volume as a workaround to KUDU-1419.
This patch introduces a series of environment variables a user may tweak
in order to help with that purpose. The patch assumes a viable,
reasonable Docker container based on a standard Linux distribution like
Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need to copy the testdata off the
container onto the host running Docker Engine. To do that we suggest a
strategy using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 209 insertions(+), 60 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/8
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 8
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>

[Impala-ASF-CR] IMPALA-4188: Leopard: support external Docker volumes

Posted by "Michael Brown (Code Review)" <ge...@cloudera.org>.
Michael Brown has uploaded a new patch set (#2).

Change subject: IMPALA-4188: Leopard: support external Docker volumes
......................................................................

IMPALA-4188: Leopard: support external Docker volumes

To be able to run the Random Query Generator with Impala and Kudu, we
need to use external Docker volumes as a workaround to KUDU-1419. This
patch introduces a series of environment variables a user may tweak in
order to help with that purpose. The patch assumes a viable, reasonable
Docker container based on a standard Linux distribution like Ubuntu 14.

To assist users, I've updated the Leopard README with instructions on
the environment variables' meanings.

The gist here is that the container is the source of truth, which means
to create an external volume, we need the testdata off the container
onto the host running Docker Engine. To do that we suggest a strategy
using rsync via passwordless SSH key.

Testing:
I used a Cloudera Docker container that has Impala in /home/dev/Impala.
Before, Kudu would fail to start due to KUDU-1419. Now, we load testdata
into an external volume, build Impala, run the minicluster including
Kudu, and can access the tpch_kudu data.

I made flake8 fixes as well. flake8 on this file is now clean.

Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
---
M tests/comparison/leopard/README
M tests/comparison/leopard/impala_docker_env.py
2 files changed, 188 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/78/4678/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4678
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia7d9d9253fcd7e3905e389ddeb1438cee3e24480
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <mi...@cloudera.com>
Gerrit-Reviewer: David Knupp <dk...@cloudera.com>
Gerrit-Reviewer: Taras Bobrovytsky <tb...@cloudera.com>