You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@impala.apache.org by "Huaisi Xu (Code Review)" <ge...@cloudera.org> on 2016/02/28 05:58:08 UTC

[Impala-CR](cdh5-2.2.0_5.4.x) This is a combination of 6 commits. Fixing 5.4.x branch jenkins job error. Using virtualenv's impala-python.

Hello Casey Ching, Internal Jenkins,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/2337

to review the following change.

Change subject: This is a combination of 6 commits. Fixing 5.4.x branch jenkins job error. Using virtualenv's impala-python.
......................................................................

This is a combination of 6 commits. Fixing 5.4.x branch jenkins job error.
Using virtualenv's impala-python.

Python: Bootstrap a virtualenv and add impala-python command

This adds a bootstrap script and a "impala-python" command to
$IMPALA_HOME/bin that automatically runs the bootstrap and redirects to
the virtualenv python. Existing python scripts will later be updated to
use the this new "impala-python" command.

The bootstrap script will build a virtualenv to ensure a minimum python
version (2.6) and a well known set of dependencies. The bootstrap script
can be run with python 2.4 but 2.6 must already be installed on the
system. The resulting virtualenv will use 2.6 at a minimum.

Only dependencies explicitly listed in requirements.txt will be
installed and available (no system packages will ever be used). No
packages will ever be downloaded when setting up the virtualenv. In the
future new dependencies can be added by editing the requirements.txt
file. Installation through requirements.txt is a standard pip feature.
When requirements.txt is updated, the next run of "impala-python"  will
rebuild the virtualenv.

Change-Id: I150595d7e09a45d5f2e3c30a845bc8d6a761eeed
Reviewed-on: http://gerrit.cloudera.org:8080/424
Reviewed-by: Casey Ching <ca...@cloudera.com>
Tested-by: Internal Jenkins
(cherry picked from commit 6a3af6747e420c7bb64dae613921222f5b32d7f5)

Use "impala-python" (virtualenv) instead of system python

Python tests and infra scripts will now use "python" from the virtualenv
via $IMPALA_HOME/bin/impala-python. Some scripts could be simplified now
that python 2.6 and a dependable set of third-party libraries are
available but that is not done as part of this commit.

Change-Id: If1cf96898d6350e78ea107b9026b12ba63a4162f
Reviewed-on: http://gerrit.cloudera.org:8080/603
Reviewed-by: Taras Bobrovytsky <tb...@cloudera.com>
Tested-by: Internal Jenkins
(cherry picked from commit d8de07c01e39b2aae4cdbe81b40a9ed1fcf5dc36)

Python: Add thrift_sasl to virtualenv

Previously thrift_sasl was brought into the virtualenv by building
the shell. That meant the shell had to be built before the
virtualenv could be used. By includeing thrift_sasl directly, the
virtualenv can be used even if impala/shell is not built.

Change-Id: Id1a099036b1ac8add5a314af981789ebf69ce465
Reviewed-on: http://gerrit.cloudera.org:8080/685
Reviewed-by: Casey Ching <ca...@cloudera.com>
Tested-by: Internal Jenkins
(cherry picked from commit 060561d061cb98d01089cb7979e3132f60ca3121)

Simplify shell cancellation tests

The tests were doing unnecessary things. One such thing that stopped
working with the virtualenv patch was searching for the shell process to
get the pid. The search was never needed since the process was spawned
with Popen which provides the pid directly.

Change-Id: I2455e58de4fdba8fd2770f0489fac8cddf6b90a0
Reviewed-on: http://gerrit.cloudera.org:8080/555
Reviewed-by: Casey Ching <ca...@cloudera.com>
Tested-by: Internal Jenkins
(cherry picked from commit ae283668451a85ad352289fc2d8f41af81829994)

Remove hashbang from non-script python files

Many python files had a hashbang and the executable bit set though
they were not intended to be run a standalone script. That makes
determining which python files are actually scripts very difficult.
A future patch will update the hashbang in real python scripts so they
use $IMPALA_HOME/bin/impala-python.

Change-Id: I04eafdc73201feefe65b85817a00474e182ec2ba
Reviewed-on: http://gerrit.cloudera.org:8080/599
Reviewed-by: Casey Ching <ca...@cloudera.com>
Reviewed-by: Taras Bobrovytsky <tb...@cloudera.com>
Tested-by: Internal Jenkins
(cherry picked from commit 7bc4dc0fcfc310cdad78b74a10c5154a4e09d07c)

IMPALA-2187: Run py.test through impala python env.

The symptom of this bug was that we were seeing "ValueError: bad marshal data"
when trying to import from tests.hs2.test_hs2 during customer cluster tests.

The problem was that we were not running the custom cluster tests through the
new Impala Python virtualenv.

Some tests (properly running with the virtualenv) that run before the customer
cluster tests had caused the generation of pyc files for tests.hs2.test_hs2.
Those pyc files then appeared corrupted when executing the custom cluster
tests because the default python env is running a different version than the
virtualenv those pyc files were generated from in earlier tests.

Change-Id: Ie9d8f90c65921247dd885804165f9b7271ea807b
Reviewed-on: http://gerrit.cloudera.org:8080/618
Reviewed-by: Casey Ching <ca...@cloudera.com>
Tested-by: Internal Jenkins
(cherry picked from commit 0e709a9c05662bd06e2602929881dc6522c584c3)
---
M be/src/codegen/gen_ir_descriptions.py
M bin/gen_build_version.py
A bin/impala-ipython
A bin/impala-py.test
A bin/impala-python
M bin/impala-shell.sh
M bin/load-data.py
M bin/make_impala.sh
M bin/run-workload.py
M bin/start-impala-cluster.py
M common/function-registry/impala_functions.py
A infra/python/.gitignore
A infra/python/README
A infra/python/bootstrap_virtualenv.py
A infra/python/deps/AllPairs-2.0.1.tar.gz
A infra/python/deps/Fabric-1.10.2-py2-none-any.whl
A infra/python/deps/PyHive-0.1.5.tar.gz
A infra/python/deps/apipkg-1.4-py2.py3-none-any.whl
A infra/python/deps/cm_api-10.0.0.tar.gz
A infra/python/deps/download_requirements
A infra/python/deps/ecdsa-0.13-py2.py3-none-any.whl
A infra/python/deps/execnet-1.3.0-py2.py3-none-any.whl
A infra/python/deps/execnet-1.4.0-py2.py3-none-any.whl
A infra/python/deps/impyla-0.11.0.dev0.tar.gz
A infra/python/deps/ipython-1.2.1.tar.gz
A infra/python/deps/paramiko-1.15.2-py2.py3-none-any.whl
A infra/python/deps/pexpect-3.3.tar.gz
A infra/python/deps/pg8000-1.10.2-py2.py3-none-any.whl
A infra/python/deps/prettytable-0.7.2.tar.bz2
A infra/python/deps/psutil-0.7.1.tar.gz
A infra/python/deps/py-1.4.30-py2.py3-none-any.whl
A infra/python/deps/pycrypto-2.6.1.tar.gz
A infra/python/deps/pytest-2.7.2-py2.py3-none-any.whl
A infra/python/deps/pytest-xdist-1.12.tar.gz
A infra/python/deps/pywebhdfs-0.3.2.tar.gz
A infra/python/deps/requests-2.7.0-py2.py3-none-any.whl
A infra/python/deps/requirements.txt
A infra/python/deps/sh-1.11.tar.gz
A infra/python/deps/six-1.9.0-py2.py3-none-any.whl
A infra/python/deps/sqlparse-0.1.15.tar.gz
A infra/python/deps/texttable-0.8.3.tar.gz
A infra/python/deps/thrift-0.9.0.tar.gz
A infra/python/deps/thrift_sasl-0.1.0.tar.gz
A infra/python/deps/virtualenv-13.1.0.tar.gz
M testdata/bin/cache_tables.py
M testdata/bin/compute-table-stats.sh
M testdata/bin/create-load-data.sh
M testdata/bin/generate-schema-statements.py
M testdata/bin/generate-test-vectors.py
M testdata/bin/run-hbase.sh
M testdata/bin/run-hive-server.sh
M testdata/bin/wait-for-hbase-master.py
M testdata/bin/wait-for-hiveserver2.py
M testdata/bin/wait-for-metastore.py
M testdata/common/text_delims_table.py
M testdata/common/widetable.py
M tests/authorization/test_authorization.py
M tests/authorization/test_grant_revoke.py
M tests/beeswax/impala_beeswax.py
M tests/benchmark/perf_result_datastore.py
M tests/benchmark/plugins/__init__.py
M tests/benchmark/plugins/clear_buffer_cache.py
M tests/benchmark/plugins/vtune_plugin.py
M tests/benchmark/report-benchmark-results.py
M tests/catalog_service/test_catalog_service_client.py
M tests/catalog_service/test_hms_failure.py
M tests/catalog_service/test_large_num_partitions.py
M tests/common/base_test_suite.py
M tests/common/custom_cluster_test_suite.py
M tests/common/failure_injector.py
M tests/common/impala_cluster.py
M tests/common/impala_cluster_cm.py
M tests/common/impala_connection.py
M tests/common/impala_service.py
M tests/common/impala_test_suite.py
M tests/common/query.py
M tests/common/query_executor.py
M tests/common/scheduler.py
M tests/common/skip.py
M tests/common/test_dimensions.py
M tests/common/test_result_verifier.py
M tests/common/test_vector.py
M tests/common/workload.py
M tests/common/workload_runner.py
M tests/comparison/data_generator.py
M tests/comparison/discrepancy_searcher.py
M tests/conftest.py
M tests/custom_cluster/test_admission_controller.py
M tests/custom_cluster/test_delegation.py
M tests/custom_cluster/test_insert_behaviour.py
M tests/custom_cluster/test_query_expiration.py
M tests/custom_cluster/test_scratch_disk.py
M tests/custom_cluster/test_session_expiration.py
M tests/custom_cluster/test_spillling.py
M tests/data_errors/test_data_errors.py
M tests/experiments/test_process_failures.py
M tests/experiments/test_targeted_perf.py
M tests/failure/test_failpoints.py
M tests/hs2/hs2_test_suite.py
M tests/hs2/test_fetch.py
M tests/hs2/test_fetch_first.py
M tests/hs2/test_hs2.py
M tests/metadata/test_col_stats.py
M tests/metadata/test_compute_stats.py
M tests/metadata/test_ddl.py
M tests/metadata/test_explain.py
M tests/metadata/test_hbase_metadata.py
M tests/metadata/test_hdfs_encryption.py
M tests/metadata/test_hdfs_permissions.py
M tests/metadata/test_last_ddl_time_update.py
M tests/metadata/test_load.py
M tests/metadata/test_metadata_query_statements.py
M tests/metadata/test_partition_metadata.py
M tests/metadata/test_set.py
M tests/metadata/test_show_create_table.py
M tests/metadata/test_views_compatibility.py
M tests/query_test/test_aggregation.py
M tests/query_test/test_analytic_tpcds.py
M tests/query_test/test_avro_schema_resolution.py
M tests/query_test/test_cancellation.py
M tests/query_test/test_chars.py
M tests/query_test/test_compressed_formats.py
M tests/query_test/test_decimal_casting.py
M tests/query_test/test_decimal_queries.py
M tests/query_test/test_delimited_text.py
M tests/query_test/test_expr_limits.py
M tests/query_test/test_hbase_queries.py
M tests/query_test/test_hdfs_caching.py
M tests/query_test/test_insert.py
M tests/query_test/test_insert_behaviour.py
M tests/query_test/test_insert_parquet.py
M tests/query_test/test_insert_permutation.py
M tests/query_test/test_join_queries.py
M tests/query_test/test_limit.py
M tests/query_test/test_local_fs.py
M tests/query_test/test_mem_usage_scaling.py
M tests/query_test/test_multiple_filesystems.py
M tests/query_test/test_partitioning.py
M tests/query_test/test_queries.py
M tests/query_test/test_query_mem_limit.py
M tests/query_test/test_rows_availability.py
M tests/query_test/test_scanners.py
M tests/query_test/test_sort.py
M tests/query_test/test_timezones.py
M tests/query_test/test_tpcds_queries.py
M tests/query_test/test_tpch_queries.py
M tests/query_test/test_udfs.py
M tests/run-custom-cluster-tests.sh
M tests/run-process-failure-tests.sh
M tests/run-tests.py
M tests/shell/impala_shell_results.py
M tests/shell/test_shell_commandline.py
M tests/shell/test_shell_interactive.py
M tests/statestore/test_statestore.py
M tests/stress/test_ddl_stress.py
M tests/stress/test_mini_stress.py
M tests/unittests/test_file_parser.py
M tests/unittests/test_result_verifier.py
M tests/util/calculation_util.py
M tests/util/cluster_controller.py
M tests/util/compute_table_stats.py
M tests/util/filesystem_utils.py
M tests/util/hdfs_util.py
M tests/util/plugin_runner.py
M tests/util/shell_util.py
M tests/util/thrift_util.py
M tests/verifiers/metric_verifier.py
M tests/verifiers/test_verify_metrics.py
168 files changed, 290 insertions(+), 178 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala refs/changes/37/2337/1
-- 
To view, visit http://gerrit.cloudera.org:8080/2337
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ie9d8f90c65921247dd885804165f9b7271ea807b
Gerrit-PatchSet: 1
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.2.0_5.4.x
Gerrit-Owner: Huaisi Xu <hx...@cloudera.com>
Gerrit-Reviewer: Casey Ching <ca...@cloudera.com>
Gerrit-Reviewer: Internal Jenkins