You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/08/09 12:46:00 UTC
[jira] [Created] (IMPALA-10848) Provide compile-only option to skip
downloading test dependencies
Quanlong Huang created IMPALA-10848:
---------------------------------------
Summary: Provide compile-only option to skip downloading test dependencies
Key: IMPALA-10848
URL: https://issues.apache.org/jira/browse/IMPALA-10848
Project: IMPALA
Issue Type: Improvement
Components: Infrastructure
Reporter: Quanlong Huang
Attachments: pywebhdfs_failure.png
Compiling Impala is not easy for a beginner. A portion of failures are in downloading/installing dependencies.
For instance, old versions of Impala may fail to compile since cdh components of old GBNs on S3 are removed. However, the artifacts of cdh component are only used in testing (minicluster & holding testdata). We can still compile without them.
Take pip dependencies as another example, here is a failure I got from a community user. It failed by installing pywebhdfs:
!pywebhdfs_failure.png!
However, simple git-grep shows that pywebhdfs is only used in tests:
{code:bash}
$ git grep pywebhdfs
bin/bootstrap_system.sh:# >>> from pywebhdfs.webhdfs import PyWebHdfsClient
infra/python/deps/requirements.txt:pywebhdfs == 0.3.2
tests/common/impala_test_suite.py: # HDFS: uses a mixture of pywebhdfs (which is faster than the HDFS CLI) and the
tests/util/hdfs_util.py:from pywebhdfs.webhdfs import PyWebHdfsClient, errors, _raise_pywebhdfs_exception
tests/util/hdfs_util.py: _raise_pywebhdfs_exception(response.status_code, response.text)
tests/util/hdfs_util.py: _raise_pywebhdfs_exception(response.status_code, response.text)
tests/util/hdfs_util.py: _raise_pywebhdfs_exception(response.status_code, response.text)
tests/util/hdfs_util.py: _raise_pywebhdfs_exception(response.status_code, response.text) {code}
If the user just wants to compile Impala and deploys it in their existing Hadoop cluster, dealing with these failures is a waste of their time.
*Target for this JIRA*
* Provide compile-only option to bin/bootstrap_system.sh. It should skip downloading/installing unused dependencies like postgresql.
* Provide compile-only option to buildall.sh. It should skip downloading unused cdh/cdp components in compilation.
* Update our [wiki|https://cwiki.apache.org/confluence/display/IMPALA/Building+Impala] about this.
Note that we already have some env vars to control the download behaviors, e.g. SKIP_PYTHON_DOWNLOAD, SKIP_TOOLCHAIN_BOOTSTRAP. We just need to make the compile-only scenario works with minimal requirements and document it.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org