You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/08/09 12:46:00 UTC

[jira] [Created] (IMPALA-10848) Provide compile-only option to skip downloading test dependencies

Quanlong Huang created IMPALA-10848:
---------------------------------------

             Summary: Provide compile-only option to skip downloading test dependencies
                 Key: IMPALA-10848
                 URL: https://issues.apache.org/jira/browse/IMPALA-10848
             Project: IMPALA
          Issue Type: Improvement
          Components: Infrastructure
            Reporter: Quanlong Huang
         Attachments: pywebhdfs_failure.png

Compiling Impala is not easy for a beginner. A portion of failures are in downloading/installing dependencies.

For instance, old versions of Impala may fail to compile since cdh components of old GBNs on S3 are removed. However, the artifacts of cdh component are only used in testing (minicluster & holding testdata). We can still compile without them.

Take pip dependencies as another example, here is a failure I got from a community user. It failed by installing pywebhdfs:

!pywebhdfs_failure.png!

However, simple git-grep shows that pywebhdfs is only used in tests:
{code:bash}
$ git grep pywebhdfs
bin/bootstrap_system.sh:#  >>> from pywebhdfs.webhdfs import PyWebHdfsClient
infra/python/deps/requirements.txt:pywebhdfs == 0.3.2
tests/common/impala_test_suite.py:    #     HDFS: uses a mixture of pywebhdfs (which is faster than the HDFS CLI) and the
tests/util/hdfs_util.py:from pywebhdfs.webhdfs import PyWebHdfsClient, errors, _raise_pywebhdfs_exception
tests/util/hdfs_util.py:      _raise_pywebhdfs_exception(response.status_code, response.text)
tests/util/hdfs_util.py:      _raise_pywebhdfs_exception(response.status_code, response.text)
tests/util/hdfs_util.py:      _raise_pywebhdfs_exception(response.status_code, response.text)
tests/util/hdfs_util.py:      _raise_pywebhdfs_exception(response.status_code, response.text) {code}
If the user just wants to compile Impala and deploys it in their existing Hadoop cluster, dealing with these failures is a waste of their time.

*Target for this JIRA*
 * Provide compile-only option to bin/bootstrap_system.sh. It should skip downloading/installing unused dependencies like postgresql.
 * Provide compile-only option to buildall.sh. It should skip downloading unused cdh/cdp components in compilation.
 * Update our [wiki|https://cwiki.apache.org/confluence/display/IMPALA/Building+Impala] about this.

Note that we already have some env vars to control the download behaviors, e.g. SKIP_PYTHON_DOWNLOAD, SKIP_TOOLCHAIN_BOOTSTRAP. We just need to make the compile-only scenario works with minimal requirements and document it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org