You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/08/18 14:34:00 UTC

[jira] [Commented] (IMPALA-10429) Add Support for specifying HDFS path in 'scratch_dirs' startup option

    [ https://issues.apache.org/jira/browse/IMPALA-10429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401099#comment-17401099 ] 

ASF subversion and git services commented on IMPALA-10429:
----------------------------------------------------------

Commit 1a3ff11d82c92cda1296a7cc5062bd035553dff2 in impala's branch refs/heads/master from baggio000
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=1a3ff11 ]

IMPALA-10429 Add support for specifying HDFS path in 'scratch_dirs' startup option

We support the HDFS scratch space, but as a test-only feature with
a fixed HDFS default local path.

In this patch, we extend the HDFS scratch space to support the
customer's input. For supporting the function, we add a new
format for HDFS scratch space path. It forces the HDFS path
to have the port number to solve the contradiction to the
current format of the scratch space path.

For example, previously, the format for scratch space path is,
take s3 for example, s3a://bucketpath:#bytes:#priority. In this
case, the bucketpath doesn't have a port number.

In this patch, the new format of HDFS scratch path is
hdfs://ipaddr:#port:#bytes:#priority. The port number is required,
therefore, there must be at least one colon in the HDFS path, the
bytes and priority are optional as before. For other scratch
spaces, the path format doesn’t change.

Also, option allow_spill_to_hdfs is removed because the spilling
to HDFS is not a test-only function anymore, as a result, the e2e
tests involved are updated.

Tests:
Added and passed TmpFileMgrTest::TestDirectoryLimitParsingRemotePath.
Ran the Core tests.

Change-Id: I0882ed1e80b02724dd5cb3cdb1fa7b6c2debcbf4
Reviewed-on: http://gerrit.cloudera.org:8080/17720
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add Support for specifying HDFS path in 'scratch_dirs' startup option
> ---------------------------------------------------------------------
>
>                 Key: IMPALA-10429
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10429
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Yida Wu
>            Assignee: Yida Wu
>            Priority: Major
>
> We support HDFS scratch space as a test-only feature, but we only support a fixed HDFS default path, no matter what input HDFS path is. The reason is the format of the port number in the HDFS path contradicts the way we are using for analyzing priority and size limit, like hdfs://xxx:1:10000, it needs some extra efforts to have a general parsing logic for the HDFS path.
> The JIRA is to track this spill to HDFS limitation if we need to support a flexible HDFS scratch space input.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org