You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/10/08 00:51:00 UTC

[jira] [Commented] (IMPALA-10429) Add Support for specifying HDFS path in 'scratch_dirs' startup option

    [ https://issues.apache.org/jira/browse/IMPALA-10429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17425886#comment-17425886 ] 

ASF subversion and git services commented on IMPALA-10429:
----------------------------------------------------------

Commit 45fd3320ad4f68ca86998dff0c9504aa896a278a in impala's branch refs/heads/master from Yida Wu
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=45fd332 ]

IMPALA-10945: Fix S3 scratch path behavior

IMPALA-10429 "Support Spill to HDFS" introduces a new behavior to
S3 scratch path. It added a path verification for the S3 path,
however, the HdfsFsCache::GetNameNodeFromPath in the verification
forces the input to have at least a directory after the authority,
like "s3a://authority/dir", otherwise it will return an error and
lead to a failure on the TmpFileMgr initialization. Therefore, it
changes the previous behavior which was able to support
"s3a://authority", and may affect current users.

This patch resumes the behavior of the s3 scratch path to allow
"s3a://authority". The solution is to pass the path of the user's
input combined with a scratch suffix "/impala-scratch" to the
verification function, therefore, at least one directory is contained
in the path.

Tests:
Ran core tests.
Added logic to run two types of path format in TmpFileMgrTest:
"s3a://authority" and "s3a://authority/dir".

Change-Id: I028f375b9f535f8641261cc02f921497e076aa9b
Reviewed-on: http://gerrit.cloudera.org:8080/17901
Reviewed-by: Abhishek Rawat <ar...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add Support for specifying HDFS path in 'scratch_dirs' startup option
> ---------------------------------------------------------------------
>
>                 Key: IMPALA-10429
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10429
>             Project: IMPALA
>          Issue Type: Improvement
>            Reporter: Yida Wu
>            Assignee: Yida Wu
>            Priority: Major
>             Fix For: Impala 4.1.0
>
>
> We support HDFS scratch space as a test-only feature, but we only support a fixed HDFS default path, no matter what input HDFS path is. The reason is the format of the port number in the HDFS path contradicts the way we are using for analyzing priority and size limit, like hdfs://xxx:1:10000, it needs some extra efforts to have a general parsing logic for the HDFS path.
> The JIRA is to track this spill to HDFS limitation if we need to support a flexible HDFS scratch space input.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org