You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2018/08/15 06:22:00 UTC

[jira] [Commented] (IMPALA-7408) Add a flag to selectively disable fs operations used by catalogd

    [ https://issues.apache.org/jira/browse/IMPALA-7408?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16580756#comment-16580756 ] 

ASF subversion and git services commented on IMPALA-7408:
---------------------------------------------------------

Commit c692e5cc9ec2ab2d626c2510d300a03c27790a9b in impala's branch refs/heads/master from [~vercego]
[ https://git-wip-us.apache.org/repos/asf?p=impala.git;h=c692e5c ]

IMPALA-7408: add a debugging flag to disable reading fs data from catalogd

Add the flag: --disable_catalog_data_ops_debug_only that skips loading
files from the file-system from catalogd. The flag is by default false
and its hidden. Its intent is to avoid time-consuming accesses to
the file-system when debugging metadata issues and the file-system
contents are not available. For example, a recent ~18 GB catalog
takes 10 hours to load without the flag set vs. 1 hour to load with
the flag. The extra time comes from accessing the file-system, failing,
and logging exceptions.

This flag specifically disables copying jars from the fs when loading
Java functions and it skips loading avro schema files. Additional cases
can be added under this flag if more are needed.

Testing:
- manually confirmed that jars and avro schema files are skipped.
- added a test to check the same behavior in a custom cluster test.
- ran core tests.

Change-Id: I15789fb489b285e2a6565025eb17c63cdc726354
Reviewed-on: http://gerrit.cloudera.org:8080/11191
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> Add a flag to selectively disable fs operations used by catalogd
> ----------------------------------------------------------------
>
>                 Key: IMPALA-7408
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7408
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Vuk Ercegovac
>            Assignee: Vuk Ercegovac
>            Priority: Major
>             Fix For: Impala 3.1.0
>
>
> Its often useful to start an Impala cluster when only metadata is present, for example, HMS data and HDFS fsimage. In particular, this is useful when debugging metadata-related issues.
> However, catalogd requires FS operations that depend on data blocks, not just metadata. Examples include Avro schema files and function jars that which are stored on the FS. When these cases come up in large instances, a substantial amount of time is spent trying/failing/throwing an exception. To skip this, we can add a flag that suppresses these operations. The flag will be hidden-- its only use is for reproducing metadata-related issues/development.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org