You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (JIRA)" <ji...@apache.org> on 2018/11/20 05:32:00 UTC

[jira] [Resolved] (IMPALA-7871) Don't load Hive builtin jars for dataload

     [ https://issues.apache.org/jira/browse/IMPALA-7871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell resolved IMPALA-7871.
-----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 3.2.0

> Don't load Hive builtin jars for dataload
> -----------------------------------------
>
>                 Key: IMPALA-7871
>                 URL: https://issues.apache.org/jira/browse/IMPALA-7871
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>    Affects Versions: Impala 3.1.0
>            Reporter: Joe McDonnell
>            Assignee: Joe McDonnell
>            Priority: Major
>             Fix For: Impala 3.2.0
>
>
> One step in dataload is "Loading Hive Builtins", which copies a large number of jars into HDFS (or whatever storage). This step takes a couple minutes on HDFS dataload and 8 minutes on S3. Despite its name, I can't find any indication that Hive or anything else uses these jars. Dataload and core tests run fine without it. S3 can load data without it. There's no indication that this is needed.
> Unless we find something using these jars, we should remove this step.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)