You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2020/12/23 20:38:00 UTC

[jira] [Resolved] (IMPALA-8821) Dataload for remote clusters should use recover partitions

     [ https://issues.apache.org/jira/browse/IMPALA-8821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell resolved IMPALA-8821.
-----------------------------------
    Fix Version/s: Not Applicable
       Resolution: Won't Fix

I don't think we need this anymore, so I'm closing this. If there is a desire for this functionality, this can be reopened.

> Dataload for remote clusters should use recover partitions
> ----------------------------------------------------------
>
>                 Key: IMPALA-8821
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8821
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Infrastructure
>    Affects Versions: Impala 3.3.0
>            Reporter: Joe McDonnell
>            Priority: Major
>             Fix For: Not Applicable
>
>
> Some test setups have data already in place and only need to run the DDLs to sync up the metadata. This corresponds to running testdata/bin/create-load-data.sh using a data snapshot but without skip_metadata_load.
> Right now, for partitioned tables where the partitions are created dynamically as part of the insert, generate-schema-statements.py forces a reload:
> {noformat}
> # Force reloading of the table if the user specified the --force option or
> # if the table is partitioned and there was no ALTER section specified. This is to
> # ensure the partition metadata is always properly created. The ALTER section is
> # used to create partitions, so if that section exists there is no need to force
> # reload.
> # IMPALA-6579: Also force reload all Kudu tables. The Kudu entity referenced
> # by the table may or may not exist, so requiring a force reload guarantees
> # that the Kudu entity is always created correctly.
> # TODO: Rename the ALTER section to ALTER_TABLE_ADD_PARTITION
> force_reload = options.force_reload or (partition_columns and not alter) or \
>     file_format == 'kudu'{noformat}
> In the case where the data is already in place, this would drop that data and reload it. Instead, we should just use "recover partitions" on that table to get all the partition information.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)