You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2018/03/08 00:13:00 UTC
[jira] [Commented] (AIRFLOW-2150) Use get_partition_names() instead
of get_partitions() in HiveMetastoreHook().max_partition()
[ https://issues.apache.org/jira/browse/AIRFLOW-2150?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16390489#comment-16390489 ]
ASF subversion and git services commented on AIRFLOW-2150:
----------------------------------------------------------
Commit b8c2cea36299d6a3264d8bb1dc5a3995732b8855 in incubator-airflow's branch refs/heads/master from [~kevinyang]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=b8c2cea ]
[AIRFLOW-2150] Use lighter call in HiveMetastoreHook().max_partition()
Call self.metastore.get_partition_names() instead of
self.metastore.get_partitions(), which is extremely expensive for
large tables, in HiveMetastoreHook().max_partition().
Closes #3082 from
yrqls21/kevin_yang_fix_hive_max_partition
> Use get_partition_names() instead of get_partitions() in HiveMetastoreHook().max_partition()
> --------------------------------------------------------------------------------------------
>
> Key: AIRFLOW-2150
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2150
> Project: Apache Airflow
> Issue Type: Bug
> Reporter: Kevin Yang
> Assignee: Kevin Yang
> Priority: Major
>
> get_partitions() is extremely expensive for large tables, max_partition() should be using get_partition_names() instead.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)