You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/06/03 07:18:00 UTC
[jira] [Created] (IMPALA-10727) Share identical
CachedHmsPartitionDescriptor across HdfsPartitions
Quanlong Huang created IMPALA-10727:
---------------------------------------
Summary: Share identical CachedHmsPartitionDescriptor across HdfsPartitions
Key: IMPALA-10727
URL: https://issues.apache.org/jira/browse/IMPALA-10727
Project: IMPALA
Issue Type: Improvement
Components: Catalog
Reporter: Quanlong Huang
In catalogd, we keep one CachedHmsPartitionDescriptor for each HdfsPartition. Many fields in it could be identical, e.g. sdBucketCols, sdSortCols. We can keep different {{CachedHmsPartitionDescriptor}} in HdfsTable instead and share them to the HdfsPartition. For fields that differs across partitions, e.g. msCreateTime, msLastAccessTime, we can move them to HdfsPartition.
https://github.com/apache/impala/blob/1a84a1420c5d517f43e4c7e90ee204db30f27d57/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java#L543
{code:java}
// TODO: Cache this descriptor in HdfsTable so that identical descriptors are shared
// between HdfsPartition instances.
// TODO: sdInputFormat and sdOutputFormat can be mutated by Impala when the file format
// of a partition changes; move these fields to HdfsPartition.
private static class CachedHmsPartitionDescriptor {
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)