You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by bo...@apache.org on 2022/03/08 12:53:17 UTC

[impala] 01/02: IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables

This is an automated email from the ASF dual-hosted git repository.

boroknagyz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit c10e951bcbc57e8e16d2427a7a6a4532f30f904a
Author: Zoltan Borok-Nagy <bo...@cloudera.com>
AuthorDate: Mon Jan 31 17:16:19 2022 +0100

    IMPALA-11053: Impala should be able to read migrated partitioned Iceberg tables
    
    When Hive (and probably other engines as well) converts a legacy Hive
    table to Iceberg it doesn't rewrite the data files. It means that the
    data files don't have write ids neither partition column data. Currently
    Impala expects the partition columns to be present in the data files,
    so it is not able to read converted partitioned tables.
    
    With this patch Impala loads partition values from the Iceberg metadata.
    The extra metadata information is attached to the file descriptor
    objects and propageted to the scanners. This metadata contains the
    Iceberg data file format (later it could be used to handle mixed-format
    tables), and partition data.
    
    We use the partition data in the HdfsScanner to create the template
    tuple that contains the partition values of identity-partitioned
    columns. This is not only true to migrated tables, but all Iceberg
    tables with identity partitions, which means we also save some IO
    and CPU time for such columns. The partition information could also
    be used for Dynamic Partition Pruning later.
    
    We use the (human-readable) string representation of the partition data
    when storing them in the flat buffers. This helps debugging, also
    it provides the needed flexibility when the partition columns
    evolve (e.g. INT -> BIGINT, DECIMAL(4,2) -> DECIMAL(6,2)).
    
    Testing
     * e2e test for all data types that can be used to partition a table
     * e2e test for migrated partitioned table + schema evolution (without
       renaming columns)
     * e2e for table where all columns are used as identity-partitions
    
    Change-Id: Iac11a02de709d43532056f71359c49d20c1be2b8
    Reviewed-on: http://gerrit.cloudera.org:8080/18240
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 be/src/exec/CMakeLists.txt                         |   1 +
 be/src/exec/file-metadata-utils.cc                 | 143 ++++++++++++++++
 be/src/exec/file-metadata-utils.h                  |  56 ++++++
 be/src/exec/hdfs-orc-scanner.cc                    |   3 +-
 be/src/exec/hdfs-scan-node-base.cc                 |   7 +
 be/src/exec/hdfs-scan-node-base.h                  |   7 +
 be/src/exec/hdfs-scanner.cc                        |  13 +-
 be/src/exec/hdfs-scanner.h                         |   4 +
 be/src/exec/orc-column-readers.cc                  |  14 +-
 be/src/exec/parquet/hdfs-parquet-scanner.cc        |   3 +-
 be/src/exec/parquet/parquet-metadata-utils.h       |   3 +
 be/src/runtime/dml-exec-state.cc                   |   2 +-
 be/src/scheduling/scheduler.cc                     |   6 +
 common/fbs/CatalogObjects.fbs                      |   8 +
 common/fbs/IcebergObjects.fbs                      |  33 +++-
 common/protobuf/planner.proto                      |   1 +
 common/thrift/CatalogObjects.thrift                |   4 +
 common/thrift/PlanNodes.thrift                     |   1 +
 .../org/apache/impala/catalog/FeIcebergTable.java  |  14 +-
 .../org/apache/impala/catalog/HdfsPartition.java   |  32 +++-
 .../java/org/apache/impala/catalog/HdfsTable.java  |   5 +-
 .../org/apache/impala/planner/HdfsScanNode.java    |   3 +
 .../org/apache/impala/planner/IcebergScanNode.java |   3 +
 .../java/org/apache/impala/util/IcebergUtil.java   | 131 +++++++++++++-
 testdata/data/README                               |  23 ++-
 .../283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd-m0.avro   | Bin 0 -> 3926 bytes
 ...621-1-283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd.avro | Bin 0 -> 1999 bytes
 .../metadata/v1.metadata.json                      | 164 ++++++++++++++++++
 .../metadata/v2.metadata.json                      | 188 +++++++++++++++++++++
 .../p_date=2022-02-22/p_string=impala/000000_0     | Bin 0 -> 433 bytes
 .../db72fbf2-f9f6-4985-8a5f-fd9f632f2c77-m0.avro   | Bin 0 -> 3925 bytes
 ...230-1-db72fbf2-f9f6-4985-8a5f-fd9f632f2c77.avro | Bin 0 -> 2003 bytes
 .../metadata/v1.metadata.json                      | 164 ++++++++++++++++++
 .../metadata/v2.metadata.json                      | 188 +++++++++++++++++++++
 .../metadata/version-hint.text                     |   1 +
 .../p_date=2022-02-22/p_string=impala/000000_0     | Bin 0 -> 189 bytes
 .../2d05a7d4-c229-44c3-860e-e77e46e71a19-m0.avro   | Bin 0 -> 3691 bytes
 ...186-1-2d05a7d4-c229-44c3-860e-e77e46e71a19.avro | Bin 0 -> 1938 bytes
 .../metadata/v1.metadata.json                      |  89 ++++++++++
 .../metadata/v2.metadata.json                      | 116 +++++++++++++
 .../metadata/version-hint.text                     |   1 +
 .../p_float_double=1.1/p_dec_dec=2.718/000000_0    | Bin 0 -> 429 bytes
 .../p_float_double=1.1/p_dec_dec=3.141/000000_0    | Bin 0 -> 429 bytes
 .../8db62f0e-38e5-434b-94dc-c84210302ad8-m0.avro   | Bin 0 -> 3687 bytes
 ...046-1-8db62f0e-38e5-434b-94dc-c84210302ad8.avro | Bin 0 -> 1941 bytes
 .../metadata/v1.metadata.json                      |  89 ++++++++++
 .../metadata/v2.metadata.json                      | 116 +++++++++++++
 .../metadata/version-hint.text                     |   1 +
 .../p_float_double=1.1/p_dec_dec=2.718/000000_0    | Bin 0 -> 189 bytes
 .../p_float_double=1.1/p_dec_dec=3.141/000000_0    | Bin 0 -> 189 bytes
 .../functional/functional_schema_template.sql      |  56 ++++++
 .../datasets/functional/schema_constraints.csv     |   4 +
 .../queries/QueryTest/iceberg-migrated-tables.test | 106 ++++++++++++
 tests/query_test/test_iceberg.py                   |   3 +
 54 files changed, 1780 insertions(+), 26 deletions(-)

diff --git a/be/src/exec/CMakeLists.txt b/be/src/exec/CMakeLists.txt
index 87d3b1a..b26b7e5 100644
--- a/be/src/exec/CMakeLists.txt
+++ b/be/src/exec/CMakeLists.txt
@@ -44,6 +44,7 @@ add_library(Exec
   exec-node.cc
   exchange-node.cc
   external-data-source-executor.cc
+  file-metadata-utils.cc
   filter-context.cc
   grouping-aggregator.cc
   grouping-aggregator-ir.cc
diff --git a/be/src/exec/file-metadata-utils.cc b/be/src/exec/file-metadata-utils.cc
new file mode 100644
index 0000000..7ed2b23
--- /dev/null
+++ b/be/src/exec/file-metadata-utils.cc
@@ -0,0 +1,143 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#include "exec/file-metadata-utils.h"
+
+#include "exec/hdfs-scan-node-base.h"
+#include "exec/scanner-context.h"
+#include "exec/text-converter.inline.h"
+#include "runtime/descriptors.h"
+#include "runtime/tuple.h"
+#include "util/flat_buffer.h"
+#include "util/string-parser.h"
+
+#include "common/names.h"
+
+namespace impala {
+
+void FileMetadataUtils::Open(ScannerContext* context) {
+  DCHECK(context != nullptr);
+  context_ = context;
+  file_desc_ = scan_node_->GetFileDesc(context->partition_descriptor()->id(),
+                                       context->GetStream()->filename());
+}
+
+Tuple* FileMetadataUtils::CreateTemplateTuple(MemPool* mem_pool) {
+  DCHECK(context_ != nullptr);
+  DCHECK(file_desc_ != nullptr);
+  // Initialize the template tuple, it is copied from the template tuple map in the
+  // HdfsScanNodeBase.
+  Tuple* template_tuple =
+      scan_node_->GetTemplateTupleForPartitionId(context_->partition_descriptor()->id());
+  if (template_tuple != nullptr) {
+    template_tuple =
+        template_tuple->DeepCopy(*scan_node_->tuple_desc(), mem_pool);
+  }
+  if (!scan_node_->hdfs_table()->IsIcebergTable()) {
+    return template_tuple;
+  }
+  using namespace org::apache::impala::fb;
+  TextConverter text_converter(/* escape_char = */ '\\',
+      scan_node_->hdfs_table()->null_column_value(),
+      /* check_null = */ true, /* strict_mode = */ true);
+  const FbFileMetadata* file_metadata = file_desc_->file_metadata;
+  const FbIcebergMetadata* ice_metadata = file_metadata->iceberg_metadata();
+  auto transforms = ice_metadata->partition_keys();
+  if (transforms == nullptr) return template_tuple;
+
+  const TupleDescriptor* tuple_desc = scan_node_->tuple_desc();
+  if (template_tuple == nullptr) {
+    template_tuple = Tuple::Create(tuple_desc->byte_size(), mem_pool);
+  }
+  for (const SlotDescriptor* slot_desc : scan_node_->tuple_desc()->slots()) {
+    const SchemaPath& path = slot_desc->col_path();
+    if (path.size() != 1) continue;
+    const ColumnDescriptor& col_desc =
+        scan_node_->hdfs_table()->col_descs()[path.front()];
+    int field_id = col_desc.field_id();
+    for (int i = 0; i < transforms->Length(); ++i) {
+      auto transform = transforms->Get(i);
+      if (transform->transform_type() !=
+          FbIcebergTransformType::FbIcebergTransformType_IDENTITY) {
+        continue;
+      }
+      if (field_id != transform->source_id()) continue;
+      if (!text_converter.WriteSlot(slot_desc, template_tuple,
+                                    transform->transform_value()->c_str(),
+                                    transform->transform_value()->size(),
+                                    true, false,
+                                    mem_pool)) {
+        ErrorMsg error_msg(TErrorCode::GENERAL,
+            Substitute("Could not parse partition value for "
+                "column '$0' in file '$1'. Partition string is '$2'",
+                col_desc.name(), file_desc_->filename,
+                transform->transform_value()->c_str()));
+        // Dates are stored as INTs in the partition data in Iceberg, so let's try
+        // to parse them as INTs.
+        if (col_desc.type().type == PrimitiveType::TYPE_DATE) {
+          int32_t* slot = template_tuple->GetIntSlot(slot_desc->tuple_offset());
+          StringParser::ParseResult parse_result;
+          *slot = StringParser::StringToInt<int32_t>(
+              transform->transform_value()->c_str(),
+              transform->transform_value()->size(),
+              &parse_result);
+          if (parse_result == StringParser::ParseResult::PARSE_SUCCESS) {
+            template_tuple->SetNotNull(slot_desc->null_indicator_offset());
+          } else {
+            state_->LogError(error_msg);
+          }
+        } else {
+          state_->LogError(error_msg);
+        }
+      }
+    }
+  }
+  return template_tuple;
+}
+
+bool FileMetadataUtils::IsValuePartitionCol(const SlotDescriptor* slot_desc) {
+  DCHECK(context_ != nullptr);
+  DCHECK(file_desc_ != nullptr);
+  if (slot_desc->parent() != scan_node_->tuple_desc()) return false;
+  if (slot_desc->col_pos() < scan_node_->num_partition_keys()) {
+    return true;
+  }
+
+  if (!scan_node_->hdfs_table()->IsIcebergTable()) return false;
+
+  using namespace org::apache::impala::fb;
+
+  const SchemaPath& path = slot_desc->col_path();
+  if (path.size() != 1) return false;
+
+  int field_id = scan_node_->hdfs_table()->col_descs()[path.front()].field_id();
+  const FbFileMetadata* file_metadata = file_desc_->file_metadata;
+  const FbIcebergMetadata* ice_metadata = file_metadata->iceberg_metadata();
+  auto transforms = ice_metadata->partition_keys();
+  if (transforms == nullptr) return false;
+  for (int i = 0; i < transforms->Length(); ++i) {
+    auto transform = transforms->Get(i);
+    if (transform->source_id() == field_id &&
+        transform->transform_type() ==
+            FbIcebergTransformType::FbIcebergTransformType_IDENTITY) {
+      return true;
+    }
+  }
+  return false;
+}
+
+} // namespace impala
diff --git a/be/src/exec/file-metadata-utils.h b/be/src/exec/file-metadata-utils.h
new file mode 100644
index 0000000..baf2ad7
--- /dev/null
+++ b/be/src/exec/file-metadata-utils.h
@@ -0,0 +1,56 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+#pragma once
+
+namespace impala {
+
+struct HdfsFileDesc;
+class HdfsScanNodeBase;
+class MemPool;
+class RuntimeState;
+class ScannerContext;
+class SlotDescriptor;
+class Tuple;
+class TupleDescriptor;
+
+/// Helper class for scanners dealing with different table/file formats.
+class FileMetadataUtils {
+public:
+  FileMetadataUtils(HdfsScanNodeBase* scan_node, RuntimeState* state) :
+      scan_node_(scan_node), state_(state) {}
+
+  void Open(ScannerContext* context);
+
+  /// Returns the template tuple corresponding to this scanner context. I.e. it sets
+  /// partition columns and default values in the template tuple.
+  Tuple* CreateTemplateTuple(MemPool* mem_pool);
+
+  /// Returns true if 'slot_desc' refers to a value-based partition column. Returns false
+  /// for transform-based partition columns and non-partition columns.
+  bool IsValuePartitionCol(const SlotDescriptor* slot_desc);
+
+private:
+  HdfsScanNodeBase* scan_node_;
+  RuntimeState* state_;
+
+  // Members below are set in Open()
+  const ScannerContext* context_ = nullptr;
+  const HdfsFileDesc* file_desc_ = nullptr;
+};
+
+} // namespace impala
diff --git a/be/src/exec/hdfs-orc-scanner.cc b/be/src/exec/hdfs-orc-scanner.cc
index f9c368a..cd5282d 100644
--- a/be/src/exec/hdfs-orc-scanner.cc
+++ b/be/src/exec/hdfs-orc-scanner.cc
@@ -532,8 +532,7 @@ inline THdfsCompression::type HdfsOrcScanner::TranslateCompressionKind(
 }
 
 bool HdfsOrcScanner::IsPartitionKeySlot(const SlotDescriptor* slot) {
-  return slot->parent() == scan_node_->tuple_desc() &&
-      slot->col_pos() < scan_node_->num_partition_keys();
+  return file_metadata_utils_.IsValuePartitionCol(slot);
 }
 
 bool HdfsOrcScanner::IsMissingField(const SlotDescriptor* slot) {
diff --git a/be/src/exec/hdfs-scan-node-base.cc b/be/src/exec/hdfs-scan-node-base.cc
index da9f110..0e9c07d 100644
--- a/be/src/exec/hdfs-scan-node-base.cc
+++ b/be/src/exec/hdfs-scan-node-base.cc
@@ -50,6 +50,7 @@
 #include "runtime/runtime-state.h"
 #include "util/compression-util.h"
 #include "util/disk-info.h"
+#include "util/flat_buffer.h"
 #include "util/hdfs-util.h"
 #include "util/impalad-metrics.h"
 #include "util/metrics.h"
@@ -250,6 +251,11 @@ Status HdfsScanPlanNode::ProcessScanRangesAndInitSharedState(FragmentState* stat
     for (const ScanRangeParamsPB& params : ranges->second.scan_ranges()) {
       DCHECK(params.scan_range().has_hdfs_file_split());
       const HdfsFileSplitPB& split = params.scan_range().hdfs_file_split();
+      const org::apache::impala::fb::FbFileMetadata* file_metadata = nullptr;
+      if (params.scan_range().has_file_metadata()) {
+        file_metadata = flatbuffers::GetRoot<org::apache::impala::fb::FbFileMetadata>(
+            params.scan_range().file_metadata().c_str());
+      }
       HdfsPartitionDescriptor* partition_desc =
           hdfs_table_->GetPartition(split.partition_id());
       if (template_tuple_map_.find(split.partition_id()) == template_tuple_map_.end()) {
@@ -284,6 +290,7 @@ Status HdfsScanPlanNode::ProcessScanRangesAndInitSharedState(FragmentState* stat
         file_desc->mtime = split.mtime();
         file_desc->file_compression = CompressionTypePBToThrift(split.file_compression());
         file_desc->file_format = partition_desc->file_format();
+        file_desc->file_metadata = file_metadata;
         RETURN_IF_ERROR(HdfsFsCache::instance()->GetConnection(
             native_file_path, &file_desc->fs, &fs_cache));
         shared_state_.per_type_files_[partition_desc->file_format()].push_back(file_desc);
diff --git a/be/src/exec/hdfs-scan-node-base.h b/be/src/exec/hdfs-scan-node-base.h
index e3d2ee8..e6cd7c6 100644
--- a/be/src/exec/hdfs-scan-node-base.h
+++ b/be/src/exec/hdfs-scan-node-base.h
@@ -41,6 +41,10 @@
 #include "util/spinlock.h"
 #include "util/unique-id-hash.h"
 
+namespace org { namespace apache { namespace impala { namespace fb {
+struct FbFileMetadata;
+}}}}
+
 namespace impala {
 
 class ScannerContext;
@@ -83,6 +87,9 @@ struct HdfsFileDesc {
   /// Splits (i.e. raw byte ranges) for this file, assigned to this scan node.
   std::vector<io::ScanRange*> splits;
 
+  /// Extra file metadata, e.g. Iceberg-related file-level info.
+  const ::org::apache::impala::fb::FbFileMetadata* file_metadata;
+
   /// Some useful typedefs for creating HdfsFileDesc related data structures.
   /// This is a pair for partition ID and filename which uniquely identifies a file.
   typedef pair<int64_t, std::string> PartitionFileKey;
diff --git a/be/src/exec/hdfs-scanner.cc b/be/src/exec/hdfs-scanner.cc
index a5114dd..f4ff166 100644
--- a/be/src/exec/hdfs-scanner.cc
+++ b/be/src/exec/hdfs-scanner.cc
@@ -20,6 +20,7 @@
 #include "codegen/codegen-anyval.h"
 #include "exec/base-sequence-scanner.h"
 #include "exec/exec-node.inline.h"
+#include "exec/file-metadata-utils.h"
 #include "exec/hdfs-scan-node.h"
 #include "exec/hdfs-scan-node-mt.h"
 #include "exec/read-write-util.h"
@@ -50,6 +51,7 @@ const char* HdfsScanner::LLVM_CLASS_NAME = "class.impala::HdfsScanner";
 HdfsScanner::HdfsScanner(HdfsScanNodeBase* scan_node, RuntimeState* state)
     : scan_node_(scan_node),
       state_(state),
+      file_metadata_utils_(scan_node, state),
       expr_perm_pool_(new MemPool(scan_node->expr_mem_tracker())),
       template_tuple_pool_(new MemPool(scan_node->mem_tracker())),
       tuple_byte_size_(scan_node->tuple_desc()->byte_size()),
@@ -65,6 +67,7 @@ HdfsScanner::HdfsScanner(HdfsScanNodeBase* scan_node, RuntimeState* state)
 HdfsScanner::HdfsScanner()
     : scan_node_(nullptr),
       state_(nullptr),
+      file_metadata_utils_(nullptr, nullptr),
       tuple_byte_size_(0) {
   DCHECK(TestInfo::is_test());
 }
@@ -74,6 +77,7 @@ HdfsScanner::~HdfsScanner() {
 
 Status HdfsScanner::Open(ScannerContext* context) {
   context_ = context;
+  file_metadata_utils_.Open(context);
   stream_ = context->GetStream();
 
   // Clone the scan node's conjuncts map. The cloned evaluators must be closed by the
@@ -108,14 +112,7 @@ Status HdfsScanner::Open(ScannerContext* context) {
     }
   }
 
-  // Initialize the template_tuple_, it is copied from the template tuple map in the
-  // HdfsScanNodeBase.
-  Tuple* template_tuple =
-      scan_node_->GetTemplateTupleForPartitionId(context_->partition_descriptor()->id());
-  if (template_tuple != nullptr) {
-    template_tuple_ =
-        template_tuple->DeepCopy(*scan_node_->tuple_desc(), template_tuple_pool_.get());
-  }
+  template_tuple_ = file_metadata_utils_.CreateTemplateTuple(template_tuple_pool_.get());
   template_tuple_map_[scan_node_->tuple_desc()] = template_tuple_;
 
   decompress_timer_ = ADD_TIMER(scan_node_->runtime_profile(), "DecompressionTime");
diff --git a/be/src/exec/hdfs-scanner.h b/be/src/exec/hdfs-scanner.h
index 6b49f7d..e68e4be 100644
--- a/be/src/exec/hdfs-scanner.h
+++ b/be/src/exec/hdfs-scanner.h
@@ -31,6 +31,7 @@
 #include "common/object-pool.h"
 #include "common/status.h"
 #include "exec/exec-node.inline.h"
+#include "exec/file-metadata-utils.h"
 #include "exec/hdfs-scan-node-base.h"
 #include "exec/scanner-context.h"
 #include "runtime/io/disk-io-mgr.h"
@@ -239,6 +240,9 @@ class HdfsScanner {
   /// Context for this scanner
   ScannerContext* context_ = nullptr;
 
+  /// Utility class for handling file metadata.
+  FileMetadataUtils file_metadata_utils_;
+
   /// Object pool for objects with same lifetime as scanner.
   ObjectPool obj_pool_;
 
diff --git a/be/src/exec/orc-column-readers.cc b/be/src/exec/orc-column-readers.cc
index 53495b5..c2136f5 100644
--- a/be/src/exec/orc-column-readers.cc
+++ b/be/src/exec/orc-column-readers.cc
@@ -459,8 +459,18 @@ Status OrcStructReader::TopLevelReadValueBatch(ScratchTupleBatch* scratch_batch,
          (scanner_->row_batches_need_validation_ &&
           scanner_->scan_node_->IsZeroSlotTableScan());
     if (!valid_empty_children) {
-      return Status(Substitute("Parse error in possibly corrupt ORC file: '$0'",
-          scanner_->filename()));
+      bool only_partitions = true;
+      for (SlotDescriptor* slot : tuple_desc_->slots()) {
+        if (!scanner_->IsPartitionKeySlot(slot)) {
+          only_partitions = false;
+          break;
+        }
+      }
+      if (!only_partitions) {
+        return Status(Substitute("Parse error in possibly corrupt ORC file: '$0'. "
+            "No columns found for this scan.",
+            scanner_->filename()));
+      }
     }
     DCHECK_EQ(0, num_rows_read);
     num_rows_read = std::min(scratch_batch->capacity - scratch_batch->num_tuples,
diff --git a/be/src/exec/parquet/hdfs-parquet-scanner.cc b/be/src/exec/parquet/hdfs-parquet-scanner.cc
index 3306b64..71f2079 100644
--- a/be/src/exec/parquet/hdfs-parquet-scanner.cc
+++ b/be/src/exec/parquet/hdfs-parquet-scanner.cc
@@ -2756,8 +2756,7 @@ Status HdfsParquetScanner::CreateColumnReaders(const TupleDescriptor& tuple_desc
 
   for (SlotDescriptor* slot_desc: tuple_desc.slots()) {
     // Skip partition columns
-    if (&tuple_desc == scan_node_->tuple_desc() &&
-        slot_desc->col_pos() < scan_node_->num_partition_keys()) continue;
+    if (file_metadata_utils_.IsValuePartitionCol(slot_desc)) continue;
 
     SchemaNode* node = nullptr;
     bool pos_field;
diff --git a/be/src/exec/parquet/parquet-metadata-utils.h b/be/src/exec/parquet/parquet-metadata-utils.h
index efdb05d..df9656d 100644
--- a/be/src/exec/parquet/parquet-metadata-utils.h
+++ b/be/src/exec/parquet/parquet-metadata-utils.h
@@ -160,6 +160,9 @@ class ParquetSchemaResolver {
       const parquet::SchemaElement& first_column = schema[1];
       if (first_column.__isset.field_id) {
         fallback_schema_resolution_ = TSchemaResolutionStrategy::type::FIELD_ID;
+      } else {
+        // Use Name-based schema resolution in case of missing field ids.
+        fallback_schema_resolution_ = TSchemaResolutionStrategy::type::NAME;
       }
     }
     return CreateSchemaTree(file_metadata->schema, &schema_);
diff --git a/be/src/runtime/dml-exec-state.cc b/be/src/runtime/dml-exec-state.cc
index 862eab8..4877b35 100644
--- a/be/src/runtime/dml-exec-state.cc
+++ b/be/src/runtime/dml-exec-state.cc
@@ -510,7 +510,7 @@ string createIcebergDataFileString(
   flatbuffers::Offset<FbIcebergDataFile> data_file = CreateFbIcebergDataFile(fbb,
       fbb.CreateString(final_path),
       // Currently we can only write Parquet to Iceberg
-      FbFileFormat::FbFileFormat_PARQUET,
+      FbIcebergDataFileFormat::FbIcebergDataFileFormat_PARQUET,
       num_rows,
       file_size,
       fbb.CreateString(partition_name),
diff --git a/be/src/scheduling/scheduler.cc b/be/src/scheduling/scheduler.cc
index c6cb5c0..19e830e 100644
--- a/be/src/scheduling/scheduler.cc
+++ b/be/src/scheduling/scheduler.cc
@@ -136,6 +136,9 @@ Status Scheduler::GenerateScanRanges(const vector<TFileSplitGeneratorSpec>& spec
       hdfs_scan_range.__set_partition_path_hash(spec.partition_path_hash);
       TScanRange scan_range;
       scan_range.__set_hdfs_file_split(hdfs_scan_range);
+      if (spec.file_desc.__isset.file_metadata) {
+        scan_range.__set_file_metadata(spec.file_desc.file_metadata);
+      }
       TScanRangeLocationList scan_range_list;
       scan_range_list.__set_scan_range(scan_range);
 
@@ -1126,6 +1129,9 @@ void TScanRangeToScanRangePB(const TScanRange& tscan_range, ScanRangePB* scan_ra
   if (tscan_range.__isset.kudu_scan_token) {
     scan_range_pb->set_kudu_scan_token(tscan_range.kudu_scan_token);
   }
+  if (tscan_range.__isset.file_metadata) {
+    scan_range_pb->set_file_metadata(tscan_range.file_metadata);
+  }
 }
 
 void Scheduler::AssignmentCtx::RecordScanRangeAssignment(
diff --git a/common/fbs/CatalogObjects.fbs b/common/fbs/CatalogObjects.fbs
index 3cf20d2..629938c 100644
--- a/common/fbs/CatalogObjects.fbs
+++ b/common/fbs/CatalogObjects.fbs
@@ -15,6 +15,8 @@
 // specific language governing permissions and limitations
 // under the License.
 
+include "IcebergObjects.fbs";
+
 namespace org.apache.impala.fb;
 
 // Supported compression algorithms. This needs to match the values in
@@ -78,3 +80,9 @@ table FbFileDesc {
   // Whether this file is erasure-coded
   is_ec: bool = false (id: 5);
 }
+
+// Additional file-related metadata
+table FbFileMetadata {
+  // Iceberg-related metadata about the data file
+  iceberg_metadata : FbIcebergMetadata;
+}
diff --git a/common/fbs/IcebergObjects.fbs b/common/fbs/IcebergObjects.fbs
index 585c300..9a55ef8 100644
--- a/common/fbs/IcebergObjects.fbs
+++ b/common/fbs/IcebergObjects.fbs
@@ -17,9 +17,36 @@
 
 namespace org.apache.impala.fb;
 
-enum FbFileFormat: byte {
+enum FbIcebergDataFileFormat: byte {
   PARQUET,
-  ORC
+  ORC,
+  // We add AVRO here as a future possibility.
+  // The Iceberg spec allows AVRO data files, but currently Impala
+  // cannot read such Iceberg tables. See IMPALA-11158.
+  AVRO
+}
+
+enum FbIcebergTransformType : byte {
+  IDENTITY,
+  HOUR,
+  DAY,
+  MONTH,
+  YEAR,
+  BUCKET,
+  TRUNCATE,
+  VOID
+}
+
+table FbIcebergPartitionTransformValue {
+  transform_type: FbIcebergTransformType;
+  transform_param: int;
+  transform_value: string;
+  source_id: int;
+}
+
+table FbIcebergMetadata {
+  file_format : FbIcebergDataFileFormat;
+  partition_keys : [FbIcebergPartitionTransformValue];
 }
 
 table FbIcebergColumnStats {
@@ -32,7 +59,7 @@ table FbIcebergColumnStats {
 
 table FbIcebergDataFile {
   path: string;
-  format: FbFileFormat = PARQUET;
+  format: FbIcebergDataFileFormat = PARQUET;
   record_count: long = 0;
   file_size_in_bytes: long = 0;
   partition_path: string;
diff --git a/common/protobuf/planner.proto b/common/protobuf/planner.proto
index 80cf305..adccdee 100644
--- a/common/protobuf/planner.proto
+++ b/common/protobuf/planner.proto
@@ -73,4 +73,5 @@ message ScanRangePB {
   optional HdfsFileSplitPB hdfs_file_split = 1;
   optional HBaseKeyRangePB hbase_key_range = 2;
   optional bytes kudu_scan_token = 3;
+  optional bytes file_metadata = 4;
 }
diff --git a/common/thrift/CatalogObjects.thrift b/common/thrift/CatalogObjects.thrift
index ef3ad40..77568bb 100644
--- a/common/thrift/CatalogObjects.thrift
+++ b/common/thrift/CatalogObjects.thrift
@@ -292,6 +292,10 @@ struct THdfsFileDesc {
   // (defined in common/fbs/CatalogObjects.fbs).
   // TODO: Put this in a KRPC sidecar to avoid serialization cost.
   1: required binary file_desc_data
+
+  // Additional file metadata serialized into a FlatBuffer
+  // TODO: Put this in a KRPC sidecar to avoid serialization cost.
+  2: optional binary file_metadata
 }
 
 // Represents an HDFS partition's location in a compressed format. 'prefix_index'
diff --git a/common/thrift/PlanNodes.thrift b/common/thrift/PlanNodes.thrift
index 5883d15..b608c69 100644
--- a/common/thrift/PlanNodes.thrift
+++ b/common/thrift/PlanNodes.thrift
@@ -268,6 +268,7 @@ struct TScanRange {
   1: optional THdfsFileSplit hdfs_file_split
   2: optional THBaseKeyRange hbase_key_range
   3: optional binary kudu_scan_token
+  4: optional binary file_metadata
 }
 
 // Specification of an overlap predicate desc.
diff --git a/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java b/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
index 6a020d3..991bef5 100644
--- a/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
+++ b/fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
@@ -41,6 +41,7 @@ import org.apache.iceberg.DataFile;
 import org.apache.iceberg.PartitionField;
 import org.apache.iceberg.PartitionSpec;
 import org.apache.iceberg.Schema;
+import org.apache.iceberg.Table;
 import org.apache.iceberg.TableMetadata;
 import org.apache.impala.analysis.IcebergPartitionField;
 import org.apache.impala.analysis.IcebergPartitionSpec;
@@ -49,6 +50,8 @@ import org.apache.impala.catalog.HdfsPartition.FileDescriptor;
 import org.apache.impala.common.FileSystemUtil;
 import org.apache.impala.common.Reference;
 import org.apache.impala.compat.HdfsShim;
+import org.apache.impala.fb.FbFileDesc;
+import org.apache.impala.fb.FbIcebergMetadata;
 import org.apache.impala.thrift.TColumn;
 import org.apache.impala.thrift.TCompressionCodec;
 import org.apache.impala.thrift.THdfsCompression;
@@ -569,18 +572,25 @@ public interface FeIcebergTable extends FeFsTable {
       Map<String, HdfsPartition.FileDescriptor> fileDescMap = new HashMap<>();
       List<DataFile> dataFileList = IcebergUtil.getIcebergDataFiles(table,
           new ArrayList<>(), /*timeTravelSpecl=*/null);
+      Table iceTable = table.getIcebergBaseTable();
       for (DataFile dataFile : dataFileList) {
           Path path = new Path(dataFile.path().toString());
           if (hdfsFileDescMap.containsKey(path.toUri().getPath())) {
             String pathHash = IcebergUtil.getDataFilePathHash(dataFile);
-            fileDescMap.put(pathHash, hdfsFileDescMap.get(path.toUri().getPath()));
+            HdfsPartition.FileDescriptor fsFd = hdfsFileDescMap.get(
+                path.toUri().getPath());
+            HdfsPartition.FileDescriptor iceFd = fsFd.cloneWithFileMetadata(
+                IcebergUtil.createIcebergMetadata(iceTable, dataFile));
+            fileDescMap.put(pathHash, iceFd);
           } else {
             LOG.warn("Iceberg DataFile '{}' cannot be found in the HDFS recursive file "
                 + "listing results.", path.toString());
             HdfsPartition.FileDescriptor fileDesc = getFileDescriptor(
                 new Path(dataFile.path().toString()),
                 new Path(table.getIcebergTableLocation()), table.getHostIndex());
-            fileDescMap.put(IcebergUtil.getDataFilePathHash(dataFile), fileDesc);
+            HdfsPartition.FileDescriptor iceFd = fileDesc.cloneWithFileMetadata(
+                IcebergUtil.createIcebergMetadata(iceTable, dataFile));
+            fileDescMap.put(IcebergUtil.getDataFilePathHash(dataFile), iceFd);
           }
       }
       return fileDescMap;
diff --git a/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java b/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
index c92f5ef..b294289 100644
--- a/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
+++ b/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
@@ -51,6 +51,7 @@ import org.apache.impala.compat.MetastoreShim;
 import org.apache.impala.fb.FbCompression;
 import org.apache.impala.fb.FbFileBlock;
 import org.apache.impala.fb.FbFileDesc;
+import org.apache.impala.fb.FbFileMetadata;
 import org.apache.impala.thrift.CatalogObjectsConstants;
 import org.apache.impala.thrift.TAccessLevel;
 import org.apache.impala.thrift.TCatalogObject;
@@ -113,10 +114,26 @@ public class HdfsPartition extends CatalogObjectImpl
     // Internal representation of a file descriptor using a FlatBuffer.
     private final FbFileDesc fbFileDescriptor_;
 
-    private FileDescriptor(FbFileDesc fileDescData) { fbFileDescriptor_ = fileDescData; }
+    // Internal representation of additional file metadata, e.g. Iceberg metadata.
+    private final FbFileMetadata fbFileMetadata_;
+
+    private FileDescriptor(FbFileDesc fileDescData) {
+      fbFileDescriptor_ = fileDescData;
+      fbFileMetadata_ = null;
+    }
+
+    private FileDescriptor(FbFileDesc fileDescData, FbFileMetadata fileMetadata) {
+      fbFileDescriptor_ = fileDescData;
+      fbFileMetadata_ = fileMetadata;
+    }
 
     public static FileDescriptor fromThrift(THdfsFileDesc desc) {
       ByteBuffer bb = ByteBuffer.wrap(desc.getFile_desc_data());
+      if (desc.isSetFile_metadata()) {
+        ByteBuffer bbMd = ByteBuffer.wrap(desc.getFile_metadata());
+        return new FileDescriptor(FbFileDesc.getRootAsFbFileDesc(bb),
+                                  FbFileMetadata.getRootAsFbFileMetadata(bbMd));
+      }
       return new FileDescriptor(FbFileDesc.getRootAsFbFileDesc(bb));
     }
 
@@ -145,7 +162,11 @@ public class HdfsPartition extends CatalogObjectImpl
           it.mutateReplicaHostIdxs(j, FileBlock.makeReplicaIdx(isCached, newHostIdx));
         }
       }
-      return new FileDescriptor(cloned);
+      return new FileDescriptor(cloned, fbFileMetadata_);
+    }
+
+    public FileDescriptor cloneWithFileMetadata(FbFileMetadata fileMetadata) {
+      return new FileDescriptor(fbFileDescriptor_, fileMetadata);
     }
 
     /**
@@ -254,10 +275,17 @@ public class HdfsPartition extends CatalogObjectImpl
       return fbFileDescriptor_.fileBlocks(idx);
     }
 
+    public FbFileMetadata getFbFileMetadata() {
+      return fbFileMetadata_;
+    }
+
     public THdfsFileDesc toThrift() {
       THdfsFileDesc fd = new THdfsFileDesc();
       ByteBuffer bb = fbFileDescriptor_.getByteBuffer();
       fd.setFile_desc_data(bb);
+      if (fbFileMetadata_ != null) {
+        fd.setFile_metadata(fbFileMetadata_.getByteBuffer());
+      }
       return fd;
     }
 
diff --git a/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java b/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
index e0af0bb..4978e39 100644
--- a/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
+++ b/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
@@ -683,7 +683,10 @@ public class HdfsTable extends Table implements FeFsTable {
 
     List<HdfsPartition.Builder> partBuilders = new ArrayList<>();
     if (msTbl.getPartitionKeysSize() == 0) {
-      Preconditions.checkArgument(msPartitions == null || msPartitions.isEmpty());
+      // Legacy -> Iceberg migrated tables might have HMS partitions (HIVE-25894).
+      if (!IcebergTable.isIcebergTable(msTbl)) {
+        Preconditions.checkArgument(msPartitions == null || msPartitions.isEmpty());
+      }
       // This table has no partition key, which means it has no declared partitions.
       // We model partitions slightly differently to Hive - every file must exist in a
       // partition, so add a single partition with no keys which will get all the
diff --git a/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java b/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
index 7e80f37..e43d819 100644
--- a/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
+++ b/fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
@@ -1349,6 +1349,9 @@ public class HdfsScanNode extends ScanNode {
             currentOffset, currentLength, partition.getId(), fileDesc.getFileLength(),
             fileDesc.getFileCompression().toThrift(), fileDesc.getModificationTime(),
             partition.getLocation().hashCode()));
+        if (fileDesc.getFbFileMetadata() != null) {
+          scanRange.setFile_metadata(fileDesc.getFbFileMetadata().getByteBuffer());
+        }
         TScanRangeLocationList scanRangeLocations = new TScanRangeLocationList();
         scanRangeLocations.scan_range = scanRange;
         scanRangeLocations.locations = locations;
diff --git a/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java b/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
index fb17d8d..8f02d8f 100644
--- a/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
+++ b/fe/src/main/java/org/apache/impala/planner/IcebergScanNode.java
@@ -138,6 +138,9 @@ public class IcebergScanNode extends HdfsScanNode {
               "Cannot load file descriptor for: " + dataFile.path());
         }
         // Add file descriptor to the cache.
+        org.apache.iceberg.Table iceTable = icebergTable_.getIcebergBaseTable();
+        fileDesc = fileDesc.cloneWithFileMetadata(
+            IcebergUtil.createIcebergMetadata(iceTable, dataFile));
         icebergTable_.getPathHashToFileDescMap().put(
             IcebergUtil.getDataFilePathHash(dataFile), fileDesc);
       }
diff --git a/fe/src/main/java/org/apache/impala/util/IcebergUtil.java b/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
index 02d33e0..edab096 100644
--- a/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
+++ b/fe/src/main/java/org/apache/impala/util/IcebergUtil.java
@@ -39,11 +39,19 @@ import com.google.common.hash.Hasher;
 import com.google.common.hash.Hashing;
 import com.google.common.primitives.Ints;
 import com.google.common.primitives.Longs;
+import com.google.flatbuffers.FlatBufferBuilder;
 
 import org.apache.impala.common.Pair;
+import org.apache.impala.fb.FbFileMetadata;
+import org.apache.impala.fb.FbIcebergDataFileFormat;
+import org.apache.impala.fb.FbIcebergMetadata;
+import org.apache.impala.fb.FbIcebergPartitionTransformValue;
+import org.apache.impala.fb.FbIcebergTransformType;
+import org.apache.iceberg.BaseTable;
 import org.apache.iceberg.Transaction;
 import org.apache.iceberg.catalog.TableIdentifier;
 import org.apache.iceberg.DataFile;
+import org.apache.iceberg.FileFormat;
 import org.apache.iceberg.FileScanTask;
 import org.apache.iceberg.TableScan;
 import org.apache.iceberg.expressions.UnboundPredicate;
@@ -66,6 +74,7 @@ import org.apache.impala.analysis.TimeTravelSpec.Kind;
 import org.apache.impala.catalog.Catalog;
 import org.apache.impala.catalog.FeIcebergTable;
 import org.apache.impala.catalog.HdfsFileFormat;
+import org.apache.impala.catalog.HdfsPartition;
 import org.apache.impala.catalog.IcebergTable;
 import org.apache.impala.catalog.TableLoadingException;
 import org.apache.impala.catalog.iceberg.IcebergHadoopCatalog;
@@ -559,11 +568,11 @@ public class IcebergUtil {
   public static org.apache.iceberg.FileFormat fbFileFormatToIcebergFileFormat(
       byte fbFileFormat) throws ImpalaRuntimeException {
     switch (fbFileFormat){
-      case org.apache.impala.fb.FbFileFormat.PARQUET:
+      case org.apache.impala.fb.FbIcebergDataFileFormat.PARQUET:
           return org.apache.iceberg.FileFormat.PARQUET;
       default:
           throw new ImpalaRuntimeException(String.format("Unexpected file format: %s",
-              org.apache.impala.fb.FbFileFormat.name(fbFileFormat)));
+              org.apache.impala.fb.FbIcebergDataFileFormat.name(fbFileFormat)));
     }
   }
 
@@ -823,4 +832,122 @@ public class IcebergUtil {
     }
     return ret;
   }
+
+  /**
+   * Extracts metadata from Iceberg data file object 'df'. Such metadata is the file
+   * format of the data file, also the partition information the data file belongs.
+   * It creates a flatbuffer so it can be passed between machines and processes without
+   * further de/serialization.
+   */
+  public static FbFileMetadata createIcebergMetadata(Table iceTbl, DataFile df) {
+    FlatBufferBuilder fbb = new FlatBufferBuilder(1);
+    int iceOffset = createIcebergMetadata(fbb, iceTbl, df);
+    fbb.finish(FbFileMetadata.createFbFileMetadata(fbb, iceOffset));
+    ByteBuffer bb = fbb.dataBuffer().slice();
+    ByteBuffer compressedBb = ByteBuffer.allocate(bb.capacity());
+    compressedBb.put(bb);
+    return FbFileMetadata.getRootAsFbFileMetadata((ByteBuffer)compressedBb.flip());
+  }
+
+  private static int createIcebergMetadata(FlatBufferBuilder fbb, Table iceTbl,
+      DataFile df) {
+    int partKeysOffset = -1;
+    PartitionSpec spec = iceTbl.specs().get(df.specId());
+    if (spec != null && !spec.fields().isEmpty()) {
+      partKeysOffset = createPartitionKeys(fbb, spec, df);
+    }
+    FbIcebergMetadata.startFbIcebergMetadata(fbb);
+    byte fileFormat = -1;
+    if (df.format() == FileFormat.PARQUET) fileFormat = FbIcebergDataFileFormat.PARQUET;
+    else if (df.format() == FileFormat.ORC) fileFormat = FbIcebergDataFileFormat.ORC;
+    else if (df.format() == FileFormat.AVRO) fileFormat = FbIcebergDataFileFormat.AVRO;
+    if (fileFormat != -1) {
+      FbIcebergMetadata.addFileFormat(fbb, fileFormat);
+    }
+    if (partKeysOffset != -1) {
+      FbIcebergMetadata.addPartitionKeys(fbb, partKeysOffset);
+    }
+    return FbIcebergMetadata.endFbIcebergMetadata(fbb);
+  }
+
+  private static int createPartitionKeys(FlatBufferBuilder fbb, PartitionSpec spec,
+      DataFile df) {
+    Preconditions.checkState(spec.fields().size() == df.partition().size());
+    int[] partitionKeyOffsets = new int[spec.fields().size()];
+    for (int i = 0; i < spec.fields().size(); ++i) {
+      partitionKeyOffsets[i] =
+          createPartitionTransformValue(fbb, spec, df, i);
+    }
+    return FbIcebergMetadata.createPartitionKeysVector(fbb, partitionKeyOffsets);
+  }
+
+  private static int createPartitionTransformValue(FlatBufferBuilder fbb,
+      PartitionSpec spec, DataFile df, int fieldIndex) {
+    PartitionField field = spec.fields().get(fieldIndex);
+    Pair<Byte, Integer> transform = getFbTransform(spec.schema(), field);
+    int valueOffset = -1;
+    if (transform.first != FbIcebergTransformType.VOID) {
+      Object partValue = df.partition().get(fieldIndex, Object.class);
+      valueOffset = fbb.createString(partValue.toString());
+    }
+    FbIcebergPartitionTransformValue.startFbIcebergPartitionTransformValue(fbb);
+    FbIcebergPartitionTransformValue.addTransformType(fbb, transform.first);
+    if (transform.second != null) {
+      FbIcebergPartitionTransformValue.addTransformParam(fbb, transform.second);
+    }
+    if (valueOffset != -1) {
+      FbIcebergPartitionTransformValue.addTransformValue(fbb, valueOffset);
+    }
+    FbIcebergPartitionTransformValue.addSourceId(fbb, field.sourceId());
+    return FbIcebergPartitionTransformValue.endFbIcebergPartitionTransformValue(fbb);
+  }
+
+  private static Pair<Byte, Integer> getFbTransform(Schema schema,
+      PartitionField field) {
+    return PartitionSpecVisitor.visit(
+      schema, field, new PartitionSpecVisitor<Pair<Byte, Integer>>() {
+      @Override
+      public Pair<Byte, Integer> identity(String sourceName, int sourceId) {
+        return new Pair<Byte, Integer>(FbIcebergTransformType.IDENTITY, null);
+      }
+
+      @Override
+      public Pair<Byte, Integer> bucket(String sourceName, int sourceId,
+          int numBuckets) {
+        return new Pair<Byte, Integer>(FbIcebergTransformType.BUCKET, numBuckets);
+      }
+
+      @Override
+      public Pair<Byte, Integer> truncate(String sourceName, int sourceId,
+          int width) {
+        return new Pair<Byte, Integer>(FbIcebergTransformType.TRUNCATE, width);
+      }
+
+      @Override
+      public Pair<Byte, Integer> year(String sourceName, int sourceId) {
+        return new Pair<Byte, Integer>(FbIcebergTransformType.YEAR, null);
+      }
+
+      @Override
+      public Pair<Byte, Integer> month(String sourceName, int sourceId) {
+        return new Pair<Byte, Integer>(FbIcebergTransformType.MONTH, null);
+      }
+
+      @Override
+      public Pair<Byte, Integer> day(String sourceName, int sourceId) {
+        return new Pair<Byte, Integer>(FbIcebergTransformType.DAY, null);
+      }
+
+      @Override
+      public Pair<Byte, Integer> hour(String sourceName, int sourceId) {
+        return new Pair<Byte, Integer>(FbIcebergTransformType.HOUR, null);
+      }
+
+      @Override
+      public Pair<Byte, Integer> alwaysNull(int fieldId, String sourceName,
+          int sourceId) {
+        return new Pair<Byte, Integer>(FbIcebergTransformType.VOID, null);
+      }
+    });
+  }
 }
diff --git a/testdata/data/README b/testdata/data/README
index 98c3bf3..804c5a1 100644
--- a/testdata/data/README
+++ b/testdata/data/README
@@ -693,4 +693,25 @@ Status HdfsParquetTableWriter::WriteFileFooter() {
 +  file_metadata_.schema[1].logicalType.__isset.DECIMAL = false;
 +  file_metadata_.schema[1].__isset.logicalType = false;
 create table my_decimal_tbl (d1 decimal(4,2)) stored as parquet;
-insert into my_decimal_tbl values (cast(0 as decimal(4,2)));
\ No newline at end of file
+insert into my_decimal_tbl values (cast(0 as decimal(4,2)));
+
+iceberg_test/hadoop_catalog/ice/alltypes_part:
+iceberg_test/hadoop_catalog/ice/alltypes_part_orc:
+Generated by Hive 3.1 + Iceberg 0.11. Then the JSON and AVRO files were manually edited
+to make these tables correspond to an Iceberg table in a HadoopCatalog instead of
+HiveCatalog.
+alltypes_part has PARQUET data files, alltypes_part_orc has ORC data files. They have
+identity partitions with all the supported data types.
+
+iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution:
+iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc:
+Generated by Hive 3.1 + Iceberg 0.11. Then the JSON and AVRO files were manually edited
+to make these tables correspond to an Iceberg table in a HadoopCatalog instead of
+HiveCatalog.
+iceberg_legacy_partition_schema_evolution has PARQUET data files,
+iceberg_legacy_partition_schema_evolution_orc has ORC data files.
+The tables that have the following schema changes since table migration:
+* Partition INT column to BIGINT
+* Partition FLOAT column to DOUBLE
+* Partition DECIMAL(5,3) column to DECIMAL(8,3)
+* Non-partition column has been moved to end of the schema
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd-m0.avro b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd-m0.avro
new file mode 100644
index 0000000..046bbba
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd-m0.avro differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/snap-6167994413873848621-1-283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd.avro b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/snap-6167994413873848621-1-283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd.avro
new file mode 100644
index 0000000..d6d4716
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/snap-6167994413873848621-1-283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd.avro differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v1.metadata.json b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v1.metadata.json
new file mode 100644
index 0000000..4f5ad88
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v1.metadata.json
@@ -0,0 +1,164 @@
+{
+  "format-version" : 1,
+  "table-uuid" : "83984c52-5cba-42dd-8e30-8eb0d12b3ba5",
+  "location" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part",
+  "last-updated-ms" : 1645023419407,
+  "last-column-id" : 9,
+  "schema" : {
+    "type" : "struct",
+    "fields" : [ {
+      "id" : 1,
+      "name" : "i",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 2,
+      "name" : "p_bool",
+      "required" : false,
+      "type" : "boolean"
+    }, {
+      "id" : 3,
+      "name" : "p_int",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 4,
+      "name" : "p_bigint",
+      "required" : false,
+      "type" : "long"
+    }, {
+      "id" : 5,
+      "name" : "p_float",
+      "required" : false,
+      "type" : "float"
+    }, {
+      "id" : 6,
+      "name" : "p_double",
+      "required" : false,
+      "type" : "double"
+    }, {
+      "id" : 7,
+      "name" : "p_decimal",
+      "required" : false,
+      "type" : "decimal(6, 3)"
+    }, {
+      "id" : 8,
+      "name" : "p_date",
+      "required" : false,
+      "type" : "date"
+    }, {
+      "id" : 9,
+      "name" : "p_string",
+      "required" : false,
+      "type" : "string"
+    } ]
+  },
+  "partition-spec" : [ {
+    "name" : "p_bool",
+    "transform" : "identity",
+    "source-id" : 2,
+    "field-id" : 1000
+  }, {
+    "name" : "p_int",
+    "transform" : "identity",
+    "source-id" : 3,
+    "field-id" : 1001
+  }, {
+    "name" : "p_bigint",
+    "transform" : "identity",
+    "source-id" : 4,
+    "field-id" : 1002
+  }, {
+    "name" : "p_float",
+    "transform" : "identity",
+    "source-id" : 5,
+    "field-id" : 1003
+  }, {
+    "name" : "p_double",
+    "transform" : "identity",
+    "source-id" : 6,
+    "field-id" : 1004
+  }, {
+    "name" : "p_decimal",
+    "transform" : "identity",
+    "source-id" : 7,
+    "field-id" : 1005
+  }, {
+    "name" : "p_date",
+    "transform" : "identity",
+    "source-id" : 8,
+    "field-id" : 1006
+  }, {
+    "name" : "p_string",
+    "transform" : "identity",
+    "source-id" : 9,
+    "field-id" : 1007
+  } ],
+  "default-spec-id" : 0,
+  "partition-specs" : [ {
+    "spec-id" : 0,
+    "fields" : [ {
+      "name" : "p_bool",
+      "transform" : "identity",
+      "source-id" : 2,
+      "field-id" : 1000
+    }, {
+      "name" : "p_int",
+      "transform" : "identity",
+      "source-id" : 3,
+      "field-id" : 1001
+    }, {
+      "name" : "p_bigint",
+      "transform" : "identity",
+      "source-id" : 4,
+      "field-id" : 1002
+    }, {
+      "name" : "p_float",
+      "transform" : "identity",
+      "source-id" : 5,
+      "field-id" : 1003
+    }, {
+      "name" : "p_double",
+      "transform" : "identity",
+      "source-id" : 6,
+      "field-id" : 1004
+    }, {
+      "name" : "p_decimal",
+      "transform" : "identity",
+      "source-id" : 7,
+      "field-id" : 1005
+    }, {
+      "name" : "p_date",
+      "transform" : "identity",
+      "source-id" : 8,
+      "field-id" : 1006
+    }, {
+      "name" : "p_string",
+      "transform" : "identity",
+      "source-id" : 9,
+      "field-id" : 1007
+    } ]
+  } ],
+  "default-sort-order-id" : 0,
+  "sort-orders" : [ {
+    "order-id" : 0,
+    "fields" : [ ]
+  } ],
+  "properties" : {
+    "engine.hive.enabled" : "true",
+    "MIGRATED_TO_ICEBERG" : "true",
+    "last_modified_time" : "1645023419",
+    "EXTERNAL" : "TRUE",
+    "write.format.default" : "parquet",
+    "gc.enabled" : "TRUE",
+    "TRANSLATED_TO_EXTERNAL" : "TRUE",
+    "bucketing_version" : "2",
+    "last_modified_by" : "boroknagyz",
+    "storage_handler" : "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler",
+    "table_type" : "ICEBERG"
+  },
+  "current-snapshot-id" : -1,
+  "snapshots" : [ ],
+  "snapshot-log" : [ ],
+  "metadata-log" : [ ]
+}
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v2.metadata.json b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v2.metadata.json
new file mode 100644
index 0000000..ff285b0
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/v2.metadata.json
@@ -0,0 +1,188 @@
+{
+  "format-version" : 1,
+  "table-uuid" : "83984c52-5cba-42dd-8e30-8eb0d12b3ba5",
+  "location" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part",
+  "last-updated-ms" : 1645023419594,
+  "last-column-id" : 9,
+  "schema" : {
+    "type" : "struct",
+    "fields" : [ {
+      "id" : 1,
+      "name" : "i",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 2,
+      "name" : "p_bool",
+      "required" : false,
+      "type" : "boolean"
+    }, {
+      "id" : 3,
+      "name" : "p_int",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 4,
+      "name" : "p_bigint",
+      "required" : false,
+      "type" : "long"
+    }, {
+      "id" : 5,
+      "name" : "p_float",
+      "required" : false,
+      "type" : "float"
+    }, {
+      "id" : 6,
+      "name" : "p_double",
+      "required" : false,
+      "type" : "double"
+    }, {
+      "id" : 7,
+      "name" : "p_decimal",
+      "required" : false,
+      "type" : "decimal(6, 3)"
+    }, {
+      "id" : 8,
+      "name" : "p_date",
+      "required" : false,
+      "type" : "date"
+    }, {
+      "id" : 9,
+      "name" : "p_string",
+      "required" : false,
+      "type" : "string"
+    } ]
+  },
+  "partition-spec" : [ {
+    "name" : "p_bool",
+    "transform" : "identity",
+    "source-id" : 2,
+    "field-id" : 1000
+  }, {
+    "name" : "p_int",
+    "transform" : "identity",
+    "source-id" : 3,
+    "field-id" : 1001
+  }, {
+    "name" : "p_bigint",
+    "transform" : "identity",
+    "source-id" : 4,
+    "field-id" : 1002
+  }, {
+    "name" : "p_float",
+    "transform" : "identity",
+    "source-id" : 5,
+    "field-id" : 1003
+  }, {
+    "name" : "p_double",
+    "transform" : "identity",
+    "source-id" : 6,
+    "field-id" : 1004
+  }, {
+    "name" : "p_decimal",
+    "transform" : "identity",
+    "source-id" : 7,
+    "field-id" : 1005
+  }, {
+    "name" : "p_date",
+    "transform" : "identity",
+    "source-id" : 8,
+    "field-id" : 1006
+  }, {
+    "name" : "p_string",
+    "transform" : "identity",
+    "source-id" : 9,
+    "field-id" : 1007
+  } ],
+  "default-spec-id" : 0,
+  "partition-specs" : [ {
+    "spec-id" : 0,
+    "fields" : [ {
+      "name" : "p_bool",
+      "transform" : "identity",
+      "source-id" : 2,
+      "field-id" : 1000
+    }, {
+      "name" : "p_int",
+      "transform" : "identity",
+      "source-id" : 3,
+      "field-id" : 1001
+    }, {
+      "name" : "p_bigint",
+      "transform" : "identity",
+      "source-id" : 4,
+      "field-id" : 1002
+    }, {
+      "name" : "p_float",
+      "transform" : "identity",
+      "source-id" : 5,
+      "field-id" : 1003
+    }, {
+      "name" : "p_double",
+      "transform" : "identity",
+      "source-id" : 6,
+      "field-id" : 1004
+    }, {
+      "name" : "p_decimal",
+      "transform" : "identity",
+      "source-id" : 7,
+      "field-id" : 1005
+    }, {
+      "name" : "p_date",
+      "transform" : "identity",
+      "source-id" : 8,
+      "field-id" : 1006
+    }, {
+      "name" : "p_string",
+      "transform" : "identity",
+      "source-id" : 9,
+      "field-id" : 1007
+    } ]
+  } ],
+  "default-sort-order-id" : 0,
+  "sort-orders" : [ {
+    "order-id" : 0,
+    "fields" : [ ]
+  } ],
+  "properties" : {
+    "engine.hive.enabled" : "true",
+    "MIGRATED_TO_ICEBERG" : "true",
+    "last_modified_time" : "1645023419",
+    "EXTERNAL" : "TRUE",
+    "write.format.default" : "parquet",
+    "gc.enabled" : "TRUE",
+    "TRANSLATED_TO_EXTERNAL" : "TRUE",
+    "bucketing_version" : "2",
+    "schema.name-mapping.default" : "[ {\n  \"field-id\" : 1,\n  \"names\" : [ \"i\" ]\n}, {\n  \"field-id\" : 2,\n  \"names\" : [ \"p_bool\" ]\n}, {\n  \"field-id\" : 3,\n  \"names\" : [ \"p_int\" ]\n}, {\n  \"field-id\" : 4,\n  \"names\" : [ \"p_bigint\" ]\n}, {\n  \"field-id\" : 5,\n  \"names\" : [ \"p_float\" ]\n}, {\n  \"field-id\" : 6,\n  \"names\" : [ \"p_double\" ]\n}, {\n  \"field-id\" : 7,\n  \"names\" : [ \"p_decimal\" ]\n}, {\n  \"field-id\" : 8,\n  \"names\" : [ \"p_date\" ] [...]
+    "last_modified_by" : "boroknagyz",
+    "storage_handler" : "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler",
+    "table_type" : "ICEBERG"
+  },
+  "current-snapshot-id" : 6167994413873848621,
+  "snapshots" : [ {
+    "snapshot-id" : 6167994413873848621,
+    "timestamp-ms" : 1645023419585,
+    "summary" : {
+      "operation" : "append",
+      "added-data-files" : "1",
+      "added-records" : "2",
+      "added-files-size" : "433",
+      "changed-partition-count" : "1",
+      "total-records" : "2",
+      "total-files-size" : "433",
+      "total-data-files" : "1",
+      "total-delete-files" : "0",
+      "total-position-deletes" : "0",
+      "total-equality-deletes" : "0"
+    },
+    "manifest-list" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/metadata/snap-6167994413873848621-1-283c54cb-5a45-4a2c-bca8-4bfa0e61cdbd.avro"
+  } ],
+  "snapshot-log" : [ {
+    "timestamp-ms" : 1645023419585,
+    "snapshot-id" : 6167994413873848621
+  } ],
+  "metadata-log" : [ {
+    "timestamp-ms" : 1645023419407,
+    "metadata-file" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/iceberg_alltypes_part/metadata/v1.metadata.json"
+  } ]
+}
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/000000_0 b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/000000_0
new file mode 100644
index 0000000..87ec3f6
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/000000_0 differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/db72fbf2-f9f6-4985-8a5f-fd9f632f2c77-m0.avro b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/db72fbf2-f9f6-4985-8a5f-fd9f632f2c77-m0.avro
new file mode 100644
index 0000000..0c09f84
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/db72fbf2-f9f6-4985-8a5f-fd9f632f2c77-m0.avro differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/snap-7569365419257304230-1-db72fbf2-f9f6-4985-8a5f-fd9f632f2c77.avro b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/snap-7569365419257304230-1-db72fbf2-f9f6-4985-8a5f-fd9f632f2c77.avro
new file mode 100644
index 0000000..53ff475
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/snap-7569365419257304230-1-db72fbf2-f9f6-4985-8a5f-fd9f632f2c77.avro differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v1.metadata.json b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v1.metadata.json
new file mode 100644
index 0000000..44b307b
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v1.metadata.json
@@ -0,0 +1,164 @@
+{
+  "format-version" : 1,
+  "table-uuid" : "980a5473-eb57-4f7f-8cc1-612a1a9d46dc",
+  "location" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc",
+  "last-updated-ms" : 1645028769646,
+  "last-column-id" : 9,
+  "schema" : {
+    "type" : "struct",
+    "fields" : [ {
+      "id" : 1,
+      "name" : "i",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 2,
+      "name" : "p_bool",
+      "required" : false,
+      "type" : "boolean"
+    }, {
+      "id" : 3,
+      "name" : "p_int",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 4,
+      "name" : "p_bigint",
+      "required" : false,
+      "type" : "long"
+    }, {
+      "id" : 5,
+      "name" : "p_float",
+      "required" : false,
+      "type" : "float"
+    }, {
+      "id" : 6,
+      "name" : "p_double",
+      "required" : false,
+      "type" : "double"
+    }, {
+      "id" : 7,
+      "name" : "p_decimal",
+      "required" : false,
+      "type" : "decimal(6, 3)"
+    }, {
+      "id" : 8,
+      "name" : "p_date",
+      "required" : false,
+      "type" : "date"
+    }, {
+      "id" : 9,
+      "name" : "p_string",
+      "required" : false,
+      "type" : "string"
+    } ]
+  },
+  "partition-spec" : [ {
+    "name" : "p_bool",
+    "transform" : "identity",
+    "source-id" : 2,
+    "field-id" : 1000
+  }, {
+    "name" : "p_int",
+    "transform" : "identity",
+    "source-id" : 3,
+    "field-id" : 1001
+  }, {
+    "name" : "p_bigint",
+    "transform" : "identity",
+    "source-id" : 4,
+    "field-id" : 1002
+  }, {
+    "name" : "p_float",
+    "transform" : "identity",
+    "source-id" : 5,
+    "field-id" : 1003
+  }, {
+    "name" : "p_double",
+    "transform" : "identity",
+    "source-id" : 6,
+    "field-id" : 1004
+  }, {
+    "name" : "p_decimal",
+    "transform" : "identity",
+    "source-id" : 7,
+    "field-id" : 1005
+  }, {
+    "name" : "p_date",
+    "transform" : "identity",
+    "source-id" : 8,
+    "field-id" : 1006
+  }, {
+    "name" : "p_string",
+    "transform" : "identity",
+    "source-id" : 9,
+    "field-id" : 1007
+  } ],
+  "default-spec-id" : 0,
+  "partition-specs" : [ {
+    "spec-id" : 0,
+    "fields" : [ {
+      "name" : "p_bool",
+      "transform" : "identity",
+      "source-id" : 2,
+      "field-id" : 1000
+    }, {
+      "name" : "p_int",
+      "transform" : "identity",
+      "source-id" : 3,
+      "field-id" : 1001
+    }, {
+      "name" : "p_bigint",
+      "transform" : "identity",
+      "source-id" : 4,
+      "field-id" : 1002
+    }, {
+      "name" : "p_float",
+      "transform" : "identity",
+      "source-id" : 5,
+      "field-id" : 1003
+    }, {
+      "name" : "p_double",
+      "transform" : "identity",
+      "source-id" : 6,
+      "field-id" : 1004
+    }, {
+      "name" : "p_decimal",
+      "transform" : "identity",
+      "source-id" : 7,
+      "field-id" : 1005
+    }, {
+      "name" : "p_date",
+      "transform" : "identity",
+      "source-id" : 8,
+      "field-id" : 1006
+    }, {
+      "name" : "p_string",
+      "transform" : "identity",
+      "source-id" : 9,
+      "field-id" : 1007
+    } ]
+  } ],
+  "default-sort-order-id" : 0,
+  "sort-orders" : [ {
+    "order-id" : 0,
+    "fields" : [ ]
+  } ],
+  "properties" : {
+    "engine.hive.enabled" : "true",
+    "MIGRATED_TO_ICEBERG" : "true",
+    "last_modified_time" : "1645028769",
+    "EXTERNAL" : "TRUE",
+    "write.format.default" : "orc",
+    "gc.enabled" : "TRUE",
+    "TRANSLATED_TO_EXTERNAL" : "TRUE",
+    "bucketing_version" : "2",
+    "last_modified_by" : "boroknagyz",
+    "storage_handler" : "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler",
+    "table_type" : "ICEBERG"
+  },
+  "current-snapshot-id" : -1,
+  "snapshots" : [ ],
+  "snapshot-log" : [ ],
+  "metadata-log" : [ ]
+}
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v2.metadata.json b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v2.metadata.json
new file mode 100644
index 0000000..e3ba5b0
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v2.metadata.json
@@ -0,0 +1,188 @@
+{
+  "format-version" : 1,
+  "table-uuid" : "980a5473-eb57-4f7f-8cc1-612a1a9d46dc",
+  "location" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc",
+  "last-updated-ms" : 1645028769925,
+  "last-column-id" : 9,
+  "schema" : {
+    "type" : "struct",
+    "fields" : [ {
+      "id" : 1,
+      "name" : "i",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 2,
+      "name" : "p_bool",
+      "required" : false,
+      "type" : "boolean"
+    }, {
+      "id" : 3,
+      "name" : "p_int",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 4,
+      "name" : "p_bigint",
+      "required" : false,
+      "type" : "long"
+    }, {
+      "id" : 5,
+      "name" : "p_float",
+      "required" : false,
+      "type" : "float"
+    }, {
+      "id" : 6,
+      "name" : "p_double",
+      "required" : false,
+      "type" : "double"
+    }, {
+      "id" : 7,
+      "name" : "p_decimal",
+      "required" : false,
+      "type" : "decimal(6, 3)"
+    }, {
+      "id" : 8,
+      "name" : "p_date",
+      "required" : false,
+      "type" : "date"
+    }, {
+      "id" : 9,
+      "name" : "p_string",
+      "required" : false,
+      "type" : "string"
+    } ]
+  },
+  "partition-spec" : [ {
+    "name" : "p_bool",
+    "transform" : "identity",
+    "source-id" : 2,
+    "field-id" : 1000
+  }, {
+    "name" : "p_int",
+    "transform" : "identity",
+    "source-id" : 3,
+    "field-id" : 1001
+  }, {
+    "name" : "p_bigint",
+    "transform" : "identity",
+    "source-id" : 4,
+    "field-id" : 1002
+  }, {
+    "name" : "p_float",
+    "transform" : "identity",
+    "source-id" : 5,
+    "field-id" : 1003
+  }, {
+    "name" : "p_double",
+    "transform" : "identity",
+    "source-id" : 6,
+    "field-id" : 1004
+  }, {
+    "name" : "p_decimal",
+    "transform" : "identity",
+    "source-id" : 7,
+    "field-id" : 1005
+  }, {
+    "name" : "p_date",
+    "transform" : "identity",
+    "source-id" : 8,
+    "field-id" : 1006
+  }, {
+    "name" : "p_string",
+    "transform" : "identity",
+    "source-id" : 9,
+    "field-id" : 1007
+  } ],
+  "default-spec-id" : 0,
+  "partition-specs" : [ {
+    "spec-id" : 0,
+    "fields" : [ {
+      "name" : "p_bool",
+      "transform" : "identity",
+      "source-id" : 2,
+      "field-id" : 1000
+    }, {
+      "name" : "p_int",
+      "transform" : "identity",
+      "source-id" : 3,
+      "field-id" : 1001
+    }, {
+      "name" : "p_bigint",
+      "transform" : "identity",
+      "source-id" : 4,
+      "field-id" : 1002
+    }, {
+      "name" : "p_float",
+      "transform" : "identity",
+      "source-id" : 5,
+      "field-id" : 1003
+    }, {
+      "name" : "p_double",
+      "transform" : "identity",
+      "source-id" : 6,
+      "field-id" : 1004
+    }, {
+      "name" : "p_decimal",
+      "transform" : "identity",
+      "source-id" : 7,
+      "field-id" : 1005
+    }, {
+      "name" : "p_date",
+      "transform" : "identity",
+      "source-id" : 8,
+      "field-id" : 1006
+    }, {
+      "name" : "p_string",
+      "transform" : "identity",
+      "source-id" : 9,
+      "field-id" : 1007
+    } ]
+  } ],
+  "default-sort-order-id" : 0,
+  "sort-orders" : [ {
+    "order-id" : 0,
+    "fields" : [ ]
+  } ],
+  "properties" : {
+    "engine.hive.enabled" : "true",
+    "MIGRATED_TO_ICEBERG" : "true",
+    "last_modified_time" : "1645028769",
+    "EXTERNAL" : "TRUE",
+    "write.format.default" : "orc",
+    "gc.enabled" : "TRUE",
+    "TRANSLATED_TO_EXTERNAL" : "TRUE",
+    "bucketing_version" : "2",
+    "schema.name-mapping.default" : "[ {\n  \"field-id\" : 1,\n  \"names\" : [ \"i\" ]\n}, {\n  \"field-id\" : 2,\n  \"names\" : [ \"p_bool\" ]\n}, {\n  \"field-id\" : 3,\n  \"names\" : [ \"p_int\" ]\n}, {\n  \"field-id\" : 4,\n  \"names\" : [ \"p_bigint\" ]\n}, {\n  \"field-id\" : 5,\n  \"names\" : [ \"p_float\" ]\n}, {\n  \"field-id\" : 6,\n  \"names\" : [ \"p_double\" ]\n}, {\n  \"field-id\" : 7,\n  \"names\" : [ \"p_decimal\" ]\n}, {\n  \"field-id\" : 8,\n  \"names\" : [ \"p_date\" ] [...]
+    "last_modified_by" : "boroknagyz",
+    "storage_handler" : "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler",
+    "table_type" : "ICEBERG"
+  },
+  "current-snapshot-id" : 7569365419257304230,
+  "snapshots" : [ {
+    "snapshot-id" : 7569365419257304230,
+    "timestamp-ms" : 1645028769913,
+    "summary" : {
+      "operation" : "append",
+      "added-data-files" : "1",
+      "added-records" : "2",
+      "added-files-size" : "189",
+      "changed-partition-count" : "1",
+      "total-records" : "2",
+      "total-files-size" : "189",
+      "total-data-files" : "1",
+      "total-delete-files" : "0",
+      "total-position-deletes" : "0",
+      "total-equality-deletes" : "0"
+    },
+    "manifest-list" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/snap-7569365419257304230-1-db72fbf2-f9f6-4985-8a5f-fd9f632f2c77.avro"
+  } ],
+  "snapshot-log" : [ {
+    "timestamp-ms" : 1645028769913,
+    "snapshot-id" : 7569365419257304230
+  } ],
+  "metadata-log" : [ {
+    "timestamp-ms" : 1645028769646,
+    "metadata-file" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/v1.metadata.json"
+  } ]
+}
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/version-hint.text b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/version-hint.text
new file mode 100644
index 0000000..0cfbf08
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/metadata/version-hint.text
@@ -0,0 +1 @@
+2
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/000000_0 b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/000000_0
new file mode 100644
index 0000000..dc93311
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc/p_bool=true/p_int=1/p_bigint=11/p_float=1.1/p_double=2.222/p_decimal=123.321/p_date=2022-02-22/p_string=impala/000000_0 differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/2d05a7d4-c229-44c3-860e-e77e46e71a19-m0.avro b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/2d05a7d4-c229-44c3-860e-e77e46e71a19-m0.avro
new file mode 100644
index 0000000..1fc8c81
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/2d05a7d4-c229-44c3-860e-e77e46e71a19-m0.avro differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/snap-6654673546382518186-1-2d05a7d4-c229-44c3-860e-e77e46e71a19.avro b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/snap-6654673546382518186-1-2d05a7d4-c229-44c3-860e-e77e46e71a19.avro
new file mode 100644
index 0000000..0dc82f5
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/snap-6654673546382518186-1-2d05a7d4-c229-44c3-860e-e77e46e71a19.avro differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v1.metadata.json b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v1.metadata.json
new file mode 100644
index 0000000..8d0eebb
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v1.metadata.json
@@ -0,0 +1,89 @@
+{
+  "format-version" : 1,
+  "table-uuid" : "40f528e5-19f7-465b-86a6-6943e536b009",
+  "location" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution",
+  "last-updated-ms" : 1645033399611,
+  "last-column-id" : 4,
+  "schema" : {
+    "type" : "struct",
+    "fields" : [ {
+      "id" : 1,
+      "name" : "i",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 2,
+      "name" : "p_int_long",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 3,
+      "name" : "p_float_double",
+      "required" : false,
+      "type" : "float"
+    }, {
+      "id" : 4,
+      "name" : "p_dec_dec",
+      "required" : false,
+      "type" : "decimal(5, 3)"
+    } ]
+  },
+  "partition-spec" : [ {
+    "name" : "p_int_long",
+    "transform" : "identity",
+    "source-id" : 2,
+    "field-id" : 1000
+  }, {
+    "name" : "p_float_double",
+    "transform" : "identity",
+    "source-id" : 3,
+    "field-id" : 1001
+  }, {
+    "name" : "p_dec_dec",
+    "transform" : "identity",
+    "source-id" : 4,
+    "field-id" : 1002
+  } ],
+  "default-spec-id" : 0,
+  "partition-specs" : [ {
+    "spec-id" : 0,
+    "fields" : [ {
+      "name" : "p_int_long",
+      "transform" : "identity",
+      "source-id" : 2,
+      "field-id" : 1000
+    }, {
+      "name" : "p_float_double",
+      "transform" : "identity",
+      "source-id" : 3,
+      "field-id" : 1001
+    }, {
+      "name" : "p_dec_dec",
+      "transform" : "identity",
+      "source-id" : 4,
+      "field-id" : 1002
+    } ]
+  } ],
+  "default-sort-order-id" : 0,
+  "sort-orders" : [ {
+    "order-id" : 0,
+    "fields" : [ ]
+  } ],
+  "properties" : {
+    "engine.hive.enabled" : "true",
+    "MIGRATED_TO_ICEBERG" : "true",
+    "last_modified_time" : "1645033399",
+    "EXTERNAL" : "TRUE",
+    "write.format.default" : "parquet",
+    "gc.enabled" : "TRUE",
+    "TRANSLATED_TO_EXTERNAL" : "TRUE",
+    "bucketing_version" : "2",
+    "last_modified_by" : "boroknagyz",
+    "storage_handler" : "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler",
+    "table_type" : "ICEBERG"
+  },
+  "current-snapshot-id" : -1,
+  "snapshots" : [ ],
+  "snapshot-log" : [ ],
+  "metadata-log" : [ ]
+}
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v2.metadata.json b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v2.metadata.json
new file mode 100644
index 0000000..8f66e3c
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v2.metadata.json
@@ -0,0 +1,116 @@
+{
+  "format-version" : 1,
+  "table-uuid" : "40f528e5-19f7-465b-86a6-6943e536b009",
+  "location" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution",
+  "last-updated-ms" : 1645033713901,
+  "last-column-id" : 4,
+  "schema" : {
+    "type" : "struct",
+    "fields" : [ {
+      "id" : 2,
+      "name" : "p_int_long",
+      "required" : false,
+      "type" : "long",
+      "doc" : "from deserializer"
+    }, {
+      "id" : 3,
+      "name" : "p_float_double",
+      "required" : false,
+      "type" : "double",
+      "doc" : "from deserializer"
+    }, {
+      "id" : 4,
+      "name" : "p_dec_dec",
+      "required" : false,
+      "type" : "decimal(8, 3)",
+      "doc" : "from deserializer"
+    }, {
+      "id" : 1,
+      "name" : "i",
+      "required" : false,
+      "type" : "int",
+      "doc" : "from deserializer"
+    } ]
+  },
+  "partition-spec" : [ {
+    "name" : "p_int_long",
+    "transform" : "identity",
+    "source-id" : 2,
+    "field-id" : 1000
+  }, {
+    "name" : "p_float_double",
+    "transform" : "identity",
+    "source-id" : 3,
+    "field-id" : 1001
+  }, {
+    "name" : "p_dec_dec",
+    "transform" : "identity",
+    "source-id" : 4,
+    "field-id" : 1002
+  } ],
+  "default-spec-id" : 0,
+  "partition-specs" : [ {
+    "spec-id" : 0,
+    "fields" : [ {
+      "name" : "p_int_long",
+      "transform" : "identity",
+      "source-id" : 2,
+      "field-id" : 1000
+    }, {
+      "name" : "p_float_double",
+      "transform" : "identity",
+      "source-id" : 3,
+      "field-id" : 1001
+    }, {
+      "name" : "p_dec_dec",
+      "transform" : "identity",
+      "source-id" : 4,
+      "field-id" : 1002
+    } ]
+  } ],
+  "default-sort-order-id" : 0,
+  "sort-orders" : [ {
+    "order-id" : 0,
+    "fields" : [ ]
+  } ],
+  "properties" : {
+    "engine.hive.enabled" : "true",
+    "last_modified_time" : "1645033399",
+    "EXTERNAL" : "TRUE",
+    "write.format.default" : "parquet",
+    "gc.enabled" : "TRUE",
+    "TRANSLATED_TO_EXTERNAL" : "TRUE",
+    "bucketing_version" : "2",
+    "schema.name-mapping.default" : "[ {\n  \"field-id\" : 1,\n  \"names\" : [ \"i\" ]\n}, {\n  \"field-id\" : 2,\n  \"names\" : [ \"p_int_long\" ]\n}, {\n  \"field-id\" : 3,\n  \"names\" : [ \"p_float_double\" ]\n}, {\n  \"field-id\" : 4,\n  \"names\" : [ \"p_dec_dec\" ]\n} ]",
+    "last_modified_by" : "boroknagyz",
+    "storage_handler" : "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler",
+    "table_type" : "ICEBERG"
+  },
+  "current-snapshot-id" : 6654673546382518186,
+  "snapshots" : [ {
+    "snapshot-id" : 6654673546382518186,
+    "timestamp-ms" : 1645033399760,
+    "summary" : {
+      "operation" : "append",
+      "added-data-files" : "2",
+      "added-records" : "2",
+      "added-files-size" : "858",
+      "changed-partition-count" : "2",
+      "total-records" : "2",
+      "total-files-size" : "858",
+      "total-data-files" : "2",
+      "total-delete-files" : "0",
+      "total-position-deletes" : "0",
+      "total-equality-deletes" : "0"
+    },
+    "manifest-list" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/snap-6654673546382518186-1-2d05a7d4-c229-44c3-860e-e77e46e71a19.avro"
+  } ],
+  "snapshot-log" : [ {
+    "timestamp-ms" : 1645033399760,
+    "snapshot-id" : 6654673546382518186
+  } ],
+  "metadata-log" : [ {
+    "timestamp-ms" : 1645033399611,
+    "metadata-file" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/v1.metadata.json"
+  } ]
+}
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/version-hint.text b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/version-hint.text
new file mode 100644
index 0000000..0cfbf08
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/metadata/version-hint.text
@@ -0,0 +1 @@
+2
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/p_int_long=1/p_float_double=1.1/p_dec_dec=2.718/000000_0 b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/p_int_long=1/p_float_double=1.1/p_dec_dec=2.718/000000_0
new file mode 100644
index 0000000..22cb5d9
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/p_int_long=1/p_float_double=1.1/p_dec_dec=2.718/000000_0 differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/p_int_long=1/p_float_double=1.1/p_dec_dec=3.141/000000_0 b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/p_int_long=1/p_float_double=1.1/p_dec_dec=3.141/000000_0
new file mode 100644
index 0000000..e4e3fad
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution/p_int_long=1/p_float_double=1.1/p_dec_dec=3.141/000000_0 differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/8db62f0e-38e5-434b-94dc-c84210302ad8-m0.avro b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/8db62f0e-38e5-434b-94dc-c84210302ad8-m0.avro
new file mode 100644
index 0000000..5ffdbd7
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/8db62f0e-38e5-434b-94dc-c84210302ad8-m0.avro differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/snap-888589552112488046-1-8db62f0e-38e5-434b-94dc-c84210302ad8.avro b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/snap-888589552112488046-1-8db62f0e-38e5-434b-94dc-c84210302ad8.avro
new file mode 100644
index 0000000..a7eabaa
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/snap-888589552112488046-1-8db62f0e-38e5-434b-94dc-c84210302ad8.avro differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/v1.metadata.json b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/v1.metadata.json
new file mode 100644
index 0000000..1ff481a
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/v1.metadata.json
@@ -0,0 +1,89 @@
+{
+  "format-version" : 1,
+  "table-uuid" : "40f528e5-19f7-465b-86a6-6943e536b009",
+  "location" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc",
+  "last-updated-ms" : 1645033399611,
+  "last-column-id" : 4,
+  "schema" : {
+    "type" : "struct",
+    "fields" : [ {
+      "id" : 1,
+      "name" : "i",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 2,
+      "name" : "p_int_long",
+      "required" : false,
+      "type" : "int"
+    }, {
+      "id" : 3,
+      "name" : "p_float_double",
+      "required" : false,
+      "type" : "float"
+    }, {
+      "id" : 4,
+      "name" : "p_dec_dec",
+      "required" : false,
+      "type" : "decimal(5, 3)"
+    } ]
+  },
+  "partition-spec" : [ {
+    "name" : "p_int_long",
+    "transform" : "identity",
+    "source-id" : 2,
+    "field-id" : 1000
+  }, {
+    "name" : "p_float_double",
+    "transform" : "identity",
+    "source-id" : 3,
+    "field-id" : 1001
+  }, {
+    "name" : "p_dec_dec",
+    "transform" : "identity",
+    "source-id" : 4,
+    "field-id" : 1002
+  } ],
+  "default-spec-id" : 0,
+  "partition-specs" : [ {
+    "spec-id" : 0,
+    "fields" : [ {
+      "name" : "p_int_long",
+      "transform" : "identity",
+      "source-id" : 2,
+      "field-id" : 1000
+    }, {
+      "name" : "p_float_double",
+      "transform" : "identity",
+      "source-id" : 3,
+      "field-id" : 1001
+    }, {
+      "name" : "p_dec_dec",
+      "transform" : "identity",
+      "source-id" : 4,
+      "field-id" : 1002
+    } ]
+  } ],
+  "default-sort-order-id" : 0,
+  "sort-orders" : [ {
+    "order-id" : 0,
+    "fields" : [ ]
+  } ],
+  "properties" : {
+    "engine.hive.enabled" : "true",
+    "MIGRATED_TO_ICEBERG" : "true",
+    "last_modified_time" : "1645033399",
+    "EXTERNAL" : "TRUE",
+    "write.format.default" : "orc",
+    "gc.enabled" : "TRUE",
+    "TRANSLATED_TO_EXTERNAL" : "TRUE",
+    "bucketing_version" : "2",
+    "last_modified_by" : "boroknagyz",
+    "storage_handler" : "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler",
+    "table_type" : "ICEBERG"
+  },
+  "current-snapshot-id" : -1,
+  "snapshots" : [ ],
+  "snapshot-log" : [ ],
+  "metadata-log" : [ ]
+}
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/v2.metadata.json b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/v2.metadata.json
new file mode 100644
index 0000000..ab35977
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/v2.metadata.json
@@ -0,0 +1,116 @@
+{
+  "format-version" : 1,
+  "table-uuid" : "40f528e5-19f7-465b-86a6-6943e536b009",
+  "location" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc",
+  "last-updated-ms" : 1645033713901,
+  "last-column-id" : 4,
+  "schema" : {
+    "type" : "struct",
+    "fields" : [ {
+      "id" : 2,
+      "name" : "p_int_long",
+      "required" : false,
+      "type" : "long",
+      "doc" : "from deserializer"
+    }, {
+      "id" : 3,
+      "name" : "p_float_double",
+      "required" : false,
+      "type" : "double",
+      "doc" : "from deserializer"
+    }, {
+      "id" : 4,
+      "name" : "p_dec_dec",
+      "required" : false,
+      "type" : "decimal(8, 3)",
+      "doc" : "from deserializer"
+    }, {
+      "id" : 1,
+      "name" : "i",
+      "required" : false,
+      "type" : "int",
+      "doc" : "from deserializer"
+    } ]
+  },
+  "partition-spec" : [ {
+    "name" : "p_int_long",
+    "transform" : "identity",
+    "source-id" : 2,
+    "field-id" : 1000
+  }, {
+    "name" : "p_float_double",
+    "transform" : "identity",
+    "source-id" : 3,
+    "field-id" : 1001
+  }, {
+    "name" : "p_dec_dec",
+    "transform" : "identity",
+    "source-id" : 4,
+    "field-id" : 1002
+  } ],
+  "default-spec-id" : 0,
+  "partition-specs" : [ {
+    "spec-id" : 0,
+    "fields" : [ {
+      "name" : "p_int_long",
+      "transform" : "identity",
+      "source-id" : 2,
+      "field-id" : 1000
+    }, {
+      "name" : "p_float_double",
+      "transform" : "identity",
+      "source-id" : 3,
+      "field-id" : 1001
+    }, {
+      "name" : "p_dec_dec",
+      "transform" : "identity",
+      "source-id" : 4,
+      "field-id" : 1002
+    } ]
+  } ],
+  "default-sort-order-id" : 0,
+  "sort-orders" : [ {
+    "order-id" : 0,
+    "fields" : [ ]
+  } ],
+  "properties" : {
+    "engine.hive.enabled" : "true",
+    "last_modified_time" : "1645033399",
+    "EXTERNAL" : "TRUE",
+    "write.format.default" : "orc",
+    "gc.enabled" : "TRUE",
+    "TRANSLATED_TO_EXTERNAL" : "TRUE",
+    "bucketing_version" : "2",
+    "schema.name-mapping.default" : "[ {\n  \"field-id\" : 1,\n  \"names\" : [ \"i\" ]\n}, {\n  \"field-id\" : 2,\n  \"names\" : [ \"p_int_long\" ]\n}, {\n  \"field-id\" : 3,\n  \"names\" : [ \"p_float_double\" ]\n}, {\n  \"field-id\" : 4,\n  \"names\" : [ \"p_dec_dec\" ]\n} ]",
+    "last_modified_by" : "boroknagyz",
+    "storage_handler" : "org.apache.iceberg.mr.hive.HiveIcebergStorageHandler",
+    "table_type" : "ICEBERG"
+  },
+  "current-snapshot-id" : 6654673546382518186,
+  "snapshots" : [ {
+    "snapshot-id" : 6654673546382518186,
+    "timestamp-ms" : 1645033399760,
+    "summary" : {
+      "operation" : "append",
+      "added-data-files" : "2",
+      "added-records" : "2",
+      "added-files-size" : "858",
+      "changed-partition-count" : "2",
+      "total-records" : "2",
+      "total-files-size" : "858",
+      "total-data-files" : "2",
+      "total-delete-files" : "0",
+      "total-position-deletes" : "0",
+      "total-equality-deletes" : "0"
+    },
+    "manifest-list" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/snap-888589552112488046-1-8db62f0e-38e5-434b-94dc-c84210302ad8.avro"
+  } ],
+  "snapshot-log" : [ {
+    "timestamp-ms" : 1645033399760,
+    "snapshot-id" : 6654673546382518186
+  } ],
+  "metadata-log" : [ {
+    "timestamp-ms" : 1645033399611,
+    "metadata-file" : "/test-warehouse/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/v1.metadata.json"
+  } ]
+}
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/version-hint.text b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/version-hint.text
new file mode 100644
index 0000000..0cfbf08
--- /dev/null
+++ b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/metadata/version-hint.text
@@ -0,0 +1 @@
+2
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/p_int_long=1/p_float_double=1.1/p_dec_dec=2.718/000000_0 b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/p_int_long=1/p_float_double=1.1/p_dec_dec=2.718/000000_0
new file mode 100644
index 0000000..a5c19a0
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/p_int_long=1/p_float_double=1.1/p_dec_dec=2.718/000000_0 differ
diff --git a/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/p_int_long=1/p_float_double=1.1/p_dec_dec=3.141/000000_0 b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/p_int_long=1/p_float_double=1.1/p_dec_dec=3.141/000000_0
new file mode 100644
index 0000000..423367f
Binary files /dev/null and b/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc/p_int_long=1/p_float_double=1.1/p_dec_dec=3.141/000000_0 differ
diff --git a/testdata/datasets/functional/functional_schema_template.sql b/testdata/datasets/functional/functional_schema_template.sql
index a6e55e1..21d5c16 100644
--- a/testdata/datasets/functional/functional_schema_template.sql
+++ b/testdata/datasets/functional/functional_schema_template.sql
@@ -3179,6 +3179,62 @@ hadoop fs -put -f ${IMPALA_HOME}/testdata/data/iceberg_test/hadoop_catalog/ice/c
 ---- DATASET
 functional
 ---- BASE_TABLE_NAME
+iceberg_alltypes_part
+---- CREATE
+CREATE EXTERNAL TABLE IF NOT EXISTS {db_name}{db_suffix}.{table_name}
+STORED AS ICEBERG
+TBLPROPERTIES('iceberg.catalog'='hadoop.catalog',
+              'iceberg.catalog_location'='/test-warehouse/iceberg_test/hadoop_catalog',
+              'iceberg.table_identifier'='ice.iceberg_alltypes_part');
+---- DEPENDENT_LOAD
+`hadoop fs -mkdir -p /test-warehouse/iceberg_test/hadoop_catalog/ice && \
+hadoop fs -put -f ${IMPALA_HOME}/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part /test-warehouse/iceberg_test/hadoop_catalog/ice
+====
+---- DATASET
+functional
+---- BASE_TABLE_NAME
+iceberg_alltypes_part_orc
+---- CREATE
+CREATE EXTERNAL TABLE IF NOT EXISTS {db_name}{db_suffix}.{table_name}
+STORED AS ICEBERG
+TBLPROPERTIES('write.format.default'='orc', 'iceberg.catalog'='hadoop.catalog',
+              'iceberg.catalog_location'='/test-warehouse/iceberg_test/hadoop_catalog',
+              'iceberg.table_identifier'='ice.iceberg_alltypes_part_orc');
+---- DEPENDENT_LOAD
+`hadoop fs -mkdir -p /test-warehouse/iceberg_test/hadoop_catalog/ice && \
+hadoop fs -put -f ${IMPALA_HOME}/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_alltypes_part_orc /test-warehouse/iceberg_test/hadoop_catalog/ice
+====
+---- DATASET
+functional
+---- BASE_TABLE_NAME
+iceberg_legacy_partition_schema_evolution
+---- CREATE
+CREATE EXTERNAL TABLE IF NOT EXISTS {db_name}{db_suffix}.{table_name}
+STORED AS ICEBERG
+TBLPROPERTIES('iceberg.catalog'='hadoop.catalog',
+              'iceberg.catalog_location'='/test-warehouse/iceberg_test/hadoop_catalog',
+              'iceberg.table_identifier'='ice.iceberg_legacy_partition_schema_evolution');
+---- DEPENDENT_LOAD
+`hadoop fs -mkdir -p /test-warehouse/iceberg_test/hadoop_catalog/ice && \
+hadoop fs -put -f ${IMPALA_HOME}/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution /test-warehouse/iceberg_test/hadoop_catalog/ice
+====
+---- DATASET
+functional
+---- BASE_TABLE_NAME
+iceberg_legacy_partition_schema_evolution_orc
+---- CREATE
+CREATE EXTERNAL TABLE IF NOT EXISTS {db_name}{db_suffix}.{table_name}
+STORED AS ICEBERG
+TBLPROPERTIES('write.format.default'='orc', 'iceberg.catalog'='hadoop.catalog',
+              'iceberg.catalog_location'='/test-warehouse/iceberg_test/hadoop_catalog',
+              'iceberg.table_identifier'='ice.iceberg_legacy_partition_schema_evolution_orc');
+---- DEPENDENT_LOAD
+`hadoop fs -mkdir -p /test-warehouse/iceberg_test/hadoop_catalog/ice && \
+hadoop fs -put -f ${IMPALA_HOME}/testdata/data/iceberg_test/hadoop_catalog/ice/iceberg_legacy_partition_schema_evolution_orc /test-warehouse/iceberg_test/hadoop_catalog/ice
+====
+---- DATASET
+functional
+---- BASE_TABLE_NAME
 airports_orc
 ---- CREATE
 CREATE EXTERNAL TABLE IF NOT EXISTS {db_name}{db_suffix}.{table_name}
diff --git a/testdata/datasets/functional/schema_constraints.csv b/testdata/datasets/functional/schema_constraints.csv
index d34334e..0bb7711 100644
--- a/testdata/datasets/functional/schema_constraints.csv
+++ b/testdata/datasets/functional/schema_constraints.csv
@@ -73,6 +73,10 @@ table_name:iceberg_partitioned, constraint:restrict_to, table_format:parquet/non
 table_name:iceberg_partitioned_orc_external, constraint:restrict_to, table_format:parquet/none/none
 table_name:iceberg_partition_transforms_zorder, constraint:restrict_to, table_format:parquet/none/none
 table_name:iceberg_resolution_test_external, constraint:restrict_to, table_format:parquet/none/none
+table_name:iceberg_alltypes_part, constraint:restrict_to, table_format:parquet/none/none
+table_name:iceberg_alltypes_part_orc, constraint:restrict_to, table_format:parquet/none/none
+table_name:iceberg_legacy_partition_schema_evolution, constraint:restrict_to, table_format:parquet/none/none
+table_name:iceberg_legacy_partition_schema_evolution_orc, constraint:restrict_to, table_format:parquet/none/none
 
 # TODO: Support Avro. Data loading currently fails for Avro because complex types
 # cannot be converted to the corresponding Avro types yet.
diff --git a/testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-tables.test b/testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-tables.test
new file mode 100644
index 0000000..db40813
--- /dev/null
+++ b/testdata/workloads/functional-query/queries/QueryTest/iceberg-migrated-tables.test
@@ -0,0 +1,106 @@
+====
+---- QUERY
+# Read everything from a partitioned table with all the
+# data types that support partitioning, when the underlying
+# data file format is Parquet.
+select * from functional_parquet.iceberg_alltypes_part
+---- RESULTS
+1,true,1,11,1.100000023841858,2.222,123.321,2022-02-22,'impala'
+2,true,1,11,1.100000023841858,2.222,123.321,2022-02-22,'impala'
+---- TYPES
+INT, BOOLEAN, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, DATE, STRING
+====
+---- QUERY
+# Read only the partition columns.
+select p_bool, p_int, p_bigint, p_float,
+       p_double, p_decimal, p_date, p_string
+from functional_parquet.iceberg_alltypes_part;
+---- RESULTS
+true,1,11,1.100000023841858,2.222,123.321,2022-02-22,'impala'
+true,1,11,1.100000023841858,2.222,123.321,2022-02-22,'impala'
+---- TYPES
+BOOLEAN, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, DATE, STRING
+====
+---- QUERY
+# Read everything from a partitioned table with all the
+# data types that support partitioning, when the underlying
+# data file format is ORC.
+select * from functional_parquet.iceberg_alltypes_part_orc
+---- RESULTS
+1,true,1,11,1.100000023841858,2.222,123.321,2022-02-22,'impala'
+2,true,1,11,1.100000023841858,2.222,123.321,2022-02-22,'impala'
+---- TYPES
+INT, BOOLEAN, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, DATE, STRING
+====
+---- QUERY
+# Read only the partition columns.
+select p_bool, p_int, p_bigint, p_float,
+       p_double, p_decimal, p_date, p_string
+from functional_parquet.iceberg_alltypes_part_orc;
+---- RESULTS
+true,1,11,1.100000023841858,2.222,123.321,2022-02-22,'impala'
+true,1,11,1.100000023841858,2.222,123.321,2022-02-22,'impala'
+---- TYPES
+BOOLEAN, INT, BIGINT, FLOAT, DOUBLE, DECIMAL, DATE, STRING
+====
+---- QUERY
+# Read a migrated partitioned Parquet table that had the following schema
+# changes since table migration:
+# * Partition INT column to BIGINT
+# * Partition FLOAT column to DOUBLE
+# * Partition DECIMAL(5,3) column to DECIMAL(8,3)
+# * Non-partition column has been moved to end of the schema
+select * from functional_parquet.iceberg_legacy_partition_schema_evolution
+---- RESULTS
+1,1.100000023841858,2.718,2
+1,1.100000023841858,3.141,1
+---- TYPES
+BIGINT, DOUBLE, DECIMAL, INT
+====
+---- QUERY
+# Read only the partition columns.
+select p_int_long, p_float_double, p_dec_dec
+from functional_parquet.iceberg_legacy_partition_schema_evolution;
+---- RESULTS
+1,1.100000023841858,3.141
+1,1.100000023841858,2.718
+---- TYPES
+BIGINT, DOUBLE, DECIMAL
+====
+---- QUERY
+# Read a migrated partitioned ORC table that had the following schema
+# changes since table migration:
+# * Partition INT column to BIGINT
+# * Partition FLOAT column to DOUBLE
+# * Partition DECIMAL(5,3) column to DECIMAL(8,3)
+# * Non-partition column has been moved to end of the schema
+# Currently this fails due to IMPALA-9410
+select * from functional_parquet.iceberg_legacy_partition_schema_evolution_orc
+---- CATCH
+Parse error in possibly corrupt ORC file
+====
+---- QUERY
+# Read only the partition columns.
+select p_int_long, p_float_double, p_dec_dec
+from functional_parquet.iceberg_legacy_partition_schema_evolution_orc;
+---- RESULTS
+1,1.100000023841858,3.141
+1,1.100000023841858,2.718
+---- TYPES
+BIGINT, DOUBLE, DECIMAL
+====
+---- QUERY
+# Create a table that is identity-partitioned by all of its columns.
+create table only_part_cols (i int, s string)
+partitioned by spec (i, s)
+stored as iceberg;
+insert into only_part_cols values (1, 'i'), (1, 'i'), (2, 's'), (2, 'q');
+select * from only_part_cols;
+---- RESULTS
+1,'i'
+1,'i'
+2,'q'
+2,'s'
+---- TYPES
+INT, STRING
+====
diff --git a/tests/query_test/test_iceberg.py b/tests/query_test/test_iceberg.py
index 45f72d9..7444ff9 100644
--- a/tests/query_test/test_iceberg.py
+++ b/tests/query_test/test_iceberg.py
@@ -99,6 +99,9 @@ class TestIcebergTable(ImpalaTestSuite):
   def test_missing_field_ids(self, vector):
     self.run_test_case('QueryTest/iceberg-missing-field-ids', vector)
 
+  def test_migrated_tables(self, vector, unique_database):
+    self.run_test_case('QueryTest/iceberg-migrated-tables', vector, unique_database)
+
   def test_describe_history(self, vector, unique_database):
     self.run_test_case('QueryTest/iceberg-table-history', vector, use_db=unique_database)