You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by jo...@apache.org on 2019/05/16 21:56:14 UTC

[impala] branch master updated (d4648e8 -> 2e7f689)

This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git.


    from d4648e8  IMPALA-4356,IMPALA-7331: codegen all ScalarExprs
     new d873922  Fix condition for starting YARN on USE_CDP_HIVE=true
     new f203dfa  IMPALA-8369 : Skip test_owner_privileges test when running against Hive-3
     new 2e7f689  IMPALA-8438: Store WriteId and ValidWriteId list for table and partition

The 3 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 common/thrift/CatalogObjects.thrift                |  9 +++
 .../hadoop/hive/common/ValidWriteIdList.java       | 74 ++++++++++++++++++++++
 .../org/apache/impala/compat/MetastoreShim.java    | 40 ++++++++++++
 .../org/apache/impala/compat/MetastoreShim.java    | 53 +++++++++++++++-
 .../apache/impala/analysis/StmtMetadataLoader.java | 17 +++++
 .../org/apache/impala/catalog/DataSourceTable.java | 11 +++-
 .../org/apache/impala/catalog/FeCatalogUtils.java  |  2 +
 .../org/apache/impala/catalog/FeFsPartition.java   |  7 ++
 .../java/org/apache/impala/catalog/FeTable.java    | 10 +++
 .../org/apache/impala/catalog/HdfsPartition.java   | 16 +++++
 .../java/org/apache/impala/catalog/HdfsTable.java  |  2 +
 .../main/java/org/apache/impala/catalog/Table.java | 55 ++++++++++++++++
 .../impala/catalog/local/LocalFsPartition.java     |  6 ++
 .../apache/impala/catalog/local/LocalTable.java    | 11 +++-
 .../impala/analysis/StmtMetadataLoaderTest.java    | 38 +++++++++++
 testdata/bin/run-hive-server.sh                    |  2 +-
 testdata/cluster/admin                             |  2 +-
 tests/authorization/test_owner_privileges.py       |  2 +
 tests/common/skip.py                               |  7 +-
 19 files changed, 358 insertions(+), 6 deletions(-)
 create mode 100644 fe/src/compat-hive-2/java/org/apache/hadoop/hive/common/ValidWriteIdList.java


[impala] 03/03: IMPALA-8438: Store WriteId and ValidWriteId list for table and partition

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 2e7f689e26510a66e701fb7767a6ee2e7a097963
Author: Yongzhi Chen <yc...@cloudera.com>
AuthorDate: Thu May 2 12:24:39 2019 -0400

    IMPALA-8438: Store WriteId and ValidWriteId list for table and partition
    
    This happens when tables load metadata from HMS.
    Add MetastoreShim functions to support HMS3 only functions.
    Add validwriteIdlists to query profile through timeline.
    
    Tests:
    Manually tests HMS2 and HMS3, using log files to check
    Unit tests against HMS3
    
    ToDo:
    WriteId and valid writeIds can be fetched in other time, need
    more study on that.
    
    Profile example:
        Query Compilation: 5s057ms
           - Metadata load started: 63.006ms (63.006ms)
           - Metadata load finished. loaded-tables=2/2...: 4s801ms (4s738ms)
           - Loaded ValidWriteIdLists:
               acid.insert_only_no_partitions:6:9223372036854775807::
               acid.insert_only_with_partitions:3:9223372036854775807::
                 : 4s921ms (120.580ms)
           - Analysis finished: 4s929ms (8.013ms)
    Change-Id: I6edbd64424edf0ba88af110ab8b958a1966b8b54
    Reviewed-on: http://gerrit.cloudera.org:8080/13215
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 common/thrift/CatalogObjects.thrift                |  9 +++
 .../hadoop/hive/common/ValidWriteIdList.java       | 74 ++++++++++++++++++++++
 .../org/apache/impala/compat/MetastoreShim.java    | 40 ++++++++++++
 .../org/apache/impala/compat/MetastoreShim.java    | 53 +++++++++++++++-
 .../apache/impala/analysis/StmtMetadataLoader.java | 17 +++++
 .../org/apache/impala/catalog/DataSourceTable.java | 11 +++-
 .../org/apache/impala/catalog/FeCatalogUtils.java  |  2 +
 .../org/apache/impala/catalog/FeFsPartition.java   |  7 ++
 .../java/org/apache/impala/catalog/FeTable.java    | 10 +++
 .../org/apache/impala/catalog/HdfsPartition.java   | 16 +++++
 .../java/org/apache/impala/catalog/HdfsTable.java  |  2 +
 .../main/java/org/apache/impala/catalog/Table.java | 55 ++++++++++++++++
 .../impala/catalog/local/LocalFsPartition.java     |  6 ++
 .../apache/impala/catalog/local/LocalTable.java    | 11 +++-
 .../impala/analysis/StmtMetadataLoaderTest.java    | 38 +++++++++++
 15 files changed, 348 insertions(+), 3 deletions(-)

diff --git a/common/thrift/CatalogObjects.thrift b/common/thrift/CatalogObjects.thrift
index 4682fb7..64bd4cf 100644
--- a/common/thrift/CatalogObjects.thrift
+++ b/common/thrift/CatalogObjects.thrift
@@ -303,6 +303,9 @@ struct THdfsPartition {
   // Set to true if partition_stats contain intermediate column stats computed via
   // incremental statistics, false otherwise.
   19: optional bool has_incremental_stats
+
+  // For acid table, store last committed write id.
+  20: optional i64 write_id
 }
 
 // Constant partition ID used for THdfsPartition.prototype_partition below.
@@ -472,6 +475,12 @@ struct TTable {
 
   // Set iff this a kudu table
   13: optional TKuduTable kudu_table
+
+  // Set iff this is an acid table. The valid write ids list.
+  // The string is assumed to be created by ValidWriteIdList.writeToString
+  // For example ValidReaderWriteIdList object's format is:
+  // <table_name>:<highwatermark>:<minOpenWriteId>:<open_writeids>:<abort_writeids>
+  14: optional string valid_write_ids
 }
 
 // Represents a database.
diff --git a/fe/src/compat-hive-2/java/org/apache/hadoop/hive/common/ValidWriteIdList.java b/fe/src/compat-hive-2/java/org/apache/hadoop/hive/common/ValidWriteIdList.java
new file mode 100644
index 0000000..1257d1e
--- /dev/null
+++ b/fe/src/compat-hive-2/java/org/apache/hadoop/hive/common/ValidWriteIdList.java
@@ -0,0 +1,74 @@
+// // Licensed to the Apache Software Foundation (ASF) under one
+// // or more contributor license agreements.  See the NOTICE file
+// // distributed with this work for additional information
+// // regarding copyright ownership.  The ASF licenses this file
+// // to you under the Apache License, Version 2.0 (the
+// // "License"); you may not use this file except in compliance
+// // with the License.  You may obtain a copy of the License at
+// //
+// //   http://www.apache.org/licenses/LICENSE-2.0
+// //
+// // Unless required by applicable law or agreed to in writing,
+// // software distributed under the License is distributed on an
+// // "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// // KIND, either express or implied.  See the License for the
+// // specific language governing permissions and limitations
+// // under the License.
+package org.apache.hadoop.hive.common;
+
+/**
+ * ValidWriteIdList is not supported in Hive 2
+ */
+public class ValidWriteIdList {
+  enum RangeResponse {NONE, SOME, ALL};
+
+  public boolean isWriteIdValid(long writeId) {
+    throw new UnsupportedOperationException("isWriteIdValid not supported for "
+        + getClass().getName());
+  }
+
+  public boolean isValidBase(long writeId) {
+    throw new UnsupportedOperationException("isValidBase not supported for "
+        + getClass().getName());
+  }
+
+  public RangeResponse isWriteIdRangeValid(long minWriteId, long maxWriteId) {
+    throw new UnsupportedOperationException("isWriteIdRangeValid not supported for "
+        + getClass().getName());
+  }
+
+  public String writeToString() {
+    throw new UnsupportedOperationException("writeToStringd not supported for "
+        + getClass().getName());
+  }
+
+  public void readFromString(String src) {
+    throw new UnsupportedOperationException("readFromString not supported for "
+        + getClass().getName());
+  }
+
+  public long getHighWatermark() {
+    throw new UnsupportedOperationException("getHighWatermark not supported for "
+        + getClass().getName());
+  }
+
+  public long[] getInvalidWriteIds() {
+    throw new UnsupportedOperationException("getInvalidWriteIds not supported for "
+        + getClass().getName());
+  }
+
+  public boolean isWriteIdAborted(long writeId) {
+    throw new UnsupportedOperationException("isWriteIdAborted not supported for "
+        + getClass().getName());
+  }
+
+  public RangeResponse isWriteIdRangeAborted(long minWriteId, long maxWriteId) {
+    throw new UnsupportedOperationException(
+        "isWriteIdRangeAborted not supported for " + getClass().getName());
+  }
+
+  public Long getMinOpenWriteId() {
+    throw new UnsupportedOperationException("getMinOpenWriteId not supported for "
+        + getClass().getName());
+  }
+}
diff --git a/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java b/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
index 8ceb127..c79dca9 100644
--- a/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
+++ b/fe/src/compat-hive-2/java/org/apache/impala/compat/MetastoreShim.java
@@ -28,6 +28,7 @@ import java.util.List;
 
 import org.apache.hadoop.hive.common.FileUtils;
 import org.apache.hadoop.hive.common.StatsSetupConst;
+import org.apache.hadoop.hive.common.ValidWriteIdList;
 import org.apache.hadoop.hive.ql.metadata.formatting.MetaDataFormatUtils;
 import org.apache.hadoop.hive.conf.HiveConf;
 import org.apache.hadoop.hive.metastore.IMetaStoreClient;
@@ -222,4 +223,43 @@ public class MetastoreShim {
   public static String unescapeSQLString(String normalizedStringLiteral) {
     return BaseSemanticAnalyzer.unescapeSQLString(normalizedStringLiteral);
   }
+
+  /**
+   * This is Hive-3 only function
+   */
+  public static ValidWriteIdList fetchValidWriteIds(IMetaStoreClient client,
+      String tableFullName) {
+    throw new UnsupportedOperationException("fetchValidWriteIds not supported");
+  }
+
+  /**
+   * Hive-3 only function
+   */
+  public static ValidWriteIdList getValidWriteIdListFromString(String validWriteIds) {
+    throw new UnsupportedOperationException(
+        "getValidWriteIdListFromString not supported");
+  }
+
+  /**
+   * Hive-3 only function
+   * -1 means undefined
+   */
+  public static long getWriteIdFromMSPartition(Partition partition) {
+    return -1L;
+  }
+
+  /**
+   *  Hive-3 only function
+   *  -1 means undefined
+   */
+  public static long getWriteIdFromMSTable(Table msTbl) {
+    return -1L;
+  }
+
+  /**
+   * @return the shim version.
+   */
+  public static long getMajorVersion() {
+    return 2;
+  }
 }
diff --git a/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java b/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
index ee22c2b..ea194a7 100644
--- a/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
+++ b/fe/src/compat-hive-3/java/org/apache/impala/compat/MetastoreShim.java
@@ -26,6 +26,8 @@ import java.util.EnumSet;
 import java.util.List;
 
 import org.apache.hadoop.hive.common.StatsSetupConst;
+import org.apache.hadoop.hive.common.ValidReaderWriteIdList;
+import org.apache.hadoop.hive.common.ValidWriteIdList;
 import org.apache.hadoop.hive.metastore.IMetaStoreClient;
 import org.apache.hadoop.hive.metastore.TableType;
 import org.apache.hadoop.hive.metastore.Warehouse;
@@ -57,9 +59,11 @@ import org.apache.impala.thrift.TMetadataOpRequest;
 import org.apache.impala.thrift.TResultSet;
 import org.apache.thrift.TException;
 
+import com.google.common.base.Preconditions;
+
 /**
  * A wrapper around some of Hive's Metastore API's to abstract away differences
- * between major versions of Hive. This implements the shimmed methods for Hive 2.
+ * between major versions of Hive. This implements the shimmed methods for Hive 3.
  */
 public class MetastoreShim {
   /**
@@ -364,4 +368,51 @@ public class MetastoreShim {
       return sb.toString();
     }
   }
+  /**
+   * Get valid write ids from HMS for the acid table
+   * @param client the client to access HMS
+   * @param tableFullName the name for the table
+   * @return ValidWriteIdList object
+   */
+  public static ValidWriteIdList fetchValidWriteIds(IMetaStoreClient client,
+      String tableFullName) throws TException {
+    return client.getValidWriteIds(tableFullName);
+  }
+
+  /**
+   * Get ValidWriteIdList object by given string
+   * @param validWriteIds ValidWriteIdList object in String
+   * @return ValidWriteIdList object
+   */
+  public static ValidWriteIdList getValidWriteIdListFromString(String validWriteIds) {
+    Preconditions.checkNotNull(validWriteIds);
+    return new ValidReaderWriteIdList(validWriteIds);
+  }
+
+  /**
+   * Wrapper around HMS Partition object to get writeID
+   * WriteID is introduced in ACID 2
+   * It is used to detect changes of the partition
+   */
+  public static long getWriteIdFromMSPartition(Partition partition) {
+    Preconditions.checkNotNull(partition);
+    return partition.getWriteId();
+  }
+
+  /**
+   * Wrapper around HMS Table object to get writeID
+   * Per table writeId is introduced in ACID 2
+   * It is used to detect changes of the table
+   */
+  public static long getWriteIdFromMSTable(Table msTbl) {
+    Preconditions.checkNotNull(msTbl);
+    return msTbl.getWriteId();
+  }
+
+  /**
+   * @return the shim major version
+   */
+  public static long getMajorVersion() {
+    return 3;
+  }
 }
diff --git a/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java b/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
index 32902e3..1898da5 100644
--- a/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
+++ b/fe/src/main/java/org/apache/impala/analysis/StmtMetadataLoader.java
@@ -26,9 +26,13 @@ import java.util.Set;
 
 import org.apache.impala.catalog.FeCatalog;
 import org.apache.impala.catalog.FeDb;
+import org.apache.impala.catalog.FeFsTable;
 import org.apache.impala.catalog.FeTable;
 import org.apache.impala.catalog.FeView;
+import org.apache.impala.catalog.HdfsPartition;
+import org.apache.impala.catalog.PrunablePartition;
 import org.apache.impala.common.InternalException;
+import org.apache.impala.compat.MetastoreShim;
 import org.apache.impala.service.Frontend;
 import org.apache.impala.util.EventSequence;
 import org.apache.impala.util.TUniqueIdUtil;
@@ -227,8 +231,21 @@ public class StmtMetadataLoader {
           "loaded-tables=%d/%d load-requests=%d catalog-updates=%d",
           requestedTbls.size(), loadedTbls_.size(), numLoadRequestsSent_,
           numCatalogUpdatesReceived_));
+
+      if (MetastoreShim.getMajorVersion() > 2) {
+        StringBuilder validIdsBuf = new StringBuilder("Loaded ValidWriteIdLists: ");
+        for (FeTable iTbl : loadedTbls_.values()) {
+          validIdsBuf.append("\n");
+          validIdsBuf.append("           ");
+          validIdsBuf.append(iTbl.getValidWriteIds());
+        }
+        validIdsBuf.append("\n");
+        validIdsBuf.append("             ");
+        timeline_.markEvent(validIdsBuf.toString());
+      }
     }
     fe_.getImpaladTableUsageTracker().recordTableUsage(loadedTbls_.keySet());
+
     return new StmtTableCache(catalog, dbs_, loadedTbls_);
   }
 
diff --git a/fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java b/fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
index 7d700b4..ee90c6f 100644
--- a/fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
+++ b/fe/src/main/java/org/apache/impala/catalog/DataSourceTable.java
@@ -24,7 +24,6 @@ import org.apache.hadoop.hive.metastore.IMetaStoreClient;
 import org.apache.hadoop.hive.metastore.api.FieldSchema;
 import org.slf4j.Logger;
 import org.slf4j.LoggerFactory;
-
 import org.apache.impala.extdatasource.v1.ExternalDataSource;
 import org.apache.impala.thrift.TCatalogObjectType;
 import org.apache.impala.thrift.TColumn;
@@ -108,6 +107,16 @@ public class DataSourceTable extends Table implements FeDataSourceTable {
     return isSupportedPrimitiveType(colType.getPrimitiveType());
   }
 
+  @Override
+  public long getWriteId() {
+    return -1;
+  }
+
+  @Override
+  public String getValidWriteIds() {
+    return null;
+  }
+
   /**
    * Returns true if the primitive type is supported.
    */
diff --git a/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java b/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
index 7e55d73..8ab8671 100644
--- a/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
+++ b/fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
@@ -330,6 +330,8 @@ public abstract class FeCatalogUtils {
         sd.getBlockSize());
     thriftHdfsPart.setId(part.getId());
     thriftHdfsPart.setLocation(part.getLocationAsThrift());
+    if (part.getWriteId() >= 0)
+      thriftHdfsPart.setWrite_id(part.getWriteId());
     if (type == ThriftObjectType.FULL) {
       thriftHdfsPart.setStats(new TTableStats(part.getNumRows()));
       thriftHdfsPart.setAccess_level(part.getAccessLevel());
diff --git a/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java b/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
index 00c90af..60cbb7a 100644
--- a/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
+++ b/fe/src/main/java/org/apache/impala/catalog/FeFsPartition.java
@@ -167,4 +167,11 @@ public interface FeFsPartition {
    * getPartitionStatsCompressed().
    */
   Map<String, String> getParameters();
+
+  /**
+   * @return the writeId stored in hms for the partition
+   * -1 means write Id is undefined.
+   */
+  long getWriteId();
+
 }
diff --git a/fe/src/main/java/org/apache/impala/catalog/FeTable.java b/fe/src/main/java/org/apache/impala/catalog/FeTable.java
index 4c4c37d..ca56ca7 100644
--- a/fe/src/main/java/org/apache/impala/catalog/FeTable.java
+++ b/fe/src/main/java/org/apache/impala/catalog/FeTable.java
@@ -136,4 +136,14 @@ public interface FeTable {
    */
   TTableDescriptor toThriftDescriptor(int tableId, Set<Long> referencedPartitions);
 
+  /**
+   * @return the write id for this table
+   */
+  long getWriteId();
+
+  /**
+   * @return the valid write id list for this table
+   */
+  String getValidWriteIds();
+
 }
diff --git a/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java b/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
index a97d0eb..8832ce8 100644
--- a/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
+++ b/fe/src/main/java/org/apache/impala/catalog/HdfsPartition.java
@@ -41,6 +41,7 @@ import org.apache.impala.common.FileSystemUtil;
 import org.apache.impala.common.ImpalaException;
 import org.apache.impala.common.Pair;
 import org.apache.impala.common.Reference;
+import org.apache.impala.compat.MetastoreShim;
 import org.apache.impala.fb.FbCompression;
 import org.apache.impala.fb.FbFileBlock;
 import org.apache.impala.fb.FbFileDesc;
@@ -613,6 +614,10 @@ public class HdfsPartition implements FeFsPartition, PrunablePartition {
   // True if partitionStats_ has intermediate_col_stats populated.
   private boolean hasIncrementalStats_ ;
 
+  // The last committed write ID which modified this partition.
+  // -1 means writeId_ is irrelevant(not supported).
+  private long writeId_ = -1L;
+
   private HdfsPartition(HdfsTable table,
       org.apache.hadoop.hive.metastore.api.Partition msPartition,
       List<LiteralExpr> partitionKeyValues,
@@ -641,6 +646,9 @@ public class HdfsPartition implements FeFsPartition, PrunablePartition {
     extractAndCompressPartStats();
     // Intern parameters after removing the incremental stats
     hmsParameters_ = CatalogInterners.internParameters(hmsParameters_);
+    if (MetastoreShim.getMajorVersion() > 2 && msPartition != null) {
+      writeId_ = MetastoreShim.getWriteIdFromMSPartition(msPartition);
+    }
   }
 
   public HdfsPartition(HdfsTable table,
@@ -1014,6 +1022,9 @@ public class HdfsPartition implements FeFsPartition, PrunablePartition {
       partition.partitionStats_ = thriftPartition.getPartition_stats();
     }
 
+    partition.writeId_ = thriftPartition.isSetWrite_id() ?
+        thriftPartition.getWrite_id() : -1L;
+
     return partition;
   }
 
@@ -1054,4 +1065,9 @@ public class HdfsPartition implements FeFsPartition, PrunablePartition {
     }
     return 0;
   }
+
+  @Override
+  public long getWriteId() {
+    return writeId_;
+  }
 }
diff --git a/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java b/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
index 8ff3445..b4d2531 100644
--- a/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
+++ b/fe/src/main/java/org/apache/impala/catalog/HdfsTable.java
@@ -921,6 +921,8 @@ public class HdfsTable extends Table implements FeFsTable {
                 MetaStoreUtil.getNullPartitionKeyValue(client).intern();
             loadSchema(msTbl);
             loadAllColumnStats(client);
+            //TODO writeIDs may also be loaded in other code paths.
+            loadValidWriteIdList(client);
         }
         // Load partition and file metadata
         if (reuseMetadata) {
diff --git a/fe/src/main/java/org/apache/impala/catalog/Table.java b/fe/src/main/java/org/apache/impala/catalog/Table.java
index 24d9917..373da49 100644
--- a/fe/src/main/java/org/apache/impala/catalog/Table.java
+++ b/fe/src/main/java/org/apache/impala/catalog/Table.java
@@ -27,12 +27,14 @@ import java.util.Set;
 import java.util.concurrent.atomic.AtomicLong;
 import java.util.concurrent.locks.ReentrantLock;
 
+import org.apache.hadoop.hive.common.ValidWriteIdList;
 import org.apache.hadoop.hive.metastore.IMetaStoreClient;
 import org.apache.hadoop.hive.metastore.TableType;
 import org.apache.hadoop.hive.metastore.api.ColumnStatisticsData;
 import org.apache.hadoop.hive.metastore.api.ColumnStatisticsObj;
 import org.apache.hadoop.hive.metastore.api.FieldSchema;
 import org.apache.impala.analysis.TableName;
+import org.apache.impala.compat.MetastoreShim;
 import org.apache.impala.common.ImpalaRuntimeException;
 import org.apache.impala.common.Metrics;
 import org.apache.impala.common.Pair;
@@ -116,6 +118,9 @@ public abstract class Table extends CatalogObjectImpl implements FeTable {
   // impalad.
   protected long lastUsedTime_;
 
+  // Valid write id list for this table
+  protected String validWriteIds_ = null;
+
   // maximum number of catalog versions to store for in-flight events for this table
   private static final int MAX_NUMBER_OF_INFLIGHT_EVENTS = 10;
 
@@ -269,6 +274,41 @@ public abstract class Table extends CatalogObjectImpl implements FeTable {
   }
 
   /**
+   * Get valid write ids for the acid table.
+   * @param client the client to access HMS
+   * @return the list of valid write IDs for the table
+   */
+  protected String fetchValidWriteIds(IMetaStoreClient client)
+      throws TableLoadingException {
+    String tblFullName = getFullName();
+    if (LOG.isTraceEnabled()) LOG.trace("Get valid writeIds for table: " + tblFullName);
+    String writeIds = null;
+    try {
+      ValidWriteIdList validWriteIds = MetastoreShim.fetchValidWriteIds(client,
+          tblFullName);
+      writeIds = validWriteIds == null ? null : validWriteIds.writeToString();
+    } catch (Exception e) {
+      throw new TableLoadingException(String.format("Error loading ValidWriteIds for " +
+          "table '%s'", getName()), e);
+    }
+    if (LOG.isTraceEnabled()) {
+      LOG.trace("Valid writeIds: " + writeIds);
+    }
+    return writeIds;
+  }
+
+  /**
+   * Set ValistWriteIdList with stored writeId
+   * @param client the client to access HMS
+   */
+  protected void loadValidWriteIdList(IMetaStoreClient client)
+      throws TableLoadingException {
+    if (MetastoreShim.getMajorVersion() > 2) {
+      validWriteIds_ = fetchValidWriteIds(client);
+    }
+  }
+
+  /**
    * Creates a table of the appropriate type based on the given hive.metastore.api.Table
    * object.
    */
@@ -347,6 +387,8 @@ public abstract class Table extends CatalogObjectImpl implements FeTable {
         TAccessLevel.READ_WRITE;
 
     storedInImpaladCatalogCache_ = true;
+    validWriteIds_ = thriftTable.isSetValid_write_ids() ?
+        thriftTable.getValid_write_ids() : null;
   }
 
   /**
@@ -396,6 +438,9 @@ public abstract class Table extends CatalogObjectImpl implements FeTable {
 
     table.setMetastore_table(getMetaStoreTable());
     table.setTable_stats(tableStats_);
+    if (validWriteIds_ != null) {
+      table.setValid_write_ids(validWriteIds_);
+    }
     return table;
   }
 
@@ -646,4 +691,14 @@ public abstract class Table extends CatalogObjectImpl implements FeTable {
     versionsForInflightEvents_.add(versionNumber);
     return true;
   }
+
+  @Override
+  public long getWriteId() {
+    return MetastoreShim.getWriteIdFromMSTable(msTable_);
+  }
+
+  @Override
+  public String getValidWriteIds() {
+    return validWriteIds_;
+  }
 }
diff --git a/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java b/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
index 140ff6b..5f7ebcc 100644
--- a/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
+++ b/fe/src/main/java/org/apache/impala/catalog/local/LocalFsPartition.java
@@ -34,6 +34,7 @@ import org.apache.impala.catalog.HdfsStorageDescriptor;
 import org.apache.impala.catalog.HdfsStorageDescriptor.InvalidStorageDescriptorException;
 import org.apache.impala.catalog.PartitionStatsUtil;
 import org.apache.impala.common.FileSystemUtil;
+import org.apache.impala.compat.MetastoreShim;
 import org.apache.impala.thrift.TAccessLevel;
 import org.apache.impala.thrift.THdfsPartitionLocation;
 import org.apache.impala.thrift.TPartitionStats;
@@ -218,4 +219,9 @@ public class LocalFsPartition implements FeFsPartition {
   public Map<String, String> getParameters() {
     return msPartition_.getParameters();
   }
+
+  @Override
+  public long getWriteId() {
+    return MetastoreShim.getWriteIdFromMSPartition(msPartition_);
+  }
 }
diff --git a/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java b/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
index c544890..edb0e06 100644
--- a/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
+++ b/fe/src/main/java/org/apache/impala/catalog/local/LocalTable.java
@@ -247,6 +247,16 @@ abstract class LocalTable implements FeTable {
     return tableStats_;
   }
 
+  @Override
+  public long getWriteId() {
+    return -1l;
+  }
+
+  @Override
+  public String getValidWriteIds() {
+    return null;
+  }
+
   protected void loadColumnStats() {
     try {
       List<ColumnStatisticsObj> stats = db_.getCatalog().getMetaProvider()
@@ -325,7 +335,6 @@ abstract class LocalTable implements FeTable {
       return Column.toColumnNames(colsByPos_);
     }
 
-
     private static StructType columnsToStructType(List<Column> cols) {
       List<StructField> fields = Lists.newArrayListWithCapacity(cols.size());
       for (Column col : cols) {
diff --git a/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java b/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
index 65dde6b..b429edc 100644
--- a/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
+++ b/fe/src/test/java/org/apache/impala/analysis/StmtMetadataLoaderTest.java
@@ -25,10 +25,12 @@ import org.apache.impala.catalog.Catalog;
 import org.apache.impala.catalog.FeTable;
 import org.apache.impala.common.ImpalaException;
 import org.apache.impala.common.InternalException;
+import org.apache.impala.compat.MetastoreShim;
 import org.apache.impala.service.Frontend;
 import org.apache.impala.testutil.ImpaladTestCatalog;
 import org.apache.impala.util.EventSequence;
 import org.junit.Assert;
+import org.junit.Ignore;
 import org.junit.Test;
 
 public class StmtMetadataLoaderTest {
@@ -56,6 +58,19 @@ public class StmtMetadataLoaderTest {
     }
   }
 
+  private void testLoadAcidTables(String stmtStr)
+      throws ImpalaException {
+    try (ImpaladTestCatalog catalog = new ImpaladTestCatalog()) {
+      Frontend fe = new Frontend(new NoopAuthorizationFactory(), catalog);
+      StatementBase stmt = Parser.parse(stmtStr);
+      EventSequence timeline = new EventSequence("Test Timeline");
+      StmtMetadataLoader mdLoader =
+          new StmtMetadataLoader(fe, Catalog.DEFAULT_DB, timeline);
+      StmtTableCache stmtTableCache = mdLoader.loadTables(stmt);
+      validateTablesWriteIds(stmtTableCache);
+    }
+  }
+
   private void validateDbs(StmtTableCache stmtTableCache, String[] expectedDbs) {
     String[] actualDbs = new String[stmtTableCache.dbs.size()];
     actualDbs = stmtTableCache.dbs.toArray(actualDbs);
@@ -76,6 +91,17 @@ public class StmtMetadataLoaderTest {
     Assert.assertArrayEquals(expectedTables, actualTables);
   }
 
+  private void validateTablesWriteIds(StmtTableCache stmtTableCache) {
+    for (FeTable t: stmtTableCache.tables.values()) {
+      //TODO may check if it is acid table in the future
+      Assert.assertTrue(t.isLoaded());
+      Assert.assertTrue(t.getValidWriteIds() != null);
+      Assert.assertTrue(MetastoreShim.getValidWriteIdListFromString(t.getValidWriteIds())
+          .isWriteIdValid(t.getWriteId()));
+    }
+  }
+
+
   private void validateUncached(StatementBase stmt, Frontend fe,
       int expectedNumLoadRequests, int expectedNumCatalogUpdates,
       String[] expectedDbs, String[] expectedTables) throws InternalException {
@@ -200,4 +226,16 @@ public class StmtMetadataLoaderTest {
     testLoadTables("refresh functional.alltypes partition (year=2009, month=1)", 1, 1,
         new String[] {"default", "functional"}, new String[] {"functional.alltypes"});
   }
+
+  @Ignore
+  @Test
+  public void testTableWriteID() throws ImpalaException {
+    if (MetastoreShim.getMajorVersion() == 2)
+      return;
+    // ToDo this assumes the acid tables have been created.
+    // They will be available after IMPALA-8439 is checked in.
+    // Ignore the test for now.
+    testLoadAcidTables("select * from acid.insert_only_no_partitions");
+    testLoadAcidTables("select * from acid.insert_only_with_partitions");
+  }
 }


[impala] 02/03: IMPALA-8369 : Skip test_owner_privileges test when running against Hive-3

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit f203dfa983f90624e371b229e1f4f8898b7106d4
Author: Vihang Karajgaonkar <vi...@cloudera.com>
AuthorDate: Tue May 14 17:51:28 2019 -0700

    IMPALA-8369 : Skip test_owner_privileges test when running against Hive-3
    
    Currently, when running with USE_CDP_HIVE=true, Sentry service's sync
    with HMS is very slow. This is most likely due to the fact that in HMS-3
    the notification events are generated using the JSONMessageFactory
    provided by Metastore, unlike in case of HMS-2 setup. When running
    against HMS-2, Sentry provides its own MessageFactory implementation
    which has its limitations and cannot be used in HMS-3. In order to fix
    this Sentry should add support for the out-of-box message factory
    available in Hive-3 (See SENTRY-2518).
    
    Due to these additional delays from Sentry test_owner_privileges fails
    due to race conditions between the cached information in catalog and
    Sentry server (See IMPALA-8550). This patch disables this
    test when running against HMS-3 until we fix the issues both on
    the Sentry and Impala side.
    
    Testing done:
    1. Confirmed the test is skipped when using USE_CDP_HIVE=true
    2. Confirmed the test is not skipped when using USE_CDP_HIVE=false
    
    Change-Id: I9f904446f50b5095443bf27b3092a2e3665b76d3
    Reviewed-on: http://gerrit.cloudera.org:8080/13339
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 tests/authorization/test_owner_privileges.py | 2 ++
 tests/common/skip.py                         | 7 ++++++-
 2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tests/authorization/test_owner_privileges.py b/tests/authorization/test_owner_privileges.py
index ea0e4ac..4a183e9 100644
--- a/tests/authorization/test_owner_privileges.py
+++ b/tests/authorization/test_owner_privileges.py
@@ -24,6 +24,7 @@ from os import getenv
 
 from tests.common.sentry_cache_test_suite import SentryCacheTestSuite, TestObject
 from tests.common.test_dimensions import create_uncompressed_text_dimension
+from tests.common.skip import SkipIfHive3
 
 # Sentry long polling frequency to make Sentry refresh not run.
 SENTRY_LONG_POLLING_FREQUENCY_S = 3600
@@ -35,6 +36,7 @@ SENTRY_CONFIG_FILE_OO_NOGRANT = SENTRY_CONFIG_DIR + 'sentry-site_oo_nogrant.xml'
 SENTRY_CONFIG_FILE_NO_OO = SENTRY_CONFIG_DIR + 'sentry-site_no_oo.xml'
 
 
+@SkipIfHive3.sentry_not_supported
 class TestOwnerPrivileges(SentryCacheTestSuite):
   @classmethod
   def add_test_dimensions(cls):
diff --git a/tests/common/skip.py b/tests/common/skip.py
index 8c2bdfb..e1b060a 100644
--- a/tests/common/skip.py
+++ b/tests/common/skip.py
@@ -25,7 +25,7 @@ import pytest
 from functools import partial
 
 from tests.common.environ import (IMPALA_TEST_CLUSTER_PROPERTIES,
-    IS_DOCKERIZED_TEST_CLUSTER, IS_BUGGY_EL6_KERNEL)
+    IS_DOCKERIZED_TEST_CLUSTER, IS_BUGGY_EL6_KERNEL, HIVE_MAJOR_VERSION)
 from tests.common.kudu_test_suite import get_kudu_master_flag
 from tests.util.filesystem_utils import (
     IS_ABFS,
@@ -199,6 +199,11 @@ class SkipIfDockerizedCluster:
       reason="IMPALA-8384: insert ACL tests are broken on dockerised minicluster.")
 
 
+class SkipIfHive3:
+  sentry_not_supported = pytest.mark.skipif(HIVE_MAJOR_VERSION >= 3,
+      reason="Sentry HMS follower does not work with HMS-3. See SENTRY-2518 for details")
+
+
 class SkipIfCatalogV2:
   """Expose decorators as methods so that is_catalog_v2_cluster() can be evaluated lazily
   when needed, instead of whenever this module is imported."""


[impala] 01/03: Fix condition for starting YARN on USE_CDP_HIVE=true

Posted by jo...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

joemcdonnell pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit d873922e717d4ef5b78f13b3cf677281d1c7b502
Author: Joe McDonnell <jo...@cloudera.com>
AuthorDate: Wed May 15 10:23:17 2019 -0700

    Fix condition for starting YARN on USE_CDP_HIVE=true
    
    Dataload on Hive 3 uses YARN, but YARN is not needed for Hive 2
    dataload. This fixes the condition so that YARN does not start
    for USE_CDP_HIVE=false (Hive 2). This fixes a similar condition
    for manipulating the classpath in testdata/bin/run-hive-server.sh.
    
    Change-Id: If2b0a529eadd5a436f0318229600180f72a27207
    Reviewed-on: http://gerrit.cloudera.org:8080/13343
    Reviewed-by: Todd Lipcon <to...@apache.org>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 testdata/bin/run-hive-server.sh | 2 +-
 testdata/cluster/admin          | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/testdata/bin/run-hive-server.sh b/testdata/bin/run-hive-server.sh
index 473a654..a2f4bba 100755
--- a/testdata/bin/run-hive-server.sh
+++ b/testdata/bin/run-hive-server.sh
@@ -70,7 +70,7 @@ suspend=n,address=30010"
 # CDH Hive metastore scripts do not do so. This is currently to make sure that we can run
 # all the tests including sentry tests
 # TODO: This can be removed when we move to Ranger completely
-if [[ $USE_CDP_HIVE && -n "$SENTRY_HOME" ]]; then
+if [[ "$USE_CDP_HIVE" = "true" && -n "$SENTRY_HOME" ]]; then
   for f in ${SENTRY_HOME}/lib/sentry-binding-hive*.jar; do
     FILE_NAME=$(basename $f)
     # exclude all the hive jars from being included in the classpath since Sentry
diff --git a/testdata/cluster/admin b/testdata/cluster/admin
index 9eafd8c..6a5ea33 100755
--- a/testdata/cluster/admin
+++ b/testdata/cluster/admin
@@ -35,7 +35,7 @@ setup_report_build_error
 : ${INCLUDE_YARN=}
 
 # For Hive 3, we require Yarn for Tez support.
-if [[ $USE_CDP_HIVE ]]; then
+if "$USE_CDP_HIVE"; then
   INCLUDE_YARN=1
 fi