You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rezasafi <gi...@git.apache.org> on 2018/08/20 18:53:24 UTC

[GitHub] spark pull request #15673: [SPARK-17992][SQL] Return all partitions from Hiv...

Github user rezasafi commented on a diff in the pull request:

    https://github.com/apache/spark/pull/15673#discussion_r211370552
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala ---
    @@ -586,17 +587,31 @@ private[client] class Shim_v0_13 extends Shim_v0_12 {
             getAllPartitionsMethod.invoke(hive, table).asInstanceOf[JSet[Partition]]
           } else {
             logDebug(s"Hive metastore filter is '$filter'.")
    +        val tryDirectSqlConfVar = HiveConf.ConfVars.METASTORE_TRY_DIRECT_SQL
    +        val tryDirectSql =
    +          hive.getConf.getBoolean(tryDirectSqlConfVar.varname, tryDirectSqlConfVar.defaultBoolVal)
             try {
    +          // Hive may throw an exception when calling this method in some circumstances, such as
    +          // when filtering on a non-string partition column when the hive config key
    +          // hive.metastore.try.direct.sql is false
               getPartitionsByFilterMethod.invoke(hive, table, filter)
                 .asInstanceOf[JArrayList[Partition]]
             } catch {
    -          case e: InvocationTargetException =>
    -            // SPARK-18167 retry to investigate the flaky test. This should be reverted before
    -            // the release is cut.
    -            val retry = Try(getPartitionsByFilterMethod.invoke(hive, table, filter))
    -            logError("getPartitionsByFilter failed, retry success = " + retry.isSuccess)
    -            logError("all partitions: " + getAllPartitions(hive, table))
    -            throw e
    +          case ex: InvocationTargetException if ex.getCause.isInstanceOf[MetaException] &&
    +              !tryDirectSql =>
    +            logWarning("Caught Hive MetaException attempting to get partition metadata by " +
    +              "filter from Hive. Falling back to fetching all partition metadata, which will " +
    +              "degrade performance. Modifying your Hive metastore configuration to set " +
    +              s"${tryDirectSqlConfVar.varname} to true may resolve this problem.", ex)
    +            // HiveShim clients are expected to handle a superset of the requested partitions
    +            getAllPartitionsMethod.invoke(hive, table).asInstanceOf[JSet[Partition]]
    +          case ex: InvocationTargetException if ex.getCause.isInstanceOf[MetaException] &&
    +              tryDirectSql =>
    +            throw new RuntimeException("Caught Hive MetaException attempting to get partition " +
    --- End diff --
    
    @mallman sorry to disturb you here, but what is the reason that when direct sql isn't set only a warning is logged?and why when direct sql is set a runtime exception is being raised instead of just a warning like no direct sql case?   


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org