You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "parthchandra (via GitHub)" <gi...@apache.org> on 2024/03/13 00:32:11 UTC

[PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

parthchandra opened a new pull request, #45488:
URL: https://github.com/apache/spark/pull/45488

   ### What changes were proposed in this pull request?
   
   This addresses SPARK-47289 and adds a new section in explain plan where Spark extensions can add additional information for end users. The section is included in the output only if the relevant configuration is enabled and if the extension actually adds some new information
   
   ### Why are the changes needed?
   Extensions to Spark can add their own planning rules and sometimes may need to add additional information about how the plan was generated. This is useful for end users in determining if the extensions rules are working as intended.
   
   
   ### Does this PR introduce _any_ user-facing change?
   This PR increases the information logged in the UI in the query plan. The attached screenshot shows output from an extension which provides some of its own operations but does not support some operations. 
   
   <img width="703" alt="Screenshot 2024-03-11 at 10 23 36 AM" src="https://github.com/apache/spark/assets/6529136/88594772-f85e-4fd4-8eac-33017ef0c1c6">
   
   ### How was this patch tested?
   Unit test and manual testing
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1550642869


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +372,26 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    try {
+      val generator = sparkSession.sparkContext.conf.get(EXTENDED_EXPLAIN_PROVIDER.key)
+      try {
+        val extensionClass = Utils.classForName(generator)
+        val extension = extensionClass.getConstructor().newInstance()
+          .asInstanceOf[ExtendedExplainGenerator]

Review Comment:
   Good suggestion. I was not aware of this method. It is much cleaner to use.
   Ironically, we do not use this method to load session extensions!



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +372,26 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    try {
+      val generator = sparkSession.sparkContext.conf.get(EXTENDED_EXPLAIN_PROVIDER.key)

Review Comment:
   It was originally a session config, then after a previous review comment, made it static. Thinking about it again, it seems it might be better to allow it as a session conf. The extended info can then be turned on/off for specific sessions among other cases.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/ExtendedExplainGenerator.scala:
##########
@@ -0,0 +1,26 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+/**
+ * A trait for a session extension to implement that provides addition explain plan
+ * information.
+ */
+trait ExtendedExplainGenerator {

Review Comment:
   Good point. Thanks!



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +372,26 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    try {
+      val generator = sparkSession.sparkContext.conf.get(EXTENDED_EXPLAIN_PROVIDER.key)
+      try {
+        val extensionClass = Utils.classForName(generator)
+        val extension = extensionClass.getConstructor().newInstance()
+          .asInstanceOf[ExtendedExplainGenerator]
+        append("\n== Extended Information ==\n")
+        append(extension.generateExtendedInfo(plan))
+      } catch {
+        case e@(_: ClassCastException |
+                _: ClassNotFoundException |
+                _: NoClassDefFoundError) =>

Review Comment:
   Done



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/ExtendedExplainGenerator.scala:
##########
@@ -0,0 +1,26 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1550953116


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +373,23 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    val generators = sparkSession.sessionState.conf.getConf(SQLConf.EXTENDED_EXPLAIN_PROVIDERS)
+      .getOrElse(Seq.empty)
+    val extensions = Utils.loadExtensions(classOf[ExtendedExplainGenerator],
+      generators,
+      sparkSession.sparkContext.conf)
+    if (extensions.nonEmpty) {
+      append("\n== Extended Information ==\n")

Review Comment:
   Since we support more than one `ExtendedExplainGenerator`, shall we add a new method `def title: String` in the trait and included in here like `== Extended Information ($title) ==`?



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +373,23 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    val generators = sparkSession.sessionState.conf.getConf(SQLConf.EXTENDED_EXPLAIN_PROVIDERS)
+      .getOrElse(Seq.empty)
+    val extensions = Utils.loadExtensions(classOf[ExtendedExplainGenerator],
+      generators,
+      sparkSession.sparkContext.conf)
+    if (extensions.nonEmpty) {
+      append("\n== Extended Information ==\n")

Review Comment:
   Since we support more than one `ExtendedExplainGenerator`, shall we add a new method `def title: String` in the trait and included it here like `== Extended Information ($title) ==`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #45488:
URL: https://github.com/apache/spark/pull/45488#issuecomment-2038654508

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #45488:
URL: https://github.com/apache/spark/pull/45488#issuecomment-2037419059

   BTW, `connect-client-jvm` seems to start to fail in CIs.
   ```
   Do connect-client-jvm module mima check ...
   finish connect-client-jvm module mima check ...
   ERROR: Comparing Client jar: /__w/spark/spark/connector/connect/client/jvm/target/scala-2.13/spark-connect-client-jvm_2.13-4.0.0-SNAPSHOT.jar and Sql jar: /__w/spark/spark/sql/core/target/scala-2.13/spark-sql_2.13-4.0.0-SNAPSHOT.jar 
   problems with Sql module: 
   interface org.apache.spark.sql.ExtendedExplainGenerator does not have a correspondent in client version
   Exceptions to binary compatibility can be added in 'CheckConnectJvmClientCompatibility#checkMiMaCompatibilityWithSqlModule'
   connect-client-jvm module mima check failed.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534357046


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -1519,7 +1526,7 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
-  val V2_BUCKETING_SHUFFLE_ENABLED =
+ val V2_BUCKETING_SHUFFLE_ENABLED =

Review Comment:
   fixed



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -1529,7 +1536,7 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
-   val V2_BUCKETING_ALLOW_JOIN_KEYS_SUBSET_OF_PARTITION_KEYS =
+  val V2_BUCKETING_ALLOW_JOIN_KEYS_SUBSET_OF_PARTITION_KEYS =

Review Comment:
   reverted this change



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -1160,7 +1167,7 @@ object SQLConf {
     .intConf
     .createWithDefault(4096)
 
-  val PARQUET_FIELD_ID_WRITE_ENABLED =
+   val PARQUET_FIELD_ID_WRITE_ENABLED =

Review Comment:
   My IDE keeps changing code for me and I haven't figured out what to do to fix it. I'll try to correct these auto-changes manually in future efforts. (Thank you for being patient about this)!



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -861,7 +868,7 @@ object SQLConf {
       .booleanConf
       .createWithDefault(true)
 
-  val ADAPTIVE_REBALANCE_PARTITIONS_SMALL_PARTITION_FACTOR =
+  val  ADAPTIVE_REBALANCE_PARTITIONS_SMALL_PARTITION_FACTOR =

Review Comment:
   Done. Really sorry about this reformatting. My IDE keeps changing code I have not modified. 



##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -543,6 +544,33 @@ class SparkSessionExtensionSuite extends SparkFunSuite with SQLHelper with Adapt
       }
     }
   }
+
+  test("explain info") {

Review Comment:
   changed



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -4101,15 +4108,15 @@ object SQLConf {
     .booleanConf
     .createWithDefault(false)
 
-  val LEGACY_INTEGER_GROUPING_ID =
+   val LEGACY_INTEGER_GROUPING_ID =
     buildConf("spark.sql.legacy.integerGroupingId")
       .internal()
       .doc("When true, grouping_id() returns int values instead of long values.")
       .version("3.1.0")
       .booleanConf
       .createWithDefault(false)
 
-  val LEGACY_GROUPING_ID_WITH_APPENDED_USER_GROUPBY =
+   val LEGACY_GROUPING_ID_WITH_APPENDED_USER_GROUPBY =

Review Comment:
   fixed



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -333,6 +333,13 @@ object SQLConf {
     .booleanConf
     .createWithDefault(true)
 
+  val PLAN_EXPLAIN_EXTENSIONS_ENABLED = buildConf("spark.sql.plan.explain.extensions.enabled")

Review Comment:
   Changed to `spark.sql.enableExtensionInfo`



##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -543,6 +544,33 @@ class SparkSessionExtensionSuite extends SparkFunSuite with SQLHelper with Adapt
       }
     }
   }
+
+  test("explain info") {
+    val extensions = create { extensions =>
+      extensions.injectQueryStagePrepRule { _ => AddTagRule() }
+    }
+    withSession(extensions) { session =>
+      withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
+        SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {

Review Comment:
   Indented 2 more spaces



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -648,6 +702,8 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
 object QueryPlan extends PredicateHelper {
   val OP_ID_TAG = TreeNodeTag[Int]("operatorId")
   val CODEGEN_ID_TAG = new TreeNodeTag[Int]("wholeStageCodegenId")
+  val EXPLAIN_PLAN_INFO_SIMPLE = new TreeNodeTag[String]("explainPlanInfoSimple")
+  val EXPLAIN_PLAN_INFO = new TreeNodeTag[String]("explainPlanInfo")

Review Comment:
   Added better comments explaining the reason for the two methods.



##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -583,6 +611,24 @@ case class MyParser(spark: SparkSession, delegate: ParserInterface) extends Pars
     delegate.parseQuery(sqlText)
 }
 
+case class AddTagRule() extends Rule[SparkPlan] {

Review Comment:
   Good suggestion. Changed.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -464,6 +464,60 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
     s"($opId) $nodeName$codegenId"
   }
 
+  def extensionsInfo(mode: String = "simple"): String = {
+    mode match {
+      case "simple" =>
+        summaryExtensionsInfoSimple().mkString("\n")
+      case _ =>
+        val sb = new StringBuilder
+        summaryExtensionsInfo(0, sb)
+        sb.toString
+    }
+  }
+
+  def summaryExtensionsInfo(indent: Int, sb: StringBuilder): Unit = {
+    var indentLevel = indent
+    sb.append(
+      getTagValue(QueryPlan.EXPLAIN_PLAN_INFO).map(t => {
+        indentLevel = indentLevel + 1
+        "  " * indentLevel + t + "\n"
+      }).getOrElse("")
+    )
+    if (innerChildren.nonEmpty) {
+      innerChildren.foreach(_.summaryExtensionsInfo(indentLevel, sb))
+    }
+    if (children.nonEmpty) {
+      children.foreach(_.summaryExtensionsInfo(indentLevel, sb))
+    }
+  }
+
+  // Sometimes all that is needed is the root level information
+  // (especially with errors during planning)
+  def summaryExtensionsInfoSimple(): mutable.Set[String] = {
+    val sb = mutable.Set[String]() // don't allow duplicates

Review Comment:
   Change the variable name. Also added an explanation for why dupes are removed in the comment.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -4101,15 +4108,15 @@ object SQLConf {
     .booleanConf
     .createWithDefault(false)
 
-  val LEGACY_INTEGER_GROUPING_ID =
+   val LEGACY_INTEGER_GROUPING_ID =

Review Comment:
   fixed



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -4864,6 +4871,8 @@ class SQLConf extends Serializable with Logging with SqlApiConf {
 
   def planChangeBatches: Option[String] = getConf(PLAN_CHANGE_LOG_BATCHES)
 
+  def planExplainExtensionsEnabled: Boolean = getConf(PLAN_EXPLAIN_EXTENSIONS_ENABLED)

Review Comment:
   Changed to singular



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -464,6 +464,60 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
     s"($opId) $nodeName$codegenId"
   }
 
+  def extensionsInfo(mode: String = "simple"): String = {

Review Comment:
   Good idea. I've moved this method out of this class into `QueryExecution` because it already has the appropriate `ExplainMode` defined (it's a trait though, not an enum, but it is already available and for the same use case).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on PR #45488:
URL: https://github.com/apache/spark/pull/45488#issuecomment-2040035343

   Thank you, @parthchandra and @cloud-fan .
   This is much better. :) 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1535858713


##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -583,6 +611,24 @@ case class MyParser(spark: SparkSession, delegate: ParserInterface) extends Pars
     delegate.parseQuery(sqlText)
 }
 
+case class ExplainPlanInfoTagRule() extends Rule[SparkPlan] {

Review Comment:
   Let me see if I understand this right - 
   A session extension implementation can implement the trait above and set the implementation in a config. If such an implementation is provided, then explain plan can print it. We can replace the `enableExtensionInfo` flag with this new conf.
   The contents and  formatting of the extended explain string etc, now become part of the session extension and so we can remove the changes in `QueryPlan`. The extension implementors will have to do more work, but likely will have greater flexibility.
   Let me know if I misunderstood. Otherwise I'll go ahead with this change.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on PR #45488:
URL: https://github.com/apache/spark/pull/45488#issuecomment-1995049132

   @dongjoon-hyun github actions are enabled in my repository and the branch is based on the latest commit in master. In my repo the ci checks are shown as passing. 
   <img width="1609" alt="Screenshot 2024-03-13 at 10 05 15 AM" src="https://github.com/apache/spark/assets/6529136/3e177782-1700-4262-a330-6def7533837d">
   I've never encountered this situation in github before. Any suggestions? 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1549249140


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/ExtendedExplainGenerator.scala:
##########
@@ -0,0 +1,26 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+/**
+ * A trait for a session extension to implement that provides addition explain plan
+ * information.
+ */
+trait ExtendedExplainGenerator {

Review Comment:
   let's add
   ```
   @DeveloperApi
   @Since("4.0.0")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1548386888


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/QueryExecutionSuite.scala:
##########
@@ -16,17 +16,20 @@
  */
 package org.apache.spark.sql.execution
 
+import scala.collection.mutable
 import scala.io.Source
-

Review Comment:
   nit. Could you run `dev/lint-scala`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534127854


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -464,6 +464,60 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
     s"($opId) $nodeName$codegenId"
   }
 
+  def extensionsInfo(mode: String = "simple"): String = {

Review Comment:
   Shall we use `enum` instead of `String` type? For me, the following code in test suite looks fragile and error-prone.
   ```scala
   val info = df.queryExecution.executedPlan.extensionsInfo("extended")
   ```



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -464,6 +464,60 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
     s"($opId) $nodeName$codegenId"
   }
 
+  def extensionsInfo(mode: String = "simple"): String = {

Review Comment:
   Shall we use `enum` instead of `String` type? For me, the following code in test suite looks fragile and error-prone due to the typos.
   ```scala
   val info = df.queryExecution.executedPlan.extensionsInfo("extended")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1535032201


##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -583,6 +611,24 @@ case class MyParser(spark: SparkSession, delegate: ParserInterface) extends Pars
     delegate.parseQuery(sqlText)
 }
 
+case class ExplainPlanInfoTagRule() extends Rule[SparkPlan] {

Review Comment:
   This is really a hacky way to provide extended EXPLAIN info. Can we add a interface like
   ```
   trait ExtendedExplainGenerator {
     def generateExtendedInfo(plan: SparkPlan): String
   }
   ```
   
   Then the config provides the implementation class.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1549261402


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +372,26 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    try {
+      val generator = sparkSession.sparkContext.conf.get(EXTENDED_EXPLAIN_PROVIDER.key)
+      try {
+        val extensionClass = Utils.classForName(generator)
+        val extension = extensionClass.getConstructor().newInstance()
+          .asInstanceOf[ExtendedExplainGenerator]
+        append("\n== Extended Information ==\n")
+        append(extension.generateExtendedInfo(plan))
+      } catch {
+        case e@(_: ClassCastException |
+                _: ClassNotFoundException |
+                _: NoClassDefFoundError) =>

Review Comment:
   how about catch `NonFatal`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1548389842


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +371,25 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    sparkSession.sessionState.conf.extendedExplainProvider match {
+      case Some(generator) =>
+        try {
+          val extensionClass = Utils.classForName(generator)
+          val extension = extensionClass.getConstructor().newInstance()
+            .asInstanceOf[ExtendedExplainGenerator]

Review Comment:
   Do you know how many times we need to generate?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1548389842


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +371,25 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    sparkSession.sessionState.conf.extendedExplainProvider match {
+      case Some(generator) =>
+        try {
+          val extensionClass = Utils.classForName(generator)
+          val extension = extensionClass.getConstructor().newInstance()
+            .asInstanceOf[ExtendedExplainGenerator]

Review Comment:
   Just a question. Do you know how many times we need to generate?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1548482885


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +371,25 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    sparkSession.sessionState.conf.extendedExplainProvider match {
+      case Some(generator) =>

Review Comment:
   Shall we try the following?
   ```scala
   - case Some(generator) =>
   + case Some(generator) if Utils.classIsLoadable(generator) => 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534110561


##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -583,6 +611,24 @@ case class MyParser(spark: SparkSession, delegate: ParserInterface) extends Pars
     delegate.parseQuery(sqlText)
 }
 
+case class AddTagRule() extends Rule[SparkPlan] {

Review Comment:
   `AddTagRule` looks too general in a long term perspective. Shall we use more specific one like `ExplainPlanInfoTagRule`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534165007


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -464,6 +464,60 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
     s"($opId) $nodeName$codegenId"
   }
 
+  def extensionsInfo(mode: String = "simple"): String = {
+    mode match {
+      case "simple" =>
+        summaryExtensionsInfoSimple().mkString("\n")
+      case _ =>
+        val sb = new StringBuilder
+        summaryExtensionsInfo(0, sb)
+        sb.toString
+    }
+  }
+
+  def summaryExtensionsInfo(indent: Int, sb: StringBuilder): Unit = {
+    var indentLevel = indent
+    sb.append(
+      getTagValue(QueryPlan.EXPLAIN_PLAN_INFO).map(t => {
+        indentLevel = indentLevel + 1
+        "  " * indentLevel + t + "\n"
+      }).getOrElse("")
+    )
+    if (innerChildren.nonEmpty) {
+      innerChildren.foreach(_.summaryExtensionsInfo(indentLevel, sb))
+    }
+    if (children.nonEmpty) {
+      children.foreach(_.summaryExtensionsInfo(indentLevel, sb))
+    }
+  }
+
+  // Sometimes all that is needed is the root level information
+  // (especially with errors during planning)
+  def summaryExtensionsInfoSimple(): mutable.Set[String] = {
+    val sb = mutable.Set[String]() // don't allow duplicates

Review Comment:
   The name `sb` looks misleading because we use it for StringBiuilder.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on PR #45488:
URL: https://github.com/apache/spark/pull/45488#issuecomment-2015460813

   CI test failure appears to be unrelated. Not re-running, because this PR will change.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534095844


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -1529,7 +1536,7 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
-   val V2_BUCKETING_ALLOW_JOIN_KEYS_SUBSET_OF_PARTITION_KEYS =
+  val V2_BUCKETING_ALLOW_JOIN_KEYS_SUBSET_OF_PARTITION_KEYS =

Review Comment:
   This one looks correct, but let's not mix the style change and functional change in a single PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534165424


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -464,6 +464,60 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
     s"($opId) $nodeName$codegenId"
   }
 
+  def extensionsInfo(mode: String = "simple"): String = {
+    mode match {
+      case "simple" =>
+        summaryExtensionsInfoSimple().mkString("\n")
+      case _ =>
+        val sb = new StringBuilder
+        summaryExtensionsInfo(0, sb)
+        sb.toString
+    }
+  }
+
+  def summaryExtensionsInfo(indent: Int, sb: StringBuilder): Unit = {
+    var indentLevel = indent
+    sb.append(
+      getTagValue(QueryPlan.EXPLAIN_PLAN_INFO).map(t => {
+        indentLevel = indentLevel + 1
+        "  " * indentLevel + t + "\n"
+      }).getOrElse("")
+    )
+    if (innerChildren.nonEmpty) {
+      innerChildren.foreach(_.summaryExtensionsInfo(indentLevel, sb))
+    }
+    if (children.nonEmpty) {
+      children.foreach(_.summaryExtensionsInfo(indentLevel, sb))
+    }
+  }
+
+  // Sometimes all that is needed is the root level information
+  // (especially with errors during planning)
+  def summaryExtensionsInfoSimple(): mutable.Set[String] = {
+    val sb = mutable.Set[String]() // don't allow duplicates

Review Comment:
   Could you explain why we don't allow duplications?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1548482885


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +371,25 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    sparkSession.sessionState.conf.extendedExplainProvider match {
+      case Some(generator) =>

Review Comment:
   Shall we try the following?
   ```scala
   - case Some(generator) =>
   + case Some(generator) if Utils.classIsLoadable(generator) => 
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on PR #45488:
URL: https://github.com/apache/spark/pull/45488#issuecomment-2038764692

   Thank you @cloud-fan @dongjoon-hyun  for the review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1536006062


##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -583,6 +611,24 @@ case class MyParser(spark: SparkSession, delegate: ParserInterface) extends Pars
     delegate.parseQuery(sqlText)
 }
 
+case class ExplainPlanInfoTagRule() extends Rule[SparkPlan] {

Review Comment:
   Thank you for the review, @cloud-fan .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1548387498


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/ExtendedExplainGenerator.scala:
##########
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+/**
+ * A trait for a session extension to implement that provides addition explain plan
+ * information.
+ */
+

Review Comment:
   nit. Remove empty line.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534096953


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -4101,15 +4108,15 @@ object SQLConf {
     .booleanConf
     .createWithDefault(false)
 
-  val LEGACY_INTEGER_GROUPING_ID =
+   val LEGACY_INTEGER_GROUPING_ID =

Review Comment:
   New indentation looks incorrect.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -4101,15 +4108,15 @@ object SQLConf {
     .booleanConf
     .createWithDefault(false)
 
-  val LEGACY_INTEGER_GROUPING_ID =
+   val LEGACY_INTEGER_GROUPING_ID =
     buildConf("spark.sql.legacy.integerGroupingId")
       .internal()
       .doc("When true, grouping_id() returns int values instead of long values.")
       .version("3.1.0")
       .booleanConf
       .createWithDefault(false)
 
-  val LEGACY_GROUPING_ID_WITH_APPENDED_USER_GROUPBY =
+   val LEGACY_GROUPING_ID_WITH_APPENDED_USER_GROUPBY =

Review Comment:
   ditto.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534092098


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -861,7 +868,7 @@ object SQLConf {
       .booleanConf
       .createWithDefault(true)
 
-  val ADAPTIVE_REBALANCE_PARTITIONS_SMALL_PARTITION_FACTOR =
+  val  ADAPTIVE_REBALANCE_PARTITIONS_SMALL_PARTITION_FACTOR =

Review Comment:
   Could you recover the original code?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534094157


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -1519,7 +1526,7 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
-  val V2_BUCKETING_SHUFFLE_ENABLED =
+ val V2_BUCKETING_SHUFFLE_ENABLED =

Review Comment:
   New one looks incorrect to me.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1549251227


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +372,26 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    try {
+      val generator = sparkSession.sparkContext.conf.get(EXTENDED_EXPLAIN_PROVIDER.key)

Review Comment:
   looking at the implementation, seems the config can be a session config instead of static?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534114984


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/QueryPlan.scala:
##########
@@ -648,6 +702,8 @@ abstract class QueryPlan[PlanType <: QueryPlan[PlanType]]
 object QueryPlan extends PredicateHelper {
   val OP_ID_TAG = TreeNodeTag[Int]("operatorId")
   val CODEGEN_ID_TAG = new TreeNodeTag[Int]("wholeStageCodegenId")
+  val EXPLAIN_PLAN_INFO_SIMPLE = new TreeNodeTag[String]("explainPlanInfoSimple")
+  val EXPLAIN_PLAN_INFO = new TreeNodeTag[String]("explainPlanInfo")

Review Comment:
   Could you add some comments about the difference or (definition) for `explainPlanInfoSimple` and `explainPlanInfo`? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534106903


##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -543,6 +544,33 @@ class SparkSessionExtensionSuite extends SparkFunSuite with SQLHelper with Adapt
       }
     }
   }
+
+  test("explain info") {
+    val extensions = create { extensions =>
+      extensions.injectQueryStagePrepRule { _ => AddTagRule() }
+    }
+    withSession(extensions) { session =>
+      withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
+        SQLConf.AUTO_BROADCASTJOIN_THRESHOLD.key -> "-1") {

Review Comment:
   We need two more spaces.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534093370


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -1160,7 +1167,7 @@ object SQLConf {
     .intConf
     .createWithDefault(4096)
 
-  val PARQUET_FIELD_ID_WRITE_ENABLED =
+   val PARQUET_FIELD_ID_WRITE_ENABLED =

Review Comment:
   ? Is there a reason for this kind of change. It seems to have multiple instances, @parthchandra .



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534106320


##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -543,6 +544,33 @@ class SparkSessionExtensionSuite extends SparkFunSuite with SQLHelper with Adapt
       }
     }
   }
+
+  test("explain info") {

Review Comment:
   ```
   - test("explain info") {
   + test("SPARK-47289: explain info") {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1552336823


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +373,23 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    val generators = sparkSession.sessionState.conf.getConf(SQLConf.EXTENDED_EXPLAIN_PROVIDERS)
+      .getOrElse(Seq.empty)
+    val extensions = Utils.loadExtensions(classOf[ExtendedExplainGenerator],
+      generators,
+      sparkSession.sparkContext.conf)
+    if (extensions.nonEmpty) {
+      append("\n== Extended Information ==\n")

Review Comment:
   Great idea - done.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1548686284


##########
sql/core/src/test/scala/org/apache/spark/sql/execution/QueryExecutionSuite.scala:
##########
@@ -16,17 +16,20 @@
  */
 package org.apache.spark.sql.execution
 
+import scala.collection.mutable
 import scala.io.Source
-

Review Comment:
   Done



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -333,6 +333,13 @@ object SQLConf {
     .booleanConf
     .createWithDefault(true)
 
+  val EXTENDED_EXPLAIN_PROVIDER = buildConf("spark.sql.extendedExplainProvider")

Review Comment:
   Moved to static conf



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +371,25 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    sparkSession.sessionState.conf.extendedExplainProvider match {
+      case Some(generator) =>
+        try {
+          val extensionClass = Utils.classForName(generator)
+          val extension = extensionClass.getConstructor().newInstance()
+            .asInstanceOf[ExtendedExplainGenerator]

Review Comment:
   I don't think it gets called too often. It should only get called in explain plan or when the plan is displayed in the UI.
   I can make this a lazy val and load it just once in the class if you think that is preferable.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/ExtendedExplainGenerator.scala:
##########
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution
+
+/**
+ * A trait for a session extension to implement that provides addition explain plan
+ * information.
+ */
+

Review Comment:
   removed



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1548488507


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -333,6 +333,13 @@ object SQLConf {
     .booleanConf
     .createWithDefault(true)
 
+  val EXTENDED_EXPLAIN_PROVIDER = buildConf("spark.sql.extendedExplainProvider")

Review Comment:
   `StaticSQLConf` might be a better place for this conf. We don't want to change this per query, right?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on PR #45488:
URL: https://github.com/apache/spark/pull/45488#issuecomment-1998296127

   Rebased on latest master and tests are passing now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on PR #45488:
URL: https://github.com/apache/spark/pull/45488#issuecomment-2038087760

   @dongjoon-hyun thank you for pointing the ci failure out. I've addressed it by excluding the newly added trait from the compatibility check (it is not needed by the client).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534103491


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -333,6 +333,13 @@ object SQLConf {
     .booleanConf
     .createWithDefault(true)
 
+  val PLAN_EXPLAIN_EXTENSIONS_ENABLED = buildConf("spark.sql.plan.explain.extensions.enabled")

Review Comment:
   This introduces 3 new namespaces in Apache spark configurations. We should avoid this as much as possible.
   ```
   spark.sql.plan.*
   spark.sql.plan.explain.*
   spark.sql.plan.explain.extensions.*
   ```
   
   Can we change like `spark.sql.planChangeValidation`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1534168763


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -4864,6 +4871,8 @@ class SQLConf extends Serializable with Logging with SqlApiConf {
 
   def planChangeBatches: Option[String] = getConf(PLAN_CHANGE_LOG_BATCHES)
 
+  def planExplainExtensionsEnabled: Boolean = getConf(PLAN_EXPLAIN_EXTENSIONS_ENABLED)

Review Comment:
   May I ask why is this plural, `Extensions`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1549253383


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/QueryExecution.scala:
##########
@@ -369,6 +372,26 @@ class QueryExecution(
     Utils.redact(sparkSession.sessionState.conf.stringRedactionPattern, message)
   }
 
+  def extendedExplainInfo(append: String => Unit, plan: SparkPlan): Unit = {
+    try {
+      val generator = sparkSession.sparkContext.conf.get(EXTENDED_EXPLAIN_PROVIDER.key)
+      try {
+        val extensionClass = Utils.classForName(generator)
+        val extension = extensionClass.getConstructor().newInstance()
+          .asInstanceOf[ExtendedExplainGenerator]

Review Comment:
   Let's call `Utils.loadExtensions`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1538626149


##########
sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala:
##########
@@ -583,6 +611,24 @@ case class MyParser(spark: SparkSession, delegate: ParserInterface) extends Pars
     delegate.parseQuery(sqlText)
 }
 
+case class ExplainPlanInfoTagRule() extends Rule[SparkPlan] {

Review Comment:
   Yes, the EXPLAIN command will call this plugin implementation and append the result to the command output as extended information. Then we don't need to add new APIs in `QueryPlan` and the plugin can also be more flexible as it sees the full plan tree.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan closed pull request #45488: [SPARK-47289][SQL] Allow extensions to log extended information in explain plan
URL: https://github.com/apache/spark/pull/45488


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "parthchandra (via GitHub)" <gi...@apache.org>.
parthchandra commented on PR #45488:
URL: https://github.com/apache/spark/pull/45488#issuecomment-2032746946

   @cloud-fan I've update this PR. Can you please take a look?
   I also moved the unit test to a more appropriate test suite.
   (Sorry it took a while; I was making sure that with this approach extensions would be able to publish their extended info and also be able to build and work with older versions of Spark which do not have the trait.)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47289][SQL] Allow extensions to log extended information in explain plan [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #45488:
URL: https://github.com/apache/spark/pull/45488#discussion_r1549250190


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/ExtendedExplainGenerator.scala:
##########
@@ -0,0 +1,26 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.execution

Review Comment:
   This is an internal package. Let's put it in `org.apache.spark.sql`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org