You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/08/15 23:02:43 UTC

[GitHub] [spark] sumeetgajjar commented on a diff in pull request #37505: [SPARK-40067][SQL] Use Table#name() instead of Scan#name() to populate the table name in the BatchScan node in SparkUI

sumeetgajjar commented on code in PR #37505:
URL: https://github.com/apache/spark/pull/37505#discussion_r946206226


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/BatchScanExec.scala:
##########
@@ -37,7 +37,8 @@ case class BatchScanExec(
     @transient scan: Scan,
     runtimeFilters: Seq[Expression],
     keyGroupedPartitioning: Option[Seq[Expression]] = None,
-    ordering: Option[Seq[SortOrder]] = None) extends DataSourceV2ScanExecBase {
+    ordering: Option[Seq[SortOrder]] = None,
+    table: Option[String] = None) extends DataSourceV2ScanExecBase {

Review Comment:
   > since we are adding a new parameter, shall we add `Table` directly? in case we may need to access other methods of `Table` in the future.
   
   I initially had added `Table` https://github.com/apache/spark/pull/37505/commits/710944e86c2254708a207855bd65d1ba09463ec5
   But then there were test failures with `NotSerializableException` exceptions. 
   I made `CaseInsensitiveStringMap` serializable and ran the failed test locally but the test failure persisted. 
   So a quick-fix was to simply pass the table name as String instead of `Table`.
   
   
   
   
   ```
   sbt:spark-avro> testOnly org.apache.spark.sql.avro.AvroV2LogicalTypeSuite -- -z "Logical type: write Decimal with BYTES type"
   .
   .
   [info] AvroV2LogicalTypeSuite:
   [info] - Logical type: write Decimal with BYTES type *** FAILED *** (4 seconds, 704 milliseconds)
   [info]   org.apache.spark.SparkException: Job aborted due to stage failure: Task not serializable: java.io.NotSerializableException: org.apache.spark.sql.util.CaseInsensitiveStringMap
   [info]   at org.apache.spark.scheduler.DAGScheduler.failJobAndIndependentStages(DAGScheduler.scala:2706)
   [info]   at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2(DAGScheduler.scala:2642)
   [info]   at org.apache.spark.scheduler.DAGScheduler.$anonfun$abortStage$2$adapted(DAGScheduler.scala:2641)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org