You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2019/05/29 15:46:53 UTC

[GitHub] [spark] gengliangwang commented on a change in pull request #24666: [SPARK-27482][SQL][WEBUI] Show BroadcastHashJoinExec numOutputRows statistics info on SparkSQL UI page

gengliangwang commented on a change in pull request #24666: [SPARK-27482][SQL][WEBUI] Show BroadcastHashJoinExec numOutputRows statistics info on SparkSQL UI page
URL: https://github.com/apache/spark/pull/24666#discussion_r288638013
 
 

 ##########
 File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala
 ##########
 @@ -46,8 +46,13 @@ case class BroadcastHashJoinExec(
     right: SparkPlan)
   extends BinaryExecNode with HashJoin with CodegenSupport {
 
-  override lazy val metrics = Map(
-    "numOutputRows" -> SQLMetrics.createMetric(sparkContext, "number of output rows"))
+  override lazy val metrics = {
+      Map("numOutputRows" ->
+        SQLMetrics.createMetric(
+          sparkContext,
+          "number of output rows",
+          logicalPlan.map(_.stats.rowCount.map(_.toLong).getOrElse(-1L)).getOrElse(-1L)))
 
 Review comment:
   IIRC, for file sources, usually there is only sizeInBytes stats in logical plan level.  So the estimated numOutputRows for logical plan should be empty for file sources.
   What is the scenario of this PR?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org