You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "rednaxelafx (via GitHub)" <gi...@apache.org> on 2023/05/10 19:49:50 UTC

[GitHub] [spark] rednaxelafx commented on a diff in pull request #41064: [SPARK-43383][SQL] Add `rowCount` statistics to LocalRelation

rednaxelafx commented on code in PR #41064:
URL: https://github.com/apache/spark/pull/41064#discussion_r1190332445


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/LocalRelation.scala:
##########
@@ -79,7 +79,8 @@ case class LocalRelation(
   }
 
   override def computeStats(): Statistics =
-    Statistics(sizeInBytes = EstimationUtils.getSizePerRow(output) * data.length)
+    Statistics(sizeInBytes = EstimationUtils.getSizePerRow(output) * data.length,
+      rowCount = Some(data.size))

Review Comment:
   Post-hoc comment: not a big deal but I just noticed that we have `data.length` and `data.size` used on adjacent lines and they mean the same thing. It would have been better if we use the same method for the same thing, at least on the same page of code.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org