You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "davintjong-db (via GitHub)" <gi...@apache.org> on 2024/01/10 01:55:44 UTC

[PR] [SPARK-46644] Fix add and merge in SQLMetric [spark]

davintjong-db opened a new pull request, #44649:
URL: https://github.com/apache/spark/pull/44649

   ### What changes were proposed in this pull request?
   
   A previous refactor mistakenly used `isValid` for add. Since `defaultValidValue` was always `0`, this didn't cause any correctness issues.
   
   What we really want to do for add (and merge) is `if (isZero) _value = 0`.
   
   ### Why are the changes needed?
   
   There are no correctness errors, but this is confusing and error-prone.
   
   ### Does this PR introduce _any_ user-facing change?
   No.
   
   ### How was this patch tested?
   Running the tests.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46644] Change add and merge in SQLMetric to use isZero [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan closed pull request #44649: [SPARK-46644] Change add and merge in SQLMetric to use isZero
URL: https://github.com/apache/spark/pull/44649


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46644] Change add and merge in SQLMetric to use isZero [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44649:
URL: https://github.com/apache/spark/pull/44649#discussion_r1449954211


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala:
##########
@@ -93,8 +91,8 @@ class SQLMetric(
   def +=(v: Long): Unit = add(v)
 
   // _value may be invalid, in many cases being -1. We should not expose it to the user
-  // and instead return defaultValidValue.
-  override def value: Long = if (!isValid) defaultValidValue else _value
+  // and instead return 0.
+  override def value: Long = if (!isValid) 0 else _value

Review Comment:
   can we do `if (isZero) 0 else _value`? or we are worried about negative values due to bugs?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46644] Change add and merge in SQLMetric to use isZero [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on PR #44649:
URL: https://github.com/apache/spark/pull/44649#issuecomment-1897694199

   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46644] Change add and merge in SQLMetric to use isZero [spark]

Posted by "davintjong-db (via GitHub)" <gi...@apache.org>.
davintjong-db commented on code in PR #44649:
URL: https://github.com/apache/spark/pull/44649#discussion_r1451023537


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala:
##########
@@ -93,8 +91,8 @@ class SQLMetric(
   def +=(v: Long): Unit = add(v)
 
   // _value may be invalid, in many cases being -1. We should not expose it to the user
-  // and instead return defaultValidValue.
-  override def value: Long = if (!isValid) defaultValidValue else _value
+  // and instead return 0.
+  override def value: Long = if (!isValid) 0 else _value

Review Comment:
   Hm, it seems like tests fail because 
   
   
   ```
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46644] Change add and merge in SQLMetric to use isZero [spark]

Posted by "davintjong-db (via GitHub)" <gi...@apache.org>.
davintjong-db commented on PR #44649:
URL: https://github.com/apache/spark/pull/44649#issuecomment-1896500225

   Seem to have the same failure as https://github.com/apache/spark/pull/44025#issuecomment-1842752713


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46644] Change add and merge in SQLMetric to use isZero [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #44649:
URL: https://github.com/apache/spark/pull/44649#discussion_r1446916177


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala:
##########
@@ -62,8 +62,8 @@ class SQLMetric(
 
   override def merge(other: AccumulatorV2[Long, Long]): Unit = other match {
     case o: SQLMetric =>
-      if (o.isValid) {
-        if (!isValid) _value = defaultValidValue
+      if (!o.isZero) {
+        if (isZero) _value = 0

Review Comment:
   ```suggestion
           if (isZero) _value = defaultValidValue
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46644] Change add and merge in SQLMetric to use isZero [spark]

Posted by "davintjong-db (via GitHub)" <gi...@apache.org>.
davintjong-db commented on code in PR #44649:
URL: https://github.com/apache/spark/pull/44649#discussion_r1446919073


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala:
##########
@@ -62,8 +62,8 @@ class SQLMetric(
 
   override def merge(other: AccumulatorV2[Long, Long]): Unit = other match {
     case o: SQLMetric =>
-      if (o.isValid) {
-        if (!isValid) _value = defaultValidValue
+      if (!o.isZero) {
+        if (isZero) _value = 0

Review Comment:
   This was the issue before - in the case of an optional metric, `defaultValidValue` will be negative. So if we were to add a value to this metric, it would be subtracted by this sentinel value, so we want to set it to `0`.
   
   We want `defaultValidValue` defines what is exposed via `value()`, not where to start counting - that's where I got confused the first time around.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46644] Change add and merge in SQLMetric to use isZero [spark]

Posted by "davintjong-db (via GitHub)" <gi...@apache.org>.
davintjong-db commented on code in PR #44649:
URL: https://github.com/apache/spark/pull/44649#discussion_r1451023537


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/metric/SQLMetrics.scala:
##########
@@ -93,8 +91,8 @@ class SQLMetric(
   def +=(v: Long): Unit = add(v)
 
   // _value may be invalid, in many cases being -1. We should not expose it to the user
-  // and instead return defaultValidValue.
-  override def value: Long = if (!isValid) defaultValidValue else _value
+  // and instead return 0.
+  override def value: Long = if (!isValid) 0 else _value

Review Comment:
   Hm, it seems like tests fail because of a test with a positive initValue. 
   ```
   assert(SQLMetrics.createNanoTimingMetric(sparkContext, name = "m", initValue = 5).value === 5)
   ```
   I don't think we ever use a positive initValue so we could either remove this or stick with the original check `if(_value < 0)`.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org