You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/09/01 07:36:26 UTC

[GitHub] [spark] zhengruifeng opened a new pull request, #37752: [SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd

zhengruifeng opened a new pull request, #37752:
URL: https://github.com/apache/spark/pull/37752

   ### What changes were proposed in this pull request?
   compared with the scala side, some parameter validations were missing in `pyspark.rdd`
   
   
   ### Why are the changes needed?
   add missing parameter validations in `pyspark.rdd`
   
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   existing testsutes
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng closed pull request #37752: [SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd

Posted by GitBox <gi...@apache.org>.
zhengruifeng closed pull request #37752: [SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd
URL: https://github.com/apache/spark/pull/37752


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #37752: [SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #37752:
URL: https://github.com/apache/spark/pull/37752#issuecomment-1236059439

   > Thanks!
   > 
   > Shall we mention that ValueError will be raised for invalid inputs in the`Does this PR introduce any user-facing change` of PR description?
   
   sure, thanks for pointing out it


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on a diff in pull request #37752: [SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on code in PR #37752:
URL: https://github.com/apache/spark/pull/37752#discussion_r961493174


##########
python/pyspark/rdd.py:
##########
@@ -1039,7 +1039,8 @@ def sample(
         >>> 6 <= rdd.sample(False, 0.1, 81).count() <= 14
         True
         """
-        assert fraction >= 0.0, "Negative fraction value: %s" % fraction
+        if not fraction >= 0:
+            raise ValueError("Fraction must be nonnegative.")

Review Comment:
   `AssertionError` is replaced with a `ValueError`, but I think it maybe trivial to add it to migration-doc



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] xinrong-meng commented on pull request #37752: [SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd

Posted by GitBox <gi...@apache.org>.
xinrong-meng commented on PR #37752:
URL: https://github.com/apache/spark/pull/37752#issuecomment-1235716391

   Thanks! 
   
   Shall we mention that ValueError will be raised for invalid inputs in the`Does this PR introduce any user-facing change` of PR description? 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] Yikun commented on a diff in pull request #37752: [SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd

Posted by GitBox <gi...@apache.org>.
Yikun commented on code in PR #37752:
URL: https://github.com/apache/spark/pull/37752#discussion_r961409260


##########
python/pyspark/rdd.py:
##########
@@ -1039,7 +1039,8 @@ def sample(
         >>> 6 <= rdd.sample(False, 0.1, 81).count() <= 14
         True
         """
-        assert fraction >= 0.0, "Negative fraction value: %s" % fraction
+        if not fraction >= 0:
+            raise ValueError("Fraction must be nonnegative.")

Review Comment:
   Change `assert` to ValueError is right definately, I believe all `assert` in main code (espacailly, use when validation params) should be fix (test mode is okay)
   
   [1] https://mail.python.org/pipermail/python-list/2013-November/810940.html
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhengruifeng commented on pull request #37752: [SPARK-40301][PYTHON] Add parameter validations in pyspark.rdd

Posted by GitBox <gi...@apache.org>.
zhengruifeng commented on PR #37752:
URL: https://github.com/apache/spark/pull/37752#issuecomment-1236270173

   Merged into master, thank you @xinrong-meng @Yikun for reivews!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org