You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "allisonwang-db (via GitHub)" <gi...@apache.org> on 2023/10/18 20:18:19 UTC

Re: [PR] [SPARK-45523][Python] Return useful error message if UDTF returns None for any non-nullable column [spark]

allisonwang-db commented on code in PR #43356:
URL: https://github.com/apache/spark/pull/43356#discussion_r1364486288


##########
sql/core/src/test/scala/org/apache/spark/sql/IntegratedUDFTestUtils.scala:
##########
@@ -749,6 +749,363 @@ object IntegratedUDFTestUtils extends SQLHelper {
     val prettyName: String = "Python UDTF whose 'analyze' method sets state and reads it later"
   }
 
+  object TestPythonUDTFInvalidEvalReturnsNoneToNonNullableColumnScalarType extends TestUDTF {
+    val name: String = "TestPythonUDTFInvalidEvalReturnsNoneToNonNullableColumnScalarType"

Review Comment:
   It would be great if we could make this name a bit shorter :) 



##########
python/pyspark/worker.py:
##########
@@ -841,6 +845,63 @@ def _remove_partition_by_exprs(self, arg: Any) -> Any:
             "the query again."
         )
 
+    # Compares each UDTF output row against the output schema for this particular UDTF call,
+    # raising an error if the two are incompatible.
+    def check_output_row_against_schema(row: Any) -> None:

Review Comment:
   @ueshin do you think this will add extra performance overhead if we check this for each output row?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org