You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/09/19 09:17:16 UTC

[GitHub] [spark] cloud-fan commented on pull request #37634: [SPARK-40199][SQL] Provide useful error when projecting a non-null column encounters null value

cloud-fan commented on PR #37634:
URL: https://github.com/apache/spark/pull/37634#issuecomment-1250772044

   Spark trusts data nullability in many places (expressions, projection generators, optimizer rules, etc.). It's a lot of efforts to improve error messages for all these places when data does not match the nullability. We'd better pick a clear scope here.
   
   AFAIK a common source of mismatch is data source and UDF. We can focus on these 2 cases only.
   
   For data sources, we can add a Filter node above the data source relation to apply null check, using the existing `AssertNotNull` expression. For UDF, we can wrap the UDF expression with `AssertNotNull` to do the null check as well.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org