You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "rangadi (via GitHub)" <gi...@apache.org> on 2023/12/08 19:18:51 UTC

[PR] [SPARK-46275][3.4][Chrry-pick] Protobuf: Return null in permissive mode when deserialization fails [spark]

rangadi opened a new pull request, #44265:
URL: https://github.com/apache/spark/pull/44265

   This is a cherry-pick of #44214 into 3.4 branch.
   
   From the original PR:
   
   ### What changes were proposed in this pull request?
   This updates the the behavior of `from_protobuf()` built function when underlying record fails to deserialize.
   
     * **Current behvior**:
       * By default, this would throw an error and the query fails. [This part is not changed in the PR]
       * When `mode` is set to 'PERMISSIVE' it returns a non-null struct with each of the inner fields set to null e.g. `{ "field_a": null, "field_b": null }`  etc.
          * This is not very convenient to the users. They don't know if this was due to malformed record or if the input itself has null. It is very hard to check for each field for null in SQL query (imagine a sql query with a struct that has 10 fields).
   
     * **New behavior**
       * When `mode` is set to 'PERMISSIVE' it simply returns `null`.
   
   ### Why are the changes needed?
   This makes it easier for users to detect and handle malformed records.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, but this does not change the contract. In fact, it clarifies it.
   
   ### How was this patch tested?
    - Unit tests are updated.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46275][3.4] Protobuf: Return null in permissive mode when deserialization fails [spark]

Posted by "rangadi (via GitHub)" <gi...@apache.org>.
rangadi commented on PR #44265:
URL: https://github.com/apache/spark/pull/44265#issuecomment-1847932270

   Thank you!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46275][3.4] Protobuf: Return null in permissive mode when deserialization fails [spark]

Posted by "rangadi (via GitHub)" <gi...@apache.org>.
rangadi commented on PR #44265:
URL: https://github.com/apache/spark/pull/44265#issuecomment-1847769153

   @HyukjinKwon please merge this into 3.4 when you get a chance. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-46275][3.4] Protobuf: Return null in permissive mode when deserialization fails [spark]

Posted by "dongjoon-hyun (via GitHub)" <gi...@apache.org>.
dongjoon-hyun closed pull request #44265: [SPARK-46275][3.4] Protobuf: Return null in permissive mode when deserialization fails
URL: https://github.com/apache/spark/pull/44265


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org