You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "rangadi (via GitHub)" <gi...@apache.org> on 2023/05/29 02:28:40 UTC

[GitHub] [spark] rangadi commented on a diff in pull request #41343: [SPARK-43799][PYTHON] Add descriptor binary option to Pyspark Protobuf API

rangadi commented on code in PR #41343:
URL: https://github.com/apache/spark/pull/41343#discussion_r1208774517


##########
python/pyspark/sql/protobuf/functions.py:
##########
@@ -61,9 +65,11 @@ def from_protobuf(
         The Protobuf class name when descFilePath parameter is not set.
         E.g. `com.example.protos.ExampleEvent`.
     descFilePath : str, optional
-        The protobuf descriptor file.
+        The Protobuf descriptor file.
     options : dict, optional
         options to control how the protobuf record is parsed.
+    binaryDescriptorSet: bytes, optional
+        The Protobuf `FileDescriptorSet` serialized as binary.

Review Comment:
   Good call. Updated.



##########
python/pyspark/sql/protobuf/functions.py:
##########
@@ -232,6 +274,12 @@ def to_protobuf(
     return Column(jc)
 
 
+def _read_descriptor_set_file(filePath: str) -> bytes:
+    # TODO: Throw structured errors like "PROTOBUF_DESCRIPTOR_FILE_NOT_FOUND" etc.

Review Comment:
   Agree. Added jira ticket and updated the comment.
   Can we throw structured error now in Python? In that case, I can actually fix it in this PR itself. 
   Please point to an example. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org