You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/11/15 12:47:14 UTC

[GitHub] [pulsar] gaoran10 opened a new pull request #12811: [Doc] Add doc for using the Avro JSON schema definition in Python client

gaoran10 opened a new pull request #12811:
URL: https://github.com/apache/pulsar/pull/12811


   ### Motivation
   
   Currently, there is a lack of how to use the Avro JSON schema definition in the Python client.
   
   ### Modifications
   
   Add doc for using custom Avro JSON schema definition in Python client.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] github-actions[bot] commented on pull request #12811: [Doc] Add doc for using the Avro JSON schema definition in Python client

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #12811:
URL: https://github.com/apache/pulsar/pull/12811#issuecomment-968876910


   @gaoran10:Thanks for your contribution. For this PR, do we need to update docs?
   (The [PR template contains info about doc](https://github.com/apache/pulsar/blob/master/.github/PULL_REQUEST_TEMPLATE.md#documentation), which helps others know more about the changes. Can you provide doc-related info in this and future PR descriptions? Thanks)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] gaoran10 commented on a change in pull request #12811: [Doc] Add doc for using the Avro JSON schema definition in Python client

Posted by GitBox <gi...@apache.org>.
gaoran10 commented on a change in pull request #12811:
URL: https://github.com/apache/pulsar/pull/12811#discussion_r749313147



##########
File path: site2/docs/client-libraries-python.md
##########
@@ -224,6 +224,71 @@ while True:
         consumer.negative_acknowledge(msg)
 ```
 
+### Using AVRO JSON schema definition
+
+Users also could use the AVRO JSON schema definition to generate an AvroSchema.
+
+Assume that there is a company AVRO JSON schema definition file `company.avsc`, like this:
+
+```json
+{
+    "doc": "this is doc",
+    "namespace": "example.avro",
+    "type": "record",
+    "name": "Company",
+    "fields": [
+        {"name": "name", "type": ["null", "string"]},
+        {"name": "address", "type": ["null", "string"]},
+        {"name": "employees", "type": ["null", {"type": "array", "items": {
+            "type": "record",
+            "name": "Employee",
+            "fields": [
+                {"name": "name", "type": ["null", "string"]},
+                {"name": "age", "type": ["null", "int"]}
+            ]
+        }}]},
+        {"name": "labels", "type": ["null", {"type": "map", "values": "string"}]}
+    ]
+}
+```
+
+Users could load schema definition from file by `avro.schema` or `fastavro.schema`
+> refer to [load_schema](https://fastavro.readthedocs.io/en/latest/schema.html#fastavro._schema_py.load_schema) or [Avro Schema](http://avro.apache.org/docs/current/gettingstartedpython.html)
+
+If using custom JSON definition schema, users need to use Python dict to produce and consume messages, this is different from using Record definition 
+and the `_record_cls` param should be None when generating `AvroSchema` object.
+
+```
+schema_definition = load_schema("examples/company.avsc")
+# schema_definition = avro.schema.parse(open("examples/company.avsc", "rb").read()).to_json()

Review comment:
       Good idea! Thanks.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Anonymitaet commented on pull request #12811: [DO NOT MERGE - WIP][Doc] Add doc for using the Avro JSON schema definition in Python client

Posted by GitBox <gi...@apache.org>.
Anonymitaet commented on pull request #12811:
URL: https://github.com/apache/pulsar/pull/12811#issuecomment-970118240


   Talked with @gaoran10 just now, we are updating the doc, please do not merge this PR now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Anonymitaet edited a comment on pull request #12811: [DO NOT MERGE - WIP][Doc] Add doc for using the Avro JSON schema definition in Python client

Posted by GitBox <gi...@apache.org>.
Anonymitaet edited a comment on pull request #12811:
URL: https://github.com/apache/pulsar/pull/12811#issuecomment-970118240


   Talked with @gaoran10 just now, we are updating the doc [here](https://docs.google.com/document/d/1guNdoKa2cGvHMSbvt7pBGlhJq2MetqD7VjxO7Ipn9Uw/edit#), please do not merge this PR now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] gaoran10 commented on a change in pull request #12811: [Doc] Add doc for using the Avro JSON schema definition in Python client

Posted by GitBox <gi...@apache.org>.
gaoran10 commented on a change in pull request #12811:
URL: https://github.com/apache/pulsar/pull/12811#discussion_r749313147



##########
File path: site2/docs/client-libraries-python.md
##########
@@ -224,6 +224,71 @@ while True:
         consumer.negative_acknowledge(msg)
 ```
 
+### Using AVRO JSON schema definition
+
+Users also could use the AVRO JSON schema definition to generate an AvroSchema.
+
+Assume that there is a company AVRO JSON schema definition file `company.avsc`, like this:
+
+```json
+{
+    "doc": "this is doc",
+    "namespace": "example.avro",
+    "type": "record",
+    "name": "Company",
+    "fields": [
+        {"name": "name", "type": ["null", "string"]},
+        {"name": "address", "type": ["null", "string"]},
+        {"name": "employees", "type": ["null", {"type": "array", "items": {
+            "type": "record",
+            "name": "Employee",
+            "fields": [
+                {"name": "name", "type": ["null", "string"]},
+                {"name": "age", "type": ["null", "int"]}
+            ]
+        }}]},
+        {"name": "labels", "type": ["null", {"type": "map", "values": "string"}]}
+    ]
+}
+```
+
+Users could load schema definition from file by `avro.schema` or `fastavro.schema`
+> refer to [load_schema](https://fastavro.readthedocs.io/en/latest/schema.html#fastavro._schema_py.load_schema) or [Avro Schema](http://avro.apache.org/docs/current/gettingstartedpython.html)
+
+If using custom JSON definition schema, users need to use Python dict to produce and consume messages, this is different from using Record definition 
+and the `_record_cls` param should be None when generating `AvroSchema` object.
+
+```
+schema_definition = load_schema("examples/company.avsc")
+# schema_definition = avro.schema.parse(open("examples/company.avsc", "rb").read()).to_json()

Review comment:
       Good idea! Thanks. I'll fix this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] BewareMyPower commented on a change in pull request #12811: [Doc] Add doc for using the Avro JSON schema definition in Python client

Posted by GitBox <gi...@apache.org>.
BewareMyPower commented on a change in pull request #12811:
URL: https://github.com/apache/pulsar/pull/12811#discussion_r749310620



##########
File path: site2/docs/client-libraries-python.md
##########
@@ -224,6 +224,71 @@ while True:
         consumer.negative_acknowledge(msg)
 ```
 
+### Using AVRO JSON schema definition
+
+Users also could use the AVRO JSON schema definition to generate an AvroSchema.
+
+Assume that there is a company AVRO JSON schema definition file `company.avsc`, like this:
+
+```json
+{
+    "doc": "this is doc",
+    "namespace": "example.avro",
+    "type": "record",
+    "name": "Company",
+    "fields": [
+        {"name": "name", "type": ["null", "string"]},
+        {"name": "address", "type": ["null", "string"]},
+        {"name": "employees", "type": ["null", {"type": "array", "items": {
+            "type": "record",
+            "name": "Employee",
+            "fields": [
+                {"name": "name", "type": ["null", "string"]},
+                {"name": "age", "type": ["null", "int"]}
+            ]
+        }}]},
+        {"name": "labels", "type": ["null", {"type": "map", "values": "string"}]}
+    ]
+}
+```
+
+Users could load schema definition from file by `avro.schema` or `fastavro.schema`
+> refer to [load_schema](https://fastavro.readthedocs.io/en/latest/schema.html#fastavro._schema_py.load_schema) or [Avro Schema](http://avro.apache.org/docs/current/gettingstartedpython.html)
+
+If using custom JSON definition schema, users need to use Python dict to produce and consume messages, this is different from using Record definition 
+and the `_record_cls` param should be None when generating `AvroSchema` object.
+
+```
+schema_definition = load_schema("examples/company.avsc")
+# schema_definition = avro.schema.parse(open("examples/company.avsc", "rb").read()).to_json()

Review comment:
       I just found this comment is confused. If you want to distinguish the `fastavro` and `avro`, you should better add the full imports. Like
   
   ```python
   import fastavro
   
   schema_definition = fastavro.schema.load_schema('examples/company.asvc')
   ```
   
   or
   
   ```python
   import avro
   
   schema_definition = avro.schema.parse(open("examples/company.avsc", "rb").read()).to_json()
   ```
   
   It's better to just retain only one in the example code. Don't use comment for the other one.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Anonymitaet commented on pull request #12811: [DO NOT MERGE - WIP][Doc] Add doc for using the Avro JSON schema definition in Python client

Posted by GitBox <gi...@apache.org>.
Anonymitaet commented on pull request #12811:
URL: https://github.com/apache/pulsar/pull/12811#issuecomment-979592511


   Hi @gaoran10 we've added docs in #12914, so I close this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] Anonymitaet closed pull request #12811: [DO NOT MERGE - WIP][Doc] Add doc for using the Avro JSON schema definition in Python client

Posted by GitBox <gi...@apache.org>.
Anonymitaet closed pull request #12811:
URL: https://github.com/apache/pulsar/pull/12811


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org