You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2021/04/08 13:54:32 UTC

[GitHub] [pulsar] bmsilva opened a new issue #10174: Record.schema classmethod shouldn't sort the schema fields

bmsilva opened a new issue #10174:
URL: https://github.com/apache/pulsar/issues/10174


   Hi,
   
   I step into this problem when using AvroSchema and connecting to pulsar for read. The writers are implemented in Java, and in my case that I'm reading using python with the pulsar-client, I noticed that the schema doesn't respect the field order and it sorts the field alphabetically, link to the code here:
   
   https://github.com/apache/pulsar/blob/3d7a6bae1d4de4eaa399d45b8f59b92b0f090faa/pulsar-client-cpp/python/pulsar/schema/definition.py#L83
   
   This is a problem because the java writer doesn't sort the fields, so 2 bugs happen, first when subscribing the topic, pulsar doesn't validate the Schema correctly, and then you get an error when calling method `value` in the read message.
   
   A workaround for me was reimplement the `schema` method from `Record` class.
   
   example:
   ```python
   class MyRecord(pschema.Record):
       c = pschema.String(required=False)
       b = pschema.String(required=False)
       a = pschema.String(required=False)
       
       @classmethod
       def schema(cls):
           schema = {
               'name': str(cls.__name__),
               'type': 'record',
               'fields': []
           }
   
           for name in cls._fields.keys():  # removed "sorted" from here
               field = cls._fields[name]
               field_type = field.schema() if field._required else ['null', field.schema()]
               field_schema = {
                   'name': name,
                   'type': field_type
               }
              # also add problem when `required=False` and had to add `default`
               if not field._required:
                   field_schema['default'] = None
               schema['fields'].append(field_schema)
           return schema
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli edited a comment on issue #10174: Record.schema classmethod shouldn't sort the schema fields

Posted by GitBox <gi...@apache.org>.
eolivelli edited a comment on issue #10174:
URL: https://github.com/apache/pulsar/issues/10174#issuecomment-815853968


   I am not sure about how it works in Python but usually this is the problem with Avro because when you read you have both the writer schema and the reader schema so Avro is able to decode the value using the writer schema and apply the content to the reader side model.
   
   Pulsar stores the Writer schema in the internal schema registry


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] eolivelli commented on issue #10174: Record.schema classmethod shouldn't sort the schema fields

Posted by GitBox <gi...@apache.org>.
eolivelli commented on issue #10174:
URL: https://github.com/apache/pulsar/issues/10174#issuecomment-815853968


   I am not sure about how it works in Python but usually this is the problem with Avro because when you read you have both the writer schema and the reader schema so Avro is able to decode the value using the writer schema and apply the content to the reader side model.
   
   Pulsar stores the Writer schema in the I terbal schema registry


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] codelipenghui commented on issue #10174: Record.schema classmethod shouldn't sort the schema fields

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #10174:
URL: https://github.com/apache/pulsar/issues/10174#issuecomment-1058891567


   The issue had no activity for 30 days, mark with Stale label.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org