You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/08/05 05:12:58 UTC

[GitHub] [pulsar] bigbang489 opened a new issue #7752: [Pulsar Client] Is there a way to specify a schema version instead of uploading SchemaInfo to broker?

bigbang489 opened a new issue #7752:
URL: https://github.com/apache/pulsar/issues/7752


   **Is your enhancement request related to a problem? Please describe.**
   When conumser/producer connect to broker, it has to upload the schema info (include JSON Schema) to broker for checking schema compatibility, in case the JSON Schema is large, it will become inefficient. I know it is only uploaded when consumer/producer connect to broker, but in my scenario, the client needs to connect and close connection everytime it consume/produce message. Is it better if we just specify the schema version instead of whole SchemaInfo?
   
   **Describe the solution you'd like**
   In case of the schema version has been registered, the client just need to specify the schema version of message it going to consume/produce
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] bigbang489 edited a comment on issue #7752: [Pulsar Client] Is there a way to specify a schema version instead of uploading SchemaInfo to broker?

Posted by GitBox <gi...@apache.org>.
bigbang489 edited a comment on issue #7752:
URL: https://github.com/apache/pulsar/issues/7752#issuecomment-670458320


   We are building an adapter to integrate Pulsar to SAP Integraion system. Sender adapter will consume message from a topic, Receiver adapter will produce message to a topic. For Receiver adapter, there is no certain time when it is called, if it's called once a month and keeping connection open for a long time to just send a message a month is not a good idea. That is why we decided to close and re-connect to broker everytime.
   Btw, Is there any way to tell the Pulsar client (that has producers only) that it should close the connectivity if there is no message to publish after an amount of time, and re-connect to brokers when it needs to send a message again?
   For the 
   
   > If we only send the schema version, there is no way for us to verify if the client is using the right schema. Because a client can just provide a random schema number and produce the data with a different schema.
   
   There is no guarantee that the encoded data sent to brokers is using correct schema in case they using their own custom Schema implementation. I think skipping checking SchemaInfo will make sense, if the client itself can assures the correctness of schema version. By doing this, not only faster connectivity establishment, but also be able to send messages with diffrent versions without creating new producers for each version.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] sijie commented on issue #7752: [Pulsar Client] Is there a way to specify a schema version instead of uploading SchemaInfo to broker?

Posted by GitBox <gi...@apache.org>.
sijie commented on issue #7752:
URL: https://github.com/apache/pulsar/issues/7752#issuecomment-669633854


   > Is it better if we just specify the schema version instead of whole SchemaInfo?
   
   We send the schema definition when a client connect to a broker, so the broker can verify the compatibility of a schema. If we only send the schema version, there is no way for us to verify if the client is using the right schema. Because a client can just provide a random schema number and produce the data with a different schema.
   
   > but in my scenario, the client needs to connect and close connection everytime it consume/produce message. 
   
   Can you explain a bit more about your use case? I would like to understand more why you need to connect and close connection every time it consumes and produce messages. This is an anti-pattern of using Pulsar.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] bigbang489 edited a comment on issue #7752: [Pulsar Client] Is there a way to specify a schema version instead of uploading SchemaInfo to broker?

Posted by GitBox <gi...@apache.org>.
bigbang489 edited a comment on issue #7752:
URL: https://github.com/apache/pulsar/issues/7752#issuecomment-670458320


   We are building an adapter to integrate Pulsar to SAP Integraion system. Sender adapter will consume message from a topic, Receiver adapter will produce message to a topic. For Receiver adapter, there is no certain time when it is called, if it's called once a month and keeping connection open for a long time to just send a message a month is not a good idea. That is why we decided to close and re-connect to broker everytime.
   Btw, Is there any way to tell the Pulsar client (that has producers only) that it should close the connectivity if there is no message to publish after an amount of time, and re-connect to brokers when it needs to send a message again?
   For the 
   
   > If we only send the schema version, there is no way for us to verify if the client is using the right schema. Because a client can just provide a random schema number and produce the data with a different schema.
   
   There is no guarantee that the encoded data sent to brokers is using correct schema in case they using their own custom Schema implementation. I think skipping verifying SchemaInfo will make sense, if the client itself can assures the correctness of schema version. By doing this, not only faster connectivity establishment, but also be able to send messages with diffrent versions without creating new producers for each version.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] bigbang489 edited a comment on issue #7752: [Pulsar Client] Is there a way to specify a schema version instead of uploading SchemaInfo to broker?

Posted by GitBox <gi...@apache.org>.
bigbang489 edited a comment on issue #7752:
URL: https://github.com/apache/pulsar/issues/7752#issuecomment-670458320


   We are building an adapter to integrate Pulsar to SAP Integraion system. Sender adapter will consume message from a topic, Receiver adapter will produce message to a topic. For Receiver adapter, there is no certain time when it is called, if it's called once a month and keeping connection open for a long time to just send a message a month is not a good idea. That is why we decided to close and re-connect to broker everytime.
   Btw, Is there any way to tell the Pulsar client (that has producers only) that it should close the connectivity if there is no message to publish after an amount of time, and re-connect to brokers when it needs to send a message again?
   For the 
   
   > If we only send the schema version, there is no way for us to verify if the client is using the right schema. Because a client can just provide a random schema number and produce the data with a different schema.
   
   There is no guarantee that the encoded data sent to brokers is using correct schema in case they using their own custom Schema implementation. I think skipping checking SchemaInfo is make sense, if the client itself can assures the correctness of schema version. By doing this, not only faster connectivity establishment, but also be able to send messages with diffrent versions without creating new producers for each version.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [pulsar] bigbang489 commented on issue #7752: [Pulsar Client] Is there a way to specify a schema version instead of uploading SchemaInfo to broker?

Posted by GitBox <gi...@apache.org>.
bigbang489 commented on issue #7752:
URL: https://github.com/apache/pulsar/issues/7752#issuecomment-670458320


   We are building an adapter to integrate Pulsar to SAP Integraion system. Sender adapter will consume message from a topic, Receiver adapter will produce message to a topic. For Receiver adapter, there is no certain time when it is called, if it's called once a month and keeping connection open for a long time to just send a message a month is not a good idea. That is why we decided to close and re-connect to broker everytime.
   Btw, Is there any way to tell the Pulsar client (that has producers only) that it should close the connectivity if there is no message to publish after an amount of time, and re-connect to brokers when it needs to send a message again?
   For the 
   
   > If we only send the schema version, there is no way for us to verify if the client is using the right schema. Because a client can just provide a random schema number and produce the data with a different schema.
   There is no guarantee that the encoded data sent to brokers is using correct schema in case they using their own custom Schema implementation. I think skipping checking SchemaInfo is make sense, if the client itself can assures the correctness of schema version. By doing this, not only faster connectivity establishment, but also be able to send messages with diffrent versions without creating new producers for each version.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org