You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2019/11/01 19:25:30 UTC

[GitHub] [flink] derjust commented on issue #8371: [FLINK-9679] - Add AvroSerializationSchema

derjust commented on issue #8371: [FLINK-9679] - Add AvroSerializationSchema
URL: https://github.com/apache/flink/pull/8371#issuecomment-548919899
 
 
   I used this PR fairly successful over a week producing ~50 mil messages in total.
   
   There is only one catch in the PR that needs to be fixed: https://github.com/apache/flink/pull/8371/files#diff-c60754c7ce564f4229c22913d783c339R128
   
   ```java
   int schemaId= schemaCoderProvider.get()
   					.writeSchema(getSchema());
   ```
   
   This creates a new http client each time a message is serialized which creates a significant overhead due to establishing the HTTP connections but also never leveraging the schema caching 
   The `schemaCoderProvider.get()` should be cached/kept local after it has been fetched and used for all subsequent `serialize()` calls.
   In fact, it should be done in the same fashion as in the existing source implementation:
   
   ```java
     @Override
     protected void checkAvroInitialized() {
       super.checkAvroInitialized();
       if (schemaCoder == null) {
         this.schemaCoder = schemaCoderProvider.get();
       }
   
   ```
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services