You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2020/04/08 15:15:41 UTC

[GitHub] [pulsar] shiv4289 opened a new pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.

shiv4289 opened a new pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.
URL: https://github.com/apache/pulsar/pull/6695
 
 
   Fixes #6694 
   
   ### Motivation
   
   If the avro encode for message fails after writing a few bytes, the cursor in the stream is not reset. The following **flush()** that would normally reset the cursor is skipped in the event of an exception.
   
   ### Modifications
   
   Add **flush()** in the finally block.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [pulsar] sijie commented on a change in pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.

Posted by GitBox <gi...@apache.org>.
sijie commented on a change in pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.
URL: https://github.com/apache/pulsar/pull/6695#discussion_r407252324
 
 

 ##########
 File path: pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/writer/AvroWriter.java
 ##########
 @@ -47,6 +47,11 @@ public AvroWriter(Schema schema) {
         } catch (Exception e) {
             throw new SchemaSerializationException(e);
         } finally {
+            try {
+                this.encoder.flush();
 
 Review comment:
   (1) seems fine to me.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [pulsar] skyrocknroll commented on issue #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.

Posted by GitBox <gi...@apache.org>.
skyrocknroll commented on issue #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.
URL: https://github.com/apache/pulsar/pull/6695#issuecomment-613224840
 
 
   /pulsarbot run-failure-checks

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [pulsar] codelipenghui commented on issue #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.

Posted by GitBox <gi...@apache.org>.
codelipenghui commented on issue #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.
URL: https://github.com/apache/pulsar/pull/6695#issuecomment-613252866
 
 
   Ping @sijie please take a look again.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [pulsar] sijie commented on a change in pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.

Posted by GitBox <gi...@apache.org>.
sijie commented on a change in pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.
URL: https://github.com/apache/pulsar/pull/6695#discussion_r405687773
 
 

 ##########
 File path: pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/writer/AvroWriter.java
 ##########
 @@ -47,6 +47,11 @@ public AvroWriter(Schema schema) {
         } catch (Exception e) {
             throw new SchemaSerializationException(e);
         } finally {
+            try {
+                this.encoder.flush();
 
 Review comment:
   If this encoder is already flushed, shall we skip flushing it again?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [pulsar] shiv4289 commented on a change in pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.

Posted by GitBox <gi...@apache.org>.
shiv4289 commented on a change in pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.
URL: https://github.com/apache/pulsar/pull/6695#discussion_r407197590
 
 

 ##########
 File path: pulsar-client/src/main/java/org/apache/pulsar/client/impl/schema/writer/AvroWriter.java
 ##########
 @@ -47,6 +47,11 @@ public AvroWriter(Schema schema) {
         } catch (Exception e) {
             throw new SchemaSerializationException(e);
         } finally {
+            try {
+                this.encoder.flush();
 
 Review comment:
   Hi @sijie  thanks for reviewing this promptly.
   
   Below are two easy (or minimal code change) alternate option  I see to skip flushing again. Just cosmetic difference. I prefer 1 by the way.
   
   1 .
   ```
       public synchronized byte[] write(T message) {
           byte[] outputBytes = null;
           try {
               writer.write(message, this.encoder);
               outputBytes = this.byteArrayOutputStream.toByteArray();
           } catch (IOException e) {
               throw new SchemaSerializationException(e);
           } finally {
               try {
                   this.encoder.flush();
               } catch (IOException ex) {
                   throw new SchemaSerializationException(e);
               }
               this.byteArrayOutputStream.reset();
           }
           return outputBytes;
       }
   ```
   
   And 2. 
   
   ```
   public synchronized byte[] write(T message) {
       try {
           writer.write(message, this.encoder);
           this.encoder.flush();
           return this.byteArrayOutputStream.toByteArray();
       } catch (IOException e) {
           try {
               this.encoder.flush();
           } catch (IOException ex) {
               throw new SchemaSerializationException(e);
           }
           throw new SchemaSerializationException(e);
       } finally {
           this.byteArrayOutputStream.reset();
       }
   }
   
   ```
   Regarding your suggestions on slack:
   
   > Can flush throw exception?
   
   Yes. Infact any method for the encoder throws an IOException!
   
   > If flush also throws exception, then we might need to know where the exception is thrown and whether do we need to call flush.
   
   Flush is [here](https://github.com/apache/avro/blob/9af218d3c060da103868cdc69d7b39fc8d50dcbc/lang/java/avro/src/main/java/org/apache/avro/io/BufferedBinaryEncoder.java#L84) and I want to call it in case of crash because of [this line](https://github.com/apache/avro/blob/9af218d3c060da103868cdc69d7b39fc8d50dcbc/lang/java/avro/src/main/java/org/apache/avro/io/BufferedBinaryEncoder.java#L98). I understand i need a reset to be precise but there is no way to throw the contents of the stream and reset the cursor to 0.
   
   > Alternatively, we can consider adding a flag for indicating if the flush succeeded or not.
   
   The success or failure is in effect separated by a null pointer exception [here](https://github.com/apache/avro/blob/9af218d3c060da103868cdc69d7b39fc8d50dcbc/lang/java/avro/src/main/java/org/apache/avro/specific/SpecificDatumWriter.java#L68). With that, it is still some sort of a hack to catch a null pointer exception and then update a flag. Also, that would require changing the avro library by the way.
   
   On the other hand, to add a flag on the pulsar side in the current class will actually mean updating it in the catch/finally which again means there is not much use to adding a flag. 
   
   Please let me know if (1) is ok. Happy to know if you have any other suggestion(s). 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [pulsar] shiv4289 commented on issue #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.

Posted by GitBox <gi...@apache.org>.
shiv4289 commented on issue #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.
URL: https://github.com/apache/pulsar/pull/6695#issuecomment-611355699
 
 
   > According to #6694, it looks easy to reproduce. So, would you please help add a test for this change?
   
   I have a repeating test case. Will convert to junit and add one shortly. Thanks @codelipenghui . 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

[GitHub] [pulsar] sijie merged pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.

Posted by GitBox <gi...@apache.org>.
sijie merged pull request #6695: [issue 6694][AVRO ENCODE] Reset cursor if message encode fails.
URL: https://github.com/apache/pulsar/pull/6695
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services