You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/08/20 05:38:29 UTC

[GitHub] [pulsar] pouledodue opened a new issue #4982: Pulsar SQL unusable when topic contains malformed message

pouledodue opened a new issue #4982: Pulsar SQL unusable when topic contains malformed message
URL: https://github.com/apache/pulsar/issues/4982
 
 
   **Describe the bug**
   ```
   $ pulsar sql --execute 'select * from pulsar."test/app".topic1' \
    --output-format ALIGNED
   
   Query 20190820_051411_00001_t5ejs failed: Malformed data. Length is negative: -62
   ```
   and
   ```
   2019-08-20T01:14:12.305-0400	ERROR	remote-task-callback-1	com.facebook.presto.execution.StageStateMachine	Stage 20190820_051411_00001_t5ejs.1 failed
   org.apache.avro.AvroRuntimeException: Malformed data. Length is negative: -62
   	at org.apache.avro.io.BinaryDecoder.doReadBytes(BinaryDecoder.java:336)
   	at org.apache.avro.io.BinaryDecoder.readString(BinaryDecoder.java:263)
   	at org.apache.avro.io.ResolvingDecoder.readString(ResolvingDecoder.java:201)
   	at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:422)
   	at org.apache.avro.generic.GenericDatumReader.readString(GenericDatumReader.java:414)
   	at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:181)
   	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
   	at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
   	at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
   	at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
   	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
   	at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
   	at org.apache.pulsar.sql.presto.AvroSchemaHandler.deserialize(AvroSchemaHandler.java:70)
   	at org.apache.pulsar.sql.presto.PulsarRecordCursor.advanceNextPosition(PulsarRecordCursor.java:421)
   	at com.facebook.presto.spi.RecordPageSource.getNextPage(RecordPageSource.java:93)
   	at com.facebook.presto.operator.TableScanOperator.getOutput(TableScanOperator.java:242)
   ```
   
   **To Reproduce**
   Steps to reproduce the behavior:
   1. create a topic with an AvroSchema
   2. using the CLI, send non-Avro message to the topic:
    `pulsar-client produce persistent://test/app/topic1 -m yoshi`  
   3. Query the topic from CLI: 
   `$ pulsar sql --execute 'select * from pulsar."test/app".topic1' \
    --output-format ALIGNED`
   
   **Expected behavior**
   I expect a list of the correctly-formed Avro messages, maybe with a `malformed` notice for each malformed message:
   ```
    comment | __event_time__ |      __publish_time__       | __message_id__ | __sequence_id__ | __producer_name__ | __key__ | __properties__
   ---------+----------------+-----------------------------+----------------+-----------------+-------------------+---------+----------------
    yoshi!  | NULL           | 2019-08-16 23:30:17.405 UTC | (14438,4,0)    |               0 | standalone-7-71   | NULL    | {}
    yoshi!  | NULL           | 2019-08-16 23:15:21.035 UTC | (14438,3,0)    |               0 | standalone-7-70   | NULL    | {}
   <malformed>  | NULL           | 2019-08-16 23:14:21.035 UTC | (14438,2,0)    |               0 | standalone-7-70   | NULL    | {}
    yoshi!  | NULL           | 2019-08-16 23:12:21.035 UTC | (14438,1,0)    |               0 | standalone-7-70   | NULL    | {}
   ```
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services