You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@avro.apache.org by kant kodali <ka...@gmail.com> on 2018/08/24 01:09:17 UTC

GenericRecord.toString produces invalid JSON for logical types

Hi All,

I have a serialized avro binary data represented by byte[] where one one
the field is long with logical type of TimeStamp.

  Timestamp tsp = new Timestamp(1530228588182l);
    Schema schema  = SchemaBuilder.builder()
            .record("hello")
            .fields()
            .name("tsp").type(LogicalTypes.timestampMillis().addToSchema(Schema.create(Schema.Type.LONG))).noDefault()
            .endRecord();
    System.out.println(schema.toString());

    GenericRecord genericRecord = new GenericData.Record(schema);
    genericRecord.put("tsp",tsp.getTime());


I serialized the above generic record to byte[] and used below two methods
to deserialize but both of them produce invalid JSON.

public static GenericRecord deserialize(final Schema schema, byte[]
data) throws IOException {
        final GenericData genericData = new GenericData();
        genericData.addLogicalTypeConversion(new
TimeConversions.TimestampConversion());
        genericData.addLogicalTypeConversion(new
TimeConversions.TimeConversion());
        try (final InputStream is = new ByteArrayInputStream(data)) {
            final Decoder decoder =
DecoderFactory.get().binaryDecoder(is, null);
            final DatumReader<GenericRecord> reader = new
GenericDatumReader<>(schema, schema, genericData);
            return reader.read(null, decoder);
        }
    }

This produces {"tsp": 2018-06-28T23:29:48.182Z} this is not a valid json

so I also tried the following

public static GenericRecord deserialize(final Schema schema, byte[]
data) throws IOException {
        final GenericData genericData = new GenericData(){
            @Override
            public String toString(Object datum) {
                StringBuilder buffer = new StringBuilder();
                // Since these types are not quoted and produce a
malformed JSON string, quote it here.
                if (datum instanceof java.sql.Timestamp || datum
instanceof java.sql.Time || datum instanceof java.sql.Date) {
                    return
buffer.append("\"").append(datum).append("\"").toString();
                }
                return super.toString(datum);
            }
        };
        genericData.addLogicalTypeConversion(new
TimeConversions.TimestampConversion());
        genericData.addLogicalTypeConversion(new
TimeConversions.TimeConversion());
        try (final InputStream is = new ByteArrayInputStream(data)) {
            final Decoder decoder =
DecoderFactory.get().binaryDecoder(is, null);
            final DatumReader<GenericRecord> reader = new
GenericDatumReader<>(schema, schema, genericData);
            return reader.read(null, decoder);
        }
    }


I still get {"tsp": 2018-06-28T23:29:48.182Z} this is not a valid json

Expected output: {"tsp": "2018-06-28T23:29:48.182Z"}

Any ideas?

Thanks!

Re: GenericRecord.toString produces invalid JSON for logical types

Posted by kant kodali <ka...@gmail.com>.
Hi Dan,

I modified my function a bit (pasted below). The control never seems to
enter my overridden toString method for some reason. The print statement
inside my toString method never gets invoked and I don't see it in my
stdout. Any ideas? I am using avro 1.8.2.

Thanks!

public static GenericRecord deserializeWithLogicalType(final Schema
schema, byte[] data) throws IOException {
    final GenericData genericData = new GenericData(){
        @Override
        public String toString(Object datum) {
            System.out.println("I got invoked");
            // Since these types are not quoted and produce a
malformed JSON string, quote it here.
            return "\"" + super.toString(datum) + "\"";
        }
    };
    genericData.addLogicalTypeConversion(new
TimeConversions.TimestampConversion());
    genericData.addLogicalTypeConversion(new TimeConversions.TimeConversion());
    try (final InputStream is = new ByteArrayInputStream(data)) {
        final Decoder decoder = DecoderFactory.get().binaryDecoder(is, null);
        final DatumReader<GenericRecord> reader = new
GenericDatumReader<>(schema, schema, genericData);
        return reader.read(null, decoder);
    }
}


On Thu, Aug 23, 2018 at 7:52 PM Dan Schmitt <da...@gmail.com> wrote:

> On Thu, Aug 23, 2018 at 9:09 PM, kant kodali <ka...@gmail.com> wrote:
> >                 // Since these types are not quoted and produce a
> malformed
> > JSON string, quote it here.
> >                 if (datum instanceof java.sql.Timestamp || datum
> instanceof
> > java.sql.Time || datum instanceof java.sql.Date) {
> >                     return
> > buffer.append("\"").append(datum).append("\"").toString();
> >                 }
> >                 return super.toString(datum);
>
> I suspect datum is always a Long type, but you could check by adding
> logging to do something
> like Class c = datum.getClass(); System.out.println("Saw
> class:"+c.getName()); before the if (might
> get lucky and have more info about the LogicalType.)  If it does work
> the way you are looking, I bet
> the class is more likely org.joda.time.DateTime than one of the
> java.sql.*s you are looking for:
>
>
> https://avro.apache.org/docs/1.8.2/api/java/org/apache/avro/data/TimeConversions.TimestampConversion.html
>
> Note, if you do find out that datum has some nice way to tell it's the
> logical type, you likely want
> to .append(super.toString(datum)) between the quotes instead of doing
> .append(datum) to the first
> double quote string.
>

Re: GenericRecord.toString produces invalid JSON for logical types

Posted by Dan Schmitt <da...@gmail.com>.
On Thu, Aug 23, 2018 at 9:09 PM, kant kodali <ka...@gmail.com> wrote:
>                 // Since these types are not quoted and produce a malformed
> JSON string, quote it here.
>                 if (datum instanceof java.sql.Timestamp || datum instanceof
> java.sql.Time || datum instanceof java.sql.Date) {
>                     return
> buffer.append("\"").append(datum).append("\"").toString();
>                 }
>                 return super.toString(datum);

I suspect datum is always a Long type, but you could check by adding
logging to do something
like Class c = datum.getClass(); System.out.println("Saw
class:"+c.getName()); before the if (might
get lucky and have more info about the LogicalType.)  If it does work
the way you are looking, I bet
the class is more likely org.joda.time.DateTime than one of the
java.sql.*s you are looking for:

https://avro.apache.org/docs/1.8.2/api/java/org/apache/avro/data/TimeConversions.TimestampConversion.html

Note, if you do find out that datum has some nice way to tell it's the
logical type, you likely want
to .append(super.toString(datum)) between the quotes instead of doing
.append(datum) to the first
double quote string.