You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@avro.apache.org by "Nandor Kollar (JIRA)" <ji...@apache.org> on 2018/02/19 08:47:03 UTC

[jira] [Commented] (AVRO-2146) getting Expected start-union. Got VALUE_STRING

    [ https://issues.apache.org/jira/browse/AVRO-2146?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16368876#comment-16368876 ] 

Nandor Kollar commented on AVRO-2146:
-------------------------------------

I think your schema includes a union: \{{{"name" : "data_field_value", "type" : ["string"], "doc" : "The value"}}}. Here string is in square brackets, so {{data_field_value}} is a union field (see details in [doc|https://avro.apache.org/docs/1.8.1/spec.html#Unions]).

> getting Expected start-union. Got VALUE_STRING
> ----------------------------------------------
>
>                 Key: AVRO-2146
>                 URL: https://issues.apache.org/jira/browse/AVRO-2146
>             Project: Avro
>          Issue Type: Bug
>    Affects Versions: 1.8.2
>         Environment: error message:
> Exception in thread "main" org.apache.avro.AvroTypeException: Expected start-union. Got VALUE_STRING
>  at org.apache.avro.io.JsonDecoder.error(JsonDecoder.java:698)
>  at org.apache.avro.io.JsonDecoder.readIndex(JsonDecoder.java:441)
>  at org.apache.avro.io.ResolvingDecoder.doAction(ResolvingDecoder.java:290)
>  at org.apache.avro.io.parsing.Parser.advance(Parser.java:88)
>  at org.apache.avro.io.ResolvingDecoder.readIndex(ResolvingDecoder.java:267)
>  at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:179)
>  at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
>  at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
>  at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at org.apache.avro.generic.GenericDatumReader.readField(GenericDatumReader.java:232)
>  at org.apache.avro.generic.GenericDatumReader.readRecord(GenericDatumReader.java:222)
>  at org.apache.avro.generic.GenericDatumReader.readWithoutConversion(GenericDatumReader.java:175)
>  at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:153)
>  at org.apache.avro.generic.GenericDatumReader.read(GenericDatumReader.java:145)
>  at myJson2Avro.fromJasonToAvro(myJson2Avro.java:81)
>  at myJson2Avro.main(myJson2Avro.java:48)
>            Reporter: laki
>            Priority: Major
>
> Here is the schema, no unions, but getting union error :
>  
> {
>  "type" : "record",
>  "name" : "edm_generic_publisher_avro_schema",
>  "namespace" : "edm.avro",
>  "doc" : "The generic avro schema used by publishers to publish events to the enterprise streaming service",
>  "fields" : [
> {"name" : "event", 
>  "type" : {
>  "type" : "record",
>  "name" : "event_meta_data",
>  "fields" : [
> {"name" : "event_name", "type" : "string", "doc" : "The name of the event. In the CDC, this field is populated with the name of the data base table or segment."}
> ,
> {"name" : "operation_type", "type" : "string", "doc": "The operation or action that triggered the event. e.g., Insert, Update, Delete, etc."}
> ,
> {"name" : "transaction_identifier", "type" : "string", "default" : "NONE", "doc" : "A unique identifier that identifies a unit or work or transaction. Useful in relating multiple events together."}
> ,
> {"name" : "event_publication_timestamp_millis", "type" : "string", "doc": "timestamp when the event was published"}
> ,
>  
> {"name" : "event_publisher", "type" : "string", "doc" : "The system or application that published the event"}
> ,
>  
> {"name" : "event_publisher_identity", "type": "string", "default" : "NONE", "doc": "The identity (user) of the system or application that published the event"}
> ,
>  
> {"name" : "event_timestamp_millis", "type" : "string", "default" : "NONE", "doc": "timestamp when the event occured"}
> ,
>  
> {"name": "event_initiator", "type": "string", "default" : "NONE", "doc" : "The system or application that initiated the event"}
> ,
> {"name": "event_initiator_identity", "type" : "string", "default" : "NONE", "doc": "The system id or application id that initiated the event" }
> ]},
>  "doc" : "The data about the published event"
>  },
>  { "name" : "contents",
>  "type" : {
>  "name": "data_field_groups",
>  "type": "array",
>  "items": {
>  "type": "record",
>  "name": "data_field_group",
>  "fields" : [
> {"name": "data_group_name", "type": "string" }
> ,
>  {
>  "name": "data_fields",
>  "type": {
>  "type": "array",
>  "items": {
>  "name": "data_field",
>  "type": "record",
>  "fields":[
> {"name" : "data_field_name", "type" : "string", "doc" : "The field name"}
> ,
>  
> {"name": "data_field_type", "type": "string", "doc" : "The data type is one of the following values: string, boolean, int, long, float, double or bytes"}
> ,
> {"name" : "data_field_value", "type" : ["string"], "doc" : "The value"}
> ]
>  }
> }
> }
>  ]
>  }
>  },
>  "doc" : "The datafields for the for the published event"
>  } 
>  ]
>  }
> ;
>  
>  
> here is the code that is causing the issue-------------------------------
>  
> static byte[] fromJasonToAvro( InputStream json, Schema schemastr) throws Exception {
>  FileOutputStream outStream;
> InputStream input = json; 
>  DataInputStream din = new DataInputStream(input);
> Schema schema = schemastr;
> Decoder decoder = DecoderFactory.get().jsonDecoder(schema, din);
> DatumReader<Object> reader = new GenericDatumReader<Object>(schema);
>  Object datum = reader.read(null, decoder);
> GenericDatumWriter<Object> w = new GenericDatumWriter<Object>(schema);
>  ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
>  
> Encoder e = EncoderFactory.get().binaryEncoder(outputStream, null);
> w.write(datum, e);
>  e.flush();
>  
>  
> -----------------------------------------------------
>  
> json file:
>  
> {
>  "event": {
>  "event_name": "Customer Phone Number",
>  "operation_type": "Add",
>  "transaction_identifier": "1234567890",
>  "event_publication_timestamp_millis": "1518464452915",
>  "event_publisher": "CIS",
>  "event_publisher_identity": "System",
>  "event_timestamp_millis": "NONE",
>  "event_initiator": "NONE",
>  "event_initiator_identity": "NONE"
>  },
>  "content": [
>  {
>  "data_group_name": "customer_identifier",
>  "data_fields": [
>  {
>  "data_field_name": "company_number",
>  "data_field_type": "string",
>  "data_field_value": "069"
>  },
>  {
>  "data_field_name": "customer_short_name",
>  "data_field_type": "string",
>  "data_field_value": "marshben"
>  },
>  {
>  "data_field_name": "tie_breaker",
>  "data_field_type": "string",
>  "data_field_value": "01"
>  }
>  ]
>  },
>  {
>  "data_group_name": "customer_phone_number",
>  "data_fields": [
>  {
>  "data_field_name": "phone_number",
>  "data_field_type": "string",
>  "data_field_value": "6x2-555-1x12"
>  },
>  {
>  "data_field_name": "phone_type",
>  "data_field_type": "string",
>  "data_field_value": "cell"
>  },
>  {
>  "data_field_name": "primary_indicator",
>  "data_field_type": "string",
>  "data_field_value": "yes"
>  }
>  ]
>  }
>  
>  ]
> }
>  
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)