You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@nifi.apache.org by "Matt Burgess (Jira)" <ji...@apache.org> on 2021/07/08 22:04:00 UTC

[jira] [Commented] (NIFI-8716) Avro Schema with Array Root Results in AvroRuntimeException

    [ https://issues.apache.org/jira/browse/NIFI-8716?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17377610#comment-17377610 ] 

Matt Burgess commented on NIFI-8716:
------------------------------------

I don't think it is possible to defined a single record as a union without having at least a single field name that is of the desired union type. For your use case you may need a scripted processor (ExecuteScript, InvokeScriptedProcessor, etc. ) that can tell the difference between the objects, and perhaps route them accordingly or set an attribute you can use downstream to pick the correct schema.

> Avro Schema with Array Root Results in AvroRuntimeException
> -----------------------------------------------------------
>
>                 Key: NIFI-8716
>                 URL: https://issues.apache.org/jira/browse/NIFI-8716
>             Project: Apache NiFi
>          Issue Type: Bug
>    Affects Versions: 1.13.2
>         Environment: Openshift 3.11 (using official unofficial docker image)
>            Reporter: Jan Malcomess
>            Priority: Major
>         Attachments: image-2021-06-19-11-38-45-420.png
>
>
> I have a flow setup with a HandleHttpRequest-Processor accepting a JSON-Document and forwarding that to a PublishKafkaRecord_2_6-Processor. That has a JSONTreeReader with "inferSchema" setting and an AVRORecordSetWriter setup to read the schema from our Confluent Registry.
> When the input schema is of the 'array'-form, i.e.
> [
>  {
>  "name": "MyArrayLine",
>  "type": "record",
>  "namespace": "com.foo.bar",
>  "fields": [
>  {
>  "name": "lineAttr1",
>  "type": "string"
>  },
>  {
>  "name": "lineAttr2",
>  "type": "string"
>  }
>  ]
>  },
>  {
>  "name": "MyEvent",
>  "type": "record", 
>  "namespace": "com.foo.bar",
>  "fields": [
>  {
>  "name": "eventAttr1",
>  "type": "string"
>  },
>  {
>  "name": "eventAttr2",
>  "type": "string"
>  },
>  {
>  "name": "eventArrayLines",
>  "type": {
>  "type": "array",
>  "items": "MyArrayLine"
>  }
>  }
>  ]
>  }
> ]
> Nifi throws an AvroRuntimeException when loading the schema from the registry with the following stacktrace:
> Caused by: org.apache.avro.AvroRuntimeException: Not a named type: [...actual schema here...]
>  at org.apache.avro.Schema.getNamespace(Schema.java:258)
>  at org.apache.nifi.avro.AvroTypeUtil.createSchema(AvroTypeUtil.java:473)
>  at org.apache.nifi.confluent.schemaregistry.client.RestSchemaRegistryClient.createRecordSchema(RestSchemaRegistryClient.java:154)
>  at org.apache.nifi.confluent.schemaregistry.client.RestSchemaRegistryClient.getSchema(RestSchemaRegistryClient.java:99)
>  at com.github.benmanes.caffeine.cache.LocalLoadingCache.lambda$newMappingFunction$2(LocalLoadingCache.java:141)
>  at com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2380)
>  at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
>  at com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2378)
>  at com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2361)
>  at com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
>  at com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:54)
>  at org.apache.nifi.confluent.schemaregistry.client.CachingSchemaRegistryClient.getSchema(CachingSchemaRegistryClient.java:54)
>  at org.apache.nifi.confluent.schemaregistry.ConfluentSchemaRegistry.retrieveSchemaByName(ConfluentSchemaRegistry.java:250)
>  at org.apache.nifi.confluent.schemaregistry.ConfluentSchemaRegistry.retrieveSchema(ConfluentSchemaRegistry.java:269)
>  at sun.reflect.GeneratedMethodAccessor217.invoke(Unknown Source)
>  at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  at java.lang.reflect.Method.invoke(Method.java:498)
>  at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:254)
>  at org.apache.nifi.controller.service.StandardControllerServiceInvocationHandler.invoke(StandardControllerServiceInvocationHandler.java:105)
>  at com.sun.proxy.$Proxy123.retrieveSchema(Unknown Source)
>  at org.apache.nifi.schema.access.SchemaNamePropertyStrategy.getSchema(SchemaNamePropertyStrategy.java:81)
>  ... 24 common frames omitted
>  
> I guess that the code just assumes that a valid schema always has a named type at its root, because the code in AvroTypeUtil in line 473 is this:
> !image-2021-06-19-11-38-45-420.png!
> Please look into this issue, because a schema having an array at its root is perfectly valid (see http://avro.apache.org/docs/current/spec.html#schemas)
> Thank you very much :)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)