You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@gobblin.apache.org by Tamas Nemeth <ta...@prezi.com.INVALID> on 2018/08/01 14:08:14 UTC

Hive registration fails on avro schema with the latest Gobblin

Hi All,

I'm testing the latest release of Gobblin on our environment and I noticed
that with the latest release I'm not able to query the registered Hive
tables (it was registered with native MR compaction -> HiveAvroSerDeManager)
because it fails with the following error when I tried to run a sample
select query:
FAILED: RuntimeException
MetaException(message:org.apache.hadoop.hive.serde2.SerDeException
Encountered exception determining schema. Returning signal schema to
indicate problem: org.codehaus.jackson.JsonParseException: Illegal
character ((CTRL-CHAR, code 1)): only regular white space (\r, \n, \t) is
allowed between tokens
 at [Source: org.apache.hadoop.fs.FSDataInputStream@6be8ce1b; line: 1,
column: 2])

I checked the newly create Avro schema and it looks like this:
U{"type":"record","name":"AnalyticsError","namespace":"com.prezi.analytics","fields":[{"name":"timestamp","type":"long","doc":"Time
at which event was
created.","default":0},{"name":"error","type":["string","null"],"doc":"Error
message.","default":null},{"name":"event","type":"string","doc":"The
original event which failed on validation"}]}

I think it fails on the first character which is I guess due to this change:
https://github.com/apache/incubator-gobblin/pull/2355/files#diff-1e025b316af80547dec67d0bd8edd7efL398

In writeSchemaToFile this was before this:
dos.writeChars(schema.toString());
And it was changed to:
dos.writeUTF(schema.toString());

Our current Gobblin build wich working fine was built from an April version
of Gobblin, so I have a strong feeling this is the change which caused me
the issue.

Am I doing wrong something in Hive registration? I only changed the Gobblin
build (from the April one to the latest one) but I used the same config.

Thanks,
Tamas