You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Lars Volker (JIRA)" <ji...@apache.org> on 2016/06/23 23:10:16 UTC
[jira] [Created] (HIVE-14086)
org.apache.hadoop.hive.metastore.api.Table does not return columns from
Avro schema file
Lars Volker created HIVE-14086:
----------------------------------
Summary: org.apache.hadoop.hive.metastore.api.Table does not return columns from Avro schema file
Key: HIVE-14086
URL: https://issues.apache.org/jira/browse/HIVE-14086
Project: Hive
Issue Type: Bug
Components: API
Reporter: Lars Volker
Consider this table, using an external Avro schema file:
{noformat}
CREATE TABLE avro_table
PARTITIONED BY (str_part STRING)
ROW FORMAT SERDE
'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
STORED AS INPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
OUTPUTFORMAT
'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
TBLPROPERTIES (
'avro.schema.url'='hdfs://localhost:20500/tmp/avro.json'
);
{noformat}
This will populate the "COLUMNS_V2" metastore table with the correct column information (as per HIVE-6308). The columns of this table can then be queried via the Hive API, for example by calling {{.getSd().getCols()}} on a {{org.apache.hadoop.hive.metastore.api.Table}} object.
Changes to the avro.schema.url file - either changing where it points to or changing its contents - will be reflected in the output of {{describe formatted avro_table}} *but not* in the result of the {{.getSd().getCols()}} API call. Instead it looks like Hive only reads the Avro schema file internally, but does not expose the information therein via its API.
Is there a way to obtain the effective Table information via Hive? Would it make sense to fix table retrieval so calls to {{get_table}} return the correct set of columns?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)