You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Wayne Zhu (JIRA)" <ji...@apache.org> on 2014/04/04 17:22:15 UTC

[jira] [Commented] (HIVE-5820) Neither avro.schema.literal nor avro.schema.url specified, can't determine table schema

    [ https://issues.apache.org/jira/browse/HIVE-5820?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13960042#comment-13960042 ] 

Wayne Zhu commented on HIVE-5820:
---------------------------------

I have the same issue on CDH 4.3.0.
$ rpm -qa | grep hive
hive-hbase-0.10.0+121-1.cdh4.3.0.p0.16.el6.noarch
hive-metastore-0.10.0+121-1.cdh4.3.0.p0.16.el6.noarch
hive-jdbc-0.10.0+121-1.cdh4.3.0.p0.16.el6.noarch
hive-0.10.0+121-1.cdh4.3.0.p0.16.el6.noarch

Here is Hive steps:
hive> CREATE TABLE avro_price_internal
    > PARTITIONED BY (year int, month int, day int, hour int)
    > ROW FORMAT SERDE
    >   'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
    > STORED AS INPUTFORMAT
    >   'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
    > OUTPUTFORMAT
    >   'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
    > TBLPROPERTIES (
    >   'avro.schema.url'='hdfs:///user/wzhu00/camus/camus-avsc/PriceSchema.avsc'
    > );
OK
Time taken: 2.762 seconds
hive> ALTER TABLE avro_price_internal ADD IF NOT EXISTS PARTITION (year=2014, month=03, day=31, hour=19) LOCATION '/user/wzhu00/camus/dest/pricesingle/hourly/2014/03/31/19';
OK
Time taken: 0.595 seconds
hive> select * from avro_price_internal;
OK
Failed with exception java.io.IOException:org.apache.hadoop.hive.serde2.avro.BadSchemaException
Time taken: 0.463 seconds

> Neither avro.schema.literal nor avro.schema.url specified, can't determine table schema
> ---------------------------------------------------------------------------------------
>
>                 Key: HIVE-5820
>                 URL: https://issues.apache.org/jira/browse/HIVE-5820
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.10.0
>         Environment: CDH 4.3  Hive 0.10.0+121
>            Reporter: Sergey
>
> Hi, we've created a table:
> {code}
> create table tmp
> comment 'tmp'
> partitioned by (year string, month string, day string, fulldate string)
> row format serde 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
> stored as
>     inputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
>     outputformat 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
> location '/user/lol/tmp'
> tblproperties ('avro.schema.literal' =
>     '{"name": "tmp", "doc": "version 0.0.1", "type": "record", "fields": [
>         {"name": "a", "type": "int"},
>         {"name": "b", "type": "int"}
>     ]}'
> )
> {code}
> And we try to query it:
> {code}
> select * from tmp
> {code}
> and we get an exception
> {code}
> 13/11/14 17:12:15 WARN avro.AvroSerdeUtils: Encountered AvroSerdeException determining schema. Returning signal schema to indicate problem
> org.apache.hadoop.hive.serde2.avro.AvroSerdeException: Neither avro.schema.literal nor avro.schema.url specified, can't determine table schema
> 	at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrThrowException(AvroSerdeUtils.java:66)
> 	at org.apache.hadoop.hive.serde2.avro.AvroSerdeUtils.determineSchemaOrReturnErrorSchema(AvroSerdeUtils.java:87)
> 	at org.apache.hadoop.hive.serde2.avro.AvroSerDe.initialize(AvroSerDe.java:59)
> 	at org.apache.hadoop.hive.metastore.MetaStoreUtils.getDeserializer(MetaStoreUtils.java:249)
> 	at org.apache.hadoop.hive.ql.metadata.Partition.getDeserializer(Partition.java:251)
> 	at org.apache.hadoop.hive.ql.metadata.Partition.initialize(Partition.java:217)
> 	at org.apache.hadoop.hive.ql.metadata.Partition.<init>(Partition.java:107)
> 	at org.apache.hadoop.hive.ql.metadata.Hive.getPartitions(Hive.java:1573)
> 	at org.apache.hadoop.hive.ql.optimizer.ppr.PartitionPruner.prune(PartitionPruner.java:190)
> 	at org.apache.hadoop.hive.ql.parse.ParseContext.getPrunedPartitions(ParseContext.java:561)
> 	at org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.checkTree(SimpleFetchOptimizer.java:144)
> 	at org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.optimize(SimpleFetchOptimizer.java:100)
> 	at org.apache.hadoop.hive.ql.optimizer.SimpleFetchOptimizer.transform(SimpleFetchOptimizer.java:74)
> 	at org.apache.hadoop.hive.ql.optimizer.Optimizer.optimize(Optimizer.java:102)
> 	at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:8200)
> 	at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:258)
> 	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:457)
> 	at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:349)
> 	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.checkedCompile(BeeswaxServiceImpl.java:247)
> 	at com.cloudera.beeswax.BeeswaxServiceImpl$RunningQueryState.compile(BeeswaxServiceImpl.java:200)
> 	at com.cloudera.beeswax.BeeswaxServiceImpl$2.run(BeeswaxServiceImpl.java:830)
> 	at com.cloudera.beeswax.BeeswaxServiceImpl$2.run(BeeswaxServiceImpl.java:823)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:396)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)
> 	at com.cloudera.beeswax.BeeswaxServiceImpl.doWithState(BeeswaxServiceImpl.java:772)
> 	at com.cloudera.beeswax.BeeswaxServiceImpl.query(BeeswaxServiceImpl.java:822)
> 	at com.cloudera.beeswax.api.BeeswaxService$Processor$query.getResult(BeeswaxService.java:915)
> 	at com.cloudera.beeswax.api.BeeswaxService$Processor$query.getResult(BeeswaxService.java:899)
> 	at org.apache.thrift.ProcessFunction.process(ProcessFunction.java:39)
> 	at org.apache.thrift.TBaseProcessor.process(TBaseProcessor.java:39)
> 	at org.apache.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:206)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> 	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> 	at java.lang.Thread.run(Thread.java:662)
> 13/11/14 17:12:15 INFO parse.SemanticAnalyzer: Completed plan generation
> 13/11/14 17:12:15 INFO ql.Driver: Semantic Analysis Completed
> {code}
> here is describe:
> {code}
> 0	a	int	from deserializer
> 1	b	int	from deserializer
> 2	year	string	
> 3	month	string	
> 4	day	string	
> 5	fulldate	string	
> 6			
> 7	Detailed Table Information	Table(tableName:tmp, dbName:default, owner:devops, createTime:1384435112, lastAccessTime:0, retention:0, sd:StorageDescriptor(cols:[], location:hdfs://nameservice1/user/fedyakov/tmp, inputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat, outputFormat:org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat, compressed:false, numBuckets:-1, serdeInfo:SerDeInfo(name:null, serializationLib:org.apache.hadoop.hive.serde2.avro.AvroSerDe, parameters:{serialization.format=1}), bucketCols:[], sortCols:[], parameters:{}, skewedInfo:SkewedInfo(skewedColNames:[], skewedColValues:[], skewedColValueLocationMaps:{}), storedAsSubDirectories:false), partitionKeys:[FieldSchema(name:year, type:string, comment:null), FieldSchema(name:month, type:string, comment:null), FieldSchema(name:day, type:string, comment:null), FieldSchema(name:fulldate, type:string, comment:null)], parameters:{numPartitions=1, numFiles=1, avro.schema.literal={"name": "tmp", "doc": "version 0.0.1", "type": "record", "fields": [	
> 8	 {"name": "a", "type": "int"},		
> 9	 {"name": "b", "type": "int"}		
> 10	 ]}, transient_lastDdlTime=1384435137, numRows=0, totalSize=189, rawDataSize=0}, viewOriginalText:null, viewExpandedText:null, tableType:MANAGED_TABLE)	
> {code}
> If we sepcify file instead of "embedded" avro schema, it works.



--
This message was sent by Atlassian JIRA
(v6.2#6252)