You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@carbondata.apache.org by "Prasanna Ravichandran (Jira)" <ji...@apache.org> on 2020/10/07 13:31:00 UTC

[jira] [Updated] (CARBONDATA-3937) Insert into select from another carbon /parquet table is not working on Hive Beeline on a newly create Hive write format - carbon table. We are getting “Database is not set" error.

     [ https://issues.apache.org/jira/browse/CARBONDATA-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Prasanna Ravichandran updated CARBONDATA-3937:
----------------------------------------------
    Description: 
Insert into select from another carbon table is not working on Hive Beeline on a newly create Hive write format carbon table. We are getting “Carbondata files not found error”.

 

Test queries:

 drop table if exists hive_carbon;

create table hive_carbon(id int, name string, scale decimal, country string, salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon select 1,"Ram","2.3","India",3500;

insert into hive_carbon select 2,"Raju","2.4","Russia",3600;

insert into hive_carbon select 3,"Raghu","2.5","China",3700;

insert into hive_carbon select 4,"Ravi","2.6","Australia",3800;

 

drop table if exists hive_carbon2;

create table hive_carbon2(id int, name string, scale decimal, country string, salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon2 select * from hive_carbon;

select * from hive_carbon;

select * from hive_carbon2;

 

 --execute below queries in spark-beeline;

create table hive_table(id int, name string, scale decimal, country string, salary double);
create table parquet_table(id int, name string, scale decimal, country string, salary double) stored as parquet;
insert into hive_table select 1,"Ram","2.3","India",3500;
select * from hive_table;
insert into parquet_table select 1,"Ram","2.3","India",3500;
select * from parquet_table;

--execute the below query in hive beeline;

insert into hive_carbon select * from parquet_table;

Attached the logs for your reference. But the insert into select from the parquet and hive table into carbon table is working fine.

 

Error details in MR job which run through hive query:

Error: java.io.IOException: java.io.IOException: Database name is not set. at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:414) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:843) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:175) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: java.io.IOException: Database name is not set. at org.apache.carbondata.hadoop.api.CarbonInputFormat.getDatabaseName(CarbonInputFormat.java:841) at org.apache.carbondata.hive.MapredCarbonInputFormat.getCarbonTable(MapredCarbonInputFormat.java:80) at org.apache.carbondata.hive.MapredCarbonInputFormat.getQueryModel(MapredCarbonInputFormat.java:215) at org.apache.carbondata.hive.MapredCarbonInputFormat.getRecordReader(MapredCarbonInputFormat.java:205) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:411) ... 9 more

  was:
Insert into select from another carbon table is not working on Hive Beeline on a newly create Hive write format carbon table. We are getting “Carbondata files not found error”.

 

Test queries:

 drop table if exists hive_carbon;

create table hive_carbon(id int, name string, scale decimal, country string, salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon select 1,"Ram","2.3","India",3500;

insert into hive_carbon select 2,"Raju","2.4","Russia",3600;

insert into hive_carbon select 3,"Raghu","2.5","China",3700;

insert into hive_carbon select 4,"Ravi","2.6","Australia",3800;

 

drop table if exists hive_carbon2;

create table hive_carbon2(id int, name string, scale decimal, country string, salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';

insert into hive_carbon2 select * from hive_carbon;

select * from hive_carbon;

select * from hive_carbon2;

 

 

Attached the logs for your reference. But the insert into select from the parquet and hive table into carbon table is working fine.

 

Error details in MR job which run through hive query:

Error: java.io.IOException: java.io.IOException: CarbonData file is not present in the table location at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:414) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:843) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:175) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1729) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: java.io.IOException: CarbonData file is not present in the table location at org.apache.carbondata.core.util.CarbonUtil.inferSchema(CarbonUtil.java:2141) at org.apache.carbondata.core.metadata.schema.SchemaReader.inferSchema(SchemaReader.java:139) at org.apache.carbondata.hive.MapredCarbonInputFormat.populateCarbonTable(MapredCarbonInputFormat.java:92) at org.apache.carbondata.hive.MapredCarbonInputFormat.getCarbonTable(MapredCarbonInputFormat.java:104) at org.apache.carbondata.hive.MapredCarbonInputFormat.getQueryModel(MapredCarbonInputFormat.java:203) at org.apache.carbondata.hive.MapredCarbonInputFormat.getRecordReader(MapredCarbonInputFormat.java:192) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:411) ... 9 more

        Summary: Insert into select from another carbon /parquet table is not working on Hive Beeline on a newly create Hive write format - carbon table. We are getting “Database is not set" error.  (was: Insert into select from another carbon table is not working on Hive Beeline on a newly create Hive write format - carbon table. We are getting “Carbondata files not found error")

> Insert into select from another carbon /parquet table is not working on Hive Beeline on a newly create Hive write format - carbon table. We are getting “Database is not set" error.
> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CARBONDATA-3937
>                 URL: https://issues.apache.org/jira/browse/CARBONDATA-3937
>             Project: CarbonData
>          Issue Type: Bug
>          Components: hive-integration
>    Affects Versions: 2.0.0
>            Reporter: Prasanna Ravichandran
>            Priority: Major
>
> Insert into select from another carbon table is not working on Hive Beeline on a newly create Hive write format carbon table. We are getting “Carbondata files not found error”.
>  
> Test queries:
>  drop table if exists hive_carbon;
> create table hive_carbon(id int, name string, scale decimal, country string, salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';
> insert into hive_carbon select 1,"Ram","2.3","India",3500;
> insert into hive_carbon select 2,"Raju","2.4","Russia",3600;
> insert into hive_carbon select 3,"Raghu","2.5","China",3700;
> insert into hive_carbon select 4,"Ravi","2.6","Australia",3800;
>  
> drop table if exists hive_carbon2;
> create table hive_carbon2(id int, name string, scale decimal, country string, salary double) stored by 'org.apache.carbondata.hive.CarbonStorageHandler';
> insert into hive_carbon2 select * from hive_carbon;
> select * from hive_carbon;
> select * from hive_carbon2;
>  
>  --execute below queries in spark-beeline;
> create table hive_table(id int, name string, scale decimal, country string, salary double);
> create table parquet_table(id int, name string, scale decimal, country string, salary double) stored as parquet;
> insert into hive_table select 1,"Ram","2.3","India",3500;
> select * from hive_table;
> insert into parquet_table select 1,"Ram","2.3","India",3500;
> select * from parquet_table;
> --execute the below query in hive beeline;
> insert into hive_carbon select * from parquet_table;
> Attached the logs for your reference. But the insert into select from the parquet and hive table into carbon table is working fine.
>  
> Error details in MR job which run through hive query:
> Error: java.io.IOException: java.io.IOException: Database name is not set. at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderCreationException(HiveIOExceptionHandlerChain.java:97) at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderCreationException(HiveIOExceptionHandlerUtil.java:57) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:414) at org.apache.hadoop.hive.ql.io.CombineHiveInputFormat.getRecordReader(CombineHiveInputFormat.java:843) at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.<init>(MapTask.java:175) at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:444) at org.apache.hadoop.mapred.MapTask.run(MapTask.java:349) at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:175) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1737) at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:169) Caused by: java.io.IOException: Database name is not set. at org.apache.carbondata.hadoop.api.CarbonInputFormat.getDatabaseName(CarbonInputFormat.java:841) at org.apache.carbondata.hive.MapredCarbonInputFormat.getCarbonTable(MapredCarbonInputFormat.java:80) at org.apache.carbondata.hive.MapredCarbonInputFormat.getQueryModel(MapredCarbonInputFormat.java:215) at org.apache.carbondata.hive.MapredCarbonInputFormat.getRecordReader(MapredCarbonInputFormat.java:205) at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:411) ... 9 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)