You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "arnab chatterjee (JIRA)" <ji...@apache.org> on 2015/10/05 15:10:27 UTC
[jira] [Updated] (DRILL-3893) Issue with Drill after Hive Alters
the Table
[ https://issues.apache.org/jira/browse/DRILL-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
arnab chatterjee updated DRILL-3893:
------------------------------------
Description:
I reproduced this again on another partitioned table with existing data.
Providing some more details. I have enabled the version mode for errors. Drill is unable to fetch the new column name that was introduced.This most likely to me seems to me that it’s still picking up the stale metadata of hive.
if (!tableColumns.contains(columnName)) {
if (partitionNames.contains(columnName)) {
selectedPartitionNames.add(columnName);
} else {
throw new ExecutionSetupException(String.format("Column %s does not exist", columnName));
}
}
select testdata from testtable;
Error: SYSTEM ERROR: ExecutionSetupException: Column testdata does not exist
Fragment 0:0
[Error Id: be5cccba-97f6-4cc4-94e8-c11a4c53c8f4 on x.x.com:9999]
(org.apache.drill.common.exceptions.ExecutionSetupException) Failure while initializing HiveRecordReader: Column testdata does not exist
org.apache.drill.exec.store.hive.HiveRecordReader.init():241
org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745
Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Column testdata does not exist
org.apache.drill.exec.store.hive.HiveRecordReader.init():206
org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745 (state=,code=0)
#########################################################
Please note that this is a partitioned table with existing data.
Does Drill Cache the Meta somewhere and hence it’s not getting reflected immediately ?
DRILL CLI
> select x from xx;
Error: SYSTEM ERROR: ExecutionSetupException: Column x does not exist
Fragment 0:0
[Error Id: 62086e22-1341-459e-87ce-430a24cc5119 on x.x.com:999] (state=,code=0)
HIVE CLI
hive> describe formatted x;
OK
# col_name data_type comment
<columns>
was:
The application team reproduced this again on another partitioned table with existing data.
Providing some more details. I have enabled the version mode for errors. Drill is unable to fetch the new column name that was introduced.This most likely to me seems to me that it’s still picking up the stale metadata of hive.
if (!tableColumns.contains(columnName)) {
if (partitionNames.contains(columnName)) {
selectedPartitionNames.add(columnName);
} else {
throw new ExecutionSetupException(String.format("Column %s does not exist", columnName));
}
}
select testdata from allocation_test;
Error: SYSTEM ERROR: ExecutionSetupException: Column testdata does not exist
Fragment 0:0
[Error Id: be5cccba-97f6-4cc4-94e8-c11a4c53c8f4 on redlxd00006.nomura.com:31010]
(org.apache.drill.common.exceptions.ExecutionSetupException) Failure while initializing HiveRecordReader: Column testdata does not exist
org.apache.drill.exec.store.hive.HiveRecordReader.init():241
org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745
Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Column testdata does not exist
org.apache.drill.exec.store.hive.HiveRecordReader.init():206
org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1142
java.util.concurrent.ThreadPoolExecutor$Worker.run():617
java.lang.Thread.run():745 (state=,code=0)
#########################################################
We have hit the issue again, this time the application team added a new column to the table and once again Drill does not identify it.
Last time it was an update to an existing column name.
Please note that this is a partitioned table with existing data.
Does Drill Cache the Meta somewhere and hence it’s not getting reflected immediately ?
DRILL CLI
> select legaltickettext from parentorders_asha;
Error: SYSTEM ERROR: ExecutionSetupException: Column legaltickettext does not exist
Fragment 0:0
[Error Id: 62086e22-1341-459e-87ce-430a24cc5119 on redlxd00006.nomura.com:31010] (state=,code=0)
HIVE CLI
hive> describe formatted parentorders_asha;
OK
# col_name data_type comment
recordtype string
clientacronym string
securityid string
secdesc string
side string
orderqty bigint
pricetype string
price double
ordercapacity string
exdestination string
cashmargin string
ordernotes string
traderid string
transacttime timestamp
transacttimens bigint
commonid string
omssrc string
cumqty bigint
expirytype string
orderid string
parentid string
tradingaccount string
businessarea string
traderlocation string
clientflowtype string
complianceid string
targetstrategy string
expirydatetime string
listid string
mergedto string
legaltikcetref string
legaltickettext string
# Partition Information
# col_name data_type comment
tradedate date
ordertradetype string
> Issue with Drill after Hive Alters the Table
> ---------------------------------------------
>
> Key: DRILL-3893
> URL: https://issues.apache.org/jira/browse/DRILL-3893
> Project: Apache Drill
> Issue Type: Bug
> Components: Functions - Hive, Storage - Hive
> Affects Versions: 1.0.0, 1.1.0
> Environment: DEV
> Reporter: arnab chatterjee
>
> I reproduced this again on another partitioned table with existing data.
> Providing some more details. I have enabled the version mode for errors. Drill is unable to fetch the new column name that was introduced.This most likely to me seems to me that it’s still picking up the stale metadata of hive.
> if (!tableColumns.contains(columnName)) {
> if (partitionNames.contains(columnName)) {
> selectedPartitionNames.add(columnName);
> } else {
> throw new ExecutionSetupException(String.format("Column %s does not exist", columnName));
> }
> }
> select testdata from testtable;
> Error: SYSTEM ERROR: ExecutionSetupException: Column testdata does not exist
> Fragment 0:0
> [Error Id: be5cccba-97f6-4cc4-94e8-c11a4c53c8f4 on x.x.com:9999]
> (org.apache.drill.common.exceptions.ExecutionSetupException) Failure while initializing HiveRecordReader: Column testdata does not exist
> org.apache.drill.exec.store.hive.HiveRecordReader.init():241
> org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
> org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
> org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
> org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745
> Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Column testdata does not exist
> org.apache.drill.exec.store.hive.HiveRecordReader.init():206
> org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
> org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
> org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
> org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
> org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
> org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
> org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> java.lang.Thread.run():745 (state=,code=0)
> #########################################################
> Please note that this is a partitioned table with existing data.
> Does Drill Cache the Meta somewhere and hence it’s not getting reflected immediately ?
> DRILL CLI
> > select x from xx;
> Error: SYSTEM ERROR: ExecutionSetupException: Column x does not exist
> Fragment 0:0
> [Error Id: 62086e22-1341-459e-87ce-430a24cc5119 on x.x.com:999] (state=,code=0)
> HIVE CLI
> hive> describe formatted x;
> OK
> # col_name data_type comment
> <columns>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)