You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "arnab chatterjee (JIRA)" <ji...@apache.org> on 2015/10/05 15:10:27 UTC
[jira] [Updated] (DRILL-3893) Issue with Drill after Hive Alters the Table

     [ https://issues.apache.org/jira/browse/DRILL-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

arnab chatterjee updated DRILL-3893:
------------------------------------
    Description: 
I reproduced this again on another partitioned table with existing data.

Providing some more details. I have enabled the version mode for errors. Drill is unable to fetch the new column name that was introduced.This most likely to me seems  to me that it’s still picking up the stale metadata of hive.


if (!tableColumns.contains(columnName)) {
            if (partitionNames.contains(columnName)) {
              selectedPartitionNames.add(columnName);
            } else {
              throw new ExecutionSetupException(String.format("Column %s does not exist", columnName));
            }
          }


select testdata from testtable;
Error: SYSTEM ERROR: ExecutionSetupException: Column testdata does not exist

Fragment 0:0

[Error Id: be5cccba-97f6-4cc4-94e8-c11a4c53c8f4 on x.x.com:9999]

  (org.apache.drill.common.exceptions.ExecutionSetupException) Failure while initializing HiveRecordReader: Column testdata does not exist
    org.apache.drill.exec.store.hive.HiveRecordReader.init():241
    org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
    org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
    org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
    org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
    org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
    org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
    org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    java.lang.Thread.run():745
  Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Column testdata does not exist
    org.apache.drill.exec.store.hive.HiveRecordReader.init():206
    org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
    org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
    org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
    org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
    org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
    org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
    org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    java.lang.Thread.run():745 (state=,code=0)

#########################################################

Please note that this is a partitioned table with existing data.

Does Drill Cache the Meta somewhere and hence it’s not getting reflected immediately ?

DRILL CLI
> select x from xx;
Error: SYSTEM ERROR: ExecutionSetupException: Column x does not exist

Fragment 0:0

[Error Id: 62086e22-1341-459e-87ce-430a24cc5119 on x.x.com:999] (state=,code=0)

HIVE CLI
hive> describe formatted x;
OK
# col_name              data_type               comment

<columns>


  was:
The application team reproduced this again on another partitioned table with existing data.

Providing some more details. I have enabled the version mode for errors. Drill is unable to fetch the new column name that was introduced.This most likely to me seems  to me that it’s still picking up the stale metadata of hive.


if (!tableColumns.contains(columnName)) {
            if (partitionNames.contains(columnName)) {
              selectedPartitionNames.add(columnName);
            } else {
              throw new ExecutionSetupException(String.format("Column %s does not exist", columnName));
            }
          }


select testdata from allocation_test;
Error: SYSTEM ERROR: ExecutionSetupException: Column testdata does not exist

Fragment 0:0

[Error Id: be5cccba-97f6-4cc4-94e8-c11a4c53c8f4 on redlxd00006.nomura.com:31010]

  (org.apache.drill.common.exceptions.ExecutionSetupException) Failure while initializing HiveRecordReader: Column testdata does not exist
    org.apache.drill.exec.store.hive.HiveRecordReader.init():241
    org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
    org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
    org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
    org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
    org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
    org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
    org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    java.lang.Thread.run():745
  Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Column testdata does not exist
    org.apache.drill.exec.store.hive.HiveRecordReader.init():206
    org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
    org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
    org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
    org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
    org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
    org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
    org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
    org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
    org.apache.drill.common.SelfCleaningRunnable.run():38
    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    java.lang.Thread.run():745 (state=,code=0)

#########################################################

We have hit the issue again, this time the application team added a new column to the table and once again Drill does not identify it.
Last time it was an update to an existing column name.

Please note that this is a partitioned table with existing data.

Does Drill Cache the Meta somewhere and hence it’s not getting reflected immediately ?

DRILL CLI
> select legaltickettext     from parentorders_asha;
Error: SYSTEM ERROR: ExecutionSetupException: Column legaltickettext          does not exist

Fragment 0:0

[Error Id: 62086e22-1341-459e-87ce-430a24cc5119 on redlxd00006.nomura.com:31010] (state=,code=0)

HIVE CLI
hive> describe formatted parentorders_asha;
OK
# col_name              data_type               comment

recordtype              string
clientacronym           string
securityid              string
secdesc                 string
side                    string
orderqty                bigint
pricetype               string
price                   double
ordercapacity           string
exdestination           string
cashmargin              string
ordernotes              string
traderid                string
transacttime            timestamp
transacttimens          bigint
commonid                string
omssrc                  string
cumqty                  bigint
expirytype              string
orderid                 string
parentid                string
tradingaccount          string
businessarea            string
traderlocation          string
clientflowtype          string
complianceid            string
targetstrategy          string
expirydatetime          string
listid                  string
mergedto                string
legaltikcetref          string
legaltickettext         string

# Partition Information
# col_name              data_type               comment

tradedate               date
ordertradetype          string



>  Issue with Drill after Hive Alters the Table
> ---------------------------------------------
>
>                 Key: DRILL-3893
>                 URL: https://issues.apache.org/jira/browse/DRILL-3893
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Hive, Storage - Hive
>    Affects Versions: 1.0.0, 1.1.0
>         Environment: DEV
>            Reporter: arnab chatterjee
>
> I reproduced this again on another partitioned table with existing data.
> Providing some more details. I have enabled the version mode for errors. Drill is unable to fetch the new column name that was introduced.This most likely to me seems  to me that it’s still picking up the stale metadata of hive.
> if (!tableColumns.contains(columnName)) {
>             if (partitionNames.contains(columnName)) {
>               selectedPartitionNames.add(columnName);
>             } else {
>               throw new ExecutionSetupException(String.format("Column %s does not exist", columnName));
>             }
>           }
> select testdata from testtable;
> Error: SYSTEM ERROR: ExecutionSetupException: Column testdata does not exist
> Fragment 0:0
> [Error Id: be5cccba-97f6-4cc4-94e8-c11a4c53c8f4 on x.x.com:9999]
>   (org.apache.drill.common.exceptions.ExecutionSetupException) Failure while initializing HiveRecordReader: Column testdata does not exist
>     org.apache.drill.exec.store.hive.HiveRecordReader.init():241
>     org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
>     org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
>     org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
>     org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
>     org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
>     org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
>     org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>     java.lang.Thread.run():745
>   Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Column testdata does not exist
>     org.apache.drill.exec.store.hive.HiveRecordReader.init():206
>     org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
>     org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
>     org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
>     org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
>     org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
>     org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
>     org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>     java.lang.Thread.run():745 (state=,code=0)
> #########################################################
> Please note that this is a partitioned table with existing data.
> Does Drill Cache the Meta somewhere and hence it’s not getting reflected immediately ?
> DRILL CLI
> > select x from xx;
> Error: SYSTEM ERROR: ExecutionSetupException: Column x does not exist
> Fragment 0:0
> [Error Id: 62086e22-1341-459e-87ce-430a24cc5119 on x.x.com:999] (state=,code=0)
> HIVE CLI
> hive> describe formatted x;
> OK
> # col_name              data_type               comment
> <columns>



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)