You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Jacques Nadeau (JIRA)" <ji...@apache.org> on 2015/10/05 14:45:26 UTC

[jira] [Commented] (DRILL-3893) Issue with Drill after Hive Alters the Table

    [ https://issues.apache.org/jira/browse/DRILL-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14943319#comment-14943319 ] 

Jacques Nadeau commented on DRILL-3893:
---------------------------------------

We do cache hive Metadata for something like one minute. We should enhance this cache to refresh on a metadata miss. 

>  Issue with Drill after Hive Alters the Table
> ---------------------------------------------
>
>                 Key: DRILL-3893
>                 URL: https://issues.apache.org/jira/browse/DRILL-3893
>             Project: Apache Drill
>          Issue Type: Bug
>          Components: Functions - Hive, Storage - Hive
>    Affects Versions: 1.0.0, 1.1.0
>         Environment: DEV
>            Reporter: arnab chatterjee
>
> The application team reproduced this again on another partitioned table with existing data.
> Providing some more details. I have enabled the version mode for errors. Drill is unable to fetch the new column name that was introduced.This most likely to me seems  to me that it’s still picking up the stale metadata of hive.
> if (!tableColumns.contains(columnName)) {
>             if (partitionNames.contains(columnName)) {
>               selectedPartitionNames.add(columnName);
>             } else {
>               throw new ExecutionSetupException(String.format("Column %s does not exist", columnName));
>             }
>           }
> select testdata from allocation_test;
> Error: SYSTEM ERROR: ExecutionSetupException: Column testdata does not exist
> Fragment 0:0
> [Error Id: be5cccba-97f6-4cc4-94e8-c11a4c53c8f4 on redlxd00006.nomura.com:31010]
>   (org.apache.drill.common.exceptions.ExecutionSetupException) Failure while initializing HiveRecordReader: Column testdata does not exist
>     org.apache.drill.exec.store.hive.HiveRecordReader.init():241
>     org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
>     org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
>     org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
>     org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
>     org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
>     org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
>     org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>     java.lang.Thread.run():745
>   Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Column testdata does not exist
>     org.apache.drill.exec.store.hive.HiveRecordReader.init():206
>     org.apache.drill.exec.store.hive.HiveRecordReader.<init>():138
>     org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():58
>     org.apache.drill.exec.store.hive.HiveScanBatchCreator.getBatch():34
>     org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():150
>     org.apache.drill.exec.physical.impl.ImplCreator.getChildren():173
>     org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():106
>     org.apache.drill.exec.physical.impl.ImplCreator.getExec():81
>     org.apache.drill.exec.work.fragment.FragmentExecutor.run():235
>     org.apache.drill.common.SelfCleaningRunnable.run():38
>     java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>     java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>     java.lang.Thread.run():745 (state=,code=0)
> #########################################################
> We have hit the issue again, this time the application team added a new column to the table and once again Drill does not identify it.
> Last time it was an update to an existing column name.
> Please note that this is a partitioned table with existing data.
> Does Drill Cache the Meta somewhere and hence it’s not getting reflected immediately ?
> DRILL CLI
> > select legaltickettext     from parentorders_asha;
> Error: SYSTEM ERROR: ExecutionSetupException: Column legaltickettext          does not exist
> Fragment 0:0
> [Error Id: 62086e22-1341-459e-87ce-430a24cc5119 on redlxd00006.nomura.com:31010] (state=,code=0)
> HIVE CLI
> hive> describe formatted parentorders_asha;
> OK
> # col_name              data_type               comment
> recordtype              string
> clientacronym           string
> securityid              string
> secdesc                 string
> side                    string
> orderqty                bigint
> pricetype               string
> price                   double
> ordercapacity           string
> exdestination           string
> cashmargin              string
> ordernotes              string
> traderid                string
> transacttime            timestamp
> transacttimens          bigint
> commonid                string
> omssrc                  string
> cumqty                  bigint
> expirytype              string
> orderid                 string
> parentid                string
> tradingaccount          string
> businessarea            string
> traderlocation          string
> clientflowtype          string
> complianceid            string
> targetstrategy          string
> expirydatetime          string
> listid                  string
> mergedto                string
> legaltikcetref          string
> legaltickettext         string
> # Partition Information
> # col_name              data_type               comment
> tradedate               date
> ordertradetype          string



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)