You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Suresh Subbiah (JIRA)" <ji...@apache.org> on 2016/10/03 20:01:20 UTC

[jira] [Resolved] (TRAFODION-2199) SELECT on Hive partitioned table causes timestamp mismatch error

     [ https://issues.apache.org/jira/browse/TRAFODION-2199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Subbiah resolved TRAFODION-2199.
---------------------------------------
    Resolution: Fixed

> SELECT on Hive partitioned table causes timestamp mismatch error
> ----------------------------------------------------------------
>
>                 Key: TRAFODION-2199
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-2199
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-exe
>    Affects Versions: 2.1-incubating
>            Reporter: Suresh Subbiah
>            Assignee: Suresh Subbiah
>             Fix For: 2.1-incubating
>
>
> During execute of a select query that includes Hive partitioned tables, this error is occasinally seen.
> SQL>execute s;
>  
> *** ERROR[8436] Mismatch detected between compiletime and runtime hive table definitions. [2016-08-28 22:58:14]
> *** WARNING[8597] Statement was automatically retried 1 time(s). Delay before each retry was 0 seconds. See next entry for the error that caused this retry.  [2016-08-28 22:58:14]
> *** WARNING[8436] Mismatch detected between compiletime and runtime hive table definitions. [2016-08-28 22:58:14]
>  
> Workarounds :
> cqd HIVE_DATA_MOD_CHECK 'OFF' ; will avoid the problem.
> Alternatively the setting described here (found by Sandhya) will also avoid the problem
> https://community.cloudera.com/t5/Batch-SQL-Apache-Hive/CDH-5-3-Hive-staging-directory-has-wrong-default-value/td-p/23585
> http://stackoverflow.com/questions/35176190/hive-queries-generating-mismanaged-staging-directories
> Hi Anoop, Sandhya
>  
> In this case both modTSforDir_ and the directory timestamp we get at runtime are the same (1470914707).
> This happened because during compile we do go through all files and get the max timestamp of all files or directory as Anoop said. However this iteration through all files is based on a list of hive_sd_structs. These sd_structs are created by parsing a string we get from the java side. This string is similar to DESCRIBE EXTENDED <table-name> output, with additional info for each partition. Since this string does not contain information about non-partition directories like hive-staging, they are excluded from the compile time calculation. We need to do something similar at runtime. We cannot accept every directory returned by hdfsListDirectory() for a partitioned table as valid.
>  
> As Eric  and Sandhya said I will exclude directory names that include “.hive-staging”. I could also insist that the directory name have a “=” sign in it. This will be checked into Trafodion by tomorrow. I hope the cost of checking so many names/timestamps is not prohibitive. In this case we have about 2000 directory names to deal with. The directory names are rather long. I suppose in the worst case this is about a million comparisons which could add about a millisecond to execution time. Let me know if this is not a good idea.
>  
> Thanks
> Suresh
>  
>  
>  
>  
> At compile time, we get the max time of all files that are part of a table.
> This timestamp is stored in modTSforDir_.
>  
> At runtime, we look at the directory timestamp. If this table got created/moved
> into that directory and changed the timestamp, that will give a ts mismatch.
> This mismatch will not be fixed even with a recompile as the recompile will
> once again get the max of table files.
>  
> Can we tell if the runtime directory timestamp was set to indicate the max of
> all files(hidden or actual) under that dir?
> anoop
>  
>  
> Looks like there are cases where this staging directory is left over after hive uses it internally for it’s own processing. So as Suresh says, this is something our code possibly needs to ignore . Seems a bit messy to ignore files with this name but if we don’t then we could run into this issue…
> Sandhya
>  
>  
> It appears that hive itself creates and uses these staging directories and I see some problems reported about this. 
> Need to look more. It’s not something we directly create ..
>  
> Sandhya
>  
> If this file has a higher timestamp, then should that have changed the timestamp
> of the directory it is under? Or are there cases where it may not, and if there
> are, then how do we get to that situation?
>  
> Is this file created during a hive insert or part of some hive query.
>  
> anoop
>  
>  
> Specially given this is a hidden file, given it starts with .?
> May be logic should just filter hidden files?
> Eric
>  
> Hi,
>  
> Yes I agree with everyone that this is a bug. I debugged this and find that it might be related to a problem Sandhya told me about a few days ago.
>  
> The directory for store_sales table has this timestamp
> ptr_ = 0x7f3c578edc48 "hdfs://n01.trafodion.local:8020/user/hive/warehouse/tpcds.db/store_sales", offset_ = 139897143745608}}
> (gdb) p ((ComTdbOrcScan*)&tdb)->modTSforDir_
> $12 = 1470914707
>  
> However this directory had 1820+ directories underneath it. One of them has a timestamp that is slightly higher
> Name = 0x74e2ce0 "hdfs://n01.trafodion.local:8020/user/hive/warehouse/tpcds.db/store_sales/.hive-staging_hive_2016-08-11_19-25-06_941_7211319284354636719-1", 
> mLastMod = 1470915220,
>  
> Typical directories under store_sales directory have this type of name, with a literal for partition value. This staging directory seem to be something we should ignore. I wonder if we should do that based on name or some other criteria.
> /user/hive/warehouse/tpcds.db/store_sales/ss_sold_date_sk=2450816



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)