You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@trafodion.apache.org by "Suresh Subbiah (JIRA)" <ji...@apache.org> on 2015/10/08 06:57:26 UTC

[jira] [Assigned] (TRAFODION-793) LP Bug: 1396386 - Incorrect results or error 8442 from hive tables when using query cache and hive data changes

     [ https://issues.apache.org/jira/browse/TRAFODION-793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Suresh Subbiah reassigned TRAFODION-793:
----------------------------------------

    Assignee: Suresh Subbiah  (was: Apache Trafodion)

> LP Bug: 1396386 - Incorrect results or error 8442 from hive tables when using query cache and hive data changes
> ---------------------------------------------------------------------------------------------------------------
>
>                 Key: TRAFODION-793
>                 URL: https://issues.apache.org/jira/browse/TRAFODION-793
>             Project: Apache Trafodion
>          Issue Type: Bug
>          Components: sql-cmp
>            Reporter: Hans Zeller
>            Assignee: Suresh Subbiah
>
> Trafodion caches Hive metadata, Hive table statistics and entire queries on Hive tables. The HDFS files to read for a query of a Hive table are contained in the query plan. When the underlying data changes, Trafodion doesn't have a good way to detect those changes, therefore we return stale data. In some cases we also see an error like this one, when HDFS files are removed:
> *** ERROR[8442] Unable to access HDFS interface. Call to ExpLOBInterfaceSelectCursor/open returned error LOB_DATA_FILE_OPEN_ERROR(508). Error detail 0.
> There are several possible solutions to this problem:
> 1. Leave it up to the user to turn off query caching - current solution. Also, the HIVE_METADATA_REFRESH_INTERVAL can be used to control caching of HDFS file statistics.
> 2. Disable the query cache for queries that access Hive data.
> 3. Improve validation methods for query and metadata caches to detect changes of the underlying HDFS files.
> Test case:
> -- using the Hive shell, create Hive table T1 and enter some data
> select * from hive.hive.T1;
> -- now update the Hive table, e.g. add more data
> select * from hive.hive.T1;
> -- the changes will not be seen, we get the cached data
> cqd query_cache '0';
> select * from hive.hive.T1;
> -- now the changes are reflected in the result
> Assigned to LaunchPad User khaled Bouaziz



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)