You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/05/14 20:53:04 UTC

[jira] [Created] (DRILL-5510) Revisit connection failure recovery in Hive storage plugin

Paul Rogers created DRILL-5510:
----------------------------------

             Summary: Revisit connection failure recovery in Hive storage plugin
                 Key: DRILL-5510
                 URL: https://issues.apache.org/jira/browse/DRILL-5510
             Project: Apache Drill
          Issue Type: Improvement
    Affects Versions: 1.11.0
            Reporter: Paul Rogers


DRILL-5496 describes a problem which occurs when the Hive metastore server is restarted while Drill runs. The solution in that ticket is a work-around: we discard all cached Hive metastore data and rebuild the metadata cache.

The original code tried to be more subtle: detecting that the connection has failed, reconnect, but preserve the cache. DRILL-5496 describes the flaws in that approach for the secure connection case.

This ticket asks to spend the time to understand the Hive metadata code and restructure it to preserve the cache across connection failures.

Note a subtle issue: if the Hive metastore goes down, when it comes back up, it may contain different data; anything could happen while the server is down: upgrade schemas, replace one schema with another, etc. So, the caching mechanism, if it is to preserve data across reconnects, must handle such changes.

Of course, such changes could occur even within a single connection, so the code should handle such cases already.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)