You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2017/11/10 06:23:01 UTC

[jira] [Commented] (DRILL-5089) Skip initializing all enabled storage plugins for every query

    [ https://issues.apache.org/jira/browse/DRILL-5089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247076#comment-16247076 ] 

ASF GitHub Bot commented on DRILL-5089:
---------------------------------------

GitHub user chunhui-shi opened a pull request:

    https://github.com/apache/drill/pull/1032

    DRILL-5089: Dynamically load schema of storage plugin only when neede…

    …d for every query
    
    For each query, loading all storage plugins and loading all workspaces under file system plugins is not needed.
    
    This patch use DynamicRootSchema as the root schema for Drill. Which loads correspondent storage only when needed.
    
    infoschema to read full schema information and load second level schema accordingly.
    
    for workspaces under the same Filesyetm, no need to create FileSystem for each workspace.
    
    use fs.access API to check permission which is available after HDFS 2.6 except for windows + local file system case.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/chunhui-shi/drill DRILL-5089-pull

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/drill/pull/1032.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #1032
    
----
commit a381677c59a7371733bae12ad4896b7cc927da5e
Author: chunhui-shi <cs...@maprtech.com>
Date:   2017-11-03T00:06:25Z

    DRILL-5089: Dynamically load schema of storage plugin only when needed for every query
    
    For each query, loading all storage plugins and loading all workspaces under file system plugins is not needed.
    
    This patch use DynamicRootSchema as the root schema for Drill. Which loads correspondent storage only when needed.
    
    infoschema to read full schema information and load second level schema accordingly.
    
    for workspaces under the same Filesyetm, no need to create FileSystem for each workspace.
    
    use fs.access API to check permission which is available after HDFS 2.6 except for windows + local file system case.

----


> Skip initializing all enabled storage plugins for every query
> -------------------------------------------------------------
>
>                 Key: DRILL-5089
>                 URL: https://issues.apache.org/jira/browse/DRILL-5089
>             Project: Apache Drill
>          Issue Type: Improvement
>          Components: Query Planning & Optimization
>    Affects Versions: 1.9.0
>            Reporter: Abhishek Girish
>            Assignee: Chunhui Shi
>            Priority: Critical
>
> In a query's lifecycle, at attempt is made to initialize each enabled storage plugin, while building the schema tree. This is done regardless of the actual plugins involved within a query. 
> Sometimes, when one or more of the enabled storage plugins have issues - either due to misconfiguration or the underlying datasource being slow or being down, the overall query time taken increases drastically. Most likely due the attempt being made to register schemas from a faulty plugin.
> For example, when a jdbc plugin is configured with SQL Server, and at one point the underlying SQL Server db goes down, any Drill query starting to execute at that point and beyond begin to slow down drastically. 
> We must skip registering unrelated schemas (& workspaces) for a query. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)