You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Abhishek Girish (JIRA)" <ji...@apache.org> on 2015/06/01 19:32:17 UTC

[jira] [Created] (DRILL-3230) Local file system plug-in must be disabled in distributed mode

Abhishek Girish created DRILL-3230:
--------------------------------------

             Summary: Local file system plug-in must be disabled in distributed mode
                 Key: DRILL-3230
                 URL: https://issues.apache.org/jira/browse/DRILL-3230
             Project: Apache Drill
          Issue Type: Bug
          Components: Client - HTTP
            Reporter: Abhishek Girish
            Assignee: Jacques Nadeau


The local file system plug-in (The "file:///" connection string in dfs storage plug-in) does not behave as expected for both CTAS and querying files, when Drill is configured with distributed mode (multiple drill-bits across nodes). 

In case of CTAS, parquet files will be written to a specific node's local file system, depending on which Drill-bit the client connects to. And if the table is moderate to large in size, Drill may process them in a distributed manner and write data into more than one node - data is partitioned into different nodes. 

In case of queries, it could be confusing again, as the behavior will depend on which drill-bit the client connects to. Hence the behavior seen would be inconsistent - queries would return only partial results, which depend on the drillbit connected to.

My suggestion would be that the local file system plugin be disabled with distributed mode. With multiple drill bits and a centralized plugin for local file system, consistent behavior cannot be expected. 

It should be either disabled when distributed mode is detected or we could add support for multiple namespaces (using IP of nodes) with local file systems (might still not fix all issues). Or may be there could be other ways to resolve this, which I might be overlooking or not aware of. 

There have been many issues seen on the user ML, where inconsistent behaviors have been observed by users.




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)