You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Daniel Barclay (Drill) (JIRA)" <ji...@apache.org> on 2015/09/21 20:56:04 UTC

[jira] [Created] (DRILL-3815) unknown suffixes .not_json and .json_not treated differently (multi-file case)

Daniel Barclay (Drill) created DRILL-3815:
---------------------------------------------

             Summary: unknown suffixes .not_json and .json_not treated differently (multi-file case)
                 Key: DRILL-3815
                 URL: https://issues.apache.org/jira/browse/DRILL-3815
             Project: Apache Drill
          Issue Type: Bug
          Components: Storage - Other
            Reporter: Daniel Barclay (Drill)
            Assignee: Jacques Nadeau


In scanning a directory subtree used as a table, unknown filename extensions seem to be treated differently depending on whether they're similar to known file extensions.  The behavior suggests that Drill checks whether a file name _contains_ an extension's string rather than _ending_ with it. 

For example, given these subtrees with almost identical leaf file names:

{noformat}
$ find /tmp/testext_xx_json/
/tmp/testext_xx_json/
/tmp/testext_xx_json/voter2.not_json
/tmp/testext_xx_json/voter1.json
$ find /tmp/testext_json_xx/
/tmp/testext_json_xx/
/tmp/testext_json_xx/voter1.json
/tmp/testext_json_xx/voter2.json_not
$ 
{noformat}

the results of trying to use them as tables differs:

{noformat}
0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_xx_json`;
Sep 21, 2015 11:41:50 AM org.apache.calcite.sql.validate.SqlValidatorException <init>
...
Error: VALIDATION ERROR: From line 1, column 17 to line 1, column 25: Table 'dfs.tmp.testext_xx_json' not found


[Error Id: 6fe41deb-0e39-43f6-beca-de27b39d276b on dev-linux2:31010] (state=,code=0)
0: jdbc:drill:zk=local> SELECT *   FROM `dfs.tmp`.`testext_json_xx`;
+-----------------------+
|         onecf         |
+-----------------------+
| {"name":"someName1"}  |
| {"name":"someName2"}  |
+-----------------------+
2 rows selected (0.149 seconds)
{noformat}

(Other probing seems to indicate that there is also some sensitivity to whether the extension contains an underscore character.)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)