You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by "Updike, Clark" <Cl...@jhuapl.edu> on 2020/07/23 15:23:47 UTC

HDFS file is listable but not queryable (object not found)

This is in 1.17.  I can use SHOW FILES to list the file I'm targeting, but I cannot query it:

apache drill> show files in hdfs.root.`/tmp/employee.json`;
+---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
|     name      | isDirectory | isFile | length |  owner   |   group    | permissions |       accessTime        |    modificationTime     |
+---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
| employee.json | false       | true   | 474630 | me       | supergroup | rw-r--r--   | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 |
+---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
1 row selected (3.039 seconds)


apache drill> select * from hdfs.root.`/tmp/employee.json`;
Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18: Object '/tmp/employee.json' not found within 'hdfs.root'
[Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0)

The storage plugin uses the standard json config:

    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    },

I can't see any problems on the HDFS side.  Full stack trace is below.

Any ideas what could be causing this behavior?

Thanks, Clark



FULL STACKTRACE:

apache drill> select * from hdfs.root.`/tmp/employee.json`;
Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18: Object '/tmp/employee.json' not found within 'hdfs.root'


[Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]

  (org.apache.calcite.runtime.CalciteContextException) From line 1, column 15 to line 1, column 18: Object '/tmp/employee.json' not found within 'hdfs.root'
    sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
    sun.reflect.NativeConstructorAccessorImpl.newInstance():62
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
    java.lang.reflect.Constructor.newInstance():423
    org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
    org.apache.calcite.sql.SqlUtil.newContextException():824
    org.apache.calcite.sql.SqlUtil.newContextException():809
    org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
    org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
    org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
    org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
    org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
    org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
    org.apache.calcite.sql.SqlSelect.validate():216
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
    org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
    org.apache.drill.exec.planner.sql.SqlConverter.validate():218
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
    org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
    org.apache.drill.exec.work.foreman.Foreman.runSQL():590
    org.apache.drill.exec.work.foreman.Foreman.run():275
    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    java.lang.Thread.run():745
  Caused By (org.apache.calcite.sql.validate.SqlValidatorException) Object '/tmp/employee.json' not found within 'hdfs.root'
    sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
    sun.reflect.NativeConstructorAccessorImpl.newInstance():62
    sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
    java.lang.reflect.Constructor.newInstance():423
    org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
    org.apache.calcite.runtime.Resources$ExInst.ex():572
    org.apache.calcite.sql.SqlUtil.newContextException():824
    org.apache.calcite.sql.SqlUtil.newContextException():809
    org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
    org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
    org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
    org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
    org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
    org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
    org.apache.calcite.sql.SqlSelect.validate():216
    org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
    org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
    org.apache.drill.exec.planner.sql.SqlConverter.validate():218
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
    org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
    org.apache.drill.exec.work.foreman.Foreman.runSQL():590
    org.apache.drill.exec.work.foreman.Foreman.run():275
    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
    java.lang.Thread.run():745 (state=,code=0)

Re: HDFS file is listable but not queryable (object not found)

Posted by Rafael Jaimes III <ra...@gmail.com>.
Right, but do you need the rest of the config at the top of the dfs default
config? Here's what I assume to be the full config taken from my 1.17 dfs
config (with other formats deleted):

{
  "type": "file",
  "connection": "file:///",
  "config": null,
  "workspaces": {
    "tmp": {
      "location": "/tmp",
      "writable": true,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    },
    "root": {
      "location": "/",
      "writable": false,
      "defaultInputFormat": null,
      "allowAccessOutsideWorkspace": false
    }
  },
  "formats": {
    "json": {
      "type": "json",
      "extensions": [
        "json"
      ]
    }
  },
  "enabled": true
}

- Rafael

On Thu, Jul 23, 2020 at 11:37 AM Charles Givre <cg...@gmail.com> wrote:

> Rafael,
> Clark is using the filesystem plugin to query a Hadoop cluster.  It seems
> weird that you can enumerate the files in a directory but when you try to
> query that file, it breaks...
> -- C
>
>
>
> > On Jul 23, 2020, at 11:35 AM, Rafael Jaimes III <ra...@gmail.com>
> wrote:
> >
> > Hi all,
> >
> > It looks like the file is 644 already which should be good.
> > I'm confused why the schema is called hdfs. dfs is a pre-built schema for
> > HDFS and querying against flat files such as .json as you're trying to
> do.
> > The default config for dfs also has a lot more content than what you
> > pasted. Can you use the default and try again?
> >
> > Hope this helps,
> > Rafael
> >
> >
> > On Thu, Jul 23, 2020 at 11:30 AM Charles Givre <cg...@gmail.com> wrote:
> >
> >> Hi Clark,
> >> That's strange.  My initial thought is that this could be a permission
> >> issue.  However, it might also be that Drill isn't finding the file for
> >> some reason.
> >>
> >> Could you try:
> >>
> >> SELECT *
> >> FROM hdfs.`<full hdfs path to file>`
> >>
> >> Best,
> >> --- C
> >>
> >>
> >>> On Jul 23, 2020, at 11:23 AM, Updike, Clark <Cl...@jhuapl.edu>
> >> wrote:
> >>>
> >>> This is in 1.17.  I can use SHOW FILES to list the file I'm targeting,
> >> but I cannot query it:
> >>>
> >>> apache drill> show files in hdfs.root.`/tmp/employee.json`;
> >>>
> >>
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> >>> |     name      | isDirectory | isFile | length |  owner   |   group
> >> | permissions |       accessTime        |    modificationTime     |
> >>>
> >>
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> >>> | employee.json | false       | true   | 474630 | me       | supergroup
> >> | rw-r--r--   | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 |
> >>>
> >>
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> >>> 1 row selected (3.039 seconds)
> >>>
> >>>
> >>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
> >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
> >> Object '/tmp/employee.json' not found within 'hdfs.root'
> >>> [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0)
> >>>
> >>> The storage plugin uses the standard json config:
> >>>
> >>>   "json": {
> >>>     "type": "json",
> >>>     "extensions": [
> >>>       "json"
> >>>     ]
> >>>   },
> >>>
> >>> I can't see any problems on the HDFS side.  Full stack trace is below.
> >>>
> >>> Any ideas what could be causing this behavior?
> >>>
> >>> Thanks, Clark
> >>>
> >>>
> >>>
> >>> FULL STACKTRACE:
> >>>
> >>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
> >>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
> >> Object '/tmp/employee.json' not found within 'hdfs.root'
> >>>
> >>>
> >>> [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]
> >>>
> >>> (org.apache.calcite.runtime.CalciteContextException) From line 1,
> >> column 15 to line 1, column 18: Object '/tmp/employee.json' not found
> >> within 'hdfs.root'
> >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> >>>   sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> >>>   java.lang.reflect.Constructor.newInstance():423
> >>>   org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
> >>>   org.apache.calcite.sql.SqlUtil.newContextException():824
> >>>   org.apache.calcite.sql.SqlUtil.newContextException():809
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
> >>>   org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
> >>>
>  org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
> >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>
>  org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
> >>>   org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
> >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>>   org.apache.calcite.sql.SqlSelect.validate():216
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
> >>>   org.apache.drill.exec.planner.sql.SqlConverter.validate():218
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
> >>>
>  org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
> >>>   org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> >>>   org.apache.drill.exec.work.foreman.Foreman.run():275
> >>>   java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> >>>   java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> >>>   java.lang.Thread.run():745
> >>> Caused By (org.apache.calcite.sql.validate.SqlValidatorException)
> >> Object '/tmp/employee.json' not found within 'hdfs.root'
> >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> >>>   sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> >>>   sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> >>>   java.lang.reflect.Constructor.newInstance():423
> >>>   org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
> >>>   org.apache.calcite.runtime.Resources$ExInst.ex():572
> >>>   org.apache.calcite.sql.SqlUtil.newContextException():824
> >>>   org.apache.calcite.sql.SqlUtil.newContextException():809
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
> >>>   org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
> >>>
>  org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
> >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
> >>>
> >>
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >>>
>  org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
> >>>   org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
> >>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >>>
> >> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >>>   org.apache.calcite.sql.SqlSelect.validate():216
> >>>
> >>
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
> >>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
> >>>   org.apache.drill.exec.planner.sql.SqlConverter.validate():218
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
> >>>
> >>
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
> >>>
>  org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
> >>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
> >>>   org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> >>>   org.apache.drill.exec.work.foreman.Foreman.run():275
> >>>   java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> >>>   java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> >>>   java.lang.Thread.run():745 (state=,code=0)
> >>
> >>
>
>

Re: HDFS file is listable but not queryable (object not found)

Posted by Charles Givre <cg...@gmail.com>.
Rafael, 
Clark is using the filesystem plugin to query a Hadoop cluster.  It seems weird that you can enumerate the files in a directory but when you try to query that file, it breaks... 
-- C



> On Jul 23, 2020, at 11:35 AM, Rafael Jaimes III <ra...@gmail.com> wrote:
> 
> Hi all,
> 
> It looks like the file is 644 already which should be good.
> I'm confused why the schema is called hdfs. dfs is a pre-built schema for
> HDFS and querying against flat files such as .json as you're trying to do.
> The default config for dfs also has a lot more content than what you
> pasted. Can you use the default and try again?
> 
> Hope this helps,
> Rafael
> 
> 
> On Thu, Jul 23, 2020 at 11:30 AM Charles Givre <cg...@gmail.com> wrote:
> 
>> Hi Clark,
>> That's strange.  My initial thought is that this could be a permission
>> issue.  However, it might also be that Drill isn't finding the file for
>> some reason.
>> 
>> Could you try:
>> 
>> SELECT *
>> FROM hdfs.`<full hdfs path to file>`
>> 
>> Best,
>> --- C
>> 
>> 
>>> On Jul 23, 2020, at 11:23 AM, Updike, Clark <Cl...@jhuapl.edu>
>> wrote:
>>> 
>>> This is in 1.17.  I can use SHOW FILES to list the file I'm targeting,
>> but I cannot query it:
>>> 
>>> apache drill> show files in hdfs.root.`/tmp/employee.json`;
>>> 
>> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
>>> |     name      | isDirectory | isFile | length |  owner   |   group
>> | permissions |       accessTime        |    modificationTime     |
>>> 
>> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
>>> | employee.json | false       | true   | 474630 | me       | supergroup
>> | rw-r--r--   | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 |
>>> 
>> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
>>> 1 row selected (3.039 seconds)
>>> 
>>> 
>>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
>>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
>> Object '/tmp/employee.json' not found within 'hdfs.root'
>>> [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0)
>>> 
>>> The storage plugin uses the standard json config:
>>> 
>>>   "json": {
>>>     "type": "json",
>>>     "extensions": [
>>>       "json"
>>>     ]
>>>   },
>>> 
>>> I can't see any problems on the HDFS side.  Full stack trace is below.
>>> 
>>> Any ideas what could be causing this behavior?
>>> 
>>> Thanks, Clark
>>> 
>>> 
>>> 
>>> FULL STACKTRACE:
>>> 
>>> apache drill> select * from hdfs.root.`/tmp/employee.json`;
>>> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
>> Object '/tmp/employee.json' not found within 'hdfs.root'
>>> 
>>> 
>>> [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]
>>> 
>>> (org.apache.calcite.runtime.CalciteContextException) From line 1,
>> column 15 to line 1, column 18: Object '/tmp/employee.json' not found
>> within 'hdfs.root'
>>>   sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
>>>   sun.reflect.NativeConstructorAccessorImpl.newInstance():62
>>>   sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
>>>   java.lang.reflect.Constructor.newInstance():423
>>>   org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
>>>   org.apache.calcite.sql.SqlUtil.newContextException():824
>>>   org.apache.calcite.sql.SqlUtil.newContextException():809
>>> 
>> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
>>>   org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
>>>   org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
>>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>>> 
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
>>> 
>> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
>>> 
>> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
>>>   org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
>>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>>> 
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>>>   org.apache.calcite.sql.SqlSelect.validate():216
>>> 
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
>>>   org.apache.drill.exec.planner.sql.SqlConverter.validate():218
>>> 
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
>>> 
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
>>> 
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
>>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
>>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
>>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
>>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
>>>   org.apache.drill.exec.work.foreman.Foreman.runSQL():590
>>>   org.apache.drill.exec.work.foreman.Foreman.run():275
>>>   java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>>>   java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>>>   java.lang.Thread.run():745
>>> Caused By (org.apache.calcite.sql.validate.SqlValidatorException)
>> Object '/tmp/employee.json' not found within 'hdfs.root'
>>>   sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
>>>   sun.reflect.NativeConstructorAccessorImpl.newInstance():62
>>>   sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
>>>   java.lang.reflect.Constructor.newInstance():423
>>>   org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
>>>   org.apache.calcite.runtime.Resources$ExInst.ex():572
>>>   org.apache.calcite.sql.SqlUtil.newContextException():824
>>>   org.apache.calcite.sql.SqlUtil.newContextException():809
>>> 
>> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
>>>   org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
>>>   org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
>>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>>> 
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
>>> 
>> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
>>> 
>> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
>>>   org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
>>>   org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>>> 
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>>>   org.apache.calcite.sql.SqlSelect.validate():216
>>> 
>> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
>>>   org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
>>>   org.apache.drill.exec.planner.sql.SqlConverter.validate():218
>>> 
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
>>> 
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
>>> 
>> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
>>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
>>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
>>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
>>>   org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
>>>   org.apache.drill.exec.work.foreman.Foreman.runSQL():590
>>>   org.apache.drill.exec.work.foreman.Foreman.run():275
>>>   java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>>>   java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>>>   java.lang.Thread.run():745 (state=,code=0)
>> 
>> 


Re: HDFS file is listable but not queryable (object not found)

Posted by Rafael Jaimes III <ra...@gmail.com>.
Hi all,

It looks like the file is 644 already which should be good.
I'm confused why the schema is called hdfs. dfs is a pre-built schema for
HDFS and querying against flat files such as .json as you're trying to do.
The default config for dfs also has a lot more content than what you
pasted. Can you use the default and try again?

Hope this helps,
Rafael


On Thu, Jul 23, 2020 at 11:30 AM Charles Givre <cg...@gmail.com> wrote:

> Hi Clark,
> That's strange.  My initial thought is that this could be a permission
> issue.  However, it might also be that Drill isn't finding the file for
> some reason.
>
> Could you try:
>
> SELECT *
> FROM hdfs.`<full hdfs path to file>`
>
> Best,
> --- C
>
>
> > On Jul 23, 2020, at 11:23 AM, Updike, Clark <Cl...@jhuapl.edu>
> wrote:
> >
> > This is in 1.17.  I can use SHOW FILES to list the file I'm targeting,
> but I cannot query it:
> >
> > apache drill> show files in hdfs.root.`/tmp/employee.json`;
> >
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> > |     name      | isDirectory | isFile | length |  owner   |   group
> | permissions |       accessTime        |    modificationTime     |
> >
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> > | employee.json | false       | true   | 474630 | me       | supergroup
> | rw-r--r--   | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 |
> >
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> > 1 row selected (3.039 seconds)
> >
> >
> > apache drill> select * from hdfs.root.`/tmp/employee.json`;
> > Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
> Object '/tmp/employee.json' not found within 'hdfs.root'
> > [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0)
> >
> > The storage plugin uses the standard json config:
> >
> >    "json": {
> >      "type": "json",
> >      "extensions": [
> >        "json"
> >      ]
> >    },
> >
> > I can't see any problems on the HDFS side.  Full stack trace is below.
> >
> > Any ideas what could be causing this behavior?
> >
> > Thanks, Clark
> >
> >
> >
> > FULL STACKTRACE:
> >
> > apache drill> select * from hdfs.root.`/tmp/employee.json`;
> > Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18:
> Object '/tmp/employee.json' not found within 'hdfs.root'
> >
> >
> > [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]
> >
> >  (org.apache.calcite.runtime.CalciteContextException) From line 1,
> column 15 to line 1, column 18: Object '/tmp/employee.json' not found
> within 'hdfs.root'
> >    sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> >    sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> >    sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> >    java.lang.reflect.Constructor.newInstance():423
> >    org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
> >    org.apache.calcite.sql.SqlUtil.newContextException():824
> >    org.apache.calcite.sql.SqlUtil.newContextException():809
> >
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
> >    org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
> >    org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
> >    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
> >
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
> >
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
> >    org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
> >    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >    org.apache.calcite.sql.SqlSelect.validate():216
> >
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
> >    org.apache.drill.exec.planner.sql.SqlConverter.validate():218
> >
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
> >
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
> >
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
> >    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
> >    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
> >    org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
> >    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
> >    org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> >    org.apache.drill.exec.work.foreman.Foreman.run():275
> >    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> >    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> >    java.lang.Thread.run():745
> >  Caused By (org.apache.calcite.sql.validate.SqlValidatorException)
> Object '/tmp/employee.json' not found within 'hdfs.root'
> >    sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
> >    sun.reflect.NativeConstructorAccessorImpl.newInstance():62
> >    sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
> >    java.lang.reflect.Constructor.newInstance():423
> >    org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
> >    org.apache.calcite.runtime.Resources$ExInst.ex():572
> >    org.apache.calcite.sql.SqlUtil.newContextException():824
> >    org.apache.calcite.sql.SqlUtil.newContextException():809
> >
> org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
> >    org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
> >    org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
> >    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
> >
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
> >
> org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
> >    org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
> >    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
> >
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
> >    org.apache.calcite.sql.SqlSelect.validate():216
> >
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
> >    org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
> >    org.apache.drill.exec.planner.sql.SqlConverter.validate():218
> >
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
> >
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
> >
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
> >    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
> >    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
> >    org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
> >    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
> >    org.apache.drill.exec.work.foreman.Foreman.runSQL():590
> >    org.apache.drill.exec.work.foreman.Foreman.run():275
> >    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
> >    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
> >    java.lang.Thread.run():745 (state=,code=0)
>
>

Re: HDFS file is listable but not queryable (object not found)

Posted by Charles Givre <cg...@gmail.com>.
Hi Clark, 
That's strange.  My initial thought is that this could be a permission issue.  However, it might also be that Drill isn't finding the file for some reason. 

Could you try:

SELECT * 
FROM hdfs.`<full hdfs path to file>`

Best,
--- C


> On Jul 23, 2020, at 11:23 AM, Updike, Clark <Cl...@jhuapl.edu> wrote:
> 
> This is in 1.17.  I can use SHOW FILES to list the file I'm targeting, but I cannot query it:
> 
> apache drill> show files in hdfs.root.`/tmp/employee.json`;
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> |     name      | isDirectory | isFile | length |  owner   |   group    | permissions |       accessTime        |    modificationTime     |
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> | employee.json | false       | true   | 474630 | me       | supergroup | rw-r--r--   | 2020-07-23 10:53:15.055 | 2020-07-23 10:53:15.387 |
> +---------------+-------------+--------+--------+----------+------------+-------------+-------------------------+-------------------------+
> 1 row selected (3.039 seconds)
> 
> 
> apache drill> select * from hdfs.root.`/tmp/employee.json`;
> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18: Object '/tmp/employee.json' not found within 'hdfs.root'
> [Error Id: 3b833622-4fac-4ecc-becd-118291cd8560 ] (state=,code=0)
> 
> The storage plugin uses the standard json config:
> 
>    "json": {
>      "type": "json",
>      "extensions": [
>        "json"
>      ]
>    },
> 
> I can't see any problems on the HDFS side.  Full stack trace is below.
> 
> Any ideas what could be causing this behavior?
> 
> Thanks, Clark
> 
> 
> 
> FULL STACKTRACE:
> 
> apache drill> select * from hdfs.root.`/tmp/employee.json`;
> Error: VALIDATION ERROR: From line 1, column 15 to line 1, column 18: Object '/tmp/employee.json' not found within 'hdfs.root'
> 
> 
> [Error Id: 69c8ffc0-4933-4008-a786-85ad623578ea ]
> 
>  (org.apache.calcite.runtime.CalciteContextException) From line 1, column 15 to line 1, column 18: Object '/tmp/employee.json' not found within 'hdfs.root'
>    sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
>    sun.reflect.NativeConstructorAccessorImpl.newInstance():62
>    sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
>    java.lang.reflect.Constructor.newInstance():423
>    org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
>    org.apache.calcite.sql.SqlUtil.newContextException():824
>    org.apache.calcite.sql.SqlUtil.newContextException():809
>    org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
>    org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
>    org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
>    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
>    org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
>    org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
>    org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
>    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>    org.apache.calcite.sql.SqlSelect.validate():216
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
>    org.apache.drill.exec.planner.sql.SqlConverter.validate():218
>    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
>    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
>    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
>    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
>    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
>    org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
>    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
>    org.apache.drill.exec.work.foreman.Foreman.runSQL():590
>    org.apache.drill.exec.work.foreman.Foreman.run():275
>    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>    java.lang.Thread.run():745
>  Caused By (org.apache.calcite.sql.validate.SqlValidatorException) Object '/tmp/employee.json' not found within 'hdfs.root'
>    sun.reflect.NativeConstructorAccessorImpl.newInstance0():-2
>    sun.reflect.NativeConstructorAccessorImpl.newInstance():62
>    sun.reflect.DelegatingConstructorAccessorImpl.newInstance():45
>    java.lang.reflect.Constructor.newInstance():423
>    org.apache.calcite.runtime.Resources$ExInstWithCause.ex():463
>    org.apache.calcite.runtime.Resources$ExInst.ex():572
>    org.apache.calcite.sql.SqlUtil.newContextException():824
>    org.apache.calcite.sql.SqlUtil.newContextException():809
>    org.apache.calcite.sql.validate.SqlValidatorImpl.newValidationError():4805
>    org.apache.calcite.sql.validate.IdentifierNamespace.resolveImpl():127
>    org.apache.calcite.sql.validate.IdentifierNamespace.validateImpl():177
>    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3109
>    org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom():3091
>    org.apache.drill.exec.planner.sql.SqlConverter$DrillValidator.validateFrom():298
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect():3363
>    org.apache.calcite.sql.validate.SelectNamespace.validateImpl():60
>    org.apache.calcite.sql.validate.AbstractNamespace.validate():84
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace():995
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery():955
>    org.apache.calcite.sql.SqlSelect.validate():216
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression():930
>    org.apache.calcite.sql.validate.SqlValidatorImpl.validate():637
>    org.apache.drill.exec.planner.sql.SqlConverter.validate():218
>    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode():665
>    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert():199
>    org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan():172
>    org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan():282
>    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan():162
>    org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan():127
>    org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan():92
>    org.apache.drill.exec.work.foreman.Foreman.runSQL():590
>    org.apache.drill.exec.work.foreman.Foreman.run():275
>    java.util.concurrent.ThreadPoolExecutor.runWorker():1142
>    java.util.concurrent.ThreadPoolExecutor$Worker.run():617
>    java.lang.Thread.run():745 (state=,code=0)