You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Tim Harper <ti...@gmail.com> on 2015/09/24 20:32:00 UTC

Querying local filesystem directory: How do I access the filename?

With apache drill, I'm able to query a directory of JSON files just fine,
by invoking:

select * from file.`/path/to/data` t;

All of JSON files are selected, and the data comes as I'd expect. However,
no fields are returned describing from which file the data came. I'd like
to be able to use this signal in my query, in addition.

I apologize if this is clearly documented somewhere. I've looked and I'm
having a hard time.

This is the configuration for my file storage plugin:

{
  "type": "file",
  "enabled": true,
  "connection": "file:///",
  "workspaces": {
    "scimmia": {
      "location": "/home/scimmia",
      "writable": false,
      "defaultInputFormat": null
    }
  },
  "formats": {
    "json": {
      "type": "json"
    }
  }
}

Re: Querying local filesystem directory: How do I access the filename?

Posted by Daniel Barclay <db...@maprtech.com>.
DRILL-3425 doesn't seem to be about adding filename information to the directory name information that Drill already provides.

Also, other users have requested the addition of the file's simple name to the pathname segments currently available through dir0, dir1, etc.  See
DRILL-3474 (Filename should be an available column when querying a directory) <https://issues.apache.org/jira/browse/DRILL-3474>.

Daniel

Christopher Matta wrote:
> Seems like this was brought up in DRILL-3425
> <https://issues.apache.org/jira/browse/DRILL-3425> and identified as not
> something that was in the design spec.
>
> Chris Matta
> cmatta@mapr.com
> 215-701-3146
>
> On Thu, Sep 24, 2015 at 2:57 PM, Tim Harper <ti...@gmail.com> wrote:
>
>> So, if the files are in subdirectories, I DO get dir0, but if I query a
>> single directory with a handful of files, there is no way (it seems) to see
>> any file name information.
>>
>> On Thu, Sep 24, 2015 at 12:32 PM, Tim Harper <ti...@gmail.com> wrote:
>>
>>> With apache drill, I'm able to query a directory of JSON files just fine,
>>> by invoking:
>>>
>>> select * from file.`/path/to/data` t;
>>>
>>> All of JSON files are selected, and the data comes as I'd expect.
>> However,
>>> no fields are returned describing from which file the data came. I'd like
>>> to be able to use this signal in my query, in addition.
>>>
>>> I apologize if this is clearly documented somewhere. I've looked and I'm
>>> having a hard time.
>>>
>>> This is the configuration for my file storage plugin:
>>>
>>> {
>>>    "type": "file",
>>>    "enabled": true,
>>>    "connection": "file:///",
>>>    "workspaces": {
>>>      "scimmia": {
>>>        "location": "/home/scimmia",
>>>        "writable": false,
>>>        "defaultInputFormat": null
>>>      }
>>>    },
>>>    "formats": {
>>>      "json": {
>>>        "type": "json"
>>>      }
>>>    }
>>> }
>>>


-- 
Daniel Barclay
MapR Technologies


Re: Querying local filesystem directory: How do I access the filename?

Posted by Christopher Matta <cm...@mapr.com>.
Seems like this was brought up in DRILL-3425
<https://issues.apache.org/jira/browse/DRILL-3425> and identified as not
something that was in the design spec.

Chris Matta
cmatta@mapr.com
215-701-3146

On Thu, Sep 24, 2015 at 2:57 PM, Tim Harper <ti...@gmail.com> wrote:

> So, if the files are in subdirectories, I DO get dir0, but if I query a
> single directory with a handful of files, there is no way (it seems) to see
> any file name information.
>
> On Thu, Sep 24, 2015 at 12:32 PM, Tim Harper <ti...@gmail.com> wrote:
>
> > With apache drill, I'm able to query a directory of JSON files just fine,
> > by invoking:
> >
> > select * from file.`/path/to/data` t;
> >
> > All of JSON files are selected, and the data comes as I'd expect.
> However,
> > no fields are returned describing from which file the data came. I'd like
> > to be able to use this signal in my query, in addition.
> >
> > I apologize if this is clearly documented somewhere. I've looked and I'm
> > having a hard time.
> >
> > This is the configuration for my file storage plugin:
> >
> > {
> >   "type": "file",
> >   "enabled": true,
> >   "connection": "file:///",
> >   "workspaces": {
> >     "scimmia": {
> >       "location": "/home/scimmia",
> >       "writable": false,
> >       "defaultInputFormat": null
> >     }
> >   },
> >   "formats": {
> >     "json": {
> >       "type": "json"
> >     }
> >   }
> > }
> >
>

Re: Querying local filesystem directory: How do I access the filename?

Posted by Tim Harper <ti...@gmail.com>.
So, if the files are in subdirectories, I DO get dir0, but if I query a
single directory with a handful of files, there is no way (it seems) to see
any file name information.

On Thu, Sep 24, 2015 at 12:32 PM, Tim Harper <ti...@gmail.com> wrote:

> With apache drill, I'm able to query a directory of JSON files just fine,
> by invoking:
>
> select * from file.`/path/to/data` t;
>
> All of JSON files are selected, and the data comes as I'd expect. However,
> no fields are returned describing from which file the data came. I'd like
> to be able to use this signal in my query, in addition.
>
> I apologize if this is clearly documented somewhere. I've looked and I'm
> having a hard time.
>
> This is the configuration for my file storage plugin:
>
> {
>   "type": "file",
>   "enabled": true,
>   "connection": "file:///",
>   "workspaces": {
>     "scimmia": {
>       "location": "/home/scimmia",
>       "writable": false,
>       "defaultInputFormat": null
>     }
>   },
>   "formats": {
>     "json": {
>       "type": "json"
>     }
>   }
> }
>