You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@drill.apache.org by Edmon Begoli <eb...@gmail.com> on 2015/08/20 16:52:49 UTC

Error when querying .gz file that contains single text file (actually psv, but with txt extension) - Drill 1.1.0

I have large number of .txt files that are individually tarballed (i.e.
compressed from .txt to .txt.tar.gz).
(I received them like this.)

Each .txt file is actually a psv file, but, for some reason, it is saved as
txt.

I have created a custom 'txt' storage configuration and I can query
uncompressed .txt, structured as .psv without a problem.

select * from dfs.`/<my path here>/<myfile>_08072015.txt`;

When I try to query compressed file I get an error:

0: jdbc:drill:zk=local> select * from dfs.`/<my path
here>/<myfile>_08072015.txt.tar.gz`;

Aug 20, 2015 10:43:01 AM
org.apache.calcite.sql.validate.SqlValidatorException <init>

SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 'dfs./<my
path here>/<myfile>_08072015.txt.tar.gz' not found

Aug 20, 2015 10:43:01 AM org.apache.calcite.runtime.CalciteException <init>

SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1,
column 15 to line 1, column 17: Table 'dfs/<my path
here>/<myfile>_08072015.txt.tar.gz'
not found

Error: PARSE ERROR: From line 1, column 15 to line 1, column 17: Table
'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found

[Error Id: fdba6efb-d256-437c-8355-ce09f087537c on 192.168.1.16:31010]
(state=,code=0)

0: jdbc:drill:zk=local>


Is it possible to fix this, or is Drill just not able to recognize custom
format inside the tar/gz compressed file?

Please advise.

Re: Error when querying .gz file that contains single text file (actually psv, but with txt extension) - Drill 1.1.0

Posted by Edmon Begoli <eb...@gmail.com>.
I just replied to your email. I was able to fix it by fixing the acceptable
extensions, not the file names.

On Thu, Aug 20, 2015 at 11:44 AM, Kristine Hahn <kh...@maprtech.com> wrote:

> Or, use _08072015.txt.gz  instead of _08072015.txt.tar.gz as shown in
>
> http://drill.apache.org/docs/querying-plain-text-files/#query-the-gz-file-directly/
>
> Kristine Hahn
> Sr. Technical Writer
> 415-497-8107 @krishahn skype:krishahn
>
>
> On Thu, Aug 20, 2015 at 8:29 AM, Kristine Hahn <kh...@maprtech.com> wrote:
>
> > To be more exact re: the pointer to the doc, the renaming example is in
> > step 3 of
> >
> http://drill.apache.org/docs/querying-plain-text-files/#download-and-set-up-the-data
> > .
> >
> > Kristine Hahn
> > Sr. Technical Writer
> > 415-497-8107 @krishahn skype:krishahn
> >
> >
> > On Thu, Aug 20, 2015 at 8:27 AM, Kristine Hahn <kh...@maprtech.com>
> wrote:
> >
> >> You need to rename the tar/gz to use tbl extension to query PSV if you
> >> use the default dfs storage configuration. See TSV example on
> >>
> http://drill.apache.org/docs/querying-plain-text-files/#example-of-querying-a-tsv-file
> >> .
> >>
> >> Kristine Hahn
> >> Sr. Technical Writer
> >> 415-497-8107 @krishahn skype:krishahn
> >>
> >>
> >> On Thu, Aug 20, 2015 at 7:52 AM, Edmon Begoli <eb...@gmail.com>
> wrote:
> >>
> >>> I have large number of .txt files that are individually tarballed (i.e.
> >>> compressed from .txt to .txt.tar.gz).
> >>> (I received them like this.)
> >>>
> >>> Each .txt file is actually a psv file, but, for some reason, it is
> saved
> >>> as
> >>> txt.
> >>>
> >>> I have created a custom 'txt' storage configuration and I can query
> >>> uncompressed .txt, structured as .psv without a problem.
> >>>
> >>> select * from dfs.`/<my path here>/<myfile>_08072015.txt`;
> >>>
> >>> When I try to query compressed file I get an error:
> >>>
> >>> 0: jdbc:drill:zk=local> select * from dfs.`/<my path
> >>> here>/<myfile>_08072015.txt.tar.gz`;
> >>>
> >>> Aug 20, 2015 10:43:01 AM
> >>> org.apache.calcite.sql.validate.SqlValidatorException <init>
> >>>
> >>> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table
> >>> 'dfs./<my
> >>> path here>/<myfile>_08072015.txt.tar.gz' not found
> >>>
> >>> Aug 20, 2015 10:43:01 AM org.apache.calcite.runtime.CalciteException
> >>> <init>
> >>>
> >>> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line
> 1,
> >>> column 15 to line 1, column 17: Table 'dfs/<my path
> >>> here>/<myfile>_08072015.txt.tar.gz'
> >>> not found
> >>>
> >>> Error: PARSE ERROR: From line 1, column 15 to line 1, column 17: Table
> >>> 'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found
> >>>
> >>> [Error Id: fdba6efb-d256-437c-8355-ce09f087537c on 192.168.1.16:31010]
> >>> (state=,code=0)
> >>>
> >>> 0: jdbc:drill:zk=local>
> >>>
> >>>
> >>> Is it possible to fix this, or is Drill just not able to recognize
> custom
> >>> format inside the tar/gz compressed file?
> >>>
> >>> Please advise.
> >>>
> >>
> >>
> >
>

Re: Error when querying .gz file that contains single text file (actually psv, but with txt extension) - Drill 1.1.0

Posted by Kristine Hahn <kh...@maprtech.com>.
Or, use _08072015.txt.gz  instead of _08072015.txt.tar.gz as shown in
http://drill.apache.org/docs/querying-plain-text-files/#query-the-gz-file-directly/

Kristine Hahn
Sr. Technical Writer
415-497-8107 @krishahn skype:krishahn


On Thu, Aug 20, 2015 at 8:29 AM, Kristine Hahn <kh...@maprtech.com> wrote:

> To be more exact re: the pointer to the doc, the renaming example is in
> step 3 of
> http://drill.apache.org/docs/querying-plain-text-files/#download-and-set-up-the-data
> .
>
> Kristine Hahn
> Sr. Technical Writer
> 415-497-8107 @krishahn skype:krishahn
>
>
> On Thu, Aug 20, 2015 at 8:27 AM, Kristine Hahn <kh...@maprtech.com> wrote:
>
>> You need to rename the tar/gz to use tbl extension to query PSV if you
>> use the default dfs storage configuration. See TSV example on
>> http://drill.apache.org/docs/querying-plain-text-files/#example-of-querying-a-tsv-file
>> .
>>
>> Kristine Hahn
>> Sr. Technical Writer
>> 415-497-8107 @krishahn skype:krishahn
>>
>>
>> On Thu, Aug 20, 2015 at 7:52 AM, Edmon Begoli <eb...@gmail.com> wrote:
>>
>>> I have large number of .txt files that are individually tarballed (i.e.
>>> compressed from .txt to .txt.tar.gz).
>>> (I received them like this.)
>>>
>>> Each .txt file is actually a psv file, but, for some reason, it is saved
>>> as
>>> txt.
>>>
>>> I have created a custom 'txt' storage configuration and I can query
>>> uncompressed .txt, structured as .psv without a problem.
>>>
>>> select * from dfs.`/<my path here>/<myfile>_08072015.txt`;
>>>
>>> When I try to query compressed file I get an error:
>>>
>>> 0: jdbc:drill:zk=local> select * from dfs.`/<my path
>>> here>/<myfile>_08072015.txt.tar.gz`;
>>>
>>> Aug 20, 2015 10:43:01 AM
>>> org.apache.calcite.sql.validate.SqlValidatorException <init>
>>>
>>> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table
>>> 'dfs./<my
>>> path here>/<myfile>_08072015.txt.tar.gz' not found
>>>
>>> Aug 20, 2015 10:43:01 AM org.apache.calcite.runtime.CalciteException
>>> <init>
>>>
>>> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1,
>>> column 15 to line 1, column 17: Table 'dfs/<my path
>>> here>/<myfile>_08072015.txt.tar.gz'
>>> not found
>>>
>>> Error: PARSE ERROR: From line 1, column 15 to line 1, column 17: Table
>>> 'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found
>>>
>>> [Error Id: fdba6efb-d256-437c-8355-ce09f087537c on 192.168.1.16:31010]
>>> (state=,code=0)
>>>
>>> 0: jdbc:drill:zk=local>
>>>
>>>
>>> Is it possible to fix this, or is Drill just not able to recognize custom
>>> format inside the tar/gz compressed file?
>>>
>>> Please advise.
>>>
>>
>>
>

Re: Error when querying .gz file that contains single text file (actually psv, but with txt extension) - Drill 1.1.0

Posted by Andries Engelbrecht <ae...@maprtech.com>.
Interesting that is took gz as extension type, I have used tar as an extension type for tar.gz files and it worked fine.

—Andries
> On Aug 20, 2015, at 8:45 AM, Edmon Begoli <eb...@gmail.com> wrote:
> 
> Actually, it looks like I adding extensions to the storage format solves
> the problem:
> 
> "formats": {
>    "psv": {
>      "type": "text",
>      "extensions": [
>        "tbl",
>        "txt",
>        "gz"
>      ],
> 
> This worked for me.
> 
> On Thu, Aug 20, 2015 at 11:29 AM, Kristine Hahn <kh...@maprtech.com> wrote:
> 
>> To be more exact re: the pointer to the doc, the renaming example is in
>> step 3 of
>> 
>> http://drill.apache.org/docs/querying-plain-text-files/#download-and-set-up-the-data
>> .
>> 
>> Kristine Hahn
>> Sr. Technical Writer
>> 415-497-8107 @krishahn skype:krishahn
>> 
>> 
>> On Thu, Aug 20, 2015 at 8:27 AM, Kristine Hahn <kh...@maprtech.com> wrote:
>> 
>>> You need to rename the tar/gz to use tbl extension to query PSV if you
>> use
>>> the default dfs storage configuration. See TSV example on
>>> 
>> http://drill.apache.org/docs/querying-plain-text-files/#example-of-querying-a-tsv-file
>>> .
>>> 
>>> Kristine Hahn
>>> Sr. Technical Writer
>>> 415-497-8107 @krishahn skype:krishahn
>>> 
>>> 
>>> On Thu, Aug 20, 2015 at 7:52 AM, Edmon Begoli <eb...@gmail.com> wrote:
>>> 
>>>> I have large number of .txt files that are individually tarballed (i.e.
>>>> compressed from .txt to .txt.tar.gz).
>>>> (I received them like this.)
>>>> 
>>>> Each .txt file is actually a psv file, but, for some reason, it is saved
>>>> as
>>>> txt.
>>>> 
>>>> I have created a custom 'txt' storage configuration and I can query
>>>> uncompressed .txt, structured as .psv without a problem.
>>>> 
>>>> select * from dfs.`/<my path here>/<myfile>_08072015.txt`;
>>>> 
>>>> When I try to query compressed file I get an error:
>>>> 
>>>> 0: jdbc:drill:zk=local> select * from dfs.`/<my path
>>>> here>/<myfile>_08072015.txt.tar.gz`;
>>>> 
>>>> Aug 20, 2015 10:43:01 AM
>>>> org.apache.calcite.sql.validate.SqlValidatorException <init>
>>>> 
>>>> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table
>>>> 'dfs./<my
>>>> path here>/<myfile>_08072015.txt.tar.gz' not found
>>>> 
>>>> Aug 20, 2015 10:43:01 AM org.apache.calcite.runtime.CalciteException
>>>> <init>
>>>> 
>>>> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1,
>>>> column 15 to line 1, column 17: Table 'dfs/<my path
>>>> here>/<myfile>_08072015.txt.tar.gz'
>>>> not found
>>>> 
>>>> Error: PARSE ERROR: From line 1, column 15 to line 1, column 17: Table
>>>> 'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found
>>>> 
>>>> [Error Id: fdba6efb-d256-437c-8355-ce09f087537c on 192.168.1.16:31010]
>>>> (state=,code=0)
>>>> 
>>>> 0: jdbc:drill:zk=local>
>>>> 
>>>> 
>>>> Is it possible to fix this, or is Drill just not able to recognize
>> custom
>>>> format inside the tar/gz compressed file?
>>>> 
>>>> Please advise.
>>>> 
>>> 
>>> 
>> 


Re: Error when querying .gz file that contains single text file (actually psv, but with txt extension) - Drill 1.1.0

Posted by Edmon Begoli <eb...@gmail.com>.
Actually, it looks like I adding extensions to the storage format solves
the problem:

"formats": {
    "psv": {
      "type": "text",
      "extensions": [
        "tbl",
        "txt",
        "gz"
      ],

This worked for me.

On Thu, Aug 20, 2015 at 11:29 AM, Kristine Hahn <kh...@maprtech.com> wrote:

> To be more exact re: the pointer to the doc, the renaming example is in
> step 3 of
>
> http://drill.apache.org/docs/querying-plain-text-files/#download-and-set-up-the-data
> .
>
> Kristine Hahn
> Sr. Technical Writer
> 415-497-8107 @krishahn skype:krishahn
>
>
> On Thu, Aug 20, 2015 at 8:27 AM, Kristine Hahn <kh...@maprtech.com> wrote:
>
> > You need to rename the tar/gz to use tbl extension to query PSV if you
> use
> > the default dfs storage configuration. See TSV example on
> >
> http://drill.apache.org/docs/querying-plain-text-files/#example-of-querying-a-tsv-file
> > .
> >
> > Kristine Hahn
> > Sr. Technical Writer
> > 415-497-8107 @krishahn skype:krishahn
> >
> >
> > On Thu, Aug 20, 2015 at 7:52 AM, Edmon Begoli <eb...@gmail.com> wrote:
> >
> >> I have large number of .txt files that are individually tarballed (i.e.
> >> compressed from .txt to .txt.tar.gz).
> >> (I received them like this.)
> >>
> >> Each .txt file is actually a psv file, but, for some reason, it is saved
> >> as
> >> txt.
> >>
> >> I have created a custom 'txt' storage configuration and I can query
> >> uncompressed .txt, structured as .psv without a problem.
> >>
> >> select * from dfs.`/<my path here>/<myfile>_08072015.txt`;
> >>
> >> When I try to query compressed file I get an error:
> >>
> >> 0: jdbc:drill:zk=local> select * from dfs.`/<my path
> >> here>/<myfile>_08072015.txt.tar.gz`;
> >>
> >> Aug 20, 2015 10:43:01 AM
> >> org.apache.calcite.sql.validate.SqlValidatorException <init>
> >>
> >> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table
> >> 'dfs./<my
> >> path here>/<myfile>_08072015.txt.tar.gz' not found
> >>
> >> Aug 20, 2015 10:43:01 AM org.apache.calcite.runtime.CalciteException
> >> <init>
> >>
> >> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1,
> >> column 15 to line 1, column 17: Table 'dfs/<my path
> >> here>/<myfile>_08072015.txt.tar.gz'
> >> not found
> >>
> >> Error: PARSE ERROR: From line 1, column 15 to line 1, column 17: Table
> >> 'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found
> >>
> >> [Error Id: fdba6efb-d256-437c-8355-ce09f087537c on 192.168.1.16:31010]
> >> (state=,code=0)
> >>
> >> 0: jdbc:drill:zk=local>
> >>
> >>
> >> Is it possible to fix this, or is Drill just not able to recognize
> custom
> >> format inside the tar/gz compressed file?
> >>
> >> Please advise.
> >>
> >
> >
>

Re: Error when querying .gz file that contains single text file (actually psv, but with txt extension) - Drill 1.1.0

Posted by Kristine Hahn <kh...@maprtech.com>.
To be more exact re: the pointer to the doc, the renaming example is in
step 3 of
http://drill.apache.org/docs/querying-plain-text-files/#download-and-set-up-the-data
.

Kristine Hahn
Sr. Technical Writer
415-497-8107 @krishahn skype:krishahn


On Thu, Aug 20, 2015 at 8:27 AM, Kristine Hahn <kh...@maprtech.com> wrote:

> You need to rename the tar/gz to use tbl extension to query PSV if you use
> the default dfs storage configuration. See TSV example on
> http://drill.apache.org/docs/querying-plain-text-files/#example-of-querying-a-tsv-file
> .
>
> Kristine Hahn
> Sr. Technical Writer
> 415-497-8107 @krishahn skype:krishahn
>
>
> On Thu, Aug 20, 2015 at 7:52 AM, Edmon Begoli <eb...@gmail.com> wrote:
>
>> I have large number of .txt files that are individually tarballed (i.e.
>> compressed from .txt to .txt.tar.gz).
>> (I received them like this.)
>>
>> Each .txt file is actually a psv file, but, for some reason, it is saved
>> as
>> txt.
>>
>> I have created a custom 'txt' storage configuration and I can query
>> uncompressed .txt, structured as .psv without a problem.
>>
>> select * from dfs.`/<my path here>/<myfile>_08072015.txt`;
>>
>> When I try to query compressed file I get an error:
>>
>> 0: jdbc:drill:zk=local> select * from dfs.`/<my path
>> here>/<myfile>_08072015.txt.tar.gz`;
>>
>> Aug 20, 2015 10:43:01 AM
>> org.apache.calcite.sql.validate.SqlValidatorException <init>
>>
>> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table
>> 'dfs./<my
>> path here>/<myfile>_08072015.txt.tar.gz' not found
>>
>> Aug 20, 2015 10:43:01 AM org.apache.calcite.runtime.CalciteException
>> <init>
>>
>> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1,
>> column 15 to line 1, column 17: Table 'dfs/<my path
>> here>/<myfile>_08072015.txt.tar.gz'
>> not found
>>
>> Error: PARSE ERROR: From line 1, column 15 to line 1, column 17: Table
>> 'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found
>>
>> [Error Id: fdba6efb-d256-437c-8355-ce09f087537c on 192.168.1.16:31010]
>> (state=,code=0)
>>
>> 0: jdbc:drill:zk=local>
>>
>>
>> Is it possible to fix this, or is Drill just not able to recognize custom
>> format inside the tar/gz compressed file?
>>
>> Please advise.
>>
>
>

Re: Error when querying .gz file that contains single text file (actually psv, but with txt extension) - Drill 1.1.0

Posted by Kristine Hahn <kh...@maprtech.com>.
You need to rename the tar/gz to use tbl extension to query PSV if you use
the default dfs storage configuration. See TSV example on
http://drill.apache.org/docs/querying-plain-text-files/#example-of-querying-a-tsv-file
.

Kristine Hahn
Sr. Technical Writer
415-497-8107 @krishahn skype:krishahn


On Thu, Aug 20, 2015 at 7:52 AM, Edmon Begoli <eb...@gmail.com> wrote:

> I have large number of .txt files that are individually tarballed (i.e.
> compressed from .txt to .txt.tar.gz).
> (I received them like this.)
>
> Each .txt file is actually a psv file, but, for some reason, it is saved as
> txt.
>
> I have created a custom 'txt' storage configuration and I can query
> uncompressed .txt, structured as .psv without a problem.
>
> select * from dfs.`/<my path here>/<myfile>_08072015.txt`;
>
> When I try to query compressed file I get an error:
>
> 0: jdbc:drill:zk=local> select * from dfs.`/<my path
> here>/<myfile>_08072015.txt.tar.gz`;
>
> Aug 20, 2015 10:43:01 AM
> org.apache.calcite.sql.validate.SqlValidatorException <init>
>
> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table
> 'dfs./<my
> path here>/<myfile>_08072015.txt.tar.gz' not found
>
> Aug 20, 2015 10:43:01 AM org.apache.calcite.runtime.CalciteException <init>
>
> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1,
> column 15 to line 1, column 17: Table 'dfs/<my path
> here>/<myfile>_08072015.txt.tar.gz'
> not found
>
> Error: PARSE ERROR: From line 1, column 15 to line 1, column 17: Table
> 'dfs./<my path here>/<myfile>_08072015.txt.tar.gz' not found
>
> [Error Id: fdba6efb-d256-437c-8355-ce09f087537c on 192.168.1.16:31010]
> (state=,code=0)
>
> 0: jdbc:drill:zk=local>
>
>
> Is it possible to fix this, or is Drill just not able to recognize custom
> format inside the tar/gz compressed file?
>
> Please advise.
>