You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Sharma, Tapan" <ta...@hp.com> on 2016/02/23 23:53:09 UTC

failing to drill zips of jsons

Hi Gang,

I read the following email: http://mail-archives.apache.org/mod_mbox/drill-dev/201412.mbox/%3CCAMpYv7By21Rj1nW5KUhbwN-19mZjPE4Np3Kzzu294tExR96hSQ@mail.gmail.com%3E

And I was trying to query from a zip of JSONs and it just fails with Validation error.  Any clue what might be wrong?


taps@ubuntu:~/data/temp$ file json.zip
json.zip: Zip archive data, at least v1.0 to extract

0: jdbc:drill:zk=local> select count(*) from dfs.`/home/taps/data/temp/json.zip`;
Feb 21, 2016 2:35:26 PM org.apache.calcite.sql.validate.SqlValidatorException <init>
SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table 'dfs./home/taps/data/temp/json.zip' not found
Feb 21, 2016 2:35:26 PM org.apache.calcite.runtime.CalciteException <init>
SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1, column 22 to line 1, column 24: Table 'dfs./home/taps/data/temp/json.zip' not found
Error: VALIDATION ERROR: From line 1, column 22 to line 1, column 24: Table 'dfs./home/taps/data/temp/json.zip' not found


[Error Id: f734308a-8c5b-4140-b813-08572f9a54c1 on ubuntu:31010] (state=,code=0)

If I unzip the JSONs and try the query it works, I unzipped it in the very same temp directory and moved the json.zip file.
0: jdbc:drill:zk=local> select count(*) from dfs.`/home/taps/data/temp/`;
+---------+
| EXPR$0  |
+---------+
| 284     |
+---------+
1 row selected (0.632 seconds)


Thanks,
Tapan

Re: failing to drill zips of jsons

Posted by Jason Altekruse <al...@gmail.com>.
Drill needs to know what format is stored underneath the compression, the
default way this is accomplished is with a compound extension (I don't know
if there is an accepted term for this practice).

You should be able to read the file if you name it data.json.zip.

On Tue, Feb 23, 2016 at 2:53 PM, Sharma, Tapan <ta...@hp.com> wrote:

> Hi Gang,
>
> I read the following email:
> http://mail-archives.apache.org/mod_mbox/drill-dev/201412.mbox/%3CCAMpYv7By21Rj1nW5KUhbwN-19mZjPE4Np3Kzzu294tExR96hSQ@mail.gmail.com%3E
>
> And I was trying to query from a zip of JSONs and it just fails with
> Validation error.  Any clue what might be wrong?
>
>
> taps@ubuntu:~/data/temp$ file json.zip
> json.zip: Zip archive data, at least v1.0 to extract
>
> 0: jdbc:drill:zk=local> select count(*) from
> dfs.`/home/taps/data/temp/json.zip`;
> Feb 21, 2016 2:35:26 PM
> org.apache.calcite.sql.validate.SqlValidatorException <init>
> SEVERE: org.apache.calcite.sql.validate.SqlValidatorException: Table
> 'dfs./home/taps/data/temp/json.zip' not found
> Feb 21, 2016 2:35:26 PM org.apache.calcite.runtime.CalciteException <init>
> SEVERE: org.apache.calcite.runtime.CalciteContextException: From line 1,
> column 22 to line 1, column 24: Table 'dfs./home/taps/data/temp/json.zip'
> not found
> Error: VALIDATION ERROR: From line 1, column 22 to line 1, column 24:
> Table 'dfs./home/taps/data/temp/json.zip' not found
>
>
> [Error Id: f734308a-8c5b-4140-b813-08572f9a54c1 on ubuntu:31010]
> (state=,code=0)
>
> If I unzip the JSONs and try the query it works, I unzipped it in the very
> same temp directory and moved the json.zip file.
> 0: jdbc:drill:zk=local> select count(*) from dfs.`/home/taps/data/temp/`;
> +---------+
> | EXPR$0  |
> +---------+
> | 284     |
> +---------+
> 1 row selected (0.632 seconds)
>
>
> Thanks,
> Tapan
>