You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Aman Sinha (JIRA)" <ji...@apache.org> on 2014/12/20 21:16:14 UTC
[jira] [Created] (DRILL-1906) Parquet reader error when reading a
subdirectory
Aman Sinha created DRILL-1906:
---------------------------------
Summary: Parquet reader error when reading a subdirectory
Key: DRILL-1906
URL: https://issues.apache.org/jira/browse/DRILL-1906
Project: Apache Drill
Issue Type: Bug
Reporter: Aman Sinha
I am not sure if this is a regression but on current master branch, Drill is unable to read subdirectories if there are parquet files in the parent directory and subdirectory. It's trying to read the footer for the subdirectory itself instead of recursing below. JSON works fine.
For example, here's my directory structure:
{code}
ls -lR /tmp/foo1
-rw-r--r-- 1 asinha wheel 132 Dec 20 11:10 0_0_0.parquet
drwxr-xr-x 3 asinha wheel 102 Dec 20 09:54 foo2
/tmp/foo1/foo2:
-rw-r--r-- 1 asinha wheel 132 Dec 16 16:14 0_0_0.parquet
{code}
Here's the failure and stack trace:
{code}
0: jdbc:drill:zk=local> select * from foo1;
Query failed: Query failed: Unexpected exception during fragment initialization: Internal error: Error while applying rule DrillTableRule, args [rel#660:EnumerableTableAccessRel.ENUMERABLE.ANY([]).[](table=[dfs, tmp, foo1])]
<skip>
Caused by: java.io.IOException: Could not read footer: java.io.IOException: Could not read footer for file DeprecatedRawLocalFileStatus{path=file:/tmp/foo1/foo2; isDirectory=true; modifica
tion_time=1419098040000; access_time=0; owner=; group=; permission=rwxrwxrwx; isSymlink=false}
at parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:195) ~[parquet-hadoop-1.5.1-drill-r4.jar:0.8.0-SNAPSHOT]
at parquet.hadoop.ParquetFileReader.readAllFootersInParallel(ParquetFileReader.java:208) ~[parquet-hadoop-1.5.1-drill-r4.jar:0.8.0-SNAPSHOT]
at parquet.hadoop.ParquetFileReader.readFooters(ParquetFileReader.java:224) ~[parquet-hadoop-1.5.1-drill-r4.jar:0.8.0-SNAPSHOT]
at org.apache.drill.exec.store.parquet.ParquetGroupScan.readFooter(ParquetGroupScan.java:208) ~[drill-java-exec-0.8.0-SNAPSHOT-rebuffed.jar:0.8.0-SNAPSHOT]
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)