You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "benj (JIRA)" <ji...@apache.org> on 2019/04/26 09:45:00 UTC
[jira] [Created] (DRILL-7219) Ignore hidden file problems
benj created DRILL-7219:
---------------------------
Summary: Ignore hidden file problems
Key: DRILL-7219
URL: https://issues.apache.org/jira/browse/DRILL-7219
Project: Apache Drill
Issue Type: Bug
Components: Storage - JSON, Storage - Parquet, Storage - Text & CSV
Affects Versions: 1.15.0
Reporter: benj
Drill seems to use different filtering rules for files depending on the type.
* *Parquet*: filtering hidden file (starting with ".") +whether+ we request the directory or the files with *
{code:java}
/* DirPqt
|--sub1.pqt
|--sub2.pqt
|--.sub3.pqt
*/
SELECT count(*) FROM (SELECT DISTINCT filename FROM ....`DirPqt`);
=> 2
SELECT count(*) FROM (SELECT DISTINCT filename FROM ....`DirPqt/*`);
=> 2
/* Its possible to request the hidden file */
SELECT count(*) FROM (SELECT DISTINCT filename FROM ....`DirPqt/.*`);
=> 1
/* But don't know how to request visible and hidden simultaneously (except to do an union) */
{code}
* *CSV, json*: filtering hidden file (starting with ".") +depends+ if the request is on directory or files
{code:java}
/* DirCSVH
|--sub1.csvh
|--sub2.csvh
|--.sub3.csvh
*/
SELECT count(*) FROM (SELECT DISTINCT filename FROM ....`DirCSVH`);
=> 2
SELECT count(*) FROM (SELECT DISTINCT filename FROM ....`DirCSVH/*`);
=> 3
/* Like for Parquet, its possible to request the hidden file*/
SELECT count(*) FROM (SELECT DISTINCT filename FROM ....`DirCSVH/.*`);
=>1
/* It's also possible to request only visible */
SELECT count(*) FROM (SELECT DISTINCT filename FROM ....`DirCSVH/[^.]*`);
=>2
/* But don't know how to request visible and hidden simultaneously (except to do an union)*/
{code}
Some issue are about the problematic of hidden files, example : DRILL-2424
But don't found any precision of this filtering in the documentation. I found that hidden file start with "." or "_" but maybe there are other case ?
It's a little bit strange to not have the same filtering rules depending of the type of the file.
It's not practical to not have the possibility to simply say if we want or not hidden file. For example with a :
{code:java}
SELECT * FROM ....`MyDir/[.]?*`;
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)