You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@drill.apache.org by "Lucian Poth (JIRA)" <ji...@apache.org> on 2017/05/24 18:15:04 UTC

[jira] [Created] (DRILL-5535) Paging Problem with Querying Directories

Lucian Poth created DRILL-5535:
----------------------------------

             Summary: Paging Problem with Querying Directories
                 Key: DRILL-5535
                 URL: https://issues.apache.org/jira/browse/DRILL-5535
             Project: Apache Drill
          Issue Type: Bug
          Components: Functions - Drill
    Affects Versions: 1.10.0
         Environment: Debian 8
Hadoop with Kerberos security
            Reporter: Lucian Poth


Problem comes with the following Drill query:
 "SELECT * FROM <<mySource>>
WHERE (dir0='Test1' AND dir1='TestDataSourceID1') 
   OR (dir0='Test2' AND dir1='TestDataSourceID2')  
LIMIT 2 OFFSET 0"

If this call gets run twice it is randomly set which file will be in the result. So if a query is created which should page my result I won't be able to tell which source was used for the result.
Due two the fact that if file1 contains the columns a, b, c and column b, c, d I also will get a problem with the result as the first results will for example contain the columns a, b, c and the second half of the results will contain a, b, c, d with a filled with null. 

As in the example on your webpage (https://drill.apache.org/docs/querying-directories/) where you query specific columns and order the result without any paging I am wondering if this problem only occurs while using the star in the query.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)