You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@drill.apache.org by "Charles Givre (JIRA)" <ji...@apache.org> on 2019/06/24 18:43:01 UTC

[jira] [Created] (DRILL-7308) Incorrect Metadata from text file queries

Charles Givre created DRILL-7308:
------------------------------------

             Summary: Incorrect Metadata from text file queries
                 Key: DRILL-7308
                 URL: https://issues.apache.org/jira/browse/DRILL-7308
             Project: Apache Drill
          Issue Type: Bug
          Components: Metadata
    Affects Versions: 1.17.0
            Reporter: Charles Givre
         Attachments: domains.csvh

I'm noticing some strange behavior with the newest version of Drill.  If you query a CSV file, you get the following metadata:
 
SELECT * FROM dfs.test.`domains.csvh` LIMIT 1
 
{
  "queryId": "22eee85f-c02c-5878-9735-091d18788061",
  "columns": [
    "domain"
  ],
  "rows": [
    {
      "domain": "thedataist.com"
    }
  ],
  "metadata": [
    "VARCHAR(0, 0)",
    "VARCHAR(0, 0)"
  ],
  "queryState": "COMPLETED",
  "attemptedAutoLimit": 0
}
 
 
There are two issues here:
1.  VARCHAR now has precision 
2.  There are twice as many columns as there should be.
 
Additionally, if you query a regular CSV, without the columns extracted, you get the following:
 
"rows": [
    {
      "columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]"
    }
  ],
  "metadata": [
    "VARCHAR(0, 0)",
    "VARCHAR(0, 0)"
  ],
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)