You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2019/07/07 01:09:00 UTC
[jira] [Commented] (DRILL-7308) Incorrect Metadata from text file
queries
[ https://issues.apache.org/jira/browse/DRILL-7308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16879779#comment-16879779 ]
Paul Rogers commented on DRILL-7308:
------------------------------------
Turns out that the change mentioned above *does not* work. The incorrect REST code above checks if precision is set to determine if it is non-zero. Tried modifying the {{PrimitiveColumnMetadata}} class to not set the precision if it is zero. But, this caused some TPC-H tests to fail. It seems that other code relies on the precision being set, even if zero.
So, the only solution is to fix the REST code as described above; we can't work around the problem by mucking with other parts of Drill.
> Incorrect Metadata from text file queries
> -----------------------------------------
>
> Key: DRILL-7308
> URL: https://issues.apache.org/jira/browse/DRILL-7308
> Project: Apache Drill
> Issue Type: Bug
> Components: Metadata
> Affects Versions: 1.17.0
> Reporter: Charles Givre
> Priority: Major
> Attachments: Screen Shot 2019-06-24 at 3.16.40 PM.png, domains.csvh
>
>
> I'm noticing some strange behavior with the newest version of Drill. If you query a CSV file, you get the following metadata:
> {code:sql}
> SELECT * FROM dfs.test.`domains.csvh` LIMIT 1
> {code}
> {code:json}
> {
> "queryId": "22eee85f-c02c-5878-9735-091d18788061",
> "columns": [
> "domain"
> ],
> "rows": [}
> { "domain": "thedataist.com" } ],
> "metadata": [
> "VARCHAR(0, 0)",
> "VARCHAR(0, 0)"
> ],
> "queryState": "COMPLETED",
> "attemptedAutoLimit": 0
> }
> {code}
> There are two issues here:
> 1. VARCHAR now has precision
> 2. There are twice as many columns as there should be.
> Additionally, if you query a regular CSV, without the columns extracted, you get the following:
> {code:json}
> "rows": [
> {
> "columns": "[\"ACCT_NUM\",\"PRODUCT\",\"MONTH\",\"REVENUE\"]" }
> ],
> "metadata": [
> "VARCHAR(0, 0)",
> "VARCHAR(0, 0)"
> ],
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)