You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/05/30 00:04:04 UTC
[jira] [Commented] (DRILL-5553) SELECT *, columns produces nonsense
results
[ https://issues.apache.org/jira/browse/DRILL-5553?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16028672#comment-16028672 ]
Paul Rogers commented on DRILL-5553:
------------------------------------
Repeat the exercise on a CSV file without headers, using a CSV storage plugin configured to not read headers.
Input file:
{code}
10,foo,bar
{code}
Query:
{code}
SELECT *, columns FROM `dfs.data`.`example.csv`
{code}
Actual results:
{code}
1 row(s):
columns,columns0
["10","foo","bar"],["10","foo","bar"]
{code}
Schema:
{code}
columns(VARCHAR:REPEATED),
columns0(VARCHAR:REPEATED)
{code}
Expected an error to be raise for this condition.
> SELECT *, columns produces nonsense results
> -------------------------------------------
>
> Key: DRILL-5553
> URL: https://issues.apache.org/jira/browse/DRILL-5553
> Project: Apache Drill
> Issue Type: Bug
> Affects Versions: 1.10.0
> Reporter: Paul Rogers
> Priority: Minor
>
> Consider the case discussed in DRILL-5551. Create a slight variation.
> Input file: CSV with headers:
> {code}
> a,b,c
> 10,foo,bar
> {code}
> As in DRILL-5550, CSV plugin is configured to use headers.
> Run this (admittedly strange) query:
> {code}
> SELECT *, columns FROM `dfs.data.example.csv`
> {code}
> The resulting schema is:
> {code}
> BatchSchema [fields=[
> a(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)],
> b(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)],
> c(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)],
> columns(INT:OPTIONAL) [$bits$(UINT1:REQUIRED), columns(INT:OPTIONAL)]],
> selectionVector=NONE]
> {code}
> To make it easier to read:
> {code}
> a(VARCHAR:REQUIRED),
> b(VARCHAR:REQUIRED).
> c(VARCHAR:REQUIRED),
> columns(INT:OPTIONAL)
> {code}
> In DRILL-5551, {{columns}} changes meaning from an array of columns to a blank normal column. Here, it changes meaning again to a nullable Int (our normal "placeholder" for missing columns.)
> Expected:
> 1. That, per DRILL-5552, no other column reference can occur with "*".
> 2. If item 1 is not fixed, that the scanner (or text reader) forbid the use of either "*" or "columns" with other column references.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)