You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Paul Rogers (JIRA)" <ji...@apache.org> on 2017/05/29 23:01:04 UTC

[jira] [Created] (DRILL-5553) SELECT *, columns produces nonsense results

Paul Rogers created DRILL-5553:
----------------------------------

             Summary: SELECT *, columns produces nonsense results
                 Key: DRILL-5553
                 URL: https://issues.apache.org/jira/browse/DRILL-5553
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.10.0
            Reporter: Paul Rogers
            Priority: Minor


Consider the case discussed in DRILL-5551. Create a slight variation. 

Input file: CSV with headers:

{code}
a,b,c
10,foo,bar
{code}

As in DRILL-5550, CSV plugin is configured to use headers.

Run this (admittedly strange) query:

{code}
SELECT *, columns FROM `dfs.data.example.csv`
{code}

The resulting schema is:

{code}
BatchSchema [fields=[
a(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)], 
b(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)], 
c(VARCHAR:REQUIRED) [$offsets$(UINT4:REQUIRED)], 
columns(INT:OPTIONAL) [$bits$(UINT1:REQUIRED), columns(INT:OPTIONAL)]], 
selectionVector=NONE]
{code}

To make it easier to read:

{code}
a(VARCHAR:REQUIRED), 
b(VARCHAR:REQUIRED).
c(VARCHAR:REQUIRED),
columns(INT:OPTIONAL)
{code}

In DRILL-5551, {{columns}} changes meaning from an array of columns to a blank normal column. Here, it changes meaning again to a nullable Int (our normal "placeholder" for missing columns.)

Expected:

1. That, per DRILL-5552, no other column reference can occur with "*".
2. If item 1 is not fixed, that the scanner (or text reader) forbid the use of either "*" or "columns" with other column references.






--
This message was sent by Atlassian JIRA
(v6.3.15#6346)