You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@drill.apache.org by "Raymond Wong (JIRA)" <ji...@apache.org> on 2017/11/20 21:16:01 UTC

[jira] [Created] (DRILL-5982) CTAS creates parquet files with inconsistent nullable column

Raymond Wong created DRILL-5982:
-----------------------------------

             Summary: CTAS creates parquet files with inconsistent nullable column
                 Key: DRILL-5982
                 URL: https://issues.apache.org/jira/browse/DRILL-5982
             Project: Apache Drill
          Issue Type: Bug
    Affects Versions: 1.11.0
         Environment: windows 10
            Reporter: Raymond Wong


Create two CTAS parquet files. One CTAS statement uses a MySQL as data source. The other one uses {{(Values(1))}} as data source. Both files have the same schema - same column names and data type.

The Parquet file created with MySQL data source has nullable columns and the file created with {{(Values(1))}} has non-nullable columns.

{quote}
DROP TABLE dfs.tmp.table1;
CREATE TABLE dfs.tmp.table1 AS
SELECT 'CA' AS state, CAST(1 AS BIGINT) AS id
FROM `mysql_dw_reporting.datawarehouse1`.DW_Qualbe_Cust_And_CustPay
LIMIT 1;

DROP TABLE dfs.tmp.table2;
CREATE TABLE dfs.tmp.table2 AS
SELECT 'NY' AS state, CAST(2 AS BIGINT) AS id
FROM (Values(1))
;
{quote}

The result of this inconsistency impacts the ability to apply SQL window function across parquet tables. Querying table1 and table2 with a SQL window function generates an error message as follows

{quote}
SELECT id, FIRST_VALUE(state) OVER( PARTITION BY id ) AS state 
FROM  dfs.tmp.`table*`

SQL Error: UNSUPPORTED_OPERATION ERROR: Sort doesn't currently support sorts with changing schemas
{quote}







--
This message was sent by Atlassian JIRA
(v6.4.14#64029)