You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Zoram Thanga (JIRA)" <ji...@apache.org> on 2018/01/16 20:49:00 UTC

[jira] [Resolved] (IMPALA-6307) A CTAS query fails with error: AnalysisException: Duplicate column name:

     [ https://issues.apache.org/jira/browse/IMPALA-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Zoram Thanga resolved IMPALA-6307.
----------------------------------
       Resolution: Fixed
    Fix Version/s: Impala 2.12.0

> A CTAS query fails with error: AnalysisException: Duplicate column name: <columnName>
> -------------------------------------------------------------------------------------
>
>                 Key: IMPALA-6307
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6307
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 2.10.0, Impala 2.11.0
>            Reporter: Zoram Thanga
>            Assignee: Zoram Thanga
>            Priority: Critical
>              Labels: regression
>             Fix For: Impala 2.12.0
>
>
> The following query triggers the exception:
> CREATE TABLE foo partitioned by (year) AS
> WITH TMP AS (
>   SELECT a.timestamp_col, a.year FROM functional.alltypes a
>   LEFT JOIN functional.alltypes b
>   ON b.timestamp_col BETWEEN a.timestamp_col AND a.timestamp_col
> )
> SELECT a.timestamp_col, a.year FROM TMP a;
> The exception is thrown from TableDef::analyzeColumnDefs():
>  
> {code:java}
> private void analyzeColumnDefs(Analyzer analyzer) throws AnalysisException {
>     Set<String> colNames = Sets.newHashSet();
>     for (ColumnDef colDef: columnDefs_) {
>       colDef.analyze(analyzer);
>       if (!colNames.add(colDef.getColName().toLowerCase())) {
>         throw new AnalysisException("Duplicate column name: " + colDef.getColName());
>       }
>       if (!isKuduTable() && colDef.hasKuduOptions()) {
>         throw new AnalysisException(String.format("Unsupported column options for " +
>             "file format '%s': '%s'", getFileFormat().name(), colDef.toString()));
>       }
>     }
>     for (ColumnDef colDef: getPartitionColumnDefs()) {
>       colDef.analyze(analyzer);
>       if (!colDef.getType().supportsTablePartitioning()) {
>         throw new AnalysisException(
>             String.format("Type '%s' is not supported as partition-column type " +
>                 "in column: %s", colDef.getType().toSql(), colDef.getColName()));
>       }
>       if (!colNames.add(colDef.getColName().toLowerCase())) {
>         throw new AnalysisException("Duplicate column name: " + colDef.getColName()); // THROWS HERE
>       }
>     }
> {code}
> The column duplication happens for "year" because it's in both columnDefs_ and dataLayout_::partitionColDefs_.
> The issue does not reproduce is we replace BETWEEN in the JOIN clause with the equivalent "b.timestamp_col > a.timestamp_col AND b.timestamp_col < a.timestamp_col".



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)