You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/02/10 02:52:59 UTC

[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672112#action_12672112 ] 

Zheng Shao commented on HIVE-285:
---------------------------------

{code} 
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF t1 a)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_COLREF a r) r) (TOK_SELEXPR (TOK_COLREF a c) c) (TOK_SELEXPR (TOK_COLREF a v) v)))) (TOK_QUERY (TOK_FROM (TOK_TABREF t2 b)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_COLREF b r) r) (TOK_SELEXPR (TOK_COLREF b c) c) (TOK_SELEXPR (+ 0 (TOK_COLREF b v)) v))))) s)) (TOK_INSERT (TOK_DESTINATION (TOK_TAB t)) (TOK_SELECT (TOK_SELEXPR (TOK_COLREF s r)) (TOK_SELEXPR (TOK_COLREF s c)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_COLREF s v)))) (TOK_GROUPBY (TOK_COLREF s r) (TOK_COLREF s c))))

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-2 depends on stages: Stage-1
  Stage-0 depends on stages: Stage-2

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        null-subquery2:s-subquery2:b
            Select Operator
              expressions:
                    expr: r
                    type: string
                    expr: c
                    type: string
                    expr: (UDFToDouble(0) + UDFToDouble(v))
                    type: double
                Reduce Output Operator
                  key expressions:
                        expr: 0
                        type: string
                        expr: 1
                        type: string
                  sort order: ++
                  Map-reduce partition columns:
                        expr: rand()
                        type: double
                  tag: -1
                  value expressions:
                        expr: 2
                        type: string
        null-subquery1:s-subquery1:a
            Select Operator
              expressions:
                    expr: r
                    type: string
                    expr: c
                    type: string
                    expr: v
                    type: string
                Reduce Output Operator
                  key expressions:
                        expr: 0
                        type: string
                        expr: 1
                        type: string
                  sort order: ++
                  Map-reduce partition columns:
                        expr: rand()
                        type: double
                  tag: -1
                  value expressions:
                        expr: 2
                        type: string
      Reduce Operator Tree:
        Group By Operator
          aggregations:
                expr: sum(UDFToDouble(VALUE.0))
          keys:
                expr: KEY.0
                type: string
                expr: KEY.1
                type: string
          mode: partial1
          File Output Operator
            compressed: true
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: org.apache.hadoop.mapred.SequenceFileOutputFormat
                name: binary_table

  Stage: Stage-2
    Map Reduce
      Alias -> Map Operator Tree:
        /tmp/hive-zshao/20312763/835544260.10001
          Reduce Output Operator
            key expressions:
                  expr: 0
                  type: string
                  expr: 1
                  type: string
            sort order: ++
            Map-reduce partition columns:
                  expr: 0
                  type: string
                  expr: 1
                  type: string
            tag: -1
            value expressions:
                  expr: 2
                  type: double
      Reduce Operator Tree:
        Group By Operator
          aggregations:
                expr: sum(VALUE.0)
          keys:
                expr: KEY.0
                type: string
                expr: KEY.1
                type: string
          mode: final
          Select Operator
            expressions:
                  expr: 0
                  type: string
                  expr: 1
                  type: string
                  expr: 2
                  type: double
            File Output Operator
              compressed: true
              table:
                  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                  output format: org.apache.hadoop.mapred.SequenceFileOutputFormat
                  serde: org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe
                  name: t

  Stage: Stage-0
    Move Operator
      tables:
            replace: true
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: org.apache.hadoop.mapred.SequenceFileOutputFormat
                serde: org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe
                name: t
{code} 


Note that:

{code}
                    expr: (UDFToDouble(0) + UDFToDouble(v))
                    type: double
                    ...
                  value expressions:
                        expr: 2
                        type: string
{code}


> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.