You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Zheng Shao (JIRA)" <ji...@apache.org> on 2009/02/10 02:50:59 UTC

[jira] Created: (HIVE-285) UNION ALL does not allow different types in the same column

UNION ALL does not allow different types in the same column
-----------------------------------------------------------

                 Key: HIVE-285
                 URL: https://issues.apache.org/jira/browse/HIVE-285
             Project: Hadoop Hive
          Issue Type: Bug
          Components: Query Processor
    Affects Versions: 0.2.0, 0.3.0
            Reporter: Zheng Shao


{code}
explain INSERT OVERWRITE TABLE t
    SELECT s.r, s.c, sum(s.v) FROM
    (
      SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
      UNION ALL
      SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
    ) s
    GROUP BY s.r, s.c;
{code}

Both a and b have 3 string columns: r, c, and v.

It compiled successfully but failed during runtime.
"Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.



-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghotham Murthy reassigned HIVE-285:
-------------------------------------

    Assignee: Raghotham Murthy

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Prasad Chakka (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680935#action_12680935 ] 

Prasad Chakka commented on HIVE-285:
------------------------------------

i think it will be double. users can use cast to a specific type

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680803#action_12680803 ] 

Raghotham Murthy commented on HIVE-285:
---------------------------------------

type conversion in UNION will lead to confusing behavior. 

for a query like the following on two tables R(a int) and S(a string)

select a from R
union all
select a from S

What will the schema of the result of the join be? int or string?

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghotham Murthy updated HIVE-285:
----------------------------------

    Attachment: hive-285.1.patch

Union should reject mismatched schemas. genUnionPlan was not checking for types (only number and names of columns). I have now added checking for types as well. Also added a negative test case.

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>         Attachments: hive-285.1.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Namit Jain updated HIVE-285:
----------------------------

      Resolution: Fixed
    Hadoop Flags: [Reviewed]
          Status: Resolved  (was: Patch Available)

committed. Thanks Raghu

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch, hive-285.2.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Joydeep Sen Sarma (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680911#action_12680911 ] 

Joydeep Sen Sarma commented on HIVE-285:
----------------------------------------

+1

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Zheng Shao (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12672112#action_12672112 ] 

Zheng Shao commented on HIVE-285:
---------------------------------

{code} 
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_SUBQUERY (TOK_UNION (TOK_QUERY (TOK_FROM (TOK_TABREF t1 a)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_COLREF a r) r) (TOK_SELEXPR (TOK_COLREF a c) c) (TOK_SELEXPR (TOK_COLREF a v) v)))) (TOK_QUERY (TOK_FROM (TOK_TABREF t2 b)) (TOK_INSERT (TOK_DESTINATION (TOK_DIR TOK_TMP_FILE)) (TOK_SELECT (TOK_SELEXPR (TOK_COLREF b r) r) (TOK_SELEXPR (TOK_COLREF b c) c) (TOK_SELEXPR (+ 0 (TOK_COLREF b v)) v))))) s)) (TOK_INSERT (TOK_DESTINATION (TOK_TAB t)) (TOK_SELECT (TOK_SELEXPR (TOK_COLREF s r)) (TOK_SELEXPR (TOK_COLREF s c)) (TOK_SELEXPR (TOK_FUNCTION sum (TOK_COLREF s v)))) (TOK_GROUPBY (TOK_COLREF s r) (TOK_COLREF s c))))

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-2 depends on stages: Stage-1
  Stage-0 depends on stages: Stage-2

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        null-subquery2:s-subquery2:b
            Select Operator
              expressions:
                    expr: r
                    type: string
                    expr: c
                    type: string
                    expr: (UDFToDouble(0) + UDFToDouble(v))
                    type: double
                Reduce Output Operator
                  key expressions:
                        expr: 0
                        type: string
                        expr: 1
                        type: string
                  sort order: ++
                  Map-reduce partition columns:
                        expr: rand()
                        type: double
                  tag: -1
                  value expressions:
                        expr: 2
                        type: string
        null-subquery1:s-subquery1:a
            Select Operator
              expressions:
                    expr: r
                    type: string
                    expr: c
                    type: string
                    expr: v
                    type: string
                Reduce Output Operator
                  key expressions:
                        expr: 0
                        type: string
                        expr: 1
                        type: string
                  sort order: ++
                  Map-reduce partition columns:
                        expr: rand()
                        type: double
                  tag: -1
                  value expressions:
                        expr: 2
                        type: string
      Reduce Operator Tree:
        Group By Operator
          aggregations:
                expr: sum(UDFToDouble(VALUE.0))
          keys:
                expr: KEY.0
                type: string
                expr: KEY.1
                type: string
          mode: partial1
          File Output Operator
            compressed: true
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: org.apache.hadoop.mapred.SequenceFileOutputFormat
                name: binary_table

  Stage: Stage-2
    Map Reduce
      Alias -> Map Operator Tree:
        /tmp/hive-zshao/20312763/835544260.10001
          Reduce Output Operator
            key expressions:
                  expr: 0
                  type: string
                  expr: 1
                  type: string
            sort order: ++
            Map-reduce partition columns:
                  expr: 0
                  type: string
                  expr: 1
                  type: string
            tag: -1
            value expressions:
                  expr: 2
                  type: double
      Reduce Operator Tree:
        Group By Operator
          aggregations:
                expr: sum(VALUE.0)
          keys:
                expr: KEY.0
                type: string
                expr: KEY.1
                type: string
          mode: final
          Select Operator
            expressions:
                  expr: 0
                  type: string
                  expr: 1
                  type: string
                  expr: 2
                  type: double
            File Output Operator
              compressed: true
              table:
                  input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                  output format: org.apache.hadoop.mapred.SequenceFileOutputFormat
                  serde: org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe
                  name: t

  Stage: Stage-0
    Move Operator
      tables:
            replace: true
            table:
                input format: org.apache.hadoop.mapred.SequenceFileInputFormat
                output format: org.apache.hadoop.mapred.SequenceFileOutputFormat
                serde: org.apache.hadoop.hive.serde2.MetadataTypedColumnsetSerDe
                name: t
{code} 


Note that:

{code}
                    expr: (UDFToDouble(0) + UDFToDouble(v))
                    type: double
                    ...
                  value expressions:
                        expr: 2
                        type: string
{code}


> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Adam Kramer (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12778503#action_12778503 ] 

Adam Kramer commented on HIVE-285:
----------------------------------

I am the guy asking for type conversion.

I think that the right thing to do in all cases with a UNION ALL command is to convert mismatched types to STRING.

The main argument for this is that the output of a UNION ALL is never actually used for anything--it must be wrapped in a SELECT statement which will then make use of the columns. If the SELECT statement selects the UNIONed item as a string (say, for output to local directory or the terminal) everything is converted to string anyway--or if not, it's used in a way that would cause one or both of the original column types to be converted in some way...and conversion to STRING and back does not lose information.

Certainly, data should stay in the appropriate type if possible, though. 

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch, hive-285.2.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghotham Murthy updated HIVE-285:
----------------------------------

    Attachment: hive-285.2.patch

Looks like some other test was creating a table named t2 with a different schema. Dropping the table now.

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch, hive-285.2.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Raghotham Murthy updated HIVE-285:
----------------------------------

    Fix Version/s: 0.3.0
           Status: Patch Available  (was: Open)

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Raghotham Murthy (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12682974#action_12682974 ] 

Raghotham Murthy commented on HIVE-285:
---------------------------------------

ping

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12683110#action_12683110 ] 

Namit Jain commented on HIVE-285:
---------------------------------

running tests now to commit

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch, hive-285.2.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Ashish Thusoo (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12680979#action_12680979 ] 

Ashish Thusoo commented on HIVE-285:
------------------------------------

Just checked this on Oracle and mysql with Rama..

Oracle does not do any implict conversion and gives a type mismatch.

on the other hand

mysql does an implicit converion to varbinary which I think is equivalent to blob.

Given that we do not have blobs as data types I am perfectly fine with throwing a type mismatch for now.

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HIVE-285) UNION ALL does not allow different types in the same column

Posted by "Namit Jain (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HIVE-285?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12677149#action_12677149 ] 

Namit Jain commented on HIVE-285:
---------------------------------

+1

maybe, we can file a extension later on if someone asks for a type conversion in UNION

> UNION ALL does not allow different types in the same column
> -----------------------------------------------------------
>
>                 Key: HIVE-285
>                 URL: https://issues.apache.org/jira/browse/HIVE-285
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 0.2.0, 0.3.0
>            Reporter: Zheng Shao
>            Assignee: Raghotham Murthy
>             Fix For: 0.3.0
>
>         Attachments: hive-285.1.patch
>
>
> {code}
> explain INSERT OVERWRITE TABLE t
>     SELECT s.r, s.c, sum(s.v) FROM
>     (
>       SELECT a.r AS r, a.c AS c, a.v AS v FROM t1 a
>       UNION ALL
>       SELECT b.r AS r, b.c AS c, 0 + b.v AS v FROM t2 b
>     ) s
>     GROUP BY s.r, s.c;
> {code}
> Both a and b have 3 string columns: r, c, and v.
> It compiled successfully but failed during runtime.
> "Explain" shows that the plan for the 2 union-all operands have different output types that are converged to STRING, but there is no UDFToString inserted for "0 + b.v AS v" and as a result, SerDe was failing because it expects a String but is passed a Double.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.