You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Bruce Robbins (JIRA)" <ji...@apache.org> on 2019/01/23 18:33:00 UTC
[jira] [Created] (SPARK-26707) Insert into table with single struct column fails

Bruce Robbins created SPARK-26707:
-------------------------------------

             Summary: Insert into table with single struct column fails
                 Key: SPARK-26707
                 URL: https://issues.apache.org/jira/browse/SPARK-26707
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.4.0, 2.3.2, 2.2.3, 3.0.0
            Reporter: Bruce Robbins


This works:
{noformat}
scala> sql("select named_struct('d1', 123) c1, 12 c2").write.format("parquet").saveAsTable("structtbl2")

scala> sql("show create table structtbl2").show(truncate=false)
+---------------------------------------------------------------------------+
|createtab_stmt                                                             |
+---------------------------------------------------------------------------+
|CREATE TABLE `structtbl2` (`c1` STRUCT<`d1`: INT>, `c2` INT)
USING parquet
|
+---------------------------------------------------------------------------+

scala> sql("insert into structtbl2 values (struct(789), 17)")
res2: org.apache.spark.sql.DataFrame = []

scala> sql("select * from structtbl2").show
+-----+---+
|   c1| c2|
+-----+---+
|[789]| 17|
|[123]| 12|
+-----+---+
scala>
{noformat}
However, if the table's only column is the struct column, the insert does not work:
{noformat}
scala> sql("select named_struct('d1', 123) c1").write.format("parquet").saveAsTable("structtbl1")

scala> sql("show create table structtbl1").show(truncate=false)
+-----------------------------------------------------------------+
|createtab_stmt                                                   |
+-----------------------------------------------------------------+
|CREATE TABLE `structtbl1` (`c1` STRUCT<`d1`: INT>)
USING parquet
|
+-----------------------------------------------------------------+

scala> sql("insert into structtbl1 values (struct(789))")
org.apache.spark.sql.AnalysisException: cannot resolve '`col1`' due to data type mismatch: cannot cast int to struct<d1:int>;;
'InsertIntoHadoopFsRelationCommand file:/Users/brobbins/github/spark_upstream/spark-warehouse/structtbl1, false, Parquet, Map(path -> file:/Users/brobbins/github/spark_upstream/spark-warehouse/structtbl1), Append, CatalogTable(
...etc...
{noformat}
I can work around it by using a named_struct as the value:
{noformat}
scala> sql("insert into structtbl1 values (named_struct('d1',789))")
res7: org.apache.spark.sql.DataFrame = []

scala> sql("select * from structtbl1").show
+-----+
|   c1|
+-----+
|[789]|
|[123]|
+-----+

scala>
{noformat}
My guess is that I just don't understand how structs work. But maybe this is a bug.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org