You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@spark.apache.org by Du Li <li...@yahoo-inc.com.INVALID> on 2014/09/27 01:48:34 UTC

SparkSQL: map type MatchError when inserting into Hive table

Hi,

I was loading data into a partitioned table on Spark 1.1.0
beeline-thriftserver. The table has complex data types such as map<string,
string> and array<map<string,string>>. The query is like ³insert overwrite
table a partition (Š) select Š² and the select clause worked if run
separately. However, when running the insert query, there was an error as
follows.

The source code of Cast.scala seems to only handle the primitive data
types, which is perhaps why the MatchError was thrown.

I just wonder if this is still work in progress, or I should do it
differently.

Thanks,
Du


----
scala.MatchError: MapType(StringType,StringType,true) (of class
org.apache.spark.sql.catalyst.types.MapType)
        
org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:2
47)
        org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
        org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
        
org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala
:84)
        
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
y(Projection.scala:66)
        
org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
y(Projection.scala:50)
        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
        
org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sq
l$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.sca
la:149)
        
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
File$1.apply(InsertIntoHiveTable.scala:158)
        
org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
File$1.apply(InsertIntoHiveTable.scala:158)
        org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
        org.apache.spark.scheduler.Task.run(Task.scala:54)
        
org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
        
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
145)
        
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
615)
        java.lang.Thread.run(Thread.java:722)






---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: SparkSQL: map type MatchError when inserting into Hive table

Posted by Du Li <li...@yahoo-inc.com.INVALID>.

It might be a problem when inserting into a partitioned table. It worked
fine to when the target table was unpartitioned.

Can you confirm this?

Thanks,
Du



On 9/26/14, 4:48 PM, "Du Li" <li...@yahoo-inc.com.INVALID> wrote:

>Hi,
>
>I was loading data into a partitioned table on Spark 1.1.0
>beeline-thriftserver. The table has complex data types such as map<string,
>string> and array<map<string,string>>. The query is like ³insert overwrite
>table a partition (Š) select Š² and the select clause worked if run
>separately. However, when running the insert query, there was an error as
>follows.
>
>The source code of Cast.scala seems to only handle the primitive data
>types, which is perhaps why the MatchError was thrown.
>
>I just wonder if this is still work in progress, or I should do it
>differently.
>
>Thanks,
>Du
>
>
>----
>scala.MatchError: MapType(StringType,StringType,true) (of class
>org.apache.spark.sql.catalyst.types.MapType)
>        
>org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:
>2
>47)
>        
>org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
>        
>org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
>        
>org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scal
>a
>:84)
>        
>org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.app
>l
>y(Projection.scala:66)
>        
>org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.app
>l
>y(Projection.scala:50)
>        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>        
>org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$s
>q
>l$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.sc
>a
>la:149)
>        
>org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiv
>e
>File$1.apply(InsertIntoHiveTable.scala:158)
>        
>org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiv
>e
>File$1.apply(InsertIntoHiveTable.scala:158)
>        org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>        org.apache.spark.scheduler.Task.run(Task.scala:54)
>        
>org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>        
>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
>1
>145)
>        
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:
>615)
>        java.lang.Thread.run(Thread.java:722)
>
>
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>For additional commands, e-mail: user-help@spark.apache.org
>

Re: SparkSQL: map type MatchError when inserting into Hive table

Posted by Du Li <li...@yahoo-inc.com.INVALID>.

It might be a problem when inserting into a partitioned table. It worked
fine to when the target table was unpartitioned.

Can you confirm this?

Thanks,
Du



On 9/26/14, 4:48 PM, "Du Li" <li...@yahoo-inc.com.INVALID> wrote:

>Hi,
>
>I was loading data into a partitioned table on Spark 1.1.0
>beeline-thriftserver. The table has complex data types such as map<string,
>string> and array<map<string,string>>. The query is like ³insert overwrite
>table a partition (Š) select Š² and the select clause worked if run
>separately. However, when running the insert query, there was an error as
>follows.
>
>The source code of Cast.scala seems to only handle the primitive data
>types, which is perhaps why the MatchError was thrown.
>
>I just wonder if this is still work in progress, or I should do it
>differently.
>
>Thanks,
>Du
>
>
>----
>scala.MatchError: MapType(StringType,StringType,true) (of class
>org.apache.spark.sql.catalyst.types.MapType)
>        
>org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:
>2
>47)
>        
>org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
>        
>org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
>        
>org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scal
>a
>:84)
>        
>org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.app
>l
>y(Projection.scala:66)
>        
>org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.app
>l
>y(Projection.scala:50)
>        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>        scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>        
>org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$s
>q
>l$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.sc
>a
>la:149)
>        
>org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiv
>e
>File$1.apply(InsertIntoHiveTable.scala:158)
>        
>org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHiv
>e
>File$1.apply(InsertIntoHiveTable.scala:158)
>        org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>        org.apache.spark.scheduler.Task.run(Task.scala:54)
>        
>org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>        
>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:
>1
>145)
>        
>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java
>:
>615)
>        java.lang.Thread.run(Thread.java:722)
>
>
>
>
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>For additional commands, e-mail: user-help@spark.apache.org
>

Re: SparkSQL: map type MatchError when inserting into Hive table

Posted by Cheng Lian <li...@gmail.com>.

Would you mind to provide the DDL of this partitioned table together 
with the query you tried? The stacktrace suggests that the query was 
trying to cast a map into something else, which is not supported in 
Spark SQL. And I doubt whether Hive support casting a complex type to 
some other type.

On 9/27/14 7:48 AM, Du Li wrote:
> Hi,
>
> I was loading data into a partitioned table on Spark 1.1.0
> beeline-thriftserver. The table has complex data types such as map<string,
> string> and array<map<string,string>>. The query is like ³insert overwrite
> table a partition (Š) select Š² and the select clause worked if run
> separately. However, when running the insert query, there was an error as
> follows.
>
> The source code of Cast.scala seems to only handle the primitive data
> types, which is perhaps why the MatchError was thrown.
>
> I just wonder if this is still work in progress, or I should do it
> differently.
>
> Thanks,
> Du
>
>
> ----
> scala.MatchError: MapType(StringType,StringType,true) (of class
> org.apache.spark.sql.catalyst.types.MapType)
>          
> org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:2
> 47)
>          org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
>          org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
>          
> org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala
> :84)
>          
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
> y(Projection.scala:66)
>          
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
> y(Projection.scala:50)
>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>          
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sq
> l$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.sca
> la:149)
>          
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
> File$1.apply(InsertIntoHiveTable.scala:158)
>          
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
> File$1.apply(InsertIntoHiveTable.scala:158)
>          org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>          org.apache.spark.scheduler.Task.run(Task.scala:54)
>          
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>          
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
> 145)
>          
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
> 615)
>          java.lang.Thread.run(Thread.java:722)
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org

Re: SparkSQL: map type MatchError when inserting into Hive table

Posted by Cheng Lian <li...@gmail.com>.

Would you mind to provide the DDL of this partitioned table together 
with the query you tried? The stacktrace suggests that the query was 
trying to cast a map into something else, which is not supported in 
Spark SQL. And I doubt whether Hive support casting a complex type to 
some other type.

On 9/27/14 7:48 AM, Du Li wrote:
> Hi,
>
> I was loading data into a partitioned table on Spark 1.1.0
> beeline-thriftserver. The table has complex data types such as map<string,
> string> and array<map<string,string>>. The query is like ³insert overwrite
> table a partition (Š) select Š² and the select clause worked if run
> separately. However, when running the insert query, there was an error as
> follows.
>
> The source code of Cast.scala seems to only handle the primitive data
> types, which is perhaps why the MatchError was thrown.
>
> I just wonder if this is still work in progress, or I should do it
> differently.
>
> Thanks,
> Du
>
>
> ----
> scala.MatchError: MapType(StringType,StringType,true) (of class
> org.apache.spark.sql.catalyst.types.MapType)
>          
> org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:2
> 47)
>          org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
>          org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
>          
> org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala
> :84)
>          
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
> y(Projection.scala:66)
>          
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
> y(Projection.scala:50)
>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>          
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sq
> l$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.sca
> la:149)
>          
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
> File$1.apply(InsertIntoHiveTable.scala:158)
>          
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
> File$1.apply(InsertIntoHiveTable.scala:158)
>          org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>          org.apache.spark.scheduler.Task.run(Task.scala:54)
>          
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>          
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
> 145)
>          
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
> 615)
>          java.lang.Thread.run(Thread.java:722)
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: SparkSQL: map type MatchError when inserting into Hive table

Posted by Du Li <li...@yahoo-inc.com.INVALID>.

It turned out a bug in my code. In the select clause the list of fields is
misaligned with the schema of the target table. As a consequence the map
data couldn’t be cast to some other type in the schema.

Thanks anyway.


On 9/26/14, 8:08 PM, "Cheng Lian" <li...@gmail.com> wrote:

>Would you mind to provide the DDL of this partitioned table together
>with the query you tried? The stacktrace suggests that the query was
>trying to cast a map into something else, which is not supported in
>Spark SQL. And I doubt whether Hive support casting a complex type to
>some other type.
>
>On 9/27/14 7:48 AM, Du Li wrote:
>> Hi,
>>
>> I was loading data into a partitioned table on Spark 1.1.0
>> beeline-thriftserver. The table has complex data types such as
>>map<string,
>> string> and array<map<string,string>>. The query is like ³insert
>>overwrite
>> table a partition (Š) select Š² and the select clause worked if run
>> separately. However, when running the insert query, there was an error
>>as
>> follows.
>>
>> The source code of Cast.scala seems to only handle the primitive data
>> types, which is perhaps why the MatchError was thrown.
>>
>> I just wonder if this is still work in progress, or I should do it
>> differently.
>>
>> Thanks,
>> Du
>>
>>
>> ----
>> scala.MatchError: MapType(StringType,StringType,true) (of class
>> org.apache.spark.sql.catalyst.types.MapType)
>>
>> 
>>org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala
>>:2
>> 47)
>>          
>>org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
>>          
>>org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
>>
>> 
>>org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.sca
>>la
>> :84)
>>
>> 
>>org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.ap
>>pl
>> y(Projection.scala:66)
>>
>> 
>>org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.ap
>>pl
>> y(Projection.scala:50)
>>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>
>> 
>>org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$
>>sq
>> 
>>l$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.s
>>ca
>> la:149)
>>
>> 
>>org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHi
>>ve
>> File$1.apply(InsertIntoHiveTable.scala:158)
>>
>> 
>>org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHi
>>ve
>> File$1.apply(InsertIntoHiveTable.scala:158)
>>          
>>org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>>          org.apache.spark.scheduler.Task.run(Task.scala:54)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>>
>> 
>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java
>>:1
>> 145)
>>
>> 
>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
>>a:
>> 615)
>>          java.lang.Thread.run(Thread.java:722)
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>

Re: SparkSQL: map type MatchError when inserting into Hive table

Posted by Du Li <li...@yahoo-inc.com.INVALID>.

It turned out a bug in my code. In the select clause the list of fields is
misaligned with the schema of the target table. As a consequence the map
data couldn’t be cast to some other type in the schema.

Thanks anyway.


On 9/26/14, 8:08 PM, "Cheng Lian" <li...@gmail.com> wrote:

>Would you mind to provide the DDL of this partitioned table together
>with the query you tried? The stacktrace suggests that the query was
>trying to cast a map into something else, which is not supported in
>Spark SQL. And I doubt whether Hive support casting a complex type to
>some other type.
>
>On 9/27/14 7:48 AM, Du Li wrote:
>> Hi,
>>
>> I was loading data into a partitioned table on Spark 1.1.0
>> beeline-thriftserver. The table has complex data types such as
>>map<string,
>> string> and array<map<string,string>>. The query is like ³insert
>>overwrite
>> table a partition (Š) select Š² and the select clause worked if run
>> separately. However, when running the insert query, there was an error
>>as
>> follows.
>>
>> The source code of Cast.scala seems to only handle the primitive data
>> types, which is perhaps why the MatchError was thrown.
>>
>> I just wonder if this is still work in progress, or I should do it
>> differently.
>>
>> Thanks,
>> Du
>>
>>
>> ----
>> scala.MatchError: MapType(StringType,StringType,true) (of class
>> org.apache.spark.sql.catalyst.types.MapType)
>>
>> 
>>org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala
>>:2
>> 47)
>>          
>>org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
>>          
>>org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
>>
>> 
>>org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.sca
>>la
>> :84)
>>
>> 
>>org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.ap
>>pl
>> y(Projection.scala:66)
>>
>> 
>>org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.ap
>>pl
>> y(Projection.scala:50)
>>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>>
>> 
>>org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$
>>sq
>> 
>>l$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.s
>>ca
>> la:149)
>>
>> 
>>org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHi
>>ve
>> File$1.apply(InsertIntoHiveTable.scala:158)
>>
>> 
>>org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHi
>>ve
>> File$1.apply(InsertIntoHiveTable.scala:158)
>>          
>>org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>>          org.apache.spark.scheduler.Task.run(Task.scala:54)
>>
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>>
>> 
>>java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java
>>:1
>> 145)
>>
>> 
>>java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.jav
>>a:
>> 615)
>>          java.lang.Thread.run(Thread.java:722)
>>
>>
>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
>> For additional commands, e-mail: user-help@spark.apache.org
>>
>

Re: SparkSQL: map type MatchError when inserting into Hive table

Posted by Cheng Lian <li...@gmail.com>.

Would you mind to provide the DDL of this partitioned table together 
with the query you tried? The stacktrace suggests that the query was 
trying to cast a map into something else, which is not supported in 
Spark SQL. And I doubt whether Hive support casting a complex type to 
some other type.

On 9/27/14 7:48 AM, Du Li wrote:
> Hi,
>
> I was loading data into a partitioned table on Spark 1.1.0
> beeline-thriftserver. The table has complex data types such as map<string,
> string> and array<map<string,string>>. The query is like ³insert overwrite
> table a partition (Š) select Š² and the select clause worked if run
> separately. However, when running the insert query, there was an error as
> follows.
>
> The source code of Cast.scala seems to only handle the primitive data
> types, which is perhaps why the MatchError was thrown.
>
> I just wonder if this is still work in progress, or I should do it
> differently.
>
> Thanks,
> Du
>
>
> ----
> scala.MatchError: MapType(StringType,StringType,true) (of class
> org.apache.spark.sql.catalyst.types.MapType)
>
> org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:2
> 47)
>          org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
>          org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
>
> org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala
> :84)
>
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
> y(Projection.scala:66)
>
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
> y(Projection.scala:50)
>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sq
> l$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.sca
> la:149)
>
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
> File$1.apply(InsertIntoHiveTable.scala:158)
>
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
> File$1.apply(InsertIntoHiveTable.scala:158)
>          org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>          org.apache.spark.scheduler.Task.run(Task.scala:54)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
> 145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
> 615)
>          java.lang.Thread.run(Thread.java:722)
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@spark.apache.org
For additional commands, e-mail: dev-help@spark.apache.org

Re: SparkSQL: map type MatchError when inserting into Hive table

Posted by Cheng Lian <li...@gmail.com>.

Would you mind to provide the DDL of this partitioned table together 
with the query you tried? The stacktrace suggests that the query was 
trying to cast a map into something else, which is not supported in 
Spark SQL. And I doubt whether Hive support casting a complex type to 
some other type.

On 9/27/14 7:48 AM, Du Li wrote:
> Hi,
>
> I was loading data into a partitioned table on Spark 1.1.0
> beeline-thriftserver. The table has complex data types such as map<string,
> string> and array<map<string,string>>. The query is like ³insert overwrite
> table a partition (Š) select Š² and the select clause worked if run
> separately. However, when running the insert query, there was an error as
> follows.
>
> The source code of Cast.scala seems to only handle the primitive data
> types, which is perhaps why the MatchError was thrown.
>
> I just wonder if this is still work in progress, or I should do it
> differently.
>
> Thanks,
> Du
>
>
> ----
> scala.MatchError: MapType(StringType,StringType,true) (of class
> org.apache.spark.sql.catalyst.types.MapType)
>
> org.apache.spark.sql.catalyst.expressions.Cast.cast$lzycompute(Cast.scala:2
> 47)
>          org.apache.spark.sql.catalyst.expressions.Cast.cast(Cast.scala:247)
>          org.apache.spark.sql.catalyst.expressions.Cast.eval(Cast.scala:263)
>
> org.apache.spark.sql.catalyst.expressions.Alias.eval(namedExpressions.scala
> :84)
>
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
> y(Projection.scala:66)
>
> org.apache.spark.sql.catalyst.expressions.InterpretedMutableProjection.appl
> y(Projection.scala:50)
>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>          scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable.org$apache$spark$sq
> l$hive$execution$InsertIntoHiveTable$$writeToFile$1(InsertIntoHiveTable.sca
> la:149)
>
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
> File$1.apply(InsertIntoHiveTable.scala:158)
>
> org.apache.spark.sql.hive.execution.InsertIntoHiveTable$$anonfun$saveAsHive
> File$1.apply(InsertIntoHiveTable.scala:158)
>          org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:62)
>          org.apache.spark.scheduler.Task.run(Task.scala:54)
>
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1
> 145)
>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:
> 615)
>          java.lang.Thread.run(Thread.java:722)
>
>
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
> For additional commands, e-mail: user-help@spark.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@spark.apache.org
For additional commands, e-mail: user-help@spark.apache.org