You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Herman van Hovell (JIRA)" <ji...@apache.org> on 2016/08/17 16:05:21 UTC

[jira] [Updated] (SPARK-17108) BIGINT and INT comparison failure in spark sql

     [ https://issues.apache.org/jira/browse/SPARK-17108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Herman van Hovell updated SPARK-17108:
--------------------------------------
    Description: 
I have a Hive table with the following definition:
{noformat}
create table testforerror (
    my_column MAP<BIGINT, ARRAY<String>>
);
{noformat}
The table has the following records
{noformat}
hive> select * from testforerror;
OK
{11001:["0034111000a4WaAAA2"]}
{11001:["0034111000orWiWAAU"]}
{11001:["","0034111000VgrHdAAJ"]}
{11001:["0034110000cS4rDAAS"]}
{12001:["0037110001a7ofsAAA"]}
Time taken: 0.067 seconds, Fetched: 5 row(s)
{noformat}
I have a query which filters records with key of the my_column. The query is as follows
{noformat}
select * from testforerror where my_column[11001] is not null;
{noformat}
This query is executing fine from hive/beeline shell and producing the following records:
{noformat}
hive> select * from testforerror where my_column[11001] is not null;
OK
{11001:["0034111000a4WaAAA2"]}
{11001:["0034111000orWiWAAU"]}
{11001:["","0034111000VgrHdAAJ"]}
{11001:["0034110000cS4rDAAS"]}
Time taken: 2.224 seconds, Fetched: 4 row(s)
{noformat}
But however I get an error when trying to execute from spark sqlContext. The following is the error message:
{noformat}
scala> val errorquery = "select * from testforerror where my_column[11001] is not null"
errorquery: String = select * from testforerror where my_column[11001] is not null

scala> sqlContext.sql(errorquery).show()
org.apache.spark.sql.AnalysisException: cannot resolve 'my_column[11001]' due to data type mismatch: argument 2 requires bigint type, however, '11001' is of int type.; line 1 pos 43
    at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:65)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:57)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
    at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
    at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
{noformat}

  was:
I have a Hive table with the following definition:

create table testforerror (
    my_column MAP<BIGINT, ARRAY<String>>
);
The table has the following records

hive> select * from testforerror;
OK
{11001:["0034111000a4WaAAA2"]}
{11001:["0034111000orWiWAAU"]}
{11001:["","0034111000VgrHdAAJ"]}
{11001:["0034110000cS4rDAAS"]}
{12001:["0037110001a7ofsAAA"]}
Time taken: 0.067 seconds, Fetched: 5 row(s)
I have a query which filters records with key of the my_column. The query is as follows

select * from testforerror where my_column[11001] is not null;
This query is executing fine from hive/beeline shell and producing the following records:

hive> select * from testforerror where my_column[11001] is not null;
OK
{11001:["0034111000a4WaAAA2"]}
{11001:["0034111000orWiWAAU"]}
{11001:["","0034111000VgrHdAAJ"]}
{11001:["0034110000cS4rDAAS"]}
Time taken: 2.224 seconds, Fetched: 4 row(s)
But however I get an error when trying to execute from spark sqlContext. The following is the error message:

scala> val errorquery = "select * from testforerror where my_column[11001] is not null"
errorquery: String = select * from testforerror where my_column[11001] is not null

scala> sqlContext.sql(errorquery).show()
org.apache.spark.sql.AnalysisException: cannot resolve 'my_column[11001]' due to data type mismatch: argument 2 requires bigint type, however, '11001' is of int type.; line 1 pos 43
    at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:65)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:57)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
    at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
    at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281)
    at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
    at scala.collection.Iterator$class.foreach(Iterator.scala:727)
    at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
    at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
    at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
    at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)


> BIGINT and INT comparison failure in spark sql
> ----------------------------------------------
>
>                 Key: SPARK-17108
>                 URL: https://issues.apache.org/jira/browse/SPARK-17108
>             Project: Spark
>          Issue Type: Bug
>            Reporter: Sai Krishna Kishore Beathanabhotla
>
> I have a Hive table with the following definition:
> {noformat}
> create table testforerror (
>     my_column MAP<BIGINT, ARRAY<String>>
> );
> {noformat}
> The table has the following records
> {noformat}
> hive> select * from testforerror;
> OK
> {11001:["0034111000a4WaAAA2"]}
> {11001:["0034111000orWiWAAU"]}
> {11001:["","0034111000VgrHdAAJ"]}
> {11001:["0034110000cS4rDAAS"]}
> {12001:["0037110001a7ofsAAA"]}
> Time taken: 0.067 seconds, Fetched: 5 row(s)
> {noformat}
> I have a query which filters records with key of the my_column. The query is as follows
> {noformat}
> select * from testforerror where my_column[11001] is not null;
> {noformat}
> This query is executing fine from hive/beeline shell and producing the following records:
> {noformat}
> hive> select * from testforerror where my_column[11001] is not null;
> OK
> {11001:["0034111000a4WaAAA2"]}
> {11001:["0034111000orWiWAAU"]}
> {11001:["","0034111000VgrHdAAJ"]}
> {11001:["0034110000cS4rDAAS"]}
> Time taken: 2.224 seconds, Fetched: 4 row(s)
> {noformat}
> But however I get an error when trying to execute from spark sqlContext. The following is the error message:
> {noformat}
> scala> val errorquery = "select * from testforerror where my_column[11001] is not null"
> errorquery: String = select * from testforerror where my_column[11001] is not null
> scala> sqlContext.sql(errorquery).show()
> org.apache.spark.sql.AnalysisException: cannot resolve 'my_column[11001]' due to data type mismatch: argument 2 requires bigint type, however, '11001' is of int type.; line 1 pos 43
>     at org.apache.spark.sql.catalyst.analysis.package$AnalysisErrorAt.failAnalysis(package.scala:42)
>     at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:65)
>     at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1$$anonfun$apply$2.applyOrElse(CheckAnalysis.scala:57)
>     at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
>     at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformUp$1.apply(TreeNode.scala:335)
>     at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:69)
>     at org.apache.spark.sql.catalyst.trees.TreeNode.transformUp(TreeNode.scala:334)
>     at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
>     at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$5.apply(TreeNode.scala:332)
>     at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:281)
>     at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
>     at scala.collection.Iterator$class.foreach(Iterator.scala:727)
>     at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
>     at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
>     at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
>     at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org