You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Matthew Walton (JIRA)" <ji...@apache.org> on 2017/06/22 19:21:00 UTC

[jira] [Updated] (SPARK-21179) Unable to return Hive INT data type into Spark via Hive JDBC driver: Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.

     [ https://issues.apache.org/jira/browse/SPARK-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matthew Walton updated SPARK-21179:
-----------------------------------
    Affects Version/s: 2.1.1
              Summary: Unable to return Hive INT data type into Spark via Hive JDBC driver:  Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.    (was: Unable to return Hive INT data type into Spark SQL via Hive JDBC driver:  Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.  )

> Unable to return Hive INT data type into Spark via Hive JDBC driver:  Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.  
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21179
>                 URL: https://issues.apache.org/jira/browse/SPARK-21179
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell, SQL
>    Affects Versions: 1.6.0, 2.0.0, 2.1.1
>         Environment: OS:  Linux
> HDP version 2.5.0.1-60
> Hive version: 1.2.1
> Spark  version 2.0.0.2.5.0.1-60
> JDBC:  Download the latest Hortonworks JDBC driver
>            Reporter: Matthew Walton
>
> I'm trying to fetch back data in Spark SQL using a JDBC connection to Hive.  Unfortunately, when I try to query data that resides in an INT column I get the following error:  
> 17/06/22 12:14:37 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.  
> Steps to reproduce:
> 1) On Hive create a simple table with an INT column and insert some data (I used SQuirreL Client with the Hortonworks JDBC driver):
> create table wh2.hivespark (country_id int, country_name string)
> insert into wh2.hivespark values (1, 'USA')
> 2) Copy the Hortonworks Hive JDBC driver to the machine where you will run Spark Shell
> 3) Start Spark shell loading the Hortonworks Hive JDBC driver jar files
> ./spark-shell --jars /home/spark/jdbc/hortonworkshive/HiveJDBC41.jar,/home/spark/jdbc/hortonworkshive/TCLIServiceClient.jar,/home/spark/jdbc/hortonworkshive/commons-codec-1.3.jar,/home/spark/jdbc/hortonworkshive/commons-logging-1.1.1.jar,/home/spark/jdbc/hortonworkshive/hive_metastore.jar,/home/spark/jdbc/hortonworkshive/hive_service.jar,/home/spark/jdbc/hortonworkshive/httpclient-4.1.3.jar,/home/spark/jdbc/hortonworkshive/httpcore-4.1.3.jar,/home/spark/jdbc/hortonworkshive/libfb303-0.9.0.jar,/home/spark/jdbc/hortonworkshive/libthrift-0.9.0.jar,/home/spark/jdbc/hortonworkshive/log4j-1.2.14.jar,/home/spark/jdbc/hortonworkshive/ql.jar,/home/spark/jdbc/hortonworkshive/slf4j-api-1.5.11.jar,/home/spark/jdbc/hortonworkshive/slf4j-log4j12-1.5.11.jar,/home/spark/jdbc/hortonworkshive/zookeeper-3.4.6.jar
> 4) In Spark shell load the data from Hive using the JDBC driver
> val hivespark = spark.read.format("jdbc").options(Map("url" -> "jdbc:hive2://localhost:10000/wh2;AuthMech=3;UseNativeQuery=1;user=hfs;password=hdfs","dbtable" -> "wh2.hivespark")).option("driver","com.simba.hive.jdbc41.HS2Driver").option("user","hdfs").option("password","hdfs").load()
> 5) In Spark shell try to display the data
> hivespark.show()
> At this point you should see the error:
> scala> hivespark.show()
> 17/06/22 12:14:37 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.
>         at com.simba.hiveserver2.exceptions.ExceptionConverter.toSQLException(Unknown Source)
>         at com.simba.hiveserver2.utilities.conversion.TypeConverter.toInt(Unknown Source)
>         at com.simba.hiveserver2.jdbc.common.SForwardResultSet.getInt(Unknown Source)
>         at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.getNext(JDBCRDD.scala:437)
>         at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.hasNext(JDBCRDD.scala:535)
>         at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
>         at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>         at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246)
>         at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240)
>         at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
>         at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
>         at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>         at org.apache.spark.scheduler.Task.run(Task.scala:85)
>         at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>         at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Note:  I also tested this issue using a JDBC driver from Progress DataDirect and I see a similar error message so this does not seem to be driver specific.
> scala> hivespark.show()
> 17/06/22 12:07:59 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2)
> java.sql.SQLException: [DataDirect][Hive JDBC Driver]Value can not be converted to requested type.
> Also, if I query this table directly from SQuirreL Client tool there is no error.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org