You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@phoenix.apache.org by "Eric Belanger (JIRA)" <ji...@apache.org> on 2017/02/20 14:18:44 UTC

[jira] [Commented] (PHOENIX-3687) Spark application using JDBC doesn't load tables with array columns

    [ https://issues.apache.org/jira/browse/PHOENIX-3687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15874590#comment-15874590 ] 

Eric Belanger commented on PHOENIX-3687:
----------------------------------------

I forgot to mention I tested this case in Spark 1.6.1 and 2.0.

> Spark application using JDBC doesn't load tables with array columns
> -------------------------------------------------------------------
>
>                 Key: PHOENIX-3687
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-3687
>             Project: Phoenix
>          Issue Type: Bug
>    Affects Versions: 4.9.0
>            Reporter: Eric Belanger
>
> Given this table:
> {code}
> CREATE TABLE TEST_TABLE (
>   ID CHAR(36),
>   ARR_VALUES VARCHAR ARRAY,
>   CONSTRAINT PK PRIMARY KEY (ID)
> );
> {code}
> When running this spark application.
> {code}
>     SparkSession session = SparkSession.builder()
>         .appName("Simple Application")
>         .getOrCreate();
>     session
>         .read()
>         .format("jdbc")
>         .option("url", "jdbc:phoenix:localhost")
>         .option("driver", "org.apache.phoenix.jdbc.PhoenixDriver")
>         .option("dbtable", "TEST_TABLE")
>         .load();
> {code}
> I receive this error.
> {code}
> Exception in thread "main" java.sql.SQLException: Unsupported type 2003
> 	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.org$apache$spark$sql$execution$datasources$jdbc$JdbcUtils$$getCatalystType(JdbcUtils.scala:209)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$5.apply(JdbcUtils.scala:246)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anonfun$5.apply(JdbcUtils.scala:246)
> 	at scala.Option.getOrElse(Option.scala:121)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.getSchema(JdbcUtils.scala:245)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$.resolveTable(JDBCRDD.scala:64)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JDBCRelation.<init>(JDBCRelation.scala:113)
> 	at org.apache.spark.sql.execution.datasources.jdbc.JdbcRelationProvider.createRelation(JdbcRelationProvider.scala:45)
> 	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:330)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:152)
> 	at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:125)
> 	at com.cakemail.spark.SimpleApp.main(SimpleApp.java:24)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> 	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> 	at java.lang.reflect.Method.invoke(Method.java:498)
> 	at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:738)
> 	at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187)
> 	at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212)
> 	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126)
> 	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
> {code}
> Removing the array column resolves the problem and I also tried with different array types but none worked.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)