You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Dongjoon Hyun (JIRA)" <ji...@apache.org> on 2019/01/05 00:42:00 UTC
[jira] [Comment Edited] (SPARK-26540) Requirement failed when reading numeric arrays from PostgreSQL

    [ https://issues.apache.org/jira/browse/SPARK-26540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16734708#comment-16734708 ] 

Dongjoon Hyun edited comment on SPARK-26540 at 1/5/19 12:41 AM:
----------------------------------------------------------------

For me, this is not a correctness issue and not a regression. Could you check the older Spark like 2.0/2.1? I guess this is also not a data loss issue.


was (Author: dongjoon):
For me, this is not a correctness issue and not a regression. Could you check the older Spark like 2.0/2.1?

> Requirement failed when reading numeric arrays from PostgreSQL
> --------------------------------------------------------------
>
>                 Key: SPARK-26540
>                 URL: https://issues.apache.org/jira/browse/SPARK-26540
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.2, 2.3.2, 2.4.0
>            Reporter: Takeshi Yamamuro
>            Priority: Major
>
> This bug was reported in spark-user: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-jdbc-postgres-numeric-array-td34280.html
> To reproduce this;
> {code}
> // Creates a table in a PostgreSQL shell
> postgres=# CREATE TABLE t (v numeric[], d  numeric);
> CREATE TABLE
> postgres=# INSERT INTO t VALUES('{1111.222,2222.332}', 222.4555);
> INSERT 0 1
> postgres=# SELECT * FROM t;
>           v          |    d     
> ---------------------+----------
>  {1111.222,2222.332} | 222.4555
> (1 row)
> postgres=# \d t
>         Table "public.t"
>  Column |   Type    | Modifiers 
> --------+-----------+-----------
>  v      | numeric[] | 
>  d      | numeric   | 
> // Then, reads it in Spark
> ./bin/spark-shell --jars=postgresql-42.2.4.jar -v
> scala> import java.util.Properties
> scala> val options = new Properties();
> scala> options.setProperty("driver", "org.postgresql.Driver")
> scala> options.setProperty("user", "maropu")
> scala> options.setProperty("password", "")
> scala> val pgTable = spark.read.jdbc("jdbc:postgresql:postgres", "t", options)
> scala> pgTable.printSchema
> root
>  |-- v: array (nullable = true)
>  |    |-- element: decimal(0,0) (containsNull = true)
>  |-- d: decimal(38,18) (nullable = true)
> scala> pgTable.show
> 9/01/05 09:16:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0
> 	at scala.Predef$.require(Predef.scala:281)
> 	at org.apache.spark.sql.types.Decimal.set(Decimal.scala:116)
> 	at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:465)
> ...
> {code}
> I looked over the related code and then I think we need more logics to handle numeric arrays;
> https://github.com/apache/spark/blob/2a30deb85ae4e42c5cbc936383dd5c3970f4a74f/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala#L41
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org