You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Takeshi Yamamuro (JIRA)" <ji...@apache.org> on 2019/01/05 00:19:00 UTC
[jira] [Updated] (SPARK-26540) Requirement failed when reading numeric arrays from PostgreSQL

     [ https://issues.apache.org/jira/browse/SPARK-26540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Takeshi Yamamuro updated SPARK-26540:
-------------------------------------
    Description: 
This bug was reported in spark-user: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-jdbc-postgres-numeric-array-td34280.html

To reproduce this;
{code}
// Creates a table in a PostgreSQL shell
postgres=# CREATE TABLE t (v numeric[], d  numeric);
CREATE TABLE
postgres=# INSERT INTO t VALUES('{1111.222,2222.332}', 222.4555);
INSERT 0 1
postgres=# SELECT * FROM t;
          v          |    d     
---------------------+----------
 {1111.222,2222.332} | 222.4555
(1 row)

postgres=# \d t
        Table "public.t"
 Column |   Type    | Modifiers 
--------+-----------+-----------
 v      | numeric[] | 
 d      | numeric   | 

// Then, reads it in Spark
./bin/spark-shell --jars=postgresql-42.2.4.jar -v

scala> import java.util.Properties
scala> val options = new Properties();
scala> options.setProperty("driver", "org.postgresql.Driver")
scala> options.setProperty("user", "maropu")
scala> options.setProperty("password", "")
scala> val pgTable = spark.read.jdbc("jdbc:postgresql:postgres", "t", options)
scala> pgTable.printSchema
root
 |-- v: array (nullable = true)
 |    |-- element: decimal(0,0) (containsNull = true)
 |-- d: decimal(38,18) (nullable = true)

scala> pgTable.show
9/01/05 09:16:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0
	at scala.Predef$.require(Predef.scala:281)
	at org.apache.spark.sql.types.Decimal.set(Decimal.scala:116)
	at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:465)
...
{code}

I looked over the related code and then I think we need more logics to handle numeric arrays;
https://github.com/apache/spark/blob/2a30deb85ae4e42c5cbc936383dd5c3970f4a74f/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala#L41
 

  was:
This bug was reported in spark-user: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-jdbc-postgres-numeric-array-td34280.html

To reproduce this;
{code}
// Creates a table in a PostgreSQL shell
postgres=# CREATE TABLE t (v numeric[], d  numeric);
CREATE TABLE
postgres=# INSERT INTO t VALUES('{1111.222,2222.332}', 222.4555);
INSERT 0 1
postgres=# SELECT * FROM t;
          v          |    d     
---------------------+----------
 {1111.222,2222.332} | 222.4555
(1 row)

postgres=# \d t
        Table "public.t"
 Column |   Type    | Modifiers 
--------+-----------+-----------
 v      | numeric[] | 
 d      | numeric   | 

// Then, reads it in Spark
./bin/spark-shell --jars=postgresql-42.2.4.jar -v

scala> import java.util.Properties
scala> val options = new Properties();
scala> options.setProperty("driver", "org.postgresql.Driver")
scala> options.setProperty("user", "maropu")
scala> options.setProperty("password", "")
scala> val pgTable = spark.read.jdbc("jdbc:postgresql:postgres", "t", options)
scala> pgTable.printSchema
root
 |-- v: array (nullable = true)
 |    |-- element: decimal(0,0) (containsNull = true)
 |-- d: decimal(38,18) (nullable = true)

scala> pgTable.show
9/01/05 09:16:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0
	at scala.Predef$.require(Predef.scala:281)
	at org.apache.spark.sql.types.Decimal.set(Decimal.scala:116)
	at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:465)
...
{code}


> Requirement failed when reading numeric arrays from PostgreSQL
> --------------------------------------------------------------
>
>                 Key: SPARK-26540
>                 URL: https://issues.apache.org/jira/browse/SPARK-26540
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.2.2, 2.3.2, 2.4.0
>            Reporter: Takeshi Yamamuro
>            Priority: Major
>
> This bug was reported in spark-user: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-jdbc-postgres-numeric-array-td34280.html
> To reproduce this;
> {code}
> // Creates a table in a PostgreSQL shell
> postgres=# CREATE TABLE t (v numeric[], d  numeric);
> CREATE TABLE
> postgres=# INSERT INTO t VALUES('{1111.222,2222.332}', 222.4555);
> INSERT 0 1
> postgres=# SELECT * FROM t;
>           v          |    d     
> ---------------------+----------
>  {1111.222,2222.332} | 222.4555
> (1 row)
> postgres=# \d t
>         Table "public.t"
>  Column |   Type    | Modifiers 
> --------+-----------+-----------
>  v      | numeric[] | 
>  d      | numeric   | 
> // Then, reads it in Spark
> ./bin/spark-shell --jars=postgresql-42.2.4.jar -v
> scala> import java.util.Properties
> scala> val options = new Properties();
> scala> options.setProperty("driver", "org.postgresql.Driver")
> scala> options.setProperty("user", "maropu")
> scala> options.setProperty("password", "")
> scala> val pgTable = spark.read.jdbc("jdbc:postgresql:postgres", "t", options)
> scala> pgTable.printSchema
> root
>  |-- v: array (nullable = true)
>  |    |-- element: decimal(0,0) (containsNull = true)
>  |-- d: decimal(38,18) (nullable = true)
> scala> pgTable.show
> 9/01/05 09:16:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0
> 	at scala.Predef$.require(Predef.scala:281)
> 	at org.apache.spark.sql.types.Decimal.set(Decimal.scala:116)
> 	at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:465)
> ...
> {code}
> I looked over the related code and then I think we need more logics to handle numeric arrays;
> https://github.com/apache/spark/blob/2a30deb85ae4e42c5cbc936383dd5c3970f4a74f/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala#L41
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org