You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Takeshi Yamamuro (JIRA)" <ji...@apache.org> on 2019/01/05 00:19:00 UTC
[jira] [Updated] (SPARK-26540) Requirement failed when reading
numeric arrays from PostgreSQL
[ https://issues.apache.org/jira/browse/SPARK-26540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Takeshi Yamamuro updated SPARK-26540:
-------------------------------------
Description:
This bug was reported in spark-user: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-jdbc-postgres-numeric-array-td34280.html
To reproduce this;
{code}
// Creates a table in a PostgreSQL shell
postgres=# CREATE TABLE t (v numeric[], d numeric);
CREATE TABLE
postgres=# INSERT INTO t VALUES('{1111.222,2222.332}', 222.4555);
INSERT 0 1
postgres=# SELECT * FROM t;
v | d
---------------------+----------
{1111.222,2222.332} | 222.4555
(1 row)
postgres=# \d t
Table "public.t"
Column | Type | Modifiers
--------+-----------+-----------
v | numeric[] |
d | numeric |
// Then, reads it in Spark
./bin/spark-shell --jars=postgresql-42.2.4.jar -v
scala> import java.util.Properties
scala> val options = new Properties();
scala> options.setProperty("driver", "org.postgresql.Driver")
scala> options.setProperty("user", "maropu")
scala> options.setProperty("password", "")
scala> val pgTable = spark.read.jdbc("jdbc:postgresql:postgres", "t", options)
scala> pgTable.printSchema
root
|-- v: array (nullable = true)
| |-- element: decimal(0,0) (containsNull = true)
|-- d: decimal(38,18) (nullable = true)
scala> pgTable.show
9/01/05 09:16:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0
at scala.Predef$.require(Predef.scala:281)
at org.apache.spark.sql.types.Decimal.set(Decimal.scala:116)
at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:465)
...
{code}
I looked over the related code and then I think we need more logics to handle numeric arrays;
https://github.com/apache/spark/blob/2a30deb85ae4e42c5cbc936383dd5c3970f4a74f/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala#L41
was:
This bug was reported in spark-user: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-jdbc-postgres-numeric-array-td34280.html
To reproduce this;
{code}
// Creates a table in a PostgreSQL shell
postgres=# CREATE TABLE t (v numeric[], d numeric);
CREATE TABLE
postgres=# INSERT INTO t VALUES('{1111.222,2222.332}', 222.4555);
INSERT 0 1
postgres=# SELECT * FROM t;
v | d
---------------------+----------
{1111.222,2222.332} | 222.4555
(1 row)
postgres=# \d t
Table "public.t"
Column | Type | Modifiers
--------+-----------+-----------
v | numeric[] |
d | numeric |
// Then, reads it in Spark
./bin/spark-shell --jars=postgresql-42.2.4.jar -v
scala> import java.util.Properties
scala> val options = new Properties();
scala> options.setProperty("driver", "org.postgresql.Driver")
scala> options.setProperty("user", "maropu")
scala> options.setProperty("password", "")
scala> val pgTable = spark.read.jdbc("jdbc:postgresql:postgres", "t", options)
scala> pgTable.printSchema
root
|-- v: array (nullable = true)
| |-- element: decimal(0,0) (containsNull = true)
|-- d: decimal(38,18) (nullable = true)
scala> pgTable.show
9/01/05 09:16:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0
at scala.Predef$.require(Predef.scala:281)
at org.apache.spark.sql.types.Decimal.set(Decimal.scala:116)
at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:465)
...
{code}
> Requirement failed when reading numeric arrays from PostgreSQL
> --------------------------------------------------------------
>
> Key: SPARK-26540
> URL: https://issues.apache.org/jira/browse/SPARK-26540
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.2, 2.3.2, 2.4.0
> Reporter: Takeshi Yamamuro
> Priority: Major
>
> This bug was reported in spark-user: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-jdbc-postgres-numeric-array-td34280.html
> To reproduce this;
> {code}
> // Creates a table in a PostgreSQL shell
> postgres=# CREATE TABLE t (v numeric[], d numeric);
> CREATE TABLE
> postgres=# INSERT INTO t VALUES('{1111.222,2222.332}', 222.4555);
> INSERT 0 1
> postgres=# SELECT * FROM t;
> v | d
> ---------------------+----------
> {1111.222,2222.332} | 222.4555
> (1 row)
> postgres=# \d t
> Table "public.t"
> Column | Type | Modifiers
> --------+-----------+-----------
> v | numeric[] |
> d | numeric |
> // Then, reads it in Spark
> ./bin/spark-shell --jars=postgresql-42.2.4.jar -v
> scala> import java.util.Properties
> scala> val options = new Properties();
> scala> options.setProperty("driver", "org.postgresql.Driver")
> scala> options.setProperty("user", "maropu")
> scala> options.setProperty("password", "")
> scala> val pgTable = spark.read.jdbc("jdbc:postgresql:postgres", "t", options)
> scala> pgTable.printSchema
> root
> |-- v: array (nullable = true)
> | |-- element: decimal(0,0) (containsNull = true)
> |-- d: decimal(38,18) (nullable = true)
> scala> pgTable.show
> 9/01/05 09:16:34 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.lang.IllegalArgumentException: requirement failed: Decimal precision 4 exceeds max precision 0
> at scala.Predef$.require(Predef.scala:281)
> at org.apache.spark.sql.types.Decimal.set(Decimal.scala:116)
> at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:465)
> ...
> {code}
> I looked over the related code and then I think we need more logics to handle numeric arrays;
> https://github.com/apache/spark/blob/2a30deb85ae4e42c5cbc936383dd5c3970f4a74f/sql/core/src/main/scala/org/apache/spark/sql/jdbc/PostgresDialect.scala#L41
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org