You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@arrow.apache.org by "Jonathan Swenson (Jira)" <ji...@apache.org> on 2022/04/30 21:20:00 UTC

[jira] [Created] (ARROW-16427) [Java] jdbcToArrowVectors / sqlToArrowVectorIterator fails to handle variable decimal precision / scale

Jonathan Swenson created ARROW-16427:
----------------------------------------

             Summary: [Java] jdbcToArrowVectors / sqlToArrowVectorIterator fails to handle variable decimal precision / scale
                 Key: ARROW-16427
                 URL: https://issues.apache.org/jira/browse/ARROW-16427
             Project: Apache Arrow
          Issue Type: Bug
            Reporter: Jonathan Swenson


 

Against postgres, running a simple SQL query that produces numeric types can lead to a JDBC result set with BigDecimal values with variable decimal precision/scale. 

 

 
{code:java}
SELECT value FROM (
  SELECT 1000000000000000.01 AS "value" 
  UNION SELECT 1000000000300.0000001
) a {code}
 

 

The postgres JDBC adapter produces a result set that looks like the following: 

 
|| ||value||precision||scale||
|metadata|N/A|0|0|
|row 1|1000000000000000.01|18|2|
|row 2|1000000000300.0000001|20|7|

 

 

Even a result set that returns a single value may Numeric values with precision / scale that do not match the precision / scale in the ResultSetMetadata. 

 
{code:java}
SELECT AVG(one) from (
  SELECT 1000000000000000.01 as "one" 
  UNION select 1000000000300.0000001
) a {code}
produces a result set that looks like this

 

 
|| ||value||precision||scale||
|metadata|N/A|0|0|
|row 1|500500000000150.0050001|22|7|

 

When processing the result set using the simple jdbcToArrowVectors (or sqlToArrowVectorIterator) this fails to set the values extracted from the result set into the the DecimalVector

 
{code:java}
val calendar = JdbcToArrowUtils.getUtcCalendar()
val schema = JdbcToArrowUtils.jdbcToArrowSchema(rs.metaData, calendar)
val root = VectorSchemaRoot.create(schema, RootAllocator())
val vectors = JdbcToArrowUtils.jdbcToArrowVectors(rs, root, calendar) {code}
 

Error:

 
{code:java}
Exception in thread "main" java.lang.IndexOutOfBoundsException: index: 0, length: 1 (expected: range(0, 0))
    at org.apache.arrow.memory.ArrowBuf.checkIndexD(ArrowBuf.java:318)
    at org.apache.arrow.memory.ArrowBuf.chk(ArrowBuf.java:305)
    at org.apache.arrow.memory.ArrowBuf.getByte(ArrowBuf.java:507)
    at org.apache.arrow.vector.BitVectorHelper.setBit(BitVectorHelper.java:85)
    at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:354)
    at org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
    at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
    at org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:369)
    at org.apache.arrow.adapter.jdbc.JdbcToArrowUtils.jdbcToArrowVectors(JdbcToArrowUtils.java:321) {code}
 

using `sqlToArrowVectorIterator` also fails with an error trying to set data into the vector: (requires a little bit of trickery to force creation of the package private configuration)

 
{code:java}
Exception in thread "main" java.lang.RuntimeException: Error occurred while getting next schema root.
    at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:179)
    at com.acme.dataformat.ArrowResultSetProcessor.processResultSet(ArrowResultSetProcessor.kt:31)
    at com.acme.AppKt.main(App.kt:54)
    at com.acme.AppKt.main(App.kt)
Caused by: java.lang.RuntimeException: Error occurred while consuming data.
    at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:121)
    at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.load(ArrowVectorIterator.java:153)
    at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.next(ArrowVectorIterator.java:175)
    ... 3 more
Caused by: java.lang.UnsupportedOperationException: BigDecimal scale must equal that in the Arrow vector: 7 != 0
    at org.apache.arrow.vector.util.DecimalUtility.checkPrecisionAndScale(DecimalUtility.java:95)
    at org.apache.arrow.vector.DecimalVector.set(DecimalVector.java:355)
    at org.apache.arrow.adapter.jdbc.consumer.DecimalConsumer$NullableDecimalConsumer.consume(DecimalConsumer.java:61)
    at org.apache.arrow.adapter.jdbc.consumer.CompositeJdbcConsumer.consume(CompositeJdbcConsumer.java:46)
    at org.apache.arrow.adapter.jdbc.ArrowVectorIterator.consumeData(ArrowVectorIterator.java:113)
    ... 5 more {code}
 

 

Is there a recommended course of action to represent a variable precision / scale decimal vector? In any case it does not seem possible to convert JDBC data that uses these numeric types when they come in this form. 

 

It seems like Oracle's JDBC driver also returns metadata with a 0,0 precision / scale and other dialects might do the same. 

 



--
This message was sent by Atlassian Jira
(v8.20.7#820007)