You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2018/03/15 20:36:00 UTC

[jira] [Commented] (ARROW-1780) JDBC Adapter for Apache Arrow

    [ https://issues.apache.org/jira/browse/ARROW-1780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401077#comment-16401077 ] 

ASF GitHub Bot commented on ARROW-1780:
---------------------------------------

atuldambalkar opened a new pull request #1759: ARROW-1780 - (WIP) JDBC Adapter to convert Relational Data objects to Arrow Data Format Vector Objects
URL: https://github.com/apache/arrow/pull/1759
 
 
   This code enhancement is for converting JDBC ResultSet Relational objects to Arrow columnar data Vector objects. Code is under director "java/adapter/jdbc/src/main".
   
   The API has following static methods in the 
   
   class org.apache.arrow.adapter.jdbc.JdbcToArrow -
   
   public static VectorSchemaRoot sqlToArrow(Connection connection, String query)
   public static ArrowDataFetcher jdbcArrowDataFetcher(Connection connection, String tableName) 
   
   Utility uses following data mapping to convert JDBC/SQL data types to Arrow data types -
   CHAR	--> ArrowType.Utf8
   NCHAR	--> ArrowType.Utf8
   VARCHAR --> ArrowType.Utf8
   NVARCHAR --> ArrowType.Utf8
   LONGVARCHAR --> ArrowType.Utf8
   LONGNVARCHAR --> ArrowType.Utf8
   NUMERIC --> ArrowType.Decimal(precision, scale)
   DECIMAL --> ArrowType.Decimal(precision, scale)
   BIT --> ArrowType.Bool
   TINYINT --> ArrowType.Int(8, signed)
   SMALLINT --> ArrowType.Int(16, signed)
   INTEGER --> ArrowType.Int(32, signed)
   BIGINT --> ArrowType.Int(64, signed)
   REAL --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
   FLOAT --> ArrowType.FloatingPoint(FloatingPointPrecision.SINGLE)
   DOUBLE --> ArrowType.FloatingPoint(FloatingPointPrecision.DOUBLE)
   BINARY --> ArrowType.Binary
   VARBINARY --> ArrowType.Binary
   LONGVARBINARY --> ArrowType.Binary
   DATE --> ArrowType.Date(DateUnit.MILLISECOND)
   TIME --> ArrowType.Time(TimeUnit.MILLISECOND, 32)
   TIMESTAMP --> ArrowType.Timestamp(TimeUnit.MILLISECOND, timezone=null)
   CLOB --> ArrowType.Utf8
   BLOB --> ArrowType.Binary
   
   JUnit test cases under java/adapter/jdbc/src/test. Test cases uses H2 in-memory database. 
   
   I am still working on adding and automating additional test cases. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


> JDBC Adapter for Apache Arrow
> -----------------------------
>
>                 Key: ARROW-1780
>                 URL: https://issues.apache.org/jira/browse/ARROW-1780
>             Project: Apache Arrow
>          Issue Type: New Feature
>            Reporter: Atul Dambalkar
>            Assignee: Atul Dambalkar
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.10.0
>
>
> At a high level the JDBC Adapter will allow upstream apps to query RDBMS data over JDBC and get the JDBC objects converted to Arrow objects/structures. The upstream utility can then work with Arrow objects/structures with usual performance benefits. The utility will be very much similar to C++ implementation of "Convert a vector of row-wise data into an Arrow table" as described here - https://arrow.apache.org/docs/cpp/md_tutorials_row_wise_conversion.html
> The utility will read data from RDBMS and covert the data into Arrow objects/structures. So from that perspective this will Read data from RDBMS, If the utility can push Arrow objects to RDBMS is something need to be discussed and will be out of scope for this utility for now. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)