You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@arrow.apache.org by "Chendi.Xue (Jira)" <ji...@apache.org> on 2019/09/27 02:15:00 UTC
[jira] [Created] (ARROW-6720) [JAVA][C++]Support Parquet Read and
Write in Java
Chendi.Xue created ARROW-6720:
---------------------------------
Summary: [JAVA][C++]Support Parquet Read and Write in Java
Key: ARROW-6720
URL: https://issues.apache.org/jira/browse/ARROW-6720
Project: Apache Arrow
Issue Type: New Feature
Components: C++, Java
Affects Versions: 0.15.0
Reporter: Chendi.Xue
Fix For: 0.15.0
We added a new java interface to support parquet read and write from hdfs or local file.
The purpose of this implementation is that when we loading and dumping parquet data in Java, we can only use rowBased put and get methods. Since arrow already has C++ implementation to load and dump parquet, so we wrapped those codes as Java APIs.
After test, we noticed in our workload, performance improved more than 2x comparing with rowBased load and dump. So we want to contribute codes to arrow.
since this is a total independent change, there is no codes change to current arrow codes. We added two folders as listed: java/adapter/parquet and cpp/src/jni/parquet
--
This message was sent by Atlassian Jira
(v8.3.4#803005)