You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@parquet.apache.org by "ASF GitHub Bot (Jira)" <ji...@apache.org> on 2023/05/23 14:21:00 UTC

[jira] [Commented] (PARQUET-2212) Add ByteBuffer api for decryptors to allow direct memory to be decrypted

    [ https://issues.apache.org/jira/browse/PARQUET-2212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17725432#comment-17725432 ] 

ASF GitHub Bot commented on PARQUET-2212:
-----------------------------------------

shangxinli commented on PR #1008:
URL: https://github.com/apache/parquet-mr/pull/1008#issuecomment-1559507176

   LGTM




> Add ByteBuffer api for decryptors to allow direct memory to be decrypted
> ------------------------------------------------------------------------
>
>                 Key: PARQUET-2212
>                 URL: https://issues.apache.org/jira/browse/PARQUET-2212
>             Project: Parquet
>          Issue Type: Improvement
>          Components: parquet-mr
>    Affects Versions: 1.12.3
>            Reporter: Parth Chandra
>            Assignee: Parth Chandra
>            Priority: Major
>             Fix For: 1.14.0
>
>
> The decrypt API in BlockCipher.Decryptor currently only provides an api that takes in a byte array
> {code:java}
> byte[] decrypt(byte[] lengthAndCiphertext, byte[] AAD);{code}
> A parquet reader that uses the DirectByteBufferAllocator has to incur the cost of copying the data into a byte array (and sometimes back to a DirectByteBuffer) to decrypt data.
> This proposes adding a new API that accepts ByteBuffer as input and avoids the data copy.
> {code:java}
> ByteBuffer decrypt(ByteBuffer from, byte[] AAD);{code}
> The decryption in ColumnChunkPageReadStore can also be updated to use the ByteBuffer based api if the buffer is a DirectByteBuffer. If the buffer is a HeapByteBuffer, then we can continue to use the byte array API since that does not incur a copy when the underlying byte array is accessed.
> Also, some investigation has shown that decryption with ByteBuffers is not able to use hardware acceleration in JVM's before JDK17. In those cases, the overall decryption speed is faster with byte arrays even after incurring the overhead of making a copy. 
> The proposal, then, is to enable the use of the ByteBuffer api for DirectByteBuffers only, and only if the JDK is JDK17 or higher or the user explicitly configures it. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)