You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@orc.apache.org by do...@apache.org on 2023/03/09 20:51:29 UTC

[orc] branch main updated: ORC-1384: Fix `ArrayIndexOutOfBoundsException` when reading dictionary stream bigger then dictionary

This is an automated email from the ASF dual-hosted git repository.

dongjoon pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/orc.git


The following commit(s) were added to refs/heads/main by this push:
     new 8cf9057fc ORC-1384: Fix `ArrayIndexOutOfBoundsException` when reading dictionary stream bigger then dictionary
8cf9057fc is described below

commit 8cf9057fc498f977125be3b721daf2170330b3f9
Author: Zoltan Ratkai <zr...@cloudera.com>
AuthorDate: Thu Mar 9 12:51:22 2023 -0800

    ORC-1384: Fix `ArrayIndexOutOfBoundsException` when reading dictionary stream bigger then dictionary
    
    ### What changes were proposed in this pull request?
    Avoid  ArrayIndexOutOfBoundsException when reading dictionary stream bigger then dictionary. Check the size of the dictionary and input and read only the min of those.
    
    ### Why are the changes needed?
    In Hive when reading with LLAP data is read in 4kB blocks which leads to ArrayIndexOutOfBoundsException when the dictionary is smaller.
    
    ### How was this patch tested?
    It is tested with HIVE's qtest, since here we do not have the necessary subclasses.
    
    Closes #1431 from zratkai/ORC-1384.
    
    Lead-authored-by: Zoltan Ratkai <zr...@cloudera.com>
    Co-authored-by: Dongjoon Hyun <do...@apache.org>
    Signed-off-by: Dongjoon Hyun <do...@apache.org>
---
 java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java b/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
index ecc02fb8d..2a2adf50d 100644
--- a/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
+++ b/java/core/src/java/org/apache/orc/impl/TreeReaderFactory.java
@@ -2292,10 +2292,15 @@ public class TreeReaderFactory {
           int dictionaryBufferSize = dictionaryOffsets[dictionaryOffsets.length - 1];
           dictionaryBuffer = new byte[dictionaryBufferSize];
           int pos = 0;
-          int chunkSize = in.available();
-          byte[] chunkBytes = new byte[chunkSize];
+          // check if dictionary size is smaller than available stream size
+          // to avoid ArrayIndexOutOfBoundsException
+          int readSize = Math.min(in.available(), dictionaryBufferSize);
+          byte[] chunkBytes = new byte[readSize];
           while (pos < dictionaryBufferSize) {
-            int currentLength = in.read(chunkBytes, 0, chunkSize);
+            int currentLength = in.read(chunkBytes, 0, readSize);
+            // check if dictionary size is smaller than available stream size
+            // to avoid ArrayIndexOutOfBoundsException
+            currentLength = Math.min(currentLength, dictionaryBufferSize - pos);
             System.arraycopy(chunkBytes, 0, dictionaryBuffer, pos, currentLength);
             pos += currentLength;
           }