You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@arrow.apache.org by ma...@apache.org on 2024/01/02 04:35:54 UTC

(arrow) branch main updated: GH-39413: [C++][Parquet] Vectorize decode plain on FLBA (#39414)

This is an automated email from the ASF dual-hosted git repository.

maplefu pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/arrow.git


The following commit(s) were added to refs/heads/main by this push:
     new 98f677af3c GH-39413: [C++][Parquet] Vectorize decode plain on FLBA (#39414)
98f677af3c is described below

commit 98f677af3c281680b95093ceeab084b3e57e180a
Author: Hattonuri <53...@users.noreply.github.com>
AuthorDate: Tue Jan 2 07:35:48 2024 +0300

    GH-39413: [C++][Parquet] Vectorize decode plain on FLBA (#39414)
    
    
    
    ### Rationale for this change
    
    ### What changes are included in this PR?
    FLBA Decode Plain is not vectorized. So this parsing can be implemented faster https://godbolt.org/z/xWeb93xjW
    
    ### Are these changes tested?
    Yes, on unittest
    
    ### Are there any user-facing changes?
    
    * Closes: #39413
    
    Authored-by: Dmitry Stasenko <dm...@pinely.com>
    Signed-off-by: mwish <ma...@gmail.com>
---
 cpp/src/parquet/encoding.cc | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/cpp/src/parquet/encoding.cc b/cpp/src/parquet/encoding.cc
index 9ad1ee6efc..840efa12cc 100644
--- a/cpp/src/parquet/encoding.cc
+++ b/cpp/src/parquet/encoding.cc
@@ -1080,9 +1080,7 @@ inline int DecodePlain<FixedLenByteArray>(const uint8_t* data, int64_t data_size
     ParquetException::EofException();
   }
   for (int i = 0; i < num_values; ++i) {
-    out[i].ptr = data;
-    data += type_length;
-    data_size -= type_length;
+    out[i].ptr = data + i * type_length;
   }
   return static_cast<int>(bytes_to_decode);
 }