You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by GitBox <gi...@apache.org> on 2021/11/03 04:17:32 UTC

[GitHub] [orc] guiyanakuang opened a new pull request #958: Fix the implementation of copy data blocks during LZO decompression

guiyanakuang opened a new pull request #958:
URL: https://github.com/apache/orc/pull/958


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. File a JIRA issue first and use it as a prefix of your PR title, e.g., `ORC-001: Fix ABC`.
     2. Use your PR title to summarize what this PR proposes instead of describing the problem.
     3. Make PR title and description complete because these will be the permanent commit log.
     4. If possible, provide a concise and reproducible example to reproduce the issue for a faster review.
     5. If the PR is unfinished, use GitHub PR Draft feature.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If there is a discussion in the mailing list, please add the link.
   -->
   This pr is aimed to fix the implementation of copy data blocks during LZO decompression.
   
   `*reinterpret_cast< int64_t*>(output) = *reinterpret_cast< int64_t*>(matchAddress);`  can lead to unexpected behavior, and in failed test cases it does not appear to be an atomic operation.
   
   This pr uses memcpy instead of the above statement.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   Fix the bug of LZO decompression.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   Pass the CIs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] wgtmac commented on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
wgtmac commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960478909


   > > Do we have [one remaining comment](https://github.com/apache/orc/pull/958#discussion_r742486944) to address?
   > 
   > @wgtmac Do you want to use the wrapped bit_cast in this pr instead of memcpy? But in LzoDecompressor.cc, the intention of the code is simply copy, and there is no intention of cast.
   
   I am OK not to address the wrapper of bit_cast in this patch.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] wgtmac commented on a change in pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
wgtmac commented on a change in pull request #958:
URL: https://github.com/apache/orc/pull/958#discussion_r742486944



##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       The compiler may optimize the memcpy call. BTW, should we wrap a bit_cast function which uses memcpy before C++20 and uses the native one if C++20 is available?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun edited a comment on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960294387






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on a change in pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #958:
URL: https://github.com/apache/orc/pull/958#discussion_r742425822



##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       The combination of `reinterpret_cast` + `assignment` looks cheaper than `memcpy` function invocation. I'm wondering if we need to pay some performance penalty here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #958: ORC-1041: Use `memcpy` during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-961175957


   Also, cc @williamhyun .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on a change in pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #958:
URL: https://github.com/apache/orc/pull/958#discussion_r742425222



##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       `reinterpret_cast` is a compile-time directive while `memcpy` is a run-time operation. I'm wondering if there is a potential performance regression.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960294387


   cc @wgtmac and @williamhyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun edited a comment on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960294387


   cc @wgtmac , @stiga-huang , and @williamhyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on a change in pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #958:
URL: https://github.com/apache/orc/pull/958#discussion_r742425222



##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       `reinterpret_cast` is a compile-time directive while `memcpy` is a run-time operation. I'm wondering if there is a potential performance regression.

##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       The combination of `reinterpret_cast` + `assignment` looks cheaper than `memcpy` function invocation. I'm wondering if we need to pay some performance penalty here.

##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       `reinterpret_cast` is a compile-time directive while `memcpy` is a run-time operation. I'm wondering if there is a potential performance regression.

##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       The combination of `reinterpret_cast` + `assignment` looks cheaper than `memcpy` function invocation. I'm wondering if we need to pay some performance penalty here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on a change in pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #958:
URL: https://github.com/apache/orc/pull/958#discussion_r742425222



##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       `reinterpret_cast` is a compile-time directive while `memcpy` is a run-time operation. I'm wondering if there is a potential performance regression.

##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       The combination of `reinterpret_cast` + `assignment` looks cheaper than `memcpy` function invocation. I'm wondering if we need to pay some performance penalty here.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #958: ORC-1041: Use `memcpy` during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-962695811


   This is backported to branch-1.7 for 1.7.2.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960294069






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on a change in pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on a change in pull request #958:
URL: https://github.com/apache/orc/pull/958#discussion_r742530817



##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       No problem, I've updated the pr description




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun merged pull request #958: ORC-1041: Use `memcpy` during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun merged pull request #958:
URL: https://github.com/apache/orc/pull/958


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun edited a comment on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun edited a comment on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960294387


   cc @wgtmac , @stiga-huang , and @williamhyun 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on a change in pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #958:
URL: https://github.com/apache/orc/pull/958#discussion_r742523083



##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       Thanks. Could you put this investigation result into the PR description?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960294069


   Thank you for making a PR, @guiyanakuang .


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960294069






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on a change in pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on a change in pull request #958:
URL: https://github.com/apache/orc/pull/958#discussion_r742480546



##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       > The combination of `reinterpret_cast` + `assignment` looks cheaper than `memcpy` function invocation. I'm wondering if we need to pay some performance penalty here.
   
   I'll do some performance tests later, `repeat_cast` + `assignment` makes direct use of registers, `memcpy` is usually used for larger copies of data, I'm not sure if it's lossy yet
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960466664


   > Do we have [one remaining comment](https://github.com/apache/orc/pull/958#discussion_r742486944) to address?
   
   @wgtmac Do you want to use the wrapped bit_cast in this pr instead of memcpy?
   But in LzoDecompressor.cc, the intention of the code is simply copy, and there is no intention of cast.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960463604


   Do we have [one remaining comment](https://github.com/apache/orc/pull/958#discussion_r742486944) to address?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #958:
URL: https://github.com/apache/orc/pull/958#issuecomment-960482274


   > I am OK not to address the wrapper of bit_cast in this patch.
   
   @wgtmac Thanks for the review and approval
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on a change in pull request #958: ORC-1041: Fix the implementation of copy data blocks during LZO decompression

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on a change in pull request #958:
URL: https://github.com/apache/orc/pull/958#discussion_r742507137



##########
File path: c++/src/LzoDecompressor.cc
##########
@@ -312,13 +312,11 @@ namespace orc {
               output += SIZE_OF_INT;
               matchAddress += increment32;
 
-              *reinterpret_cast<int32_t*>(output) =
-                *reinterpret_cast<int32_t*>(matchAddress);
+              memcpy(output, matchAddress, SIZE_OF_INT);

Review comment:
       @wgtmac  you are right, the compiler does optimize memcpy, the performance of both ways is similar in different compilers, in older versions of the compiler expand the assignment will be faster.
   I agree to wrap a bit_cast function for binary copy between different types. 
   
   @dongjoon-hyun, So I don't think there's any performance loss here compared to the original.
   
   ![WX20211104-104627](https://user-images.githubusercontent.com/4069905/140250827-6282739b-c060-43fa-b348-87ede15129fc.png)
   ![WX20211104-105010](https://user-images.githubusercontent.com/4069905/140250854-cf6da388-18d8-42f0-8cd6-18468633acc3.png)
   ![WX20211104-105348](https://user-images.githubusercontent.com/4069905/140250863-6c99cfcb-0b72-4ee0-a6b0-ac31344ac771.png)
   
   ```c++
   #include <string.h>
   
   static void use_memcpy(benchmark::State& state) {
     auto size = state.range(0);
     char buf[size];
     for (int i = 0; i < 8; ++i) {
       buf[i] = 'a';
     }
     for (auto _ : state) {
       char *output = buf + 8;
       char *matchAddress = buf;
       char *matchOutputLimit = buf + size;
       while (output < matchOutputLimit) {
         memcpy(output, matchAddress, 8);
         matchAddress += 8;
         output += 8;
       }
     }
   }
   
   static void use_expanded_assignment(benchmark::State& state) {
     auto size = state.range(0);
     char buf[size];
     for (int i = 0; i < 8; ++i) {
       buf[i] = 'a';
     }
     for (auto _ : state) {
       char *output = buf + 8;
       char *matchAddress = buf;
       char *matchOutputLimit = buf + size;
       while (output < matchOutputLimit) {
         output[0] = *matchOutputLimit;
         output[1] = *(matchOutputLimit + 1);
         output[2] = *(matchOutputLimit + 2);
         output[3] = *(matchOutputLimit + 3);
         output[4] = *(matchOutputLimit + 4);
         output[5] = *(matchOutputLimit + 5);
         output[6] = *(matchOutputLimit + 6);
         output[7] = *(matchOutputLimit + 7);
         matchAddress += 8;
         output += 8;
       }
     }
   }
   
   static void use_reinterpret_assignment(benchmark::State& state) {
     auto size = state.range(0);
     char buf[size];
     for (int i = 0; i < 8; ++i) {
       buf[i] = 'a';
     }
     for (auto _ : state) {
       char *output = buf + 8;
       char *matchAddress = buf;
       char *matchOutputLimit = buf + size;
       while (output < matchOutputLimit) {
         *reinterpret_cast<int64_t*>(output) =
                   *reinterpret_cast<int64_t*>(matchAddress);
         matchAddress += 8;
         output += 8;
       }
     }
   }
   
   BENCHMARK(use_memcpy)->Arg(100000);
   
   BENCHMARK(use_expanded_assignment)->Arg(100000);
   
   BENCHMARK(use_reinterpret_assignment)->Arg(100000);
   ```




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org