You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@orc.apache.org by GitBox <gi...@apache.org> on 2021/09/27 09:08:18 UTC

[GitHub] [orc] guiyanakuang opened a new pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

guiyanakuang opened a new pull request #917:
URL: https://github.com/apache/orc/pull/917


   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. File a JIRA issue first and use it as a prefix of your PR title, e.g., `ORC-001: Fix ABC`.
     2. Use your PR title to summarize what this PR proposes instead of describing the problem.
     3. Make PR title and description complete because these will be the permanent commit log.
     4. If possible, provide a concise and reproducible example to reproduce the issue for a faster review.
     5. If the PR is unfinished, use GitHub PR Draft feature.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If there is a discussion in the mailing list, please add the link.
   -->
   Fix overflow detection code for C++ int64_t / java long.
   > ORC-338 Workaround C++ compiler bug in xcode 9.3 by removing an inline function.
   As I fixed the implementation. The current update function can be inline.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   Fix bug.
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   -->
   Pass the CIs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-951665955


   > Hi, @guiyanakuang . Could you make a backport to branch-1.6/1.7?
   
   No problem, I'll make it in the next two days when I have free time.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-930796494


   Yes, because the sum in the statistics may be wrong and the user may take it as the correct conclusion. Do you think this is Major or Minor bug?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-931871138


   > We may just need to do this -
   > 
   > *result = x * y; return (x == 0 || y == 0 || *result / x == y);
   
   `if (((value < 0 ? -value : value) | repetitions) >> 31 != 0 && r / repetitions != value)`
   The current is a copy of the JDK implementation, which intends that values in the range [-2^31, 2^31] can be short-circuited to determine that no overflow will occur. It is also assumed that this range is used more frequently.
   
   Thanks to @xndai for the advice, so I've done some simple benchmarking in https://quick-bench.com/.
   
   They perform about the same under higher versions of the compiler, and it seems that the current implementation is better under lower versions. 
   
   BM_A:  current impl
   BM_B:  advice impl
   BM_C:  __builtin_mul_overflow
   
   ![image](https://user-images.githubusercontent.com/4069905/135560227-ac184361-ff43-463e-9a95-9e6d6c2f02ad.png)
   ![image](https://user-images.githubusercontent.com/4069905/135560615-76c904a6-dbf8-4f58-a410-5a34368d9460.png)
   ![image](https://user-images.githubusercontent.com/4069905/135560675-49dd3d6e-41b0-4511-91d2-e17db29d3762.png)
   
   
   
   ```c++
   static bool multiplyExactA(int64_t value, int64_t repetitions, int64_t* result) {
     int64_t r = value * repetitions;
     if (((value < 0 ? -value : value) | repetitions) >> 31 != 0 && r / repetitions != value) {
       return false;
     }
     *result = r;
     return true;
   }
   
   static bool multiplyExactB(int64_t x, int64_t y, int64_t* result) {
     int64_t r =  x * y;
     if (x == 0 || y == 0 || r / y == x) {
       *result = r;
       return true;
     }
     return false;
   }
   
   static void BM_A(benchmark::State& state) {
     auto x = state.range(0);
     int64_t value;
     for (auto _ : state) {
       for (int64_t i = std::numeric_limits<int64_t>::min(); i < std::numeric_limits<int64_t>::max(); i += 1) {
         for (int64_t j = 2; j <= x; j += 1) {
           multiplyExactA(i, j, &value);
         }
       }
     }
   }
   
   static void BM_B(benchmark::State& state) {
     auto x = state.range(0);
     int64_t value;
     for (auto _ : state) {
       for (int64_t i = std::numeric_limits<int64_t>::min(); i < std::numeric_limits<int64_t>::max(); i += 1) {
         for (int64_t j = 2; j <= x; j += 1) {
           multiplyExactB(i, j, &value);
         }
       }
     }
   }
   
   static void BM_C(benchmark::State& state) {
     auto x = state.range(0);
     int64_t value;
     for (auto _ : state) {
       for (int64_t i = std::numeric_limits<int64_t>::min(); i < std::numeric_limits<int64_t>::max(); i += 1) {
         for (int64_t j = 2; j <= x; j += 1) {
           __builtin_mul_overflow(i, j, &value);
         }
       }
     }
   }
   
   BENCHMARK(BM_A)->Arg(512);
   
   BENCHMARK(BM_B)->Arg(512);
   
   BENCHMARK(BM_C)->Arg(512);
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r717327598



##########
File path: c++/src/Statistics.hh
##########
@@ -964,7 +964,23 @@ namespace orc {
       _stats.setSum(sum);
     }
 
-    void update(int64_t value, int repetitions);
+    void update(int64_t value, int repetitions) {
+      _stats.updateMinMax(value);
+
+      if (_stats.hasSum()) {
+        if (repetitions > 1) {
+          _stats.setHasSum(__builtin_mul_overflow(value, repetitions, &value) == 0);

Review comment:
       @wgtmac I made some attempts, compatibility with various compilers seemed difficult and I switched to rewriting to C++ following the JDK's algorithm.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r717202883



##########
File path: java/core/src/test/org/apache/orc/TestColumnStatistics.java
##########
@@ -50,6 +51,16 @@
  */
 public class TestColumnStatistics {
 
+  @Test
+  public void testLongSumOverflow() {
+    TypeDescription schema = TypeDescription.createInt();
+    ColumnStatisticsImpl stats = ColumnStatisticsImpl.create(schema);
+    stats.updateInteger(1, 1);
+    assertTrue(((IntegerColumnStatistics) stats).isSumDefined());
+    stats.updateInteger(Long.MAX_VALUE, 3);

Review comment:
       Thank you for adding this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-930687353


   > Sorry for my late reply. Thanks for the investigation @guiyanakuang and @dongjoon-hyun for the review. I am fine with the impl from jdk. I have left one comment. Any thoughts?
   
   @wgtmac. Good advice, I've pushed for a new commit. How do you feel about it?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] pgaref commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
pgaref commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-927908938


   Thanks for the PR @guiyanakuang  -- can we please add a test demonstrating the issue?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-951646616


   Hi, @guiyanakuang .
   Could you make a backport to branch-1.6/1.7?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-930800560


   I believe we need backports.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r717202883



##########
File path: java/core/src/test/org/apache/orc/TestColumnStatistics.java
##########
@@ -50,6 +51,16 @@
  */
 public class TestColumnStatistics {
 
+  @Test
+  public void testLongSumOverflow() {
+    TypeDescription schema = TypeDescription.createInt();
+    ColumnStatisticsImpl stats = ColumnStatisticsImpl.create(schema);
+    stats.updateInteger(1, 1);
+    assertTrue(((IntegerColumnStatistics) stats).isSumDefined());
+    stats.updateInteger(Long.MAX_VALUE, 3);

Review comment:
       Thank you for adding this.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r717771143



##########
File path: c++/include/orc/orc-config.hh.in
##########
@@ -75,4 +75,38 @@
   }
 #endif
 
+#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__)
+  namespace orc {
+    /**
+     * Compute value * repetitions, return false if overflow, return true otherwise
+     * and save the result at the address pointed to by result
+     * imitates the jdk Math.multiplyExact implementation
+     * but this method makes the assumption that repetitions > 1
+     */
+    static bool multiplyExact(int64_t value, int64_t repetitions, int64_t* result) {
+      int64_t r = value * repetitions;
+      if (((value < 0 ? -value : value) | repetitions) >> 31 != 0 && r / repetitions != value) {
+        return false;
+      }
+      *result = r;
+      return true;
+    }

Review comment:
       Hi @dongjoon-hyun @wgtmac. Finally passed the test. But it needs a strict review.
   The window environment overflow check I've mimicked the JDK implementation. If there is a better way, please let me know.
   In fact it doesn't work well on clang (the compiler has some unknown behaviour that I don't know about)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-927926777


   > Thanks for the PR @guiyanakuang -- can we please add a test demonstrating the issue?
   
   You're right, I'll add test cases later. : )


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] wgtmac commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
wgtmac commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r717192800



##########
File path: c++/src/Statistics.hh
##########
@@ -964,7 +964,23 @@ namespace orc {
       _stats.setSum(sum);
     }
 
-    void update(int64_t value, int repetitions);
+    void update(int64_t value, int repetitions) {
+      _stats.updateMinMax(value);
+
+      if (_stats.hasSum()) {
+        if (repetitions > 1) {
+          _stats.setHasSum(__builtin_mul_overflow(value, repetitions, &value) == 0);

Review comment:
       I think Windows will complain about it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r717197131



##########
File path: c++/src/Statistics.hh
##########
@@ -964,7 +964,23 @@ namespace orc {
       _stats.setSum(sum);
     }
 
-    void update(int64_t value, int repetitions);
+    void update(int64_t value, int repetitions) {
+      _stats.updateMinMax(value);
+
+      if (_stats.hasSum()) {
+        if (repetitions > 1) {
+          _stats.setHasSum(__builtin_mul_overflow(value, repetitions, &value) == 0);

Review comment:
       @wgtmac Sorry, I'm still a C++ novice. Let me see how compatible

##########
File path: c++/src/Statistics.hh
##########
@@ -964,7 +964,23 @@ namespace orc {
       _stats.setSum(sum);
     }
 
-    void update(int64_t value, int repetitions);
+    void update(int64_t value, int repetitions) {
+      _stats.updateMinMax(value);
+
+      if (_stats.hasSum()) {
+        if (repetitions > 1) {
+          _stats.setHasSum(__builtin_mul_overflow(value, repetitions, &value) == 0);

Review comment:
       https://docs.microsoft.com/en-us/windows/win32/api/intsafe/nf-intsafe-longadd
   https://docs.microsoft.com/en-us/windows/win32/api/intsafe/nf-intsafe-longmult
   ```c++
   #define __builtin_mul_overflow LongMult
   #define __builtin_add_overflow LongAdd
   ```
   @wgtmac Can you give me some advice, I have very little practice in C++. I don't have a Win machine to hand, so I can't verify if this is possible at the moment.

##########
File path: c++/src/Statistics.hh
##########
@@ -964,7 +964,23 @@ namespace orc {
       _stats.setSum(sum);
     }
 
-    void update(int64_t value, int repetitions);
+    void update(int64_t value, int repetitions) {
+      _stats.updateMinMax(value);
+
+      if (_stats.hasSum()) {
+        if (repetitions > 1) {
+          _stats.setHasSum(__builtin_mul_overflow(value, repetitions, &value) == 0);

Review comment:
       @wgtmac I made some attempts, compatibility with various compilers seemed difficult and I switched to rewriting to C++ following the JDK's algorithm.

##########
File path: c++/include/orc/orc-config.hh.in
##########
@@ -75,4 +75,38 @@
   }
 #endif
 
+#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__)
+  namespace orc {
+    /**
+     * Compute value * repetitions, return false if overflow, return true otherwise
+     * and save the result at the address pointed to by result
+     * imitates the jdk Math.multiplyExact implementation
+     * but this method makes the assumption that repetitions > 1
+     */
+    static bool multiplyExact(int64_t value, int64_t repetitions, int64_t* result) {
+      int64_t r = value * repetitions;
+      if (((value < 0 ? -value : value) | repetitions) >> 31 != 0 && r / repetitions != value) {
+        return false;
+      }
+      *result = r;
+      return true;
+    }

Review comment:
       Hi @dongjoon-hyun @wgtmac. Finally passed the test. But it needs a strict review.
   The window environment overflow check I've mimicked the JDK implementation. If there is a better way, please let me know.
   In fact it doesn't work well on clang (the compiler has some unknown behaviour that I don't know about)




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] xndai commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
xndai commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-931504804


   We may just need to do this -
   
   *result = x * y;
   return (x == 0 || y == 0 || *result / x == y);


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r717203373



##########
File path: c++/src/Statistics.hh
##########
@@ -964,7 +964,23 @@ namespace orc {
       _stats.setSum(sum);
     }
 
-    void update(int64_t value, int repetitions);
+    void update(int64_t value, int repetitions) {
+      _stats.updateMinMax(value);
+
+      if (_stats.hasSum()) {
+        if (repetitions > 1) {
+          _stats.setHasSum(__builtin_mul_overflow(value, repetitions, &value) == 0);

Review comment:
       https://docs.microsoft.com/en-us/windows/win32/api/intsafe/nf-intsafe-longadd
   https://docs.microsoft.com/en-us/windows/win32/api/intsafe/nf-intsafe-longmult
   ```c++
   #define __builtin_mul_overflow LongMult
   #define __builtin_add_overflow LongAdd
   ```
   @wgtmac Can you give me some advice, I have very little practice in C++. I don't have a Win machine to hand, so I can't verify if this is possible at the moment.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] wgtmac commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
wgtmac commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r718126855



##########
File path: c++/include/orc/orc-config.hh.in
##########
@@ -75,4 +75,38 @@
   }
 #endif
 
+#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__)
+  namespace orc {
+    /**
+     * Compute value * repetitions, return false if overflow, return true otherwise
+     * and save the result at the address pointed to by result
+     * imitates the jdk Math.multiplyExact implementation
+     * but this method makes the assumption that repetitions > 1
+     */
+    static bool multiplyExact(int64_t value, int64_t repetitions, int64_t* result) {
+      int64_t r = value * repetitions;
+      if (((value < 0 ? -value : value) | repetitions) >> 31 != 0 && r / repetitions != value) {
+        return false;
+      }
+      *result = r;
+      return true;
+    }

Review comment:
       `#ifdef _MSC_VER` is enough to assert that it is on Windows. You may also leverage cmake_modules/CheckSourceCompiles.cmake to figure out if the gcc builtin functions are available or not. BTW, orc-config.hh.in is to address capability of older c++ standards. If would be better to handle these kinds of functions in Adaptor.hh.in.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] wgtmac commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
wgtmac commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r718126855



##########
File path: c++/include/orc/orc-config.hh.in
##########
@@ -75,4 +75,38 @@
   }
 #endif
 
+#if defined(WIN32) || defined(_WIN32) || defined(__WIN32__) || defined(__NT__)
+  namespace orc {
+    /**
+     * Compute value * repetitions, return false if overflow, return true otherwise
+     * and save the result at the address pointed to by result
+     * imitates the jdk Math.multiplyExact implementation
+     * but this method makes the assumption that repetitions > 1
+     */
+    static bool multiplyExact(int64_t value, int64_t repetitions, int64_t* result) {
+      int64_t r = value * repetitions;
+      if (((value < 0 ? -value : value) | repetitions) >> 31 != 0 && r / repetitions != value) {
+        return false;
+      }
+      *result = r;
+      return true;
+    }

Review comment:
       `#ifdef _MSC_VER` is enough to assert that it is on Windows. You may also leverage cmake_modules/CheckSourceCompiles.cmake to figure out if the gcc builtin functions are available or not. BTW, orc-config.hh.in is to address capability of older c++ standards. If would be better to handle these kinds of functions in Adaptor.hh.in.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-930807686


   I think so too


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-930793550


   @dongjoon-hyun @wgtmac A small query, do you think this fix needs to be merged into 1.6 and 1.7?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] wgtmac merged pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
wgtmac merged pull request #917:
URL: https://github.com/apache/orc/pull/917


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-930704378


   Many thanks to @wgtmac , @dongjoon-hyun  and @pgaref  for reviews and suggestions.  I also learned a lot about C++ in the process of implementing this pr, and it was great to contribute to open source and improve myself. 😄 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-930707551


   Thank you so much, @guiyanakuang and @wgtmac !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r717197131



##########
File path: c++/src/Statistics.hh
##########
@@ -964,7 +964,23 @@ namespace orc {
       _stats.setSum(sum);
     }
 
-    void update(int64_t value, int repetitions);
+    void update(int64_t value, int repetitions) {
+      _stats.updateMinMax(value);
+
+      if (_stats.hasSum()) {
+        if (repetitions > 1) {
+          _stats.setHasSum(__builtin_mul_overflow(value, repetitions, &value) == 0);

Review comment:
       @wgtmac Sorry, I'm still a C++ novice. Let me see how compatible




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] wgtmac commented on a change in pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
wgtmac commented on a change in pull request #917:
URL: https://github.com/apache/orc/pull/917#discussion_r717192800



##########
File path: c++/src/Statistics.hh
##########
@@ -964,7 +964,23 @@ namespace orc {
       _stats.setSum(sum);
     }
 
-    void update(int64_t value, int repetitions);
+    void update(int64_t value, int repetitions) {
+      _stats.updateMinMax(value);
+
+      if (_stats.hasSum()) {
+        if (repetitions > 1) {
+          _stats.setHasSum(__builtin_mul_overflow(value, repetitions, &value) == 0);

Review comment:
       I think Windows will complain about it.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] guiyanakuang commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
guiyanakuang commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-927722022


   ```cmd
   CUSTOMBUILD : error : downloading 'ftp://cygwin.osuosl.org/pub/cygwin/noarch/release/tzdata/tzdata-2020e-1.tar.xz' failed [C:\projects\orc\build\tzdata_ep.vcxproj]
              status_code: 78
              status_string: "Remote file not found"
              log:
              --- LOG BEGIN ---
              timeout on name lookup is not supported
   ```
   https://cygwin.osuosl.org/noarch/release/tzdata/
   Hi @dongjoon-hyun. It looks like the archive tzdata-2020e-1.tar.xz has been deleted and tzdata-2021b-1-src.tar.xz was created two days ago, do we need to update this third party toolchain.
   ![image](https://user-images.githubusercontent.com/4069905/134888938-7f810c06-3efa-4e1d-b620-c59aee6779d0.png)
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [orc] dongjoon-hyun commented on pull request #917: ORC-1008: Fix overflow detection code for C++ int64_t / java long

Posted by GitBox <gi...@apache.org>.
dongjoon-hyun commented on pull request #917:
URL: https://github.com/apache/orc/pull/917#issuecomment-930793927


   At least 1.7?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@orc.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org