You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/10/15 22:57:49 UTC

[GitHub] [arrow] emkornfield opened a new pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

emkornfield opened a new pull request #8475:
URL: https://github.com/apache/arrow/pull/8475


   This provides sufficient coverage to support round trip between C++ and Java.  There are still some gaps in python.  Based on review, I will open JIRAs to track missing functionality (i.e. parquet support in C++).  Marking as draft until i can triage CI failures but early feedback is welcome.
   
   Open questions I have:
   
   [C++] 
   * Should we retain logic in decimal() factory function to adjust type on scale/precision or take an explicit argument or keep it as an alias for decimal128?
   
   [Java]
   * Naming:  Would Decimal256 be better then BigDecimal?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508795520



##########
File path: cpp/src/arrow/array/builder_decimal.h
##########
@@ -58,6 +58,35 @@ class ARROW_EXPORT Decimal128Builder : public FixedSizeBinaryBuilder {
   std::shared_ptr<Decimal128Type> decimal_type_;
 };
 
+class ARROW_EXPORT Decimal256Builder : public FixedSizeBinaryBuilder {
+ public:
+  using TypeClass = Decimal256Type;
+
+  explicit Decimal256Builder(const std::shared_ptr<DataType>& type,
+                             MemoryPool* pool = default_memory_pool());
+
+  using FixedSizeBinaryBuilder::Append;
+  using FixedSizeBinaryBuilder::AppendValues;
+  using FixedSizeBinaryBuilder::Reset;
+
+  Status Append(Decimal256 val);

Review comment:
       nice catch.  agreed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509872357



##########
File path: java/vector/src/main/codegen/templates/UnionListWriter.java
##########
@@ -224,6 +249,27 @@ public void writeBigEndianBytesToDecimal(byte[] value, ArrowType arrowType){
     writer.setPosition(writer.idx() + 1);
   }
 
+  public void writeDecimal256(int start, ArrowBuf buffer, ArrowType arrowType) {
+    writer.writeDecimal256(start, buffer, arrowType);

Review comment:
       The type of start should be long?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507826076



##########
File path: cpp/src/arrow/type_traits.h
##########
@@ -614,7 +635,7 @@ template <typename T>
 using is_list_type =
     std::integral_constant<bool, std::is_same<T, ListType>::value ||
                                      std::is_same<T, LargeListType>::value ||
-                                     std::is_same<T, FixedSizeListType>::valuae>;
+                                     std::is_same<T, FixedSizeListType>::value>;

Review comment:
       :-)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508273170



##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -119,34 +121,47 @@ public static boolean checkPrecisionAndScale(int decimalPrecision, int decimalSc
    * UnsupportedOperationException if the decimal size is greater than the Decimal vector byte
    * width.
    */
-  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index) {
+  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index, int byteWidth) {
     final byte[] bytes = value.unscaledValue().toByteArray();
-    writeByteArrayToArrowBufHelper(bytes, bytebuf, index);
+    writeByteArrayToArrowBufHelper(bytes, bytebuf, index, byteWidth);
   }
 
   /**
    * Write the given long to the ArrowBuf at the given value index.
    */
   public static void writeLongToArrowBuf(long value, ArrowBuf bytebuf, int index) {
-    final long addressOfValue = bytebuf.memoryAddress() + (long) index * DECIMAL_BYTE_LENGTH;
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 16;
     PlatformDependent.putLong(addressOfValue, value);
     final long padValue = Long.signum(value) == -1 ? -1L : 0L;
     PlatformDependent.putLong(addressOfValue + Long.BYTES, padValue);
   }
 
+  /**
+   * Write value to the buffer extending it to 32 bytes at the given index. 
+   */
+  public static void writeLongToArrowBufBigDecimal(long value, ArrowBuf bytebuf, int index) {
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 32;
+    PlatformDependent.putLong(addressOfValue, value);
+    final long padValue = Long.signum(value) == -1 ? -1L : 0L;
+    PlatformDependent.putLong(addressOfValue + Long.BYTES, padValue);
+    PlatformDependent.putLong(addressOfValue + 2 * Long.BYTES, padValue);
+    PlatformDependent.putLong(addressOfValue + 3 * Long.BYTES, padValue);

Review comment:
       We should write 4 longs in total, so the last putLong is not needed?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509436836



##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -119,34 +121,47 @@ public static boolean checkPrecisionAndScale(int decimalPrecision, int decimalSc
    * UnsupportedOperationException if the decimal size is greater than the Decimal vector byte
    * width.
    */
-  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index) {
+  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index, int byteWidth) {
     final byte[] bytes = value.unscaledValue().toByteArray();
-    writeByteArrayToArrowBufHelper(bytes, bytebuf, index);
+    writeByteArrayToArrowBufHelper(bytes, bytebuf, index, byteWidth);
   }
 
   /**
    * Write the given long to the ArrowBuf at the given value index.
    */
   public static void writeLongToArrowBuf(long value, ArrowBuf bytebuf, int index) {
-    final long addressOfValue = bytebuf.memoryAddress() + (long) index * DECIMAL_BYTE_LENGTH;
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 16;

Review comment:
       i reverted this back to its original contents and inlined the writeLong method for BigDecimal into BigDecimal256

##########
File path: java/vector/src/main/codegen/templates/ArrowType.java
##########
@@ -165,7 +165,20 @@ public final T visit(${type.name?remove_ending("_")} type) {
     ${fieldType} ${field.name};
     </#list>
 
+
+    <#if type.name == "Decimal">
+    // Needed to support golden file integration tests.
+    @JsonCreator
+    public static Decimal createDecimal128(

Review comment:
       yes, you are right, thanks for your attention to detail.

##########
File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java
##########
@@ -75,7 +75,7 @@ public void endList() {
 
   <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
     <#assign fields = minor.fields!type.fields />
-  <#if minor.class != "Decimal">
+  <#if minor.class != "Decimal" && minor.class != "BigDecimal">

Review comment:
       nice catch.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508730344



##########
File path: java/vector/src/test/java/org/apache/arrow/vector/TestDecimal256Vector.java
##########
@@ -0,0 +1,364 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import org.apache.arrow.memory.ArrowBuf;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestDecimal256Vector {
+
+  private static long[] intValues;
+
+  static {
+    intValues = new long[60];
+    for (int i = 0; i < intValues.length / 2; i++) {
+      intValues[i] = 1 << i + 1;
+      intValues[2 * i] = -1 * (1 << i + 1);
+    }
+  }
+
+  private int scale = 3;
+
+  private BufferAllocator allocator;
+
+  @Before
+  public void init() {
+    allocator = new DirtyRootAllocator(Long.MAX_VALUE, (byte) 100);
+  }
+
+  @After
+  public void terminate() throws Exception {
+    allocator.close();
+  }
+
+  @Test
+  public void testValuesWriteRead() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(10, scale, 256), allocator);) {
+
+      try (Decimal256Vector oldConstructor = new Decimal256Vector("decimal", allocator, 10, scale);) {
+        assertEquals(decimalVector.getField().getType(), oldConstructor.getField().getType());
+      }
+
+      decimalVector.allocateNew();
+      BigDecimal[] values = new BigDecimal[intValues.length];
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(intValues[i]), scale);
+        values[i] = decimal;
+        decimalVector.setSafe(i, decimal);
+      }
+
+      decimalVector.setValueCount(intValues.length);
+
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals("unexpected data at index: " + i, values[i], value);
+      }
+    }
+  }
+
+  @Test
+  public void testDecimal256DifferentScaleAndPrecision() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(4, 2, 256), allocator);) {
+      decimalVector.allocateNew();
+
+      // test Decimal256 with different scale
+      boolean hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(0), 3);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;

Review comment:
       done. also updated these to use assertThrows.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509017441



##########
File path: cpp/src/arrow/array/validate.cc
##########
@@ -64,6 +64,13 @@ struct ValidateArrayVisitor {
     return Status::OK();
   }
 
+  Status Visit(const Decimal256Array& array) {

Review comment:
       Yes, I tried to make this work, and at the moment making this seems like it would make this change bigger then I would feel comfortable with.  There are a lot of type_traits that have confusing hierarchies (is_primitive and is_binary_like both would include Decimal and SFINAE doesn't work out well, so it would be an intrusive change).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507818888



##########
File path: cpp/src/arrow/ipc/metadata_internal.cc
##########
@@ -236,7 +236,8 @@ static inline TimeUnit::type FromFlatbufferUnit(flatbuf::TimeUnit unit) {
   return TimeUnit::SECOND;
 }
 
-constexpr int32_t kDecimalBitWidth = 128;
+constexpr int32_t kDecimalBitWidth128 = 128;
+constexpr int32_t kDecimalBitWidth256 = 256;

Review comment:
       I'm frankly not sure those pleonasmic constants are useful.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] github-actions[bot] commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

github-actions[bot] commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-709635438


   https://issues.apache.org/jira/browse/ARROW-9747


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-713339080


   @pitrou I think I addressed or responded to your comments.  I think I might need to rebase to get CI testing, but i'm going to hold off doing that if you want to take another look at how I handled your comments.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-714223728


   @liyafan82 thanks for the typo catches.  I addressed the comments.  Any other concerns?
   
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508791783



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -775,4 +833,119 @@ int32_t BasicDecimal128::CountLeadingBinaryZeros() const {
   }
 }
 
+#if ARROW_LITTLE_ENDIAN
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[0],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[3]})) {}
+#else
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[3],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[0]})) {
+#endif
+
+BasicDecimal256& BasicDecimal256::Negate() {
+  uint64_t carry = 1;
+  for (uint64_t& elem : little_endian_array_) {
+    elem = ~elem + carry;
+    carry &= (elem == 0);
+  }
+  return *this;
+}
+
+BasicDecimal256& BasicDecimal256::Abs() { return *this < 0 ? Negate() : *this; }
+
+BasicDecimal256 BasicDecimal256::Abs(const BasicDecimal256& in) {
+  BasicDecimal256 result(in);
+  return result.Abs();
+}
+
+std::array<uint8_t, 32> BasicDecimal256::ToBytes() const {
+  std::array<uint8_t, 32> out{{0}};
+  ToBytes(out.data());
+  return out;
+}
+
+void BasicDecimal256::ToBytes(uint8_t* out) const {
+  DCHECK_NE(out, nullptr);
+#if ARROW_LITTLE_ENDIAN
+  reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[0];
+  reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[1];
+  reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[2];
+  reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[3];
+#else
+    reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[3];
+    reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[2];
+    reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[1];
+    reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[0];

Review comment:
       for some reason this is how "make format" wants it to be




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507820725



##########
File path: cpp/src/arrow/python/decimal.cc
##########
@@ -145,8 +146,9 @@ Status DecimalFromPythonDecimal(PyObject* python_decimal, const DecimalType& arr
   return DecimalFromStdString(string, arrow_type, out);
 }
 
-Status DecimalFromPyObject(PyObject* obj, const DecimalType& arrow_type,
-                           Decimal128* out) {
+template <typename ArrowDecimal>
+Status InternalDecimalFromPyObject(PyObject* obj, const DecimalType& arrow_type,
+                                   ArrowDecimal* out) {

Review comment:
       In the implementation below you should call `InternalDecimalFromPythonDecimal<ArrowDecimal>` instead of `DecimalFromPythonDecimal`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508693097



##########
File path: cpp/src/parquet/arrow/reader_internal.cc
##########
@@ -645,7 +645,9 @@ static Status DecimalIntegerTransfer(RecordReader* reader, MemoryPool* pool,
 template <typename ParquetType>
 Status TransferDecimal(RecordReader* reader, MemoryPool* pool,
                        const std::shared_ptr<DataType>& type, Datum* out) {
-  DCHECK_EQ(type->id(), ::arrow::Type::DECIMAL);
+  if (type->id() != ::arrow::Type::DECIMAL128) {
+    return Status::Invalid("Only reading decimal128 types is currently supported");

Review comment:
       done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509362379



##########
File path: cpp/src/arrow/type.cc
##########
@@ -131,6 +133,7 @@ std::string ToString(Type::type id) {
     TO_STRING_CASE(FLOAT)
     TO_STRING_CASE(DOUBLE)
     TO_STRING_CASE(DECIMAL)

Review comment:
       Ok. Maybe we can deprecate the alias at some point?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508732622



##########
File path: java/vector/src/test/java/org/apache/arrow/vector/TestDecimal256Vector.java
##########
@@ -0,0 +1,364 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import org.apache.arrow.memory.ArrowBuf;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestDecimal256Vector {
+
+  private static long[] intValues;
+
+  static {
+    intValues = new long[60];
+    for (int i = 0; i < intValues.length / 2; i++) {
+      intValues[i] = 1 << i + 1;
+      intValues[2 * i] = -1 * (1 << i + 1);
+    }
+  }
+
+  private int scale = 3;
+
+  private BufferAllocator allocator;
+
+  @Before
+  public void init() {
+    allocator = new DirtyRootAllocator(Long.MAX_VALUE, (byte) 100);
+  }
+
+  @After
+  public void terminate() throws Exception {
+    allocator.close();
+  }
+
+  @Test
+  public void testValuesWriteRead() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(10, scale, 256), allocator);) {
+
+      try (Decimal256Vector oldConstructor = new Decimal256Vector("decimal", allocator, 10, scale);) {
+        assertEquals(decimalVector.getField().getType(), oldConstructor.getField().getType());
+      }
+
+      decimalVector.allocateNew();
+      BigDecimal[] values = new BigDecimal[intValues.length];
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(intValues[i]), scale);
+        values[i] = decimal;
+        decimalVector.setSafe(i, decimal);
+      }
+
+      decimalVector.setValueCount(intValues.length);
+
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals("unexpected data at index: " + i, values[i], value);
+      }
+    }
+  }
+
+  @Test
+  public void testDecimal256DifferentScaleAndPrecision() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(4, 2, 256), allocator);) {
+      decimalVector.allocateNew();
+
+      // test Decimal256 with different scale
+      boolean hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(0), 3);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;
+      } finally {
+        assertTrue(hasError);
+      }
+
+      // test BigDecimal with larger precision than initialized
+      hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(12345), 2);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;
+      } finally {
+        assertTrue(hasError);
+      }
+    }
+  }
+
+  @Test
+  public void testWriteBigEndian() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(38, 18, 256), allocator);) {
+      decimalVector.allocateNew();
+      BigDecimal decimal1 = new BigDecimal("123456789.000000000000000000");
+      BigDecimal decimal2 = new BigDecimal("11.123456789123456789");
+      BigDecimal decimal3 = new BigDecimal("1.000000000000000000");
+      BigDecimal decimal4 = new BigDecimal("0.111111111000000000");
+      BigDecimal decimal5 = new BigDecimal("987654321.123456789000000000");
+      BigDecimal decimal6 = new BigDecimal("222222222222.222222222000000000");
+      BigDecimal decimal7 = new BigDecimal("7777777777777.666666667000000000");
+      BigDecimal decimal8 = new BigDecimal("1212121212.343434343000000000");

Review comment:
       nulls aren't supported (this is consistent with Decimal).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508062401



##########
File path: cpp/src/arrow/type.cc
##########
@@ -131,6 +133,7 @@ std::string ToString(Type::type id) {
     TO_STRING_CASE(FLOAT)
     TO_STRING_CASE(DOUBLE)
     TO_STRING_CASE(DECIMAL)

Review comment:
       this one should.  We have DECIMAL for backwards compatibility, I think the remaining places that it is used are places we will need to update to support Decimal256.  By leaving them as DECIMAL we can find them easily with by commenting out the alias.  Does this sound reasonable?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509021509



##########
File path: cpp/src/arrow/array/validate.cc
##########
@@ -64,6 +64,13 @@ struct ValidateArrayVisitor {
     return Status::OK();
   }
 
+  Status Visit(const Decimal256Array& array) {

Review comment:
       FWIW, https://github.com/apache/arrow/pull/8417/files is probably what some of it would look like but I haven't reviewed it fully.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508730745



##########
File path: java/vector/src/test/java/org/apache/arrow/vector/TestDecimal256Vector.java
##########
@@ -0,0 +1,364 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import org.apache.arrow.memory.ArrowBuf;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestDecimal256Vector {
+
+  private static long[] intValues;
+
+  static {
+    intValues = new long[60];
+    for (int i = 0; i < intValues.length / 2; i++) {
+      intValues[i] = 1 << i + 1;
+      intValues[2 * i] = -1 * (1 << i + 1);
+    }
+  }
+
+  private int scale = 3;
+
+  private BufferAllocator allocator;
+
+  @Before
+  public void init() {
+    allocator = new DirtyRootAllocator(Long.MAX_VALUE, (byte) 100);
+  }
+
+  @After
+  public void terminate() throws Exception {
+    allocator.close();
+  }
+
+  @Test
+  public void testValuesWriteRead() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(10, scale, 256), allocator);) {
+
+      try (Decimal256Vector oldConstructor = new Decimal256Vector("decimal", allocator, 10, scale);) {
+        assertEquals(decimalVector.getField().getType(), oldConstructor.getField().getType());
+      }
+
+      decimalVector.allocateNew();
+      BigDecimal[] values = new BigDecimal[intValues.length];
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(intValues[i]), scale);
+        values[i] = decimal;
+        decimalVector.setSafe(i, decimal);
+      }
+
+      decimalVector.setValueCount(intValues.length);
+
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals("unexpected data at index: " + i, values[i], value);
+      }
+    }
+  }
+
+  @Test
+  public void testDecimal256DifferentScaleAndPrecision() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(4, 2, 256), allocator);) {
+      decimalVector.allocateNew();
+
+      // test Decimal256 with different scale
+      boolean hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(0), 3);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;
+      } finally {
+        assertTrue(hasError);
+      }
+
+      // test BigDecimal with larger precision than initialized
+      hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(12345), 2);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;
+      } finally {
+        assertTrue(hasError);
+      }
+    }
+  }
+
+  @Test
+  public void testWriteBigEndian() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(38, 18, 256), allocator);) {
+      decimalVector.allocateNew();
+      BigDecimal decimal1 = new BigDecimal("123456789.000000000000000000");
+      BigDecimal decimal2 = new BigDecimal("11.123456789123456789");
+      BigDecimal decimal3 = new BigDecimal("1.000000000000000000");
+      BigDecimal decimal4 = new BigDecimal("0.111111111000000000");
+      BigDecimal decimal5 = new BigDecimal("987654321.123456789000000000");
+      BigDecimal decimal6 = new BigDecimal("222222222222.222222222000000000");
+      BigDecimal decimal7 = new BigDecimal("7777777777777.666666667000000000");
+      BigDecimal decimal8 = new BigDecimal("1212121212.343434343000000000");
+
+      byte[] decimalValue1 = decimal1.unscaledValue().toByteArray();
+      byte[] decimalValue2 = decimal2.unscaledValue().toByteArray();
+      byte[] decimalValue3 = decimal3.unscaledValue().toByteArray();
+      byte[] decimalValue4 = decimal4.unscaledValue().toByteArray();
+      byte[] decimalValue5 = decimal5.unscaledValue().toByteArray();
+      byte[] decimalValue6 = decimal6.unscaledValue().toByteArray();
+      byte[] decimalValue7 = decimal7.unscaledValue().toByteArray();
+      byte[] decimalValue8 = decimal8.unscaledValue().toByteArray();
+
+      decimalVector.setBigEndian(0, decimalValue1);
+      decimalVector.setBigEndian(1, decimalValue2);
+      decimalVector.setBigEndian(2, decimalValue3);
+      decimalVector.setBigEndian(3, decimalValue4);
+      decimalVector.setBigEndian(4, decimalValue5);
+      decimalVector.setBigEndian(5, decimalValue6);
+      decimalVector.setBigEndian(6, decimalValue7);
+      decimalVector.setBigEndian(7, decimalValue8);
+
+      decimalVector.setValueCount(8);
+      assertEquals(8, decimalVector.getValueCount());
+      assertEquals(decimal1, decimalVector.getObject(0));
+      assertEquals(decimal2, decimalVector.getObject(1));
+      assertEquals(decimal3, decimalVector.getObject(2));
+      assertEquals(decimal4, decimalVector.getObject(3));
+      assertEquals(decimal5, decimalVector.getObject(4));
+      assertEquals(decimal6, decimalVector.getObject(5));
+      assertEquals(decimal7, decimalVector.getObject(6));
+      assertEquals(decimal8, decimalVector.getObject(7));
+    }
+  }
+
+  @Test
+  public void testLongReadWrite() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(38, 0, 256), allocator)) {
+      decimalVector.allocateNew();
+
+      long[] longValues = {0L, -2L, Long.MAX_VALUE, Long.MIN_VALUE, 187L};
+
+      for (int i = 0; i < longValues.length; ++i) {
+        decimalVector.set(i, longValues[i]);
+      }
+
+      decimalVector.setValueCount(longValues.length);
+
+      for (int i = 0; i < longValues.length; ++i) {
+        assertEquals(new BigDecimal(longValues[i]), decimalVector.getObject(i));
+      }
+    }
+  }
+
+
+  @Test
+  public void testBigDecimalReadWrite() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(38, 9, 256), allocator);) {
+      decimalVector.allocateNew();
+      BigDecimal decimal1 = new BigDecimal("123456789.000000000");
+      BigDecimal decimal2 = new BigDecimal("11.123456789");
+      BigDecimal decimal3 = new BigDecimal("1.000000000");
+      BigDecimal decimal4 = new BigDecimal("-0.111111111");
+      BigDecimal decimal5 = new BigDecimal("-987654321.123456789");
+      BigDecimal decimal6 = new BigDecimal("-222222222222.222222222");
+      BigDecimal decimal7 = new BigDecimal("7777777777777.666666667");
+      BigDecimal decimal8 = new BigDecimal("1212121212.343434343");
+
+      decimalVector.set(0, decimal1);
+      decimalVector.set(1, decimal2);
+      decimalVector.set(2, decimal3);
+      decimalVector.set(3, decimal4);
+      decimalVector.set(4, decimal5);
+      decimalVector.set(5, decimal6);
+      decimalVector.set(6, decimal7);
+      decimalVector.set(7, decimal8);
+
+      decimalVector.setValueCount(8);
+      assertEquals(8, decimalVector.getValueCount());
+      assertEquals(decimal1, decimalVector.getObject(0));
+      assertEquals(decimal2, decimalVector.getObject(1));
+      assertEquals(decimal3, decimalVector.getObject(2));
+      assertEquals(decimal4, decimalVector.getObject(3));
+      assertEquals(decimal5, decimalVector.getObject(4));
+      assertEquals(decimal6, decimalVector.getObject(5));
+      assertEquals(decimal7, decimalVector.getObject(6));
+      assertEquals(decimal8, decimalVector.getObject(7));
+    }
+  }
+
+  /**
+   * Test {@link Decimal256Vector#setBigEndian(int, byte[])} which takes BE layout input and stores in LE layout.
+   * Cases to cover: input byte array in different lengths in range [1-16] and negative values.
+   */
+  @Test
+  public void decimalBE2LE() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(23, 2, 256), allocator)) {
+      decimalVector.allocateNew();
+
+      BigInteger[] testBigInts = new BigInteger[] {
+          new BigInteger("0"),
+          new BigInteger("-1"),
+          new BigInteger("23"),
+          new BigInteger("234234"),
+          new BigInteger("-234234234"),
+          new BigInteger("234234234234"),
+          new BigInteger("-56345345345345"),
+          new BigInteger("2982346298346289346293467923465345634500"), // converts to 16+ byte array
+          new BigInteger("-389457298347598237459832459823434653600"), // converts to 16+ byte array
+          new BigInteger("-345345"),
+          new BigInteger("754533")
+      };
+
+      int insertionIdx = 0;
+      insertionIdx++; // insert a null
+      for (BigInteger val : testBigInts) {
+        decimalVector.setBigEndian(insertionIdx++, val.toByteArray());
+      }
+      insertionIdx++; // insert a null
+      // insert a zero length buffer
+      decimalVector.setBigEndian(insertionIdx++, new byte[0]);
+
+      // Try inserting a buffer larger than 33 bytes and expect a failure
+      try {
+        decimalVector.setBigEndian(insertionIdx, new byte[33]);
+        fail("above statement should have failed");
+      } catch (IllegalArgumentException ex) {
+        assertTrue(ex.getMessage().equals("Invalid decimal value length. Valid length in [1 - 32], got 33"));
+      }
+      decimalVector.setValueCount(insertionIdx);
+
+      // retrieve values and check if they are correct
+      int outputIdx = 0;
+      assertTrue(decimalVector.isNull(outputIdx++));
+      for (BigInteger expected : testBigInts) {
+        final BigDecimal actual = decimalVector.getObject(outputIdx++);
+        assertEquals(expected, actual.unscaledValue());
+      }
+      assertTrue(decimalVector.isNull(outputIdx++));
+      assertEquals(BigInteger.valueOf(0), decimalVector.getObject(outputIdx).unscaledValue());
+    }
+  }
+
+  @Test
+  public void setUsingArrowBufOfLEInts() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(5, 2, 256), allocator);
+         ArrowBuf buf = allocator.buffer(8);) {
+      decimalVector.allocateNew();
+
+      // add a positive value equivalent to 705.32
+      int val = 70532;
+      buf.setInt(0, val);
+      decimalVector.setSafe(0, 0, buf, 4);
+
+      // add a -ve value equivalent to -705.32
+      val = -70532;
+      buf.setInt(4, val);
+      decimalVector.setSafe(1, 4, buf, 4);
+
+      decimalVector.setValueCount(2);
+
+      BigDecimal [] expectedValues = new BigDecimal[] {BigDecimal.valueOf(705.32), BigDecimal
+              .valueOf(-705.32)};
+      for (int i = 0; i < 2; i ++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals(expectedValues[i], value);
+      }
+    }
+
+  }
+
+  @Test
+  public void setUsingArrowLongLEBytes() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(18, 0, 256), allocator);
+         ArrowBuf buf = allocator.buffer(16);) {
+      decimalVector.allocateNew();
+
+      long val = Long.MAX_VALUE;
+      buf.setLong(0, val);
+      decimalVector.setSafe(0, 0, buf, 8);
+
+      val = Long.MIN_VALUE;
+      buf.setLong(8, val);
+      decimalVector.setSafe(1, 8, buf, 8);
+
+      decimalVector.setValueCount(2);
+
+      BigDecimal [] expectedValues = new BigDecimal[] {BigDecimal.valueOf(Long.MAX_VALUE), BigDecimal
+              .valueOf(Long.MIN_VALUE)};
+      for (int i = 0; i < 2; i ++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals(expectedValues[i], value);
+      }
+    }
+  }
+
+  @Test
+  public void setUsingArrowBufOfBEBytes() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(5, 2, 256), allocator);
+         ArrowBuf buf = allocator.buffer(9);) {
+      BigDecimal [] expectedValues = new BigDecimal[] {BigDecimal.valueOf(705.32), BigDecimal
+              .valueOf(-705.32), BigDecimal.valueOf(705.32)};
+      verifyWritingArrowBufWithBigEndianBytes(decimalVector, buf, expectedValues, 3);
+    }
+
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(43, 2, 256), allocator);
+         ArrowBuf buf = allocator.buffer(45);) {
+      BigDecimal[] expectedValues = new BigDecimal[] {new BigDecimal("29823462983462893462934679234653450000000.63"),
+                                                      new BigDecimal("-2982346298346289346293467923465345.63"),
+                                                      new BigDecimal("2982346298346289346293467923465345.63")};
+      verifyWritingArrowBufWithBigEndianBytes(decimalVector, buf, expectedValues, 15);
+    }
+  }
+
+  private void verifyWritingArrowBufWithBigEndianBytes(Decimal256Vector decimalVector,
+                                                       ArrowBuf buf, BigDecimal[] expectedValues,
+                                                       int length) {
+    decimalVector.allocateNew();
+    for (int i = 0; i < expectedValues.length; i++) {
+      byte []bigEndianBytes = expectedValues[i].unscaledValue().toByteArray();

Review comment:
       done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509370051



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -254,67 +254,125 @@ BasicDecimal128& BasicDecimal128::operator>>=(uint32_t bits) {
 
 namespace {
 
-// TODO: Remove this guard once it's used by BasicDecimal256
-#ifndef ARROW_USE_NATIVE_INT128
-// This method losslessly multiplies x and y into a 128 bit unsigned integer
-// whose high bits will be stored in hi and low bits in lo.
-void ExtendAndMultiplyUint64(uint64_t x, uint64_t y, uint64_t* hi, uint64_t* lo) {
+// Convenience wrapper type over 128 bit unsigned integers. We opt not to
+// replace the uint128_t type in int128_internal.h because it would require
+// significantly more implementation work to be done. This class merely
+// provides the minimum necessary set of functions to perform 128+ bit
+// multiplication operations when there may or may not be native support.
 #ifdef ARROW_USE_NATIVE_INT128
-  const __uint128_t r = static_cast<__uint128_t>(x) * y;
-  *lo = r & kInt64Mask;
-  *hi = r >> 64;
+struct uint128_t {
+  uint128_t() {}
+  uint128_t(uint64_t hi, uint64_t lo) : val_((static_cast<__uint128_t>(hi) << 64) | lo) {}
+  explicit uint128_t(const BasicDecimal128& decimal) {
+    val_ = (static_cast<__uint128_t>(decimal.high_bits()) << 64) | decimal.low_bits();
+  }
+
+  uint64_t hi() { return val_ >> 64; }
+  uint64_t lo() { return val_ & kInt64Mask; }
+
+  uint128_t& operator+=(const uint128_t& other) {
+    val_ += other.val_;
+    return *this;
+  }
+
+  uint128_t& operator*=(const uint128_t& other) {
+    val_ *= other.val_;
+    return *this;
+  }
+
+  __uint128_t val_;
+};
+
 #else
-  // If we can't use a native fallback, perform multiplication
+// Multiply two 64 bit word components into a 128 bit result, with high bits
+// stored in hi and low bits in lo.
+inline void ExtendAndMultiply(uint64_t x, uint64_t y, uint64_t* hi, uint64_t* lo) {
+  // Perform multiplication on two 64 bit words x and y into a 128 bit result
   // by splitting up x and y into 32 bit high/low bit components,
   // allowing us to represent the multiplication as
   // x * y = x_lo * y_lo + x_hi * y_lo * 2^32 + y_hi * x_lo * 2^32
-  // + x_hi * y_hi * 2^64.
+  // + x_hi * y_hi * 2^64
   //
-  // Now, consider the final output as lo_lo || lo_hi || hi_lo || hi_hi.
+  // Now, consider the final output as lo_lo || lo_hi || hi_lo || hi_hi
   // Therefore,
   // lo_lo is (x_lo * y_lo)_lo,
   // lo_hi is ((x_lo * y_lo)_hi + (x_hi * y_lo)_lo + (x_lo * y_hi)_lo)_lo,
   // hi_lo is ((x_hi * y_hi)_lo + (x_hi * y_lo)_hi + (x_lo * y_hi)_hi)_hi,
   // hi_hi is (x_hi * y_hi)_hi
-  const uint64_t x_lo = x & kIntMask;
-  const uint64_t y_lo = y & kIntMask;
+  const uint64_t x_lo = x & kInt32Mask;
+  const uint64_t y_lo = y & kInt32Mask;
   const uint64_t x_hi = x >> 32;
   const uint64_t y_hi = y >> 32;
 
   const uint64_t t = x_lo * y_lo;
-  const uint64_t t_lo = t & kIntMask;
+  const uint64_t t_lo = t & kInt32Mask;
   const uint64_t t_hi = t >> 32;
 
   const uint64_t u = x_hi * y_lo + t_hi;
-  const uint64_t u_lo = u & kIntMask;
+  const uint64_t u_lo = u & kInt32Mask;
   const uint64_t u_hi = u >> 32;
 
   const uint64_t v = x_lo * y_hi + u_lo;
   const uint64_t v_hi = v >> 32;
 
   *hi = x_hi * y_hi + u_hi + v_hi;
-  *lo = (v << 32) | t_lo;
-#endif
+  *lo = (v << 32) + t_lo;
 }
-#endif
 
-void MultiplyUint128(uint64_t x_hi, uint64_t x_lo, uint64_t y_hi, uint64_t y_lo,
-                     uint64_t* hi, uint64_t* lo) {
-#ifdef ARROW_USE_NATIVE_INT128
-  const __uint128_t x = (static_cast<__uint128_t>(x_hi) << 64) | x_lo;
-  const __uint128_t y = (static_cast<__uint128_t>(y_hi) << 64) | y_lo;
-  const __uint128_t r = x * y;
-  *lo = r & kInt64Mask;
-  *hi = r >> 64;
-#else
-  // To perform 128 bit multiplication without a native fallback
-  // we first perform lossless 64 bit multiplication of the low
-  // bits, and then add x_hi * y_lo and x_lo * y_hi to the high
-  // bits. Note that we can skip adding x_hi * y_hi because it
-  // always will be over 128 bits.
-  ExtendAndMultiplyUint64(x_lo, y_lo, hi, lo);
-  *hi += (x_hi * y_lo) + (x_lo * y_hi);
+struct uint128_t {
+  uint128_t() {}
+  uint128_t(uint64_t hi, uint64_t lo) : hi_(hi), lo_(lo) {}
+  explicit uint128_t(const BasicDecimal128& decimal) {
+    hi_ = decimal.high_bits();
+    lo_ = decimal.low_bits();
+  }
+
+  uint64_t hi() const { return hi_; }
+  uint64_t lo() const { return lo_; }
+
+  uint128_t& operator+=(const uint128_t& other) {
+    // To deduce the carry bit, we perform "65 bit" addition on the low bits and
+    // seeing if the resulting high bit is 1. This is accomplished by shifting the
+    // low bits to the right by 1 (chopping off the lowest bit), then adding 1 if the
+    // result of adding the two chopped bits would have produced a carry.
+    uint64_t carry = (((lo_ & other.lo_) & 1) + (lo_ >> 1) + (other.lo_ >> 1)) >> 63;
+    hi_ += other.hi_ + carry;
+    lo_ += other.lo_;
+    return *this;
+  }
+
+  uint128_t& operator*=(const uint128_t& other) {
+    uint128_t r;
+    ExtendAndMultiply(lo_, other.lo_, &r.hi_, &r.lo_);
+    r.hi_ += (hi_ * other.lo_) + (lo_ * other.hi_);
+    *this = r;
+    return *this;
+  }
+
+  uint64_t hi_;
+  uint64_t lo_;
+};
 #endif
+
+// Multiplies two N * 64 bit unsigned integer types, represented by a uint64_t
+// array into a same sized output. Elements in the array should be in
+// little endian order, and output will be the same. Overflow in multiplication
+// will result in the lower N * 64 bits of the result being set.
+template <int N>
+inline void MultiplyUnsignedArray(const std::array<uint64_t, N>& lh,
+                                  const std::array<uint64_t, N>& rh,
+                                  std::array<uint64_t, N>* result) {
+  for (int j = 0; j < N; ++j) {
+    uint64_t carry = 0;
+    for (int i = 0; i < N - j; ++i) {
+      uint128_t tmp(lh[i]);
+      tmp *= uint128_t(rh[j]);

Review comment:
       I may be misunderstanding, but I don't see a `uint128_t(uint64_t)` constructor?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508781965



##########
File path: cpp/src/arrow/python/decimal.cc
##########
@@ -145,8 +146,9 @@ Status DecimalFromPythonDecimal(PyObject* python_decimal, const DecimalType& arr
   return DecimalFromStdString(string, arrow_type, out);
 }
 
-Status DecimalFromPyObject(PyObject* obj, const DecimalType& arrow_type,
-                           Decimal128* out) {
+template <typename ArrowDecimal>
+Status InternalDecimalFromPyObject(PyObject* obj, const DecimalType& arrow_type,
+                                   ArrowDecimal* out) {

Review comment:
       done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507836798



##########
File path: cpp/src/arrow/util/decimal_test.cc
##########
@@ -26,6 +26,7 @@
 #include <vector>
 
 #include <gtest/gtest.h>
+#include <boost/multiprecision/cpp_int.hpp>

Review comment:
       Ah, I see that we also use `int256_t`...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508697896



##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -119,34 +121,47 @@ public static boolean checkPrecisionAndScale(int decimalPrecision, int decimalSc
    * UnsupportedOperationException if the decimal size is greater than the Decimal vector byte
    * width.
    */
-  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index) {
+  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index, int byteWidth) {
     final byte[] bytes = value.unscaledValue().toByteArray();
-    writeByteArrayToArrowBufHelper(bytes, bytebuf, index);
+    writeByteArrayToArrowBufHelper(bytes, bytebuf, index, byteWidth);
   }
 
   /**
    * Write the given long to the ArrowBuf at the given value index.
    */
   public static void writeLongToArrowBuf(long value, ArrowBuf bytebuf, int index) {
-    final long addressOfValue = bytebuf.memoryAddress() + (long) index * DECIMAL_BYTE_LENGTH;
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 16;
     PlatformDependent.putLong(addressOfValue, value);
     final long padValue = Long.signum(value) == -1 ? -1L : 0L;
     PlatformDependent.putLong(addressOfValue + Long.BYTES, padValue);
   }
 
+  /**
+   * Write value to the buffer extending it to 32 bytes at the given index. 
+   */
+  public static void writeLongToArrowBufBigDecimal(long value, ArrowBuf bytebuf, int index) {
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 32;
+    PlatformDependent.putLong(addressOfValue, value);
+    final long padValue = Long.signum(value) == -1 ? -1L : 0L;
+    PlatformDependent.putLong(addressOfValue + Long.BYTES, padValue);
+    PlatformDependent.putLong(addressOfValue + 2 * Long.BYTES, padValue);
+    PlatformDependent.putLong(addressOfValue + 3 * Long.BYTES, padValue);

Review comment:
       I might be miscounting or misunderstanding put I only count 4 putLong here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-714893875


   Thanks for the reviews @liyafan82 and @pitrou.  I rebased an i'll merge when green an open up some follow-up JIRAs.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507834391



##########
File path: cpp/src/arrow/util/decimal_benchmark.cc
##########
@@ -191,6 +206,7 @@ static void BinaryBitOp(benchmark::State& state) {  // NOLINT non-const referenc
 BENCHMARK(FromString);
 BENCHMARK(ToString);
 BENCHMARK(BinaryMathOp);

Review comment:
       Rename this to `BinaryMathOp128`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-714226000


   > @liyafan82 thanks for the typo catches. I addressed the comments. Any other concerns?
   
   Sorry for my delayed review. Hopefully, I will finish the second pass today. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r506046740



##########
File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java
##########
@@ -106,6 +106,28 @@ public void writeBigEndianBytesToDecimal(byte[] value, ArrowType arrowType) {
   public void writeBigEndianBytesToDecimal(byte[] value) {
     getWriter(MinorType.DECIMAL).writeBigEndianBytesToDecimal(value);
   }
+  <#elseif minor.class == "BigDecimal">
+  @Override
+  public void write(BigDecimalHolder holder) {
+    getWriter(MinorType.BIGDECIMAL).write(holder);
+  }
+
+  public void writeBigDecimal(int start, ArrowBuf buffer, ArrowType arrowType) {
+    getWriter(MinorType.BIGDECIMAL).writeBigDecimal(start, buffer, arrowType);
+  }
+
+  public void writeBigDecimal(int start, ArrowBuf buffer) {

Review comment:
       The type of `start` should be long, as it represents the start offset in the ArrowBuf, just like in https://github.com/apache/arrow/pull/8455




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508866742



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -775,4 +833,119 @@ int32_t BasicDecimal128::CountLeadingBinaryZeros() const {
   }
 }
 
+#if ARROW_LITTLE_ENDIAN
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[0],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[3]})) {}
+#else
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[3],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[0]})) {
+#endif
+
+BasicDecimal256& BasicDecimal256::Negate() {
+  uint64_t carry = 1;
+  for (uint64_t& elem : little_endian_array_) {
+    elem = ~elem + carry;
+    carry &= (elem == 0);
+  }
+  return *this;
+}
+
+BasicDecimal256& BasicDecimal256::Abs() { return *this < 0 ? Negate() : *this; }
+
+BasicDecimal256 BasicDecimal256::Abs(const BasicDecimal256& in) {
+  BasicDecimal256 result(in);
+  return result.Abs();
+}
+
+std::array<uint8_t, 32> BasicDecimal256::ToBytes() const {
+  std::array<uint8_t, 32> out{{0}};
+  ToBytes(out.data());
+  return out;
+}
+
+void BasicDecimal256::ToBytes(uint8_t* out) const {
+  DCHECK_NE(out, nullptr);
+#if ARROW_LITTLE_ENDIAN
+  reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[0];
+  reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[1];
+  reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[2];
+  reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[3];
+#else
+    reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[3];
+    reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[2];
+    reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[1];
+    reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[0];
+#endif
+}
+
+BasicDecimal256& BasicDecimal256::operator*=(const BasicDecimal256& right) {
+  // Since the max value of BasicDecimal256 is supposed to be 1e76 - 1 and the
+  // min the negation taking the absolute values here should always be safe.
+  const bool negate = Sign() != right.Sign();
+  BasicDecimal256 x = BasicDecimal256::Abs(*this);
+  BasicDecimal256 y = BasicDecimal256::Abs(right);
+
+  uint128_t r_hi;
+  uint128_t r_lo;
+  std::array<uint64_t, 4> res{0, 0, 0, 0};
+  MultiplyUnsignedArray<4>(x.little_endian_array_, y.little_endian_array_, &res);
+  little_endian_array_ = res;
+  if (negate) {
+    Negate();
+  }
+  return *this;
+}
+
+DecimalStatus BasicDecimal256::Rescale(int32_t original_scale, int32_t new_scale,
+                                       BasicDecimal256* out) const {
+  if (original_scale == new_scale) {
+    return DecimalStatus::kSuccess;
+  }
+  // TODO: implement.
+  return DecimalStatus::kRescaleDataLoss;
+}
+
+BasicDecimal256 operator*(const BasicDecimal256& left, const BasicDecimal256& right) {
+  BasicDecimal256 result = left;
+  result *= right;
+  return result;
+}
+
+bool operator==(const BasicDecimal256& left, const BasicDecimal256& right) {
+  return left.little_endian_array() == right.little_endian_array();
+}
+
+bool operator!=(const BasicDecimal256& left, const BasicDecimal256& right) {
+  return left.little_endian_array() != right.little_endian_array();
+}
+
+bool operator<(const BasicDecimal256& left, const BasicDecimal256& right) {
+  const std::array<uint64_t, 4>& lhs = left.little_endian_array();
+  const std::array<uint64_t, 4>& rhs = right.little_endian_array();
+  return lhs[3] != rhs[3]
+             ? static_cast<int64_t>(lhs[3]) < static_cast<int64_t>(rhs[3])
+             : lhs[2] != rhs[2] ? lhs[2] < rhs[2]
+                                : lhs[1] != rhs[1] ? lhs[1] < rhs[1] : lhs[0] < rhs[0];
+}
+
+bool operator<=(const BasicDecimal256& left, const BasicDecimal256& right) {
+  return !operator>(left, right);

Review comment:
       i put =, != and the indirections as inline methods.  operator< seems complicated enough not to inline.  I'm leaving Decimal128 as is for now (I think that is what these where initially modelled after)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507812828



##########
File path: cpp/src/arrow/array/array_test.cc
##########
@@ -2426,7 +2433,43 @@ TEST_P(DecimalTest, WithNulls) {
   this->TestCreate(precision, draw, valid_bytes, 2);
 }
 
-INSTANTIATE_TEST_SUITE_P(DecimalTest, DecimalTest, ::testing::Range(1, 38));
+INSTANTIATE_TEST_SUITE_P(Decimal128Test, Decimal128Test, ::testing::Range(1, 38));
+
+using Decimal256Test = DecimalTest<Decimal256Type>;
+
+TEST_P(Decimal256Test, NoNulls) {
+  int32_t precision = GetParam();
+  std::vector<Decimal256> draw = {Decimal256(1), Decimal256(-2), Decimal256(2389),
+                                  Decimal256(4), Decimal256(-12348)};
+  std::vector<uint8_t> valid_bytes = {true, true, true, true, true};
+  this->TestCreate(precision, draw, valid_bytes, 0);
+  this->TestCreate(precision, draw, valid_bytes, 2);
+}
+
+TEST_P(Decimal256Test, WithNulls) {
+  int32_t precision = GetParam();
+  std::vector<Decimal256> draw = {Decimal256(1), Decimal256(2),  Decimal256(-1),
+                                  Decimal256(4), Decimal256(-1), Decimal256(1),
+                                  Decimal256(2)};
+  Decimal256 big;  // (pow(2, 255) - 1) / pow(10, 38)
+  ASSERT_OK_AND_ASSIGN(big,
+                       Decimal256::FromString("578960446186580977117854925043439539266."
+                                              "34992332820282019728792003956564819967"));
+  draw.push_back(big);
+
+  Decimal256 big_negative;  // -pow(2, 255) / pow(10, 38)
+  ASSERT_OK_AND_ASSIGN(big_negative,
+                       Decimal256::FromString("-578960446186580977117854925043439539266."
+                                              "34992332820282019728792003956564819968"));
+  draw.push_back(big_negative);
+
+  std::vector<uint8_t> valid_bytes = {true, true, false, true, false,
+                                      true, true, true,  true};
+  this->TestCreate(precision, draw, valid_bytes, 0);
+  this->TestCreate(precision, draw, valid_bytes, 2);
+}
+
+INSTANTIATE_TEST_SUITE_P(Decimal256Test, Decimal256Test, ::testing::Range(1, 76));

Review comment:
       Do we really want to test every precision between 1 and 76? (note the same comment applies to `Decimal128Test` above).
   
   I'm concerned about the readability of test output here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r506789848



##########
File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java
##########
@@ -106,6 +106,28 @@ public void writeBigEndianBytesToDecimal(byte[] value, ArrowType arrowType) {
   public void writeBigEndianBytesToDecimal(byte[] value) {
     getWriter(MinorType.DECIMAL).writeBigEndianBytesToDecimal(value);
   }
+  <#elseif minor.class == "BigDecimal">
+  @Override
+  public void write(BigDecimalHolder holder) {
+    getWriter(MinorType.BIGDECIMAL).write(holder);
+  }
+
+  public void writeBigDecimal(int start, ArrowBuf buffer, ArrowType arrowType) {

Review comment:
       done.

##########
File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java
##########
@@ -106,6 +106,28 @@ public void writeBigEndianBytesToDecimal(byte[] value, ArrowType arrowType) {
   public void writeBigEndianBytesToDecimal(byte[] value) {
     getWriter(MinorType.DECIMAL).writeBigEndianBytesToDecimal(value);
   }
+  <#elseif minor.class == "BigDecimal">
+  @Override
+  public void write(BigDecimalHolder holder) {
+    getWriter(MinorType.BIGDECIMAL).write(holder);
+  }
+
+  public void writeBigDecimal(int start, ArrowBuf buffer, ArrowType arrowType) {
+    getWriter(MinorType.BIGDECIMAL).writeBigDecimal(start, buffer, arrowType);
+  }
+
+  public void writeBigDecimal(int start, ArrowBuf buffer) {

Review comment:
       weird I though I had changed these, maybe I failed to commit them.

##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -32,24 +32,26 @@
   private DecimalUtility() {}
 
   public static final int DECIMAL_BYTE_LENGTH = 16;
-  public static final byte [] zeroes = new byte[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
-  public static final byte [] minus_one = new byte[] {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
+  public static final byte [] zeroes = new byte[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+                                                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
+  public static final byte [] minus_one = new byte[] {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+                                                      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
 
   /**
    * Read an ArrowType.Decimal at the given value index in the ArrowBuf and convert to a BigDecimal
    * with the given scale.
    */
-  public static BigDecimal getBigDecimalFromArrowBuf(ArrowBuf bytebuf, int index, int scale) {
-    byte[] value = new byte[DECIMAL_BYTE_LENGTH];
+  public static BigDecimal getBigDecimalFromArrowBuf(ArrowBuf bytebuf, int index, int scale, int byteWidth) {
+    byte[] value = new byte[byteWidth];
     byte temp;
-    final int startIndex = index * DECIMAL_BYTE_LENGTH;
+    final long startIndex = index * byteWidth;

Review comment:
       done.

##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -294,10 +294,10 @@ public StructVector getStruct() {
     }
     return ${uncappedName}Vector;
   }
-  <#if minor.class == "Decimal">
+  <#if minor.class?ends_with("Decimal")>
   public ${name}Vector get${name}Vector() {
     if (${uncappedName}Vector == null) {
-      throw new IllegalArgumentException("No Decimal Vector present. Provide ArrowType argument to create a new vector");
+      throw new IllegalArgumentException("No Decimal ${uncappedName} present. Provide ArrowType argument to create a new vector");

Review comment:
       yes, I think so.  Nice catch.

##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -32,24 +32,26 @@
   private DecimalUtility() {}
 
   public static final int DECIMAL_BYTE_LENGTH = 16;
-  public static final byte [] zeroes = new byte[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
-  public static final byte [] minus_one = new byte[] {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
+  public static final byte [] zeroes = new byte[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+                                                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
+  public static final byte [] minus_one = new byte[] {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+                                                      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
 
   /**
    * Read an ArrowType.Decimal at the given value index in the ArrowBuf and convert to a BigDecimal
    * with the given scale.
    */
-  public static BigDecimal getBigDecimalFromArrowBuf(ArrowBuf bytebuf, int index, int scale) {
-    byte[] value = new byte[DECIMAL_BYTE_LENGTH];
+  public static BigDecimal getBigDecimalFromArrowBuf(ArrowBuf bytebuf, int index, int scale, int byteWidth) {
+    byte[] value = new byte[byteWidth];
     byte temp;
-    final int startIndex = index * DECIMAL_BYTE_LENGTH;
+    final long startIndex = index * byteWidth;

Review comment:
       done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507836439



##########
File path: cpp/src/arrow/util/decimal_test.cc
##########
@@ -26,6 +26,7 @@
 #include <vector>
 
 #include <gtest/gtest.h>
+#include <boost/multiprecision/cpp_int.hpp>

Review comment:
       Can we include `arrow/util/int128_internal.h` instead?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509362542



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -775,4 +833,119 @@ int32_t BasicDecimal128::CountLeadingBinaryZeros() const {
   }
 }
 
+#if ARROW_LITTLE_ENDIAN
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[0],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[3]})) {}
+#else
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[3],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[0]})) {
+#endif
+
+BasicDecimal256& BasicDecimal256::Negate() {
+  uint64_t carry = 1;
+  for (uint64_t& elem : little_endian_array_) {
+    elem = ~elem + carry;
+    carry &= (elem == 0);
+  }
+  return *this;
+}
+
+BasicDecimal256& BasicDecimal256::Abs() { return *this < 0 ? Negate() : *this; }
+
+BasicDecimal256 BasicDecimal256::Abs(const BasicDecimal256& in) {
+  BasicDecimal256 result(in);
+  return result.Abs();
+}
+
+std::array<uint8_t, 32> BasicDecimal256::ToBytes() const {
+  std::array<uint8_t, 32> out{{0}};
+  ToBytes(out.data());
+  return out;
+}
+
+void BasicDecimal256::ToBytes(uint8_t* out) const {
+  DCHECK_NE(out, nullptr);
+#if ARROW_LITTLE_ENDIAN
+  reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[0];
+  reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[1];
+  reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[2];
+  reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[3];
+#else
+    reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[3];
+    reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[2];
+    reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[1];
+    reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[0];

Review comment:
       Ah :-/




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507811608



##########
File path: cpp/src/arrow/array/builder_decimal.h
##########
@@ -58,6 +58,35 @@ class ARROW_EXPORT Decimal128Builder : public FixedSizeBinaryBuilder {
   std::shared_ptr<Decimal128Type> decimal_type_;
 };
 
+class ARROW_EXPORT Decimal256Builder : public FixedSizeBinaryBuilder {
+ public:
+  using TypeClass = Decimal256Type;
+
+  explicit Decimal256Builder(const std::shared_ptr<DataType>& type,
+                             MemoryPool* pool = default_memory_pool());
+
+  using FixedSizeBinaryBuilder::Append;
+  using FixedSizeBinaryBuilder::AppendValues;
+  using FixedSizeBinaryBuilder::Reset;
+
+  Status Append(Decimal256 val);

Review comment:
       I'm not sure `Decimal256` will be passed in registers, given that it's large. Perhaps take `const Decimal256& value` instead?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508283490



##########
File path: java/vector/src/test/java/org/apache/arrow/vector/TestDecimal256Vector.java
##########
@@ -0,0 +1,364 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import org.apache.arrow.memory.ArrowBuf;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestDecimal256Vector {
+
+  private static long[] intValues;
+
+  static {
+    intValues = new long[60];
+    for (int i = 0; i < intValues.length / 2; i++) {
+      intValues[i] = 1 << i + 1;
+      intValues[2 * i] = -1 * (1 << i + 1);
+    }
+  }
+
+  private int scale = 3;
+
+  private BufferAllocator allocator;
+
+  @Before
+  public void init() {
+    allocator = new DirtyRootAllocator(Long.MAX_VALUE, (byte) 100);
+  }
+
+  @After
+  public void terminate() throws Exception {
+    allocator.close();
+  }
+
+  @Test
+  public void testValuesWriteRead() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(10, scale, 256), allocator);) {
+
+      try (Decimal256Vector oldConstructor = new Decimal256Vector("decimal", allocator, 10, scale);) {
+        assertEquals(decimalVector.getField().getType(), oldConstructor.getField().getType());
+      }
+
+      decimalVector.allocateNew();
+      BigDecimal[] values = new BigDecimal[intValues.length];
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(intValues[i]), scale);
+        values[i] = decimal;
+        decimalVector.setSafe(i, decimal);
+      }
+
+      decimalVector.setValueCount(intValues.length);
+
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals("unexpected data at index: " + i, values[i], value);
+      }
+    }
+  }
+
+  @Test
+  public void testDecimal256DifferentScaleAndPrecision() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(4, 2, 256), allocator);) {
+      decimalVector.allocateNew();
+
+      // test Decimal256 with different scale
+      boolean hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(0), 3);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;
+      } finally {
+        assertTrue(hasError);
+      }
+
+      // test BigDecimal with larger precision than initialized
+      hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(12345), 2);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;
+      } finally {
+        assertTrue(hasError);
+      }
+    }
+  }
+
+  @Test
+  public void testWriteBigEndian() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(38, 18, 256), allocator);) {
+      decimalVector.allocateNew();
+      BigDecimal decimal1 = new BigDecimal("123456789.000000000000000000");
+      BigDecimal decimal2 = new BigDecimal("11.123456789123456789");
+      BigDecimal decimal3 = new BigDecimal("1.000000000000000000");
+      BigDecimal decimal4 = new BigDecimal("0.111111111000000000");
+      BigDecimal decimal5 = new BigDecimal("987654321.123456789000000000");
+      BigDecimal decimal6 = new BigDecimal("222222222222.222222222000000000");
+      BigDecimal decimal7 = new BigDecimal("7777777777777.666666667000000000");
+      BigDecimal decimal8 = new BigDecimal("1212121212.343434343000000000");

Review comment:
       Maybe we should also check nulls here?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] jorisvandenbossche commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

jorisvandenbossche commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-716377181


   This might have "broken" the spark integration builds: https://github.com/ursa-labs/crossbow/runs/1304128112
   
   ```
   Error: ] /spark/sql/catalyst/src/main/scala/org/apache/spark/sql/util/ArrowUtils.scala:47: not enough arguments for constructor Decimal: (x$1: Int, x$2: Int, x$3: Int)org.apache.arrow.vector.types.pojo.ArrowType.Decimal.
   Unspecified value parameter x$3.
   ```
   
   (now I am not familiar enough with spark to know what kind of "broken" it is, but in any case the integration build is failing)


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-709773856


   Maybe some CI failures can be fixed by referencing https://github.com/apache/arrow/pull/8455


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-714356823


   The Java changes look good to me. Thank you @emkornfield 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] Luminarys commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

Luminarys commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507913864



##########
File path: cpp/src/arrow/util/decimal_benchmark.cc
##########
@@ -148,6 +148,21 @@ static void BinaryMathOp(benchmark::State& state) {  // NOLINT non-const referen
   state.SetItemsProcessed(state.iterations() * kValueSize);
 }
 
+static void BinaryMathOp256(benchmark::State& state) {  // NOLINT non-const reference
+  std::vector<BasicDecimal256> v1, v2;
+  for (uint64_t x = 0; x < kValueSize; x++) {
+    v1.push_back(BasicDecimal256({100 + x, 100 + x, 100 + x, 100 + x}));
+    v2.push_back(BasicDecimal256({200 + x, 200 + x, 200 + x, 200 + x}));
+  }
+
+  for (auto _ : state) {
+    for (int x = 0; x < kValueSize; x += 5) {
+      benchmark::DoNotOptimize(v1[x + 2] * v2[x + 2]);

Review comment:
       As of right now we've only added support for operator*, I think as we add more operators this benchmark can be expanded to reach parity with the other.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507837493



##########
File path: cpp/src/parquet/arrow/reader_internal.cc
##########
@@ -645,7 +645,9 @@ static Status DecimalIntegerTransfer(RecordReader* reader, MemoryPool* pool,
 template <typename ParquetType>
 Status TransferDecimal(RecordReader* reader, MemoryPool* pool,
                        const std::shared_ptr<DataType>& type, Datum* out) {
-  DCHECK_EQ(type->id(), ::arrow::Type::DECIMAL);
+  if (type->id() != ::arrow::Type::DECIMAL128) {
+    return Status::Invalid("Only reading decimal128 types is currently supported");

Review comment:
       `Status::NotImplemented` instead?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508887213



##########
File path: cpp/src/arrow/c/bridge_test.cc
##########
@@ -740,6 +741,7 @@ TEST_F(TestArrayExport, Primitive) {
   TestPrimitive(large_utf8(), R"(["foo", "bar", null])");
 
   TestPrimitive(decimal(16, 4), R"(["1234.5670", null])");
+  TestPrimitive(decimal256(16, 4), R"(["1234.5670", null])");

Review comment:
       done, added additional schema tests and a round trip test.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508791339



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -254,67 +254,125 @@ BasicDecimal128& BasicDecimal128::operator>>=(uint32_t bits) {
 
 namespace {
 
-// TODO: Remove this guard once it's used by BasicDecimal256
-#ifndef ARROW_USE_NATIVE_INT128
-// This method losslessly multiplies x and y into a 128 bit unsigned integer
-// whose high bits will be stored in hi and low bits in lo.
-void ExtendAndMultiplyUint64(uint64_t x, uint64_t y, uint64_t* hi, uint64_t* lo) {
+// Convenience wrapper type over 128 bit unsigned integers. We opt not to
+// replace the uint128_t type in int128_internal.h because it would require
+// significantly more implementation work to be done. This class merely
+// provides the minimum necessary set of functions to perform 128+ bit
+// multiplication operations when there may or may not be native support.
 #ifdef ARROW_USE_NATIVE_INT128
-  const __uint128_t r = static_cast<__uint128_t>(x) * y;
-  *lo = r & kInt64Mask;
-  *hi = r >> 64;
+struct uint128_t {
+  uint128_t() {}
+  uint128_t(uint64_t hi, uint64_t lo) : val_((static_cast<__uint128_t>(hi) << 64) | lo) {}
+  explicit uint128_t(const BasicDecimal128& decimal) {
+    val_ = (static_cast<__uint128_t>(decimal.high_bits()) << 64) | decimal.low_bits();
+  }
+
+  uint64_t hi() { return val_ >> 64; }
+  uint64_t lo() { return val_ & kInt64Mask; }
+
+  uint128_t& operator+=(const uint128_t& other) {
+    val_ += other.val_;
+    return *this;
+  }
+
+  uint128_t& operator*=(const uint128_t& other) {
+    val_ *= other.val_;
+    return *this;
+  }
+
+  __uint128_t val_;
+};
+
 #else
-  // If we can't use a native fallback, perform multiplication
+// Multiply two 64 bit word components into a 128 bit result, with high bits
+// stored in hi and low bits in lo.
+inline void ExtendAndMultiply(uint64_t x, uint64_t y, uint64_t* hi, uint64_t* lo) {
+  // Perform multiplication on two 64 bit words x and y into a 128 bit result
   // by splitting up x and y into 32 bit high/low bit components,
   // allowing us to represent the multiplication as
   // x * y = x_lo * y_lo + x_hi * y_lo * 2^32 + y_hi * x_lo * 2^32
-  // + x_hi * y_hi * 2^64.
+  // + x_hi * y_hi * 2^64
   //
-  // Now, consider the final output as lo_lo || lo_hi || hi_lo || hi_hi.
+  // Now, consider the final output as lo_lo || lo_hi || hi_lo || hi_hi
   // Therefore,
   // lo_lo is (x_lo * y_lo)_lo,
   // lo_hi is ((x_lo * y_lo)_hi + (x_hi * y_lo)_lo + (x_lo * y_hi)_lo)_lo,
   // hi_lo is ((x_hi * y_hi)_lo + (x_hi * y_lo)_hi + (x_lo * y_hi)_hi)_hi,
   // hi_hi is (x_hi * y_hi)_hi
-  const uint64_t x_lo = x & kIntMask;
-  const uint64_t y_lo = y & kIntMask;
+  const uint64_t x_lo = x & kInt32Mask;
+  const uint64_t y_lo = y & kInt32Mask;
   const uint64_t x_hi = x >> 32;
   const uint64_t y_hi = y >> 32;
 
   const uint64_t t = x_lo * y_lo;
-  const uint64_t t_lo = t & kIntMask;
+  const uint64_t t_lo = t & kInt32Mask;
   const uint64_t t_hi = t >> 32;
 
   const uint64_t u = x_hi * y_lo + t_hi;
-  const uint64_t u_lo = u & kIntMask;
+  const uint64_t u_lo = u & kInt32Mask;
   const uint64_t u_hi = u >> 32;
 
   const uint64_t v = x_lo * y_hi + u_lo;
   const uint64_t v_hi = v >> 32;
 
   *hi = x_hi * y_hi + u_hi + v_hi;
-  *lo = (v << 32) | t_lo;
-#endif
+  *lo = (v << 32) + t_lo;
 }
-#endif
 
-void MultiplyUint128(uint64_t x_hi, uint64_t x_lo, uint64_t y_hi, uint64_t y_lo,
-                     uint64_t* hi, uint64_t* lo) {
-#ifdef ARROW_USE_NATIVE_INT128
-  const __uint128_t x = (static_cast<__uint128_t>(x_hi) << 64) | x_lo;
-  const __uint128_t y = (static_cast<__uint128_t>(y_hi) << 64) | y_lo;
-  const __uint128_t r = x * y;
-  *lo = r & kInt64Mask;
-  *hi = r >> 64;
-#else
-  // To perform 128 bit multiplication without a native fallback
-  // we first perform lossless 64 bit multiplication of the low
-  // bits, and then add x_hi * y_lo and x_lo * y_hi to the high
-  // bits. Note that we can skip adding x_hi * y_hi because it
-  // always will be over 128 bits.
-  ExtendAndMultiplyUint64(x_lo, y_lo, hi, lo);
-  *hi += (x_hi * y_lo) + (x_lo * y_hi);
+struct uint128_t {
+  uint128_t() {}
+  uint128_t(uint64_t hi, uint64_t lo) : hi_(hi), lo_(lo) {}
+  explicit uint128_t(const BasicDecimal128& decimal) {
+    hi_ = decimal.high_bits();
+    lo_ = decimal.low_bits();
+  }
+
+  uint64_t hi() const { return hi_; }
+  uint64_t lo() const { return lo_; }
+
+  uint128_t& operator+=(const uint128_t& other) {
+    // To deduce the carry bit, we perform "65 bit" addition on the low bits and
+    // seeing if the resulting high bit is 1. This is accomplished by shifting the
+    // low bits to the right by 1 (chopping off the lowest bit), then adding 1 if the
+    // result of adding the two chopped bits would have produced a carry.
+    uint64_t carry = (((lo_ & other.lo_) & 1) + (lo_ >> 1) + (other.lo_ >> 1)) >> 63;
+    hi_ += other.hi_ + carry;
+    lo_ += other.lo_;
+    return *this;
+  }
+
+  uint128_t& operator*=(const uint128_t& other) {
+    uint128_t r;
+    ExtendAndMultiply(lo_, other.lo_, &r.hi_, &r.lo_);
+    r.hi_ += (hi_ * other.lo_) + (lo_ * other.hi_);
+    *this = r;
+    return *this;
+  }
+
+  uint64_t hi_;
+  uint64_t lo_;
+};
 #endif
+
+// Multiplies two N * 64 bit unsigned integer types, represented by a uint64_t
+// array into a same sized output. Elements in the array should be in
+// little endian order, and output will be the same. Overflow in multiplication
+// will result in the lower N * 64 bits of the result being set.
+template <int N>
+inline void MultiplyUnsignedArray(const std::array<uint64_t, N>& lh,
+                                  const std::array<uint64_t, N>& rh,
+                                  std::array<uint64_t, N>* result) {
+  for (int j = 0; j < N; ++j) {
+    uint64_t carry = 0;
+    for (int i = 0; i < N - j; ++i) {
+      uint128_t tmp(lh[i]);
+      tmp *= uint128_t(rh[j]);

Review comment:
       i made an explicit constructor for this case on the struct above.  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507823242



##########
File path: cpp/src/arrow/type.cc
##########
@@ -131,6 +133,7 @@ std::string ToString(Type::type id) {
     TO_STRING_CASE(FLOAT)
     TO_STRING_CASE(DOUBLE)
     TO_STRING_CASE(DECIMAL)

Review comment:
       Shouldn't this be changed to `DECIMAL128`?
   (in general, do a search for `DECIMAL` in all the C++ code, this may catch some overloooked instances)




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507816678



##########
File path: cpp/src/arrow/c/bridge_test.cc
##########
@@ -740,6 +741,7 @@ TEST_F(TestArrayExport, Primitive) {
   TestPrimitive(large_utf8(), R"(["foo", "bar", null])");
 
   TestPrimitive(decimal(16, 4), R"(["1234.5670", null])");
+  TestPrimitive(decimal256(16, 4), R"(["1234.5670", null])");

Review comment:
       Can you also add import tests?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507832075



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -775,4 +833,119 @@ int32_t BasicDecimal128::CountLeadingBinaryZeros() const {
   }
 }
 
+#if ARROW_LITTLE_ENDIAN
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[0],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[3]})) {}
+#else
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[3],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[0]})) {
+#endif
+
+BasicDecimal256& BasicDecimal256::Negate() {
+  uint64_t carry = 1;
+  for (uint64_t& elem : little_endian_array_) {
+    elem = ~elem + carry;
+    carry &= (elem == 0);
+  }
+  return *this;
+}
+
+BasicDecimal256& BasicDecimal256::Abs() { return *this < 0 ? Negate() : *this; }
+
+BasicDecimal256 BasicDecimal256::Abs(const BasicDecimal256& in) {
+  BasicDecimal256 result(in);
+  return result.Abs();
+}
+
+std::array<uint8_t, 32> BasicDecimal256::ToBytes() const {
+  std::array<uint8_t, 32> out{{0}};
+  ToBytes(out.data());
+  return out;
+}
+
+void BasicDecimal256::ToBytes(uint8_t* out) const {
+  DCHECK_NE(out, nullptr);
+#if ARROW_LITTLE_ENDIAN
+  reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[0];
+  reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[1];
+  reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[2];
+  reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[3];
+#else
+    reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[3];
+    reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[2];
+    reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[1];
+    reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[0];
+#endif
+}
+
+BasicDecimal256& BasicDecimal256::operator*=(const BasicDecimal256& right) {
+  // Since the max value of BasicDecimal256 is supposed to be 1e76 - 1 and the
+  // min the negation taking the absolute values here should always be safe.
+  const bool negate = Sign() != right.Sign();
+  BasicDecimal256 x = BasicDecimal256::Abs(*this);
+  BasicDecimal256 y = BasicDecimal256::Abs(right);
+
+  uint128_t r_hi;
+  uint128_t r_lo;
+  std::array<uint64_t, 4> res{0, 0, 0, 0};
+  MultiplyUnsignedArray<4>(x.little_endian_array_, y.little_endian_array_, &res);
+  little_endian_array_ = res;
+  if (negate) {
+    Negate();
+  }
+  return *this;
+}
+
+DecimalStatus BasicDecimal256::Rescale(int32_t original_scale, int32_t new_scale,
+                                       BasicDecimal256* out) const {
+  if (original_scale == new_scale) {
+    return DecimalStatus::kSuccess;
+  }
+  // TODO: implement.
+  return DecimalStatus::kRescaleDataLoss;
+}
+
+BasicDecimal256 operator*(const BasicDecimal256& left, const BasicDecimal256& right) {
+  BasicDecimal256 result = left;
+  result *= right;
+  return result;
+}
+
+bool operator==(const BasicDecimal256& left, const BasicDecimal256& right) {
+  return left.little_endian_array() == right.little_endian_array();
+}
+
+bool operator!=(const BasicDecimal256& left, const BasicDecimal256& right) {
+  return left.little_endian_array() != right.little_endian_array();
+}
+
+bool operator<(const BasicDecimal256& left, const BasicDecimal256& right) {
+  const std::array<uint64_t, 4>& lhs = left.little_endian_array();
+  const std::array<uint64_t, 4>& rhs = right.little_endian_array();
+  return lhs[3] != rhs[3]
+             ? static_cast<int64_t>(lhs[3]) < static_cast<int64_t>(rhs[3])
+             : lhs[2] != rhs[2] ? lhs[2] < rhs[2]
+                                : lhs[1] != rhs[1] ? lhs[1] < rhs[1] : lhs[0] < rhs[0];
+}
+
+bool operator<=(const BasicDecimal256& left, const BasicDecimal256& right) {
+  return !operator>(left, right);

Review comment:
       This will go through another indirection, so why not `return !operator<(right, left)`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508282258



##########
File path: java/vector/src/test/java/org/apache/arrow/vector/TestDecimal256Vector.java
##########
@@ -0,0 +1,364 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import org.apache.arrow.memory.ArrowBuf;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestDecimal256Vector {
+
+  private static long[] intValues;
+
+  static {
+    intValues = new long[60];
+    for (int i = 0; i < intValues.length / 2; i++) {
+      intValues[i] = 1 << i + 1;
+      intValues[2 * i] = -1 * (1 << i + 1);
+    }
+  }
+
+  private int scale = 3;
+
+  private BufferAllocator allocator;
+
+  @Before
+  public void init() {
+    allocator = new DirtyRootAllocator(Long.MAX_VALUE, (byte) 100);
+  }
+
+  @After
+  public void terminate() throws Exception {
+    allocator.close();
+  }
+
+  @Test
+  public void testValuesWriteRead() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(10, scale, 256), allocator);) {
+
+      try (Decimal256Vector oldConstructor = new Decimal256Vector("decimal", allocator, 10, scale);) {
+        assertEquals(decimalVector.getField().getType(), oldConstructor.getField().getType());
+      }
+
+      decimalVector.allocateNew();
+      BigDecimal[] values = new BigDecimal[intValues.length];
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(intValues[i]), scale);
+        values[i] = decimal;
+        decimalVector.setSafe(i, decimal);
+      }
+
+      decimalVector.setValueCount(intValues.length);
+
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals("unexpected data at index: " + i, values[i], value);
+      }
+    }
+  }
+
+  @Test
+  public void testDecimal256DifferentScaleAndPrecision() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(4, 2, 256), allocator);) {
+      decimalVector.allocateNew();
+
+      // test Decimal256 with different scale
+      boolean hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(0), 3);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;

Review comment:
       It would be nice to validate error message here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509940146



##########
File path: java/vector/src/test/java/org/apache/arrow/vector/TestDecimal256Vector.java
##########
@@ -0,0 +1,356 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.jupiter.api.Assertions.assertThrows;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import org.apache.arrow.memory.ArrowBuf;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestDecimal256Vector {
+
+  private static long[] intValues;
+
+  static {
+    intValues = new long[60];

Review comment:
       nit: it would be better to rename it to 'longValues'?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507816678



##########
File path: cpp/src/arrow/c/bridge_test.cc
##########
@@ -740,6 +741,7 @@ TEST_F(TestArrayExport, Primitive) {
   TestPrimitive(large_utf8(), R"(["foo", "bar", null])");
 
   TestPrimitive(decimal(16, 4), R"(["1234.5670", null])");
+  TestPrimitive(decimal256(16, 4), R"(["1234.5670", null])");

Review comment:
       Can you also add import and/or roundtrip tests?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-709692075


   IMO, Decimal256 is better, as it avoids confusing with `java.math.BigDecimal`. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509870508



##########
File path: java/vector/src/main/codegen/templates/ArrowType.java
##########
@@ -165,7 +165,20 @@ public final T visit(${type.name?remove_ending("_")} type) {
     ${fieldType} ${field.name};
     </#list>
 
+
+    <#if type.name == "Decimal">
+    // Needed to support golden file integration tests.
+    @JsonCreator
+    public static Decimal createDecimal128(

Review comment:
       If we allow bit width other than 128, then the method name should be createDecimal?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r506046302



##########
File path: java/vector/src/main/codegen/data/ValueVectorTypes.tdd
##########
@@ -125,10 +141,11 @@
           maxPrecisionDigits: 38, nDecimalDigits: 4, friendlyType: "BigDecimal",
           typeParams: [ {name: "scale", type: "int"}, { name: "precision", type: "int"}],
           arrowType: "org.apache.arrow.vector.types.pojo.ArrowType.Decimal",
-          fields: [{name: "start", type: "int"}, {name: "buffer", type: "ArrowBuf"}]
+          fields: [{name: "start", type: "long"}, {name: "buffer", type: "ArrowBuf"}]

Review comment:
       I am also fixing this bug in another PR (https://github.com/apache/arrow/pull/8455)
   It may involve 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r506046479



##########
File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java
##########
@@ -106,6 +106,28 @@ public void writeBigEndianBytesToDecimal(byte[] value, ArrowType arrowType) {
   public void writeBigEndianBytesToDecimal(byte[] value) {
     getWriter(MinorType.DECIMAL).writeBigEndianBytesToDecimal(value);
   }
+  <#elseif minor.class == "BigDecimal">
+  @Override
+  public void write(BigDecimalHolder holder) {
+    getWriter(MinorType.BIGDECIMAL).write(holder);
+  }
+
+  public void writeBigDecimal(int start, ArrowBuf buffer, ArrowType arrowType) {

Review comment:
       The type of start should be long?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield closed pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield closed pull request #8475:
URL: https://github.com/apache/arrow/pull/8475


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508781568



##########
File path: cpp/src/arrow/ipc/metadata_internal.cc
##########
@@ -236,7 +236,8 @@ static inline TimeUnit::type FromFlatbufferUnit(flatbuf::TimeUnit unit) {
   return TimeUnit::SECOND;
 }
 
-constexpr int32_t kDecimalBitWidth = 128;
+constexpr int32_t kDecimalBitWidth128 = 128;
+constexpr int32_t kDecimalBitWidth256 = 256;

Review comment:
       removing them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507813280



##########
File path: cpp/src/arrow/array/array_test.cc
##########
@@ -2426,7 +2433,43 @@ TEST_P(DecimalTest, WithNulls) {
   this->TestCreate(precision, draw, valid_bytes, 2);
 }
 
-INSTANTIATE_TEST_SUITE_P(DecimalTest, DecimalTest, ::testing::Range(1, 38));
+INSTANTIATE_TEST_SUITE_P(Decimal128Test, Decimal128Test, ::testing::Range(1, 38));
+
+using Decimal256Test = DecimalTest<Decimal256Type>;
+
+TEST_P(Decimal256Test, NoNulls) {
+  int32_t precision = GetParam();
+  std::vector<Decimal256> draw = {Decimal256(1), Decimal256(-2), Decimal256(2389),
+                                  Decimal256(4), Decimal256(-12348)};
+  std::vector<uint8_t> valid_bytes = {true, true, true, true, true};
+  this->TestCreate(precision, draw, valid_bytes, 0);
+  this->TestCreate(precision, draw, valid_bytes, 2);
+}
+
+TEST_P(Decimal256Test, WithNulls) {
+  int32_t precision = GetParam();
+  std::vector<Decimal256> draw = {Decimal256(1), Decimal256(2),  Decimal256(-1),
+                                  Decimal256(4), Decimal256(-1), Decimal256(1),
+                                  Decimal256(2)};
+  Decimal256 big;  // (pow(2, 255) - 1) / pow(10, 38)
+  ASSERT_OK_AND_ASSIGN(big,
+                       Decimal256::FromString("578960446186580977117854925043439539266."
+                                              "34992332820282019728792003956564819967"));
+  draw.push_back(big);
+
+  Decimal256 big_negative;  // -pow(2, 255) / pow(10, 38)
+  ASSERT_OK_AND_ASSIGN(big_negative,
+                       Decimal256::FromString("-578960446186580977117854925043439539266."
+                                              "34992332820282019728792003956564819968"));
+  draw.push_back(big_negative);
+
+  std::vector<uint8_t> valid_bytes = {true, true, false, true, false,
+                                      true, true, true,  true};
+  this->TestCreate(precision, draw, valid_bytes, 0);
+  this->TestCreate(precision, draw, valid_bytes, 2);
+}
+
+INSTANTIATE_TEST_SUITE_P(Decimal256Test, Decimal256Test, ::testing::Range(1, 76));

Review comment:
       Something like `::testing::Values(1, 2, 5, 10, 75, 76)` would sound sufficient (untested).




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508693175



##########
File path: cpp/src/arrow/util/decimal_benchmark.cc
##########
@@ -191,6 +206,7 @@ static void BinaryBitOp(benchmark::State& state) {  // NOLINT non-const referenc
 BENCHMARK(FromString);
 BENCHMARK(ToString);
 BENCHMARK(BinaryMathOp);

Review comment:
       done.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509040668



##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -119,34 +121,47 @@ public static boolean checkPrecisionAndScale(int decimalPrecision, int decimalSc
    * UnsupportedOperationException if the decimal size is greater than the Decimal vector byte
    * width.
    */
-  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index) {
+  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index, int byteWidth) {
     final byte[] bytes = value.unscaledValue().toByteArray();
-    writeByteArrayToArrowBufHelper(bytes, bytebuf, index);
+    writeByteArrayToArrowBufHelper(bytes, bytebuf, index, byteWidth);
   }
 
   /**
    * Write the given long to the ArrowBuf at the given value index.
    */
   public static void writeLongToArrowBuf(long value, ArrowBuf bytebuf, int index) {
-    final long addressOfValue = bytebuf.memoryAddress() + (long) index * DECIMAL_BYTE_LENGTH;
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 16;

Review comment:
       I think it should be OK, as the classes are in the same module?

##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -119,34 +121,47 @@ public static boolean checkPrecisionAndScale(int decimalPrecision, int decimalSc
    * UnsupportedOperationException if the decimal size is greater than the Decimal vector byte
    * width.
    */
-  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index) {
+  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index, int byteWidth) {
     final byte[] bytes = value.unscaledValue().toByteArray();
-    writeByteArrayToArrowBufHelper(bytes, bytebuf, index);
+    writeByteArrayToArrowBufHelper(bytes, bytebuf, index, byteWidth);
   }
 
   /**
    * Write the given long to the ArrowBuf at the given value index.
    */
   public static void writeLongToArrowBuf(long value, ArrowBuf bytebuf, int index) {
-    final long addressOfValue = bytebuf.memoryAddress() + (long) index * DECIMAL_BYTE_LENGTH;
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 16;
     PlatformDependent.putLong(addressOfValue, value);
     final long padValue = Long.signum(value) == -1 ? -1L : 0L;
     PlatformDependent.putLong(addressOfValue + Long.BYTES, padValue);
   }
 
+  /**
+   * Write value to the buffer extending it to 32 bytes at the given index. 
+   */
+  public static void writeLongToArrowBufBigDecimal(long value, ArrowBuf bytebuf, int index) {
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 32;
+    PlatformDependent.putLong(addressOfValue, value);
+    final long padValue = Long.signum(value) == -1 ? -1L : 0L;
+    PlatformDependent.putLong(addressOfValue + Long.BYTES, padValue);
+    PlatformDependent.putLong(addressOfValue + 2 * Long.BYTES, padValue);
+    PlatformDependent.putLong(addressOfValue + 3 * Long.BYTES, padValue);

Review comment:
       Sorry. It is my mistake. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r506185964



##########
File path: java/vector/src/main/java/org/apache/arrow/vector/BigDecimalVector.java
##########
@@ -0,0 +1,549 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+import static org.apache.arrow.vector.NullCheckingForGet.NULL_CHECKING_ENABLED;
+
+import java.math.BigDecimal;
+
+import org.apache.arrow.memory.ArrowBuf;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.complex.impl.BigDecimalReaderImpl;
+import org.apache.arrow.vector.complex.reader.FieldReader;
+import org.apache.arrow.vector.holders.BigDecimalHolder;
+import org.apache.arrow.vector.holders.NullableBigDecimalHolder;
+import org.apache.arrow.vector.types.Types.MinorType;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.apache.arrow.vector.types.pojo.Field;
+import org.apache.arrow.vector.types.pojo.FieldType;
+import org.apache.arrow.vector.util.DecimalUtility;
+import org.apache.arrow.vector.util.TransferPair;
+
+import io.netty.util.internal.PlatformDependent;
+
+/**
+ * BigDecimalVector implements a fixed width vector (32 bytes) of
+ * decimal values which could be null. A validity buffer (bit vector) is
+ * maintained to track which elements in the vector are null.
+ */
+public final class BigDecimalVector extends BaseFixedWidthVector {
+  public static final byte TYPE_WIDTH = 32;
+  private final FieldReader reader;
+
+  private final int precision;
+  private final int scale;
+
+  /**
+   * Instantiate a BigDecimalVector. This doesn't allocate any memory for
+   * the data in vector.
+   *
+   * @param name name of the vector
+   * @param allocator allocator for memory management.
+   */
+  public BigDecimalVector(String name, BufferAllocator allocator,
+                               int precision, int scale) {
+    this(name, FieldType.nullable(new ArrowType.Decimal(precision, scale, /*bitWidth=*/TYPE_WIDTH * 8)), allocator);
+  }
+
+  /**
+   * Instantiate a BigDecimalVector. This doesn't allocate any memory for
+   * the data in vector.
+   *
+   * @param name name of the vector
+   * @param fieldType type of Field materialized by this vector
+   * @param allocator allocator for memory management.
+   */
+  public BigDecimalVector(String name, FieldType fieldType, BufferAllocator allocator) {
+    this(new Field(name, fieldType, null), allocator);
+  }
+
+  /**
+   * Instantiate a BigDecimalVector. This doesn't allocate any memory for
+   * the data in vector.
+   *
+   * @param field field materialized by this vector
+   * @param allocator allocator for memory management.
+   */
+  public BigDecimalVector(Field field, BufferAllocator allocator) {
+    super(field, allocator, TYPE_WIDTH);
+    ArrowType.Decimal arrowType = (ArrowType.Decimal) field.getFieldType().getType();
+    reader = new BigDecimalReaderImpl(BigDecimalVector.this);
+    this.precision = arrowType.getPrecision();
+    this.scale = arrowType.getScale();
+  }
+
+  /**
+   * Get a reader that supports reading values from this vector.
+   *
+   * @return Field Reader for this vector
+   */
+  @Override
+  public FieldReader getReader() {
+    return reader;
+  }
+
+  /**
+   * Get minor type for this vector. The vector holds values belonging
+   * to a particular type.
+   *
+   * @return {@link org.apache.arrow.vector.types.Types.MinorType}
+   */
+  @Override
+  public MinorType getMinorType() {
+    return MinorType.BIGDECIMAL;
+  }
+
+
+  /*----------------------------------------------------------------*
+   |                                                                |
+   |          vector value retrieval methods                        |
+   |                                                                |
+   *----------------------------------------------------------------*/
+
+
+  /**
+   * Get the element at the given index from the vector.
+   *
+   * @param index   position of element
+   * @return element at given index
+   */
+  public ArrowBuf get(int index) throws IllegalStateException {
+    if (NULL_CHECKING_ENABLED && isSet(index) == 0) {
+      throw new IllegalStateException("Value at index is null");
+    }
+    return valueBuffer.slice((long) index * TYPE_WIDTH, TYPE_WIDTH);
+  }
+
+  /**
+   * Get the element at the given index from the vector and
+   * sets the state in holder. If element at given index
+   * is null, holder.isSet will be zero.
+   *
+   * @param index   position of element
+   */
+  public void get(int index, NullableBigDecimalHolder holder) {
+    if (isSet(index) == 0) {
+      holder.isSet = 0;
+      return;
+    }
+    holder.isSet = 1;
+    holder.buffer = valueBuffer;
+    holder.precision = precision;
+    holder.scale = scale;
+    holder.start = index * TYPE_WIDTH;

Review comment:
       This should be cast to long




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r506270123



##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -32,24 +32,26 @@
   private DecimalUtility() {}
 
   public static final int DECIMAL_BYTE_LENGTH = 16;
-  public static final byte [] zeroes = new byte[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
-  public static final byte [] minus_one = new byte[] {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
+  public static final byte [] zeroes = new byte[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+                                                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
+  public static final byte [] minus_one = new byte[] {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+                                                      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
 
   /**
    * Read an ArrowType.Decimal at the given value index in the ArrowBuf and convert to a BigDecimal
    * with the given scale.
    */
-  public static BigDecimal getBigDecimalFromArrowBuf(ArrowBuf bytebuf, int index, int scale) {
-    byte[] value = new byte[DECIMAL_BYTE_LENGTH];
+  public static BigDecimal getBigDecimalFromArrowBuf(ArrowBuf bytebuf, int index, int scale, int byteWidth) {
+    byte[] value = new byte[byteWidth];
     byte temp;
-    final int startIndex = index * DECIMAL_BYTE_LENGTH;
+    final long startIndex = index * byteWidth;

Review comment:
       Maybe we need a cast here
   
   Otherwise, it first multiplies two (32-bit) integers, and the promote it to a long.
   If the result of the multiplication overflows, it just promotes the overflown value to a long, which is useless. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508289499



##########
File path: java/vector/src/test/java/org/apache/arrow/vector/TestDecimal256Vector.java
##########
@@ -0,0 +1,364 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.arrow.vector;
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+import java.math.BigDecimal;
+import java.math.BigInteger;
+
+import org.apache.arrow.memory.ArrowBuf;
+import org.apache.arrow.memory.BufferAllocator;
+import org.apache.arrow.vector.types.pojo.ArrowType;
+import org.junit.After;
+import org.junit.Before;
+import org.junit.Test;
+
+public class TestDecimal256Vector {
+
+  private static long[] intValues;
+
+  static {
+    intValues = new long[60];
+    for (int i = 0; i < intValues.length / 2; i++) {
+      intValues[i] = 1 << i + 1;
+      intValues[2 * i] = -1 * (1 << i + 1);
+    }
+  }
+
+  private int scale = 3;
+
+  private BufferAllocator allocator;
+
+  @Before
+  public void init() {
+    allocator = new DirtyRootAllocator(Long.MAX_VALUE, (byte) 100);
+  }
+
+  @After
+  public void terminate() throws Exception {
+    allocator.close();
+  }
+
+  @Test
+  public void testValuesWriteRead() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(10, scale, 256), allocator);) {
+
+      try (Decimal256Vector oldConstructor = new Decimal256Vector("decimal", allocator, 10, scale);) {
+        assertEquals(decimalVector.getField().getType(), oldConstructor.getField().getType());
+      }
+
+      decimalVector.allocateNew();
+      BigDecimal[] values = new BigDecimal[intValues.length];
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(intValues[i]), scale);
+        values[i] = decimal;
+        decimalVector.setSafe(i, decimal);
+      }
+
+      decimalVector.setValueCount(intValues.length);
+
+      for (int i = 0; i < intValues.length; i++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals("unexpected data at index: " + i, values[i], value);
+      }
+    }
+  }
+
+  @Test
+  public void testDecimal256DifferentScaleAndPrecision() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(4, 2, 256), allocator);) {
+      decimalVector.allocateNew();
+
+      // test Decimal256 with different scale
+      boolean hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(0), 3);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;
+      } finally {
+        assertTrue(hasError);
+      }
+
+      // test BigDecimal with larger precision than initialized
+      hasError = false;
+      try {
+        BigDecimal decimal = new BigDecimal(BigInteger.valueOf(12345), 2);
+        decimalVector.setSafe(0, decimal);
+      } catch (UnsupportedOperationException ue) {
+        hasError = true;
+      } finally {
+        assertTrue(hasError);
+      }
+    }
+  }
+
+  @Test
+  public void testWriteBigEndian() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(38, 18, 256), allocator);) {
+      decimalVector.allocateNew();
+      BigDecimal decimal1 = new BigDecimal("123456789.000000000000000000");
+      BigDecimal decimal2 = new BigDecimal("11.123456789123456789");
+      BigDecimal decimal3 = new BigDecimal("1.000000000000000000");
+      BigDecimal decimal4 = new BigDecimal("0.111111111000000000");
+      BigDecimal decimal5 = new BigDecimal("987654321.123456789000000000");
+      BigDecimal decimal6 = new BigDecimal("222222222222.222222222000000000");
+      BigDecimal decimal7 = new BigDecimal("7777777777777.666666667000000000");
+      BigDecimal decimal8 = new BigDecimal("1212121212.343434343000000000");
+
+      byte[] decimalValue1 = decimal1.unscaledValue().toByteArray();
+      byte[] decimalValue2 = decimal2.unscaledValue().toByteArray();
+      byte[] decimalValue3 = decimal3.unscaledValue().toByteArray();
+      byte[] decimalValue4 = decimal4.unscaledValue().toByteArray();
+      byte[] decimalValue5 = decimal5.unscaledValue().toByteArray();
+      byte[] decimalValue6 = decimal6.unscaledValue().toByteArray();
+      byte[] decimalValue7 = decimal7.unscaledValue().toByteArray();
+      byte[] decimalValue8 = decimal8.unscaledValue().toByteArray();
+
+      decimalVector.setBigEndian(0, decimalValue1);
+      decimalVector.setBigEndian(1, decimalValue2);
+      decimalVector.setBigEndian(2, decimalValue3);
+      decimalVector.setBigEndian(3, decimalValue4);
+      decimalVector.setBigEndian(4, decimalValue5);
+      decimalVector.setBigEndian(5, decimalValue6);
+      decimalVector.setBigEndian(6, decimalValue7);
+      decimalVector.setBigEndian(7, decimalValue8);
+
+      decimalVector.setValueCount(8);
+      assertEquals(8, decimalVector.getValueCount());
+      assertEquals(decimal1, decimalVector.getObject(0));
+      assertEquals(decimal2, decimalVector.getObject(1));
+      assertEquals(decimal3, decimalVector.getObject(2));
+      assertEquals(decimal4, decimalVector.getObject(3));
+      assertEquals(decimal5, decimalVector.getObject(4));
+      assertEquals(decimal6, decimalVector.getObject(5));
+      assertEquals(decimal7, decimalVector.getObject(6));
+      assertEquals(decimal8, decimalVector.getObject(7));
+    }
+  }
+
+  @Test
+  public void testLongReadWrite() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(38, 0, 256), allocator)) {
+      decimalVector.allocateNew();
+
+      long[] longValues = {0L, -2L, Long.MAX_VALUE, Long.MIN_VALUE, 187L};
+
+      for (int i = 0; i < longValues.length; ++i) {
+        decimalVector.set(i, longValues[i]);
+      }
+
+      decimalVector.setValueCount(longValues.length);
+
+      for (int i = 0; i < longValues.length; ++i) {
+        assertEquals(new BigDecimal(longValues[i]), decimalVector.getObject(i));
+      }
+    }
+  }
+
+
+  @Test
+  public void testBigDecimalReadWrite() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(38, 9, 256), allocator);) {
+      decimalVector.allocateNew();
+      BigDecimal decimal1 = new BigDecimal("123456789.000000000");
+      BigDecimal decimal2 = new BigDecimal("11.123456789");
+      BigDecimal decimal3 = new BigDecimal("1.000000000");
+      BigDecimal decimal4 = new BigDecimal("-0.111111111");
+      BigDecimal decimal5 = new BigDecimal("-987654321.123456789");
+      BigDecimal decimal6 = new BigDecimal("-222222222222.222222222");
+      BigDecimal decimal7 = new BigDecimal("7777777777777.666666667");
+      BigDecimal decimal8 = new BigDecimal("1212121212.343434343");
+
+      decimalVector.set(0, decimal1);
+      decimalVector.set(1, decimal2);
+      decimalVector.set(2, decimal3);
+      decimalVector.set(3, decimal4);
+      decimalVector.set(4, decimal5);
+      decimalVector.set(5, decimal6);
+      decimalVector.set(6, decimal7);
+      decimalVector.set(7, decimal8);
+
+      decimalVector.setValueCount(8);
+      assertEquals(8, decimalVector.getValueCount());
+      assertEquals(decimal1, decimalVector.getObject(0));
+      assertEquals(decimal2, decimalVector.getObject(1));
+      assertEquals(decimal3, decimalVector.getObject(2));
+      assertEquals(decimal4, decimalVector.getObject(3));
+      assertEquals(decimal5, decimalVector.getObject(4));
+      assertEquals(decimal6, decimalVector.getObject(5));
+      assertEquals(decimal7, decimalVector.getObject(6));
+      assertEquals(decimal8, decimalVector.getObject(7));
+    }
+  }
+
+  /**
+   * Test {@link Decimal256Vector#setBigEndian(int, byte[])} which takes BE layout input and stores in LE layout.
+   * Cases to cover: input byte array in different lengths in range [1-16] and negative values.
+   */
+  @Test
+  public void decimalBE2LE() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+        new ArrowType.Decimal(23, 2, 256), allocator)) {
+      decimalVector.allocateNew();
+
+      BigInteger[] testBigInts = new BigInteger[] {
+          new BigInteger("0"),
+          new BigInteger("-1"),
+          new BigInteger("23"),
+          new BigInteger("234234"),
+          new BigInteger("-234234234"),
+          new BigInteger("234234234234"),
+          new BigInteger("-56345345345345"),
+          new BigInteger("2982346298346289346293467923465345634500"), // converts to 16+ byte array
+          new BigInteger("-389457298347598237459832459823434653600"), // converts to 16+ byte array
+          new BigInteger("-345345"),
+          new BigInteger("754533")
+      };
+
+      int insertionIdx = 0;
+      insertionIdx++; // insert a null
+      for (BigInteger val : testBigInts) {
+        decimalVector.setBigEndian(insertionIdx++, val.toByteArray());
+      }
+      insertionIdx++; // insert a null
+      // insert a zero length buffer
+      decimalVector.setBigEndian(insertionIdx++, new byte[0]);
+
+      // Try inserting a buffer larger than 33 bytes and expect a failure
+      try {
+        decimalVector.setBigEndian(insertionIdx, new byte[33]);
+        fail("above statement should have failed");
+      } catch (IllegalArgumentException ex) {
+        assertTrue(ex.getMessage().equals("Invalid decimal value length. Valid length in [1 - 32], got 33"));
+      }
+      decimalVector.setValueCount(insertionIdx);
+
+      // retrieve values and check if they are correct
+      int outputIdx = 0;
+      assertTrue(decimalVector.isNull(outputIdx++));
+      for (BigInteger expected : testBigInts) {
+        final BigDecimal actual = decimalVector.getObject(outputIdx++);
+        assertEquals(expected, actual.unscaledValue());
+      }
+      assertTrue(decimalVector.isNull(outputIdx++));
+      assertEquals(BigInteger.valueOf(0), decimalVector.getObject(outputIdx).unscaledValue());
+    }
+  }
+
+  @Test
+  public void setUsingArrowBufOfLEInts() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(5, 2, 256), allocator);
+         ArrowBuf buf = allocator.buffer(8);) {
+      decimalVector.allocateNew();
+
+      // add a positive value equivalent to 705.32
+      int val = 70532;
+      buf.setInt(0, val);
+      decimalVector.setSafe(0, 0, buf, 4);
+
+      // add a -ve value equivalent to -705.32
+      val = -70532;
+      buf.setInt(4, val);
+      decimalVector.setSafe(1, 4, buf, 4);
+
+      decimalVector.setValueCount(2);
+
+      BigDecimal [] expectedValues = new BigDecimal[] {BigDecimal.valueOf(705.32), BigDecimal
+              .valueOf(-705.32)};
+      for (int i = 0; i < 2; i ++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals(expectedValues[i], value);
+      }
+    }
+
+  }
+
+  @Test
+  public void setUsingArrowLongLEBytes() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(18, 0, 256), allocator);
+         ArrowBuf buf = allocator.buffer(16);) {
+      decimalVector.allocateNew();
+
+      long val = Long.MAX_VALUE;
+      buf.setLong(0, val);
+      decimalVector.setSafe(0, 0, buf, 8);
+
+      val = Long.MIN_VALUE;
+      buf.setLong(8, val);
+      decimalVector.setSafe(1, 8, buf, 8);
+
+      decimalVector.setValueCount(2);
+
+      BigDecimal [] expectedValues = new BigDecimal[] {BigDecimal.valueOf(Long.MAX_VALUE), BigDecimal
+              .valueOf(Long.MIN_VALUE)};
+      for (int i = 0; i < 2; i ++) {
+        BigDecimal value = decimalVector.getObject(i);
+        assertEquals(expectedValues[i], value);
+      }
+    }
+  }
+
+  @Test
+  public void setUsingArrowBufOfBEBytes() {
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(5, 2, 256), allocator);
+         ArrowBuf buf = allocator.buffer(9);) {
+      BigDecimal [] expectedValues = new BigDecimal[] {BigDecimal.valueOf(705.32), BigDecimal
+              .valueOf(-705.32), BigDecimal.valueOf(705.32)};
+      verifyWritingArrowBufWithBigEndianBytes(decimalVector, buf, expectedValues, 3);
+    }
+
+    try (Decimal256Vector decimalVector = TestUtils.newVector(Decimal256Vector.class, "decimal",
+            new ArrowType.Decimal(43, 2, 256), allocator);
+         ArrowBuf buf = allocator.buffer(45);) {
+      BigDecimal[] expectedValues = new BigDecimal[] {new BigDecimal("29823462983462893462934679234653450000000.63"),
+                                                      new BigDecimal("-2982346298346289346293467923465345.63"),
+                                                      new BigDecimal("2982346298346289346293467923465345.63")};
+      verifyWritingArrowBufWithBigEndianBytes(decimalVector, buf, expectedValues, 15);
+    }
+  }
+
+  private void verifyWritingArrowBufWithBigEndianBytes(Decimal256Vector decimalVector,
+                                                       ArrowBuf buf, BigDecimal[] expectedValues,
+                                                       int length) {
+    decimalVector.allocateNew();
+    for (int i = 0; i < expectedValues.length; i++) {
+      byte []bigEndianBytes = expectedValues[i].unscaledValue().toByteArray();

Review comment:
       The spacing is weird here




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508272146



##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -119,34 +121,47 @@ public static boolean checkPrecisionAndScale(int decimalPrecision, int decimalSc
    * UnsupportedOperationException if the decimal size is greater than the Decimal vector byte
    * width.
    */
-  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index) {
+  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index, int byteWidth) {
     final byte[] bytes = value.unscaledValue().toByteArray();
-    writeByteArrayToArrowBufHelper(bytes, bytebuf, index);
+    writeByteArrayToArrowBufHelper(bytes, bytebuf, index, byteWidth);
   }
 
   /**
    * Write the given long to the ArrowBuf at the given value index.
    */
   public static void writeLongToArrowBuf(long value, ArrowBuf bytebuf, int index) {
-    final long addressOfValue = bytebuf.memoryAddress() + (long) index * DECIMAL_BYTE_LENGTH;
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 16;

Review comment:
       It would be better to use DecimalVector.BYTE_WIDTH?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507814509



##########
File path: cpp/src/arrow/array/validate.cc
##########
@@ -64,6 +64,13 @@ struct ValidateArrayVisitor {
     return Status::OK();
   }
 
+  Status Visit(const Decimal256Array& array) {

Review comment:
       Hmm, we could have a `BaseDecimalArray` class like we already have `BaseListArray` and `BaseBinaryArray`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507830440



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -775,4 +833,119 @@ int32_t BasicDecimal128::CountLeadingBinaryZeros() const {
   }
 }
 
+#if ARROW_LITTLE_ENDIAN
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[0],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[3]})) {}
+#else
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[3],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[0]})) {
+#endif
+
+BasicDecimal256& BasicDecimal256::Negate() {
+  uint64_t carry = 1;
+  for (uint64_t& elem : little_endian_array_) {
+    elem = ~elem + carry;
+    carry &= (elem == 0);
+  }
+  return *this;
+}
+
+BasicDecimal256& BasicDecimal256::Abs() { return *this < 0 ? Negate() : *this; }
+
+BasicDecimal256 BasicDecimal256::Abs(const BasicDecimal256& in) {
+  BasicDecimal256 result(in);
+  return result.Abs();
+}
+
+std::array<uint8_t, 32> BasicDecimal256::ToBytes() const {
+  std::array<uint8_t, 32> out{{0}};
+  ToBytes(out.data());
+  return out;
+}
+
+void BasicDecimal256::ToBytes(uint8_t* out) const {
+  DCHECK_NE(out, nullptr);
+#if ARROW_LITTLE_ENDIAN
+  reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[0];
+  reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[1];
+  reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[2];
+  reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[3];
+#else
+    reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[3];
+    reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[2];
+    reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[1];
+    reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[0];

Review comment:
       Nit: wrong indentation here.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-716391341


   > This might have "broken" the spark integration builds: https://github.com/ursa-labs/crossbow/runs/1304128112
   > 
   > ```
   > Error: ] /spark/sql/catalyst/src/main/scala/org/apache/spark/sql/util/ArrowUtils.scala:47: not enough arguments for constructor Decimal: (x$1: Int, x$2: Int, x$3: Int)org.apache.arrow.vector.types.pojo.ArrowType.Decimal.
   > Unspecified value parameter x$3.
   > ```
   > 
   > (now I am not familiar enough with spark to know what kind of "broken" it is, but in any case the integration build is failing)
   
   @jorisvandenbossche Thanks for reporting the problem.
   The problem was caused by adding a new parameter to the constructor. Maybe we can solve it by restoring the default constructor and mark it as deprecated.
   Let's open an issue for it. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r506270123



##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -32,24 +32,26 @@
   private DecimalUtility() {}
 
   public static final int DECIMAL_BYTE_LENGTH = 16;
-  public static final byte [] zeroes = new byte[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
-  public static final byte [] minus_one = new byte[] {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
+  public static final byte [] zeroes = new byte[] {0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
+                                                   0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0};
+  public static final byte [] minus_one = new byte[] {-1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1,
+                                                      -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1, -1};
 
   /**
    * Read an ArrowType.Decimal at the given value index in the ArrowBuf and convert to a BigDecimal
    * with the given scale.
    */
-  public static BigDecimal getBigDecimalFromArrowBuf(ArrowBuf bytebuf, int index, int scale) {
-    byte[] value = new byte[DECIMAL_BYTE_LENGTH];
+  public static BigDecimal getBigDecimalFromArrowBuf(ArrowBuf bytebuf, int index, int scale, int byteWidth) {
+    byte[] value = new byte[byteWidth];
     byte temp;
-    final int startIndex = index * DECIMAL_BYTE_LENGTH;
+    final long startIndex = index * byteWidth;

Review comment:
       Maybe we need a cast here
   
   Otherwise, it first multiplies two (32-bit) integers, and then promotes it to a long.
   If the result of the multiplication overflows, it just promotes the overflown value to a long, which is useless. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-712676570


   > @liyafan82 I renamed to Decimal256, let me know if you have other comments on the Java side. @pitrou I'm still working through your comments.
   
   Made a first pass, and it mostly looks good.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508698381



##########
File path: java/vector/src/main/java/org/apache/arrow/vector/util/DecimalUtility.java
##########
@@ -119,34 +121,47 @@ public static boolean checkPrecisionAndScale(int decimalPrecision, int decimalSc
    * UnsupportedOperationException if the decimal size is greater than the Decimal vector byte
    * width.
    */
-  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index) {
+  public static void writeBigDecimalToArrowBuf(BigDecimal value, ArrowBuf bytebuf, int index, int byteWidth) {
     final byte[] bytes = value.unscaledValue().toByteArray();
-    writeByteArrayToArrowBufHelper(bytes, bytebuf, index);
+    writeByteArrayToArrowBufHelper(bytes, bytebuf, index, byteWidth);
   }
 
   /**
    * Write the given long to the ArrowBuf at the given value index.
    */
   public static void writeLongToArrowBuf(long value, ArrowBuf bytebuf, int index) {
-    final long addressOfValue = bytebuf.memoryAddress() + (long) index * DECIMAL_BYTE_LENGTH;
+    final long addressOfValue = bytebuf.memoryAddress() + (long) index * 16;

Review comment:
       I think that creates a circular dependency between classes?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-712610825


   @liyafan82 I renamed to Decimal256, let me know if you have other comments on the Java side.  @pitrou I'm still working through your comments.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509364544



##########
File path: cpp/src/arrow/array/validate.cc
##########
@@ -64,6 +64,13 @@ struct ValidateArrayVisitor {
     return Status::OK();
   }
 
+  Status Visit(const Decimal256Array& array) {

Review comment:
       Ok.
   




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-714905593


   Mac CI failures seem unrelated.  going to merge.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507832411



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -775,4 +833,119 @@ int32_t BasicDecimal128::CountLeadingBinaryZeros() const {
   }
 }
 
+#if ARROW_LITTLE_ENDIAN
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[0],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[3]})) {}
+#else
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[3],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[0]})) {
+#endif
+
+BasicDecimal256& BasicDecimal256::Negate() {
+  uint64_t carry = 1;
+  for (uint64_t& elem : little_endian_array_) {
+    elem = ~elem + carry;
+    carry &= (elem == 0);
+  }
+  return *this;
+}
+
+BasicDecimal256& BasicDecimal256::Abs() { return *this < 0 ? Negate() : *this; }
+
+BasicDecimal256 BasicDecimal256::Abs(const BasicDecimal256& in) {
+  BasicDecimal256 result(in);
+  return result.Abs();
+}
+
+std::array<uint8_t, 32> BasicDecimal256::ToBytes() const {
+  std::array<uint8_t, 32> out{{0}};
+  ToBytes(out.data());
+  return out;
+}
+
+void BasicDecimal256::ToBytes(uint8_t* out) const {
+  DCHECK_NE(out, nullptr);
+#if ARROW_LITTLE_ENDIAN
+  reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[0];
+  reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[1];
+  reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[2];
+  reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[3];
+#else
+    reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[3];
+    reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[2];
+    reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[1];
+    reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[0];
+#endif
+}
+
+BasicDecimal256& BasicDecimal256::operator*=(const BasicDecimal256& right) {
+  // Since the max value of BasicDecimal256 is supposed to be 1e76 - 1 and the
+  // min the negation taking the absolute values here should always be safe.
+  const bool negate = Sign() != right.Sign();
+  BasicDecimal256 x = BasicDecimal256::Abs(*this);
+  BasicDecimal256 y = BasicDecimal256::Abs(right);
+
+  uint128_t r_hi;
+  uint128_t r_lo;
+  std::array<uint64_t, 4> res{0, 0, 0, 0};
+  MultiplyUnsignedArray<4>(x.little_endian_array_, y.little_endian_array_, &res);
+  little_endian_array_ = res;
+  if (negate) {
+    Negate();
+  }
+  return *this;
+}
+
+DecimalStatus BasicDecimal256::Rescale(int32_t original_scale, int32_t new_scale,
+                                       BasicDecimal256* out) const {
+  if (original_scale == new_scale) {
+    return DecimalStatus::kSuccess;
+  }
+  // TODO: implement.
+  return DecimalStatus::kRescaleDataLoss;
+}
+
+BasicDecimal256 operator*(const BasicDecimal256& left, const BasicDecimal256& right) {
+  BasicDecimal256 result = left;
+  result *= right;
+  return result;
+}
+
+bool operator==(const BasicDecimal256& left, const BasicDecimal256& right) {
+  return left.little_endian_array() == right.little_endian_array();
+}
+
+bool operator!=(const BasicDecimal256& left, const BasicDecimal256& right) {
+  return left.little_endian_array() != right.little_endian_array();
+}
+
+bool operator<(const BasicDecimal256& left, const BasicDecimal256& right) {
+  const std::array<uint64_t, 4>& lhs = left.little_endian_array();
+  const std::array<uint64_t, 4>& rhs = right.little_endian_array();
+  return lhs[3] != rhs[3]
+             ? static_cast<int64_t>(lhs[3]) < static_cast<int64_t>(rhs[3])
+             : lhs[2] != rhs[2] ? lhs[2] < rhs[2]
+                                : lhs[1] != rhs[1] ? lhs[1] < rhs[1] : lhs[0] < rhs[0];
+}
+
+bool operator<=(const BasicDecimal256& left, const BasicDecimal256& right) {
+  return !operator>(left, right);

Review comment:
       Also, these indirections may be put in the `.h` as inline methods.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-714233095


   i Believe the archery test is unrelated i opened https://issues.apache.org/jira/browse/ARROW-10367 to track.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#issuecomment-713654633


   I posted two follow-up comments, but generally the C++ changes look good to me.
   I see that the CSV reader hasn't been updated, feel free to open a JIRA and I can do it later.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509864656



##########
File path: java/vector/src/main/codegen/templates/AbstractPromotableFieldWriter.java
##########
@@ -75,7 +75,7 @@ public void endList() {
 
   <#list vv.types as type><#list type.minor as minor><#assign name = minor.class?cap_first />
     <#assign fields = minor.fields!type.fields />
-  <#if minor.class != "Decimal">
+  <#if minor.class != "Decimal" && minor.class != "BigDecimal">

Review comment:
       BigDecimal -> Decimal256?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] liyafan82 commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

liyafan82 commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r506095989



##########
File path: java/vector/src/main/codegen/templates/UnionVector.java
##########
@@ -294,10 +294,10 @@ public StructVector getStruct() {
     }
     return ${uncappedName}Vector;
   }
-  <#if minor.class == "Decimal">
+  <#if minor.class?ends_with("Decimal")>
   public ${name}Vector get${name}Vector() {
     if (${uncappedName}Vector == null) {
-      throw new IllegalArgumentException("No Decimal Vector present. Provide ArrowType argument to create a new vector");
+      throw new IllegalArgumentException("No Decimal ${uncappedName} present. Provide ArrowType argument to create a new vector");

Review comment:
       It should be "No ${uncappedName} vector present ..." ?  




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507810468



##########
File path: cpp/src/arrow/array/array_dict_test.cc
##########
@@ -857,40 +857,48 @@ TEST(TestDecimalDictionaryBuilder, Basic) {
   ASSERT_TRUE(expected.Equals(result));
 }
 
-TEST(TestDecimalDictionaryBuilder, DoubleTableSize) {
-  const auto& decimal_type = arrow::decimal(21, 0);
+TEST(TestDecimal128DictionaryBuilder, Basic) {
+  TestDecimalDictionaryBuilderBasic<Decimal128>(arrow::decimal128(2, 0));
+}
+
+TEST(TestDecimal256DictionaryBuilder, Basic) {
+  TestDecimalDictionaryBuilderBasic<Decimal256>(arrow::decimal256(76, 0));
+}
 
+void TestDecimalDictionaryBuilderDoubleTableSize(
+    std::shared_ptr<DataType> decimal_type, FixedSizeBinaryBuilder& decimal_builder) {
   // Build the dictionary Array
   DictionaryBuilder<FixedSizeBinaryType> dict_builder(decimal_type);
 
   // Build expected data
-  Decimal128Builder decimal_builder(decimal_type);
   Int16Builder int_builder;
 
   // Fill with 1024 different values
   for (int64_t i = 0; i < 1024; i++) {
-    const uint8_t bytes[] = {0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             12,
-                             12,
-                             static_cast<uint8_t>(i / 128),
-                             static_cast<uint8_t>(i % 128)};
+    // Decimal256Builder takes 32 bytes, while Decimal128Builder takes only the first 16
+    // bytes.
+    const uint8_t bytes[32] = {0,

Review comment:
       Are we sure the remaining bytes will be zeroed?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508791783



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -775,4 +833,119 @@ int32_t BasicDecimal128::CountLeadingBinaryZeros() const {
   }
 }
 
+#if ARROW_LITTLE_ENDIAN
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[0],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[3]})) {}
+#else
+BasicDecimal256::BasicDecimal256(const uint8_t* bytes)
+    : little_endian_array_(
+          std::array<uint64_t, 4>({reinterpret_cast<const uint64_t*>(bytes)[3],
+                                   reinterpret_cast<const uint64_t*>(bytes)[2],
+                                   reinterpret_cast<const uint64_t*>(bytes)[1],
+                                   reinterpret_cast<const uint64_t*>(bytes)[0]})) {
+#endif
+
+BasicDecimal256& BasicDecimal256::Negate() {
+  uint64_t carry = 1;
+  for (uint64_t& elem : little_endian_array_) {
+    elem = ~elem + carry;
+    carry &= (elem == 0);
+  }
+  return *this;
+}
+
+BasicDecimal256& BasicDecimal256::Abs() { return *this < 0 ? Negate() : *this; }
+
+BasicDecimal256 BasicDecimal256::Abs(const BasicDecimal256& in) {
+  BasicDecimal256 result(in);
+  return result.Abs();
+}
+
+std::array<uint8_t, 32> BasicDecimal256::ToBytes() const {
+  std::array<uint8_t, 32> out{{0}};
+  ToBytes(out.data());
+  return out;
+}
+
+void BasicDecimal256::ToBytes(uint8_t* out) const {
+  DCHECK_NE(out, nullptr);
+#if ARROW_LITTLE_ENDIAN
+  reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[0];
+  reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[1];
+  reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[2];
+  reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[3];
+#else
+    reinterpret_cast<int64_t*>(out)[0] = little_endian_array_[3];
+    reinterpret_cast<int64_t*>(out)[1] = little_endian_array_[2];
+    reinterpret_cast<int64_t*>(out)[2] = little_endian_array_[1];
+    reinterpret_cast<int64_t*>(out)[3] = little_endian_array_[0];

Review comment:
       fixed




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507834219



##########
File path: cpp/src/arrow/util/decimal_benchmark.cc
##########
@@ -148,6 +148,21 @@ static void BinaryMathOp(benchmark::State& state) {  // NOLINT non-const referen
   state.SetItemsProcessed(state.iterations() * kValueSize);
 }
 
+static void BinaryMathOp256(benchmark::State& state) {  // NOLINT non-const reference
+  std::vector<BasicDecimal256> v1, v2;
+  for (uint64_t x = 0; x < kValueSize; x++) {
+    v1.push_back(BasicDecimal256({100 + x, 100 + x, 100 + x, 100 + x}));
+    v2.push_back(BasicDecimal256({200 + x, 200 + x, 200 + x, 200 + x}));
+  }
+
+  for (auto _ : state) {
+    for (int x = 0; x < kValueSize; x += 5) {
+      benchmark::DoNotOptimize(v1[x + 2] * v2[x + 2]);

Review comment:
       Why only this line? Ideally we would to the same operations as in `BinaryMathOp`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508777323



##########
File path: cpp/src/arrow/array/array_dict_test.cc
##########
@@ -857,40 +857,48 @@ TEST(TestDecimalDictionaryBuilder, Basic) {
   ASSERT_TRUE(expected.Equals(result));
 }
 
-TEST(TestDecimalDictionaryBuilder, DoubleTableSize) {
-  const auto& decimal_type = arrow::decimal(21, 0);
+TEST(TestDecimal128DictionaryBuilder, Basic) {
+  TestDecimalDictionaryBuilderBasic<Decimal128>(arrow::decimal128(2, 0));
+}
+
+TEST(TestDecimal256DictionaryBuilder, Basic) {
+  TestDecimalDictionaryBuilderBasic<Decimal256>(arrow::decimal256(76, 0));
+}
 
+void TestDecimalDictionaryBuilderDoubleTableSize(
+    std::shared_ptr<DataType> decimal_type, FixedSizeBinaryBuilder& decimal_builder) {
   // Build the dictionary Array
   DictionaryBuilder<FixedSizeBinaryType> dict_builder(decimal_type);
 
   // Build expected data
-  Decimal128Builder decimal_builder(decimal_type);
   Int16Builder int_builder;
 
   // Fill with 1024 different values
   for (int64_t i = 0; i < 1024; i++) {
-    const uint8_t bytes[] = {0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             0,
-                             12,
-                             12,
-                             static_cast<uint8_t>(i / 128),
-                             static_cast<uint8_t>(i % 128)};
+    // Decimal256Builder takes 32 bytes, while Decimal128Builder takes only the first 16
+    // bytes.
+    const uint8_t bytes[32] = {0,

Review comment:
       According o: https://en.cppreference.com/w/c/language/array_initialization is should be.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r509419492



##########
File path: cpp/src/arrow/type.cc
##########
@@ -131,6 +133,7 @@ std::string ToString(Type::type id) {
     TO_STRING_CASE(FLOAT)
     TO_STRING_CASE(DOUBLE)
     TO_STRING_CASE(DECIMAL)

Review comment:
       yes, that is the intent.  I'll be opening up a bunch of JIRA work to track down usages and remove.  Partial list so far:
   - CSV
   - Ruby/Gobj bindings
   - Implementation for Parquet
   - Finish Python implementation (rescaling is needed)
   - Gandiva 
   - Computation kernels (in particular casts)
   
   Likely some others ...




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] pitrou commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

pitrou commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r507828684



##########
File path: cpp/src/arrow/util/basic_decimal.cc
##########
@@ -254,67 +254,125 @@ BasicDecimal128& BasicDecimal128::operator>>=(uint32_t bits) {
 
 namespace {
 
-// TODO: Remove this guard once it's used by BasicDecimal256
-#ifndef ARROW_USE_NATIVE_INT128
-// This method losslessly multiplies x and y into a 128 bit unsigned integer
-// whose high bits will be stored in hi and low bits in lo.
-void ExtendAndMultiplyUint64(uint64_t x, uint64_t y, uint64_t* hi, uint64_t* lo) {
+// Convenience wrapper type over 128 bit unsigned integers. We opt not to
+// replace the uint128_t type in int128_internal.h because it would require
+// significantly more implementation work to be done. This class merely
+// provides the minimum necessary set of functions to perform 128+ bit
+// multiplication operations when there may or may not be native support.
 #ifdef ARROW_USE_NATIVE_INT128
-  const __uint128_t r = static_cast<__uint128_t>(x) * y;
-  *lo = r & kInt64Mask;
-  *hi = r >> 64;
+struct uint128_t {
+  uint128_t() {}
+  uint128_t(uint64_t hi, uint64_t lo) : val_((static_cast<__uint128_t>(hi) << 64) | lo) {}
+  explicit uint128_t(const BasicDecimal128& decimal) {
+    val_ = (static_cast<__uint128_t>(decimal.high_bits()) << 64) | decimal.low_bits();
+  }
+
+  uint64_t hi() { return val_ >> 64; }
+  uint64_t lo() { return val_ & kInt64Mask; }
+
+  uint128_t& operator+=(const uint128_t& other) {
+    val_ += other.val_;
+    return *this;
+  }
+
+  uint128_t& operator*=(const uint128_t& other) {
+    val_ *= other.val_;
+    return *this;
+  }
+
+  __uint128_t val_;
+};
+
 #else
-  // If we can't use a native fallback, perform multiplication
+// Multiply two 64 bit word components into a 128 bit result, with high bits
+// stored in hi and low bits in lo.
+inline void ExtendAndMultiply(uint64_t x, uint64_t y, uint64_t* hi, uint64_t* lo) {
+  // Perform multiplication on two 64 bit words x and y into a 128 bit result
   // by splitting up x and y into 32 bit high/low bit components,
   // allowing us to represent the multiplication as
   // x * y = x_lo * y_lo + x_hi * y_lo * 2^32 + y_hi * x_lo * 2^32
-  // + x_hi * y_hi * 2^64.
+  // + x_hi * y_hi * 2^64
   //
-  // Now, consider the final output as lo_lo || lo_hi || hi_lo || hi_hi.
+  // Now, consider the final output as lo_lo || lo_hi || hi_lo || hi_hi
   // Therefore,
   // lo_lo is (x_lo * y_lo)_lo,
   // lo_hi is ((x_lo * y_lo)_hi + (x_hi * y_lo)_lo + (x_lo * y_hi)_lo)_lo,
   // hi_lo is ((x_hi * y_hi)_lo + (x_hi * y_lo)_hi + (x_lo * y_hi)_hi)_hi,
   // hi_hi is (x_hi * y_hi)_hi
-  const uint64_t x_lo = x & kIntMask;
-  const uint64_t y_lo = y & kIntMask;
+  const uint64_t x_lo = x & kInt32Mask;
+  const uint64_t y_lo = y & kInt32Mask;
   const uint64_t x_hi = x >> 32;
   const uint64_t y_hi = y >> 32;
 
   const uint64_t t = x_lo * y_lo;
-  const uint64_t t_lo = t & kIntMask;
+  const uint64_t t_lo = t & kInt32Mask;
   const uint64_t t_hi = t >> 32;
 
   const uint64_t u = x_hi * y_lo + t_hi;
-  const uint64_t u_lo = u & kIntMask;
+  const uint64_t u_lo = u & kInt32Mask;
   const uint64_t u_hi = u >> 32;
 
   const uint64_t v = x_lo * y_hi + u_lo;
   const uint64_t v_hi = v >> 32;
 
   *hi = x_hi * y_hi + u_hi + v_hi;
-  *lo = (v << 32) | t_lo;
-#endif
+  *lo = (v << 32) + t_lo;
 }
-#endif
 
-void MultiplyUint128(uint64_t x_hi, uint64_t x_lo, uint64_t y_hi, uint64_t y_lo,
-                     uint64_t* hi, uint64_t* lo) {
-#ifdef ARROW_USE_NATIVE_INT128
-  const __uint128_t x = (static_cast<__uint128_t>(x_hi) << 64) | x_lo;
-  const __uint128_t y = (static_cast<__uint128_t>(y_hi) << 64) | y_lo;
-  const __uint128_t r = x * y;
-  *lo = r & kInt64Mask;
-  *hi = r >> 64;
-#else
-  // To perform 128 bit multiplication without a native fallback
-  // we first perform lossless 64 bit multiplication of the low
-  // bits, and then add x_hi * y_lo and x_lo * y_hi to the high
-  // bits. Note that we can skip adding x_hi * y_hi because it
-  // always will be over 128 bits.
-  ExtendAndMultiplyUint64(x_lo, y_lo, hi, lo);
-  *hi += (x_hi * y_lo) + (x_lo * y_hi);
+struct uint128_t {
+  uint128_t() {}
+  uint128_t(uint64_t hi, uint64_t lo) : hi_(hi), lo_(lo) {}
+  explicit uint128_t(const BasicDecimal128& decimal) {
+    hi_ = decimal.high_bits();
+    lo_ = decimal.low_bits();
+  }
+
+  uint64_t hi() const { return hi_; }
+  uint64_t lo() const { return lo_; }
+
+  uint128_t& operator+=(const uint128_t& other) {
+    // To deduce the carry bit, we perform "65 bit" addition on the low bits and
+    // seeing if the resulting high bit is 1. This is accomplished by shifting the
+    // low bits to the right by 1 (chopping off the lowest bit), then adding 1 if the
+    // result of adding the two chopped bits would have produced a carry.
+    uint64_t carry = (((lo_ & other.lo_) & 1) + (lo_ >> 1) + (other.lo_ >> 1)) >> 63;
+    hi_ += other.hi_ + carry;
+    lo_ += other.lo_;
+    return *this;
+  }
+
+  uint128_t& operator*=(const uint128_t& other) {
+    uint128_t r;
+    ExtendAndMultiply(lo_, other.lo_, &r.hi_, &r.lo_);
+    r.hi_ += (hi_ * other.lo_) + (lo_ * other.hi_);
+    *this = r;
+    return *this;
+  }
+
+  uint64_t hi_;
+  uint64_t lo_;
+};
 #endif
+
+// Multiplies two N * 64 bit unsigned integer types, represented by a uint64_t
+// array into a same sized output. Elements in the array should be in
+// little endian order, and output will be the same. Overflow in multiplication
+// will result in the lower N * 64 bits of the result being set.
+template <int N>
+inline void MultiplyUnsignedArray(const std::array<uint64_t, N>& lh,
+                                  const std::array<uint64_t, N>& rh,
+                                  std::array<uint64_t, N>* result) {
+  for (int j = 0; j < N; ++j) {
+    uint64_t carry = 0;
+    for (int i = 0; i < N - j; ++i) {
+      uint128_t tmp(lh[i]);
+      tmp *= uint128_t(rh[j]);

Review comment:
       I don't see a `uint128_t(uint64_t)` constructor, so will this go through the `uint128_t(const BasicDecimal128&)` constructor?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [arrow] emkornfield commented on a change in pull request #8475: ARROW-9747: [Java][C++] Initial Support for 256-bit Decimals

Posted by GitBox <gi...@apache.org>.

emkornfield commented on a change in pull request #8475:
URL: https://github.com/apache/arrow/pull/8475#discussion_r508779161



##########
File path: cpp/src/arrow/array/array_test.cc
##########
@@ -2426,7 +2433,43 @@ TEST_P(DecimalTest, WithNulls) {
   this->TestCreate(precision, draw, valid_bytes, 2);
 }
 
-INSTANTIATE_TEST_SUITE_P(DecimalTest, DecimalTest, ::testing::Range(1, 38));
+INSTANTIATE_TEST_SUITE_P(Decimal128Test, Decimal128Test, ::testing::Range(1, 38));
+
+using Decimal256Test = DecimalTest<Decimal256Type>;
+
+TEST_P(Decimal256Test, NoNulls) {
+  int32_t precision = GetParam();
+  std::vector<Decimal256> draw = {Decimal256(1), Decimal256(-2), Decimal256(2389),
+                                  Decimal256(4), Decimal256(-12348)};
+  std::vector<uint8_t> valid_bytes = {true, true, true, true, true};
+  this->TestCreate(precision, draw, valid_bytes, 0);
+  this->TestCreate(precision, draw, valid_bytes, 2);
+}
+
+TEST_P(Decimal256Test, WithNulls) {
+  int32_t precision = GetParam();
+  std::vector<Decimal256> draw = {Decimal256(1), Decimal256(2),  Decimal256(-1),
+                                  Decimal256(4), Decimal256(-1), Decimal256(1),
+                                  Decimal256(2)};
+  Decimal256 big;  // (pow(2, 255) - 1) / pow(10, 38)
+  ASSERT_OK_AND_ASSIGN(big,
+                       Decimal256::FromString("578960446186580977117854925043439539266."
+                                              "34992332820282019728792003956564819967"));
+  draw.push_back(big);
+
+  Decimal256 big_negative;  // -pow(2, 255) / pow(10, 38)
+  ASSERT_OK_AND_ASSIGN(big_negative,
+                       Decimal256::FromString("-578960446186580977117854925043439539266."
+                                              "34992332820282019728792003956564819968"));
+  draw.push_back(big_negative);
+
+  std::vector<uint8_t> valid_bytes = {true, true, false, true, false,
+                                      true, true, true,  true};
+  this->TestCreate(precision, draw, valid_bytes, 0);
+  this->TestCreate(precision, draw, valid_bytes, 2);
+}
+
+INSTANTIATE_TEST_SUITE_P(Decimal256Test, Decimal256Test, ::testing::Range(1, 76));

Review comment:
       The testing every value between 1 and 38 for decimal 128 appears to be the previous [behavior](https://github.com/apache/arrow/pull/8475/files#diff-0d86cf45e6dbc2db7c716f28ac6f7c6a1d938f5dc91611a2394f8b250c178026L2429)  I think these tests are fairly light weight but I'll update for Decimal256




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org