You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/06/14 07:57:22 UTC

[GitHub] [arrow] kou opened a new pull request, #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

kou opened a new pull request, #13377:
URL: https://github.com/apache/arrow/pull/13377

   This requires bigdecimal 3.1.0 or later for BigDecimal#scale.
   
   Arrow::Array.new([BigDecimal]) detects the max precision and scale
   from BigDecimals and creates suitable Arrow::Decimal{128,256}DataType
   automatically.
   
   This also truncates given BigDecimal when the specified
   Arrow::Decimal{128,256}DataType doesn't have enough and scale. This
   still doesn't check precision. If an user specifies data that have too
   much precision, the data are used as-is.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
kou commented on PR #13377:
URL: https://github.com/apache/arrow/pull/13377#issuecomment-1163778275

   +1


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] mrkn commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
mrkn commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r901225314


##########
ruby/red-arrow/lib/arrow/array-builder.rb:
##########
@@ -121,15 +126,28 @@ def detect_builder_info(value, builder_info)
             detected: true,
           }
         when BigDecimal
-          if value.to_arrow.is_a?(Decimal128)
-            {
-              builder: Decimal128ArrayBuilder.new,
-            }
-          else
+          builder_info ||= {}
+          if builder_info[:builder]
             {
-              builder: Decimal256ArrayBuilder.new,
+              builder: StringArrayBuilder.new,
               detected: true,
             }
+          else
+            precision = [builder_info[:precision] || 0, value.precision].max
+            scale = [builder_info[:scale] || 0, value.scale].max

Review Comment:
   I see.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
kou commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r898577939


##########
ruby/red-arrow/lib/arrow/array-builder.rb:
##########
@@ -121,15 +126,28 @@ def detect_builder_info(value, builder_info)
             detected: true,
           }
         when BigDecimal
-          if value.to_arrow.is_a?(Decimal128)
-            {
-              builder: Decimal128ArrayBuilder.new,
-            }
-          else
+          builder_info ||= {}
+          if builder_info[:builder]
             {
-              builder: Decimal256ArrayBuilder.new,
+              builder: StringArrayBuilder.new,
               detected: true,
             }
+          else
+            precision = [builder_info[:precision] || 0, value.precision].max
+            scale = [builder_info[:scale] || 0, value.scale].max

Review Comment:
   Could you provide an example case?
   `Arrow::Array.new([BigDecimal("1.1"), BigDecimal("11.11")])`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou merged pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
kou merged PR #13377:
URL: https://github.com/apache/arrow/pull/13377


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
kou commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r903355360


##########
ruby/red-arrow/test/test-array-builder.rb:
##########
@@ -68,6 +68,56 @@ def assert_build(builder_class, raw_array)
                      ])
       end
 
+      test("decimal + string") do
+        raw_array = [BigDecimal("10.1"), "10.1"]
+        array = Arrow::ArrayBuilder.build(raw_array)
+        assert_equal(raw_array.collect(&:to_s), array.to_a)
+      end
+
+      test("decimal128") do
+        values = [
+          BigDecimal("10.1"),
+          BigDecimal("1.11"),
+          BigDecimal("1"),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal128DataType.new(3, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("10.1"),
+                         BigDecimal("1.11"),
+                         BigDecimal("1"),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+
+      test("decimal256") do
+        values = [
+          BigDecimal("1" * 40 + ".1"),
+          BigDecimal("1" * 38 + ".11"),
+          BigDecimal("1" * 37),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal256DataType.new(41, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("1" * 40 + ".1"),
+                         BigDecimal("1" * 38 + ".11"),
+                         BigDecimal("1" * 37),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+

Review Comment:
   I've added a validation for `BigDecimal::NAN` and `BigDecimal::INFINITY` to `Arrow::Decimal{128,256}ArrayBuilder`. If users append `BigDecimal::NAN` or `BigDecimal::INFINITY`, `FloatDomainError` is raised.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] mrkn commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
mrkn commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r901053447


##########
ruby/red-arrow/lib/arrow/array-builder.rb:
##########
@@ -121,15 +126,28 @@ def detect_builder_info(value, builder_info)
             detected: true,
           }
         when BigDecimal
-          if value.to_arrow.is_a?(Decimal128)
-            {
-              builder: Decimal128ArrayBuilder.new,
-            }
-          else
+          builder_info ||= {}
+          if builder_info[:builder]
             {
-              builder: Decimal256ArrayBuilder.new,
+              builder: StringArrayBuilder.new,
               detected: true,
             }
+          else
+            precision = [builder_info[:precision] || 0, value.precision].max
+            scale = [builder_info[:scale] || 0, value.scale].max

Review Comment:
   I misreading the code.  It's OK now.  But, I guess we may need to permit to pass precision and scale to `Arrow::Array.new` as optional arguments like`Arrow::Array.new([BigDecimal("1.1"), BigDecimal("11.11")], scale: 1)`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] mrkn commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
mrkn commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r901225084


##########
ruby/red-arrow/test/test-array-builder.rb:
##########
@@ -68,6 +68,56 @@ def assert_build(builder_class, raw_array)
                      ])
       end
 
+      test("decimal + string") do
+        raw_array = [BigDecimal("10.1"), "10.1"]
+        array = Arrow::ArrayBuilder.build(raw_array)
+        assert_equal(raw_array.collect(&:to_s), array.to_a)
+      end
+
+      test("decimal128") do
+        values = [
+          BigDecimal("10.1"),
+          BigDecimal("1.11"),
+          BigDecimal("1"),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal128DataType.new(3, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("10.1"),
+                         BigDecimal("1.11"),
+                         BigDecimal("1"),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+
+      test("decimal256") do
+        values = [
+          BigDecimal("1" * 40 + ".1"),
+          BigDecimal("1" * 38 + ".11"),
+          BigDecimal("1" * 37),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal256DataType.new(41, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("1" * 40 + ".1"),
+                         BigDecimal("1" * 38 + ".11"),
+                         BigDecimal("1" * 37),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+

Review Comment:
   OK, It makes sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
kou commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r901197288


##########
ruby/red-arrow/test/test-array-builder.rb:
##########
@@ -68,6 +68,56 @@ def assert_build(builder_class, raw_array)
                      ])
       end
 
+      test("decimal + string") do
+        raw_array = [BigDecimal("10.1"), "10.1"]
+        array = Arrow::ArrayBuilder.build(raw_array)
+        assert_equal(raw_array.collect(&:to_s), array.to_a)
+      end
+
+      test("decimal128") do
+        values = [
+          BigDecimal("10.1"),
+          BigDecimal("1.11"),
+          BigDecimal("1"),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal128DataType.new(3, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("10.1"),
+                         BigDecimal("1.11"),
+                         BigDecimal("1"),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+
+      test("decimal256") do
+        values = [
+          BigDecimal("1" * 40 + ".1"),
+          BigDecimal("1" * 38 + ".11"),
+          BigDecimal("1" * 37),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal256DataType.new(41, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("1" * 40 + ".1"),
+                         BigDecimal("1" * 38 + ".11"),
+                         BigDecimal("1" * 37),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+

Review Comment:
   We should build an Apache Arrow array with `Arrow::ArrayBuilder.build` instead of raising an exception. I'll use `Arrow::StringArray` for the case.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13377:
URL: https://github.com/apache/arrow/pull/13377#issuecomment-1154855972

   :warning: Ticket **has not been started in JIRA**, please click 'Start Progress'.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] github-actions[bot] commented on pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #13377:
URL: https://github.com/apache/arrow/pull/13377#issuecomment-1154855949

   https://issues.apache.org/jira/browse/ARROW-14518


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] mrkn commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
mrkn commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r901053764


##########
ruby/red-arrow/test/test-array-builder.rb:
##########
@@ -68,6 +68,56 @@ def assert_build(builder_class, raw_array)
                      ])
       end
 
+      test("decimal + string") do
+        raw_array = [BigDecimal("10.1"), "10.1"]
+        array = Arrow::ArrayBuilder.build(raw_array)
+        assert_equal(raw_array.collect(&:to_s), array.to_a)
+      end
+
+      test("decimal128") do
+        values = [
+          BigDecimal("10.1"),
+          BigDecimal("1.11"),
+          BigDecimal("1"),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal128DataType.new(3, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("10.1"),
+                         BigDecimal("1.11"),
+                         BigDecimal("1"),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+
+      test("decimal256") do
+        values = [
+          BigDecimal("1" * 40 + ".1"),
+          BigDecimal("1" * 38 + ".11"),
+          BigDecimal("1" * 37),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal256DataType.new(41, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("1" * 40 + ".1"),
+                         BigDecimal("1" * 38 + ".11"),
+                         BigDecimal("1" * 37),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+

Review Comment:
   Throwing `FloatDomainError`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
kou commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r901196693


##########
ruby/red-arrow/lib/arrow/array-builder.rb:
##########
@@ -121,15 +126,28 @@ def detect_builder_info(value, builder_info)
             detected: true,
           }
         when BigDecimal
-          if value.to_arrow.is_a?(Decimal128)
-            {
-              builder: Decimal128ArrayBuilder.new,
-            }
-          else
+          builder_info ||= {}
+          if builder_info[:builder]
             {
-              builder: Decimal256ArrayBuilder.new,
+              builder: StringArrayBuilder.new,
               detected: true,
             }
+          else
+            precision = [builder_info[:precision] || 0, value.precision].max
+            scale = [builder_info[:scale] || 0, value.scale].max

Review Comment:
   If users know an expected data type, they should use `Arrow::XXXArray.new` instead of `Arrow::Array.new` such as `Arrow::Decimal128Array.new({scale: ..., precision: ...}, [BigDecimal(...). ...])`.
   
   `Arrow::Array.new` may not build a `Arrow::Decimal128Array`. In the case, the given optional arguments are just ignored. It may confuse users.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] kou commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
kou commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r898578609


##########
ruby/red-arrow/test/test-array-builder.rb:
##########
@@ -68,6 +68,56 @@ def assert_build(builder_class, raw_array)
                      ])
       end
 
+      test("decimal + string") do
+        raw_array = [BigDecimal("10.1"), "10.1"]
+        array = Arrow::ArrayBuilder.build(raw_array)
+        assert_equal(raw_array.collect(&:to_s), array.to_a)
+      end
+
+      test("decimal128") do
+        values = [
+          BigDecimal("10.1"),
+          BigDecimal("1.11"),
+          BigDecimal("1"),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal128DataType.new(3, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("10.1"),
+                         BigDecimal("1.11"),
+                         BigDecimal("1"),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+
+      test("decimal256") do
+        values = [
+          BigDecimal("1" * 40 + ".1"),
+          BigDecimal("1" * 38 + ".11"),
+          BigDecimal("1" * 37),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal256DataType.new(41, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("1" * 40 + ".1"),
+                         BigDecimal("1" * 38 + ".11"),
+                         BigDecimal("1" * 37),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+

Review Comment:
   Oh, I didn't notice them.
   What is the expected behavior with them? It seems that decimal classes in C++ can't represent `NAN` and `INFINITY`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [arrow] mrkn commented on a diff in pull request #13377: ARROW-14518: [Ruby] Add support for Arrow::Array.new([BigDecimal])

Posted by GitBox <gi...@apache.org>.
mrkn commented on code in PR #13377:
URL: https://github.com/apache/arrow/pull/13377#discussion_r898144157


##########
ruby/red-arrow/lib/arrow/array-builder.rb:
##########
@@ -121,15 +126,28 @@ def detect_builder_info(value, builder_info)
             detected: true,
           }
         when BigDecimal
-          if value.to_arrow.is_a?(Decimal128)
-            {
-              builder: Decimal128ArrayBuilder.new,
-            }
-          else
+          builder_info ||= {}
+          if builder_info[:builder]
             {
-              builder: Decimal256ArrayBuilder.new,
+              builder: StringArrayBuilder.new,
               detected: true,
             }
+          else
+            precision = [builder_info[:precision] || 0, value.precision].max
+            scale = [builder_info[:scale] || 0, value.scale].max

Review Comment:
   These 2 lines don't permit overwriting the value's precision and the scale with smaller values by specifying the precision and the scale in `builder_info`.  Is it intentional?



##########
ruby/red-arrow/test/test-array-builder.rb:
##########
@@ -68,6 +68,56 @@ def assert_build(builder_class, raw_array)
                      ])
       end
 
+      test("decimal + string") do
+        raw_array = [BigDecimal("10.1"), "10.1"]
+        array = Arrow::ArrayBuilder.build(raw_array)
+        assert_equal(raw_array.collect(&:to_s), array.to_a)
+      end
+
+      test("decimal128") do
+        values = [
+          BigDecimal("10.1"),
+          BigDecimal("1.11"),
+          BigDecimal("1"),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal128DataType.new(3, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("10.1"),
+                         BigDecimal("1.11"),
+                         BigDecimal("1"),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+
+      test("decimal256") do
+        values = [
+          BigDecimal("1" * 40 + ".1"),
+          BigDecimal("1" * 38 + ".11"),
+          BigDecimal("1" * 37),
+        ]
+        array = Arrow::Array.new(values)
+        data_type = Arrow::Decimal256DataType.new(41, 2)
+        assert_equal({
+                       data_type: data_type,
+                       values: [
+                         BigDecimal("1" * 40 + ".1"),
+                         BigDecimal("1" * 38 + ".11"),
+                         BigDecimal("1" * 37),
+                       ],
+                     },
+                     {
+                       data_type: array.value_data_type,
+                       values: array.to_a,
+                     })
+      end
+

Review Comment:
   Is it unnecessary to test with `BigDecimal::NAN` and `BigDecimal::INFINITY`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org