You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by "berkaysynnada (via GitHub)" <gi...@apache.org> on 2023/11/21 10:51:30 UTC

[I] Negatively Resulting Interval Subtractions [arrow-rs]

berkaysynnada opened a new issue, #5105:
URL: https://github.com/apache/arrow-rs/issues/5105

   **Describe the bug**
   <!--
   A clear and concise description of what the bug is.
   -->
   
   When subtracting `IntervalDayTimeType` instances, `make_value()` function of `IntervalDayTimeType` 
   
   https://github.com/apache/arrow-rs/blob/df69ef57d055453c399fa925ad315d19211d7ab2/arrow-array/src/types.rs#L849C4-L868C6
   
   creates the new instance of difference. However, this function cannot properly handle the negative inputs (When the day or millisecond part of the difference is negative).
   
   **To Reproduce**
   <!--
   Steps to reproduce the behavior:
   -->
   
   `let a = IntervalDayTimeArray::from(vec![1]);`
   `let b = IntervalDayTimeArray::from(vec![2]);`
   `let result = sub_wrapping(&a, &b).unwrap();`
   `assert_eq!(result.as_ref(), &IntervalDayTimeArray::from(vec![-1]));`
   
   It currently results as `PrimitiveArray<Interval(DayTime)>[4294967295] `
   
   **Expected behavior**
   <!--
   A clear and concise description of what you expected to happen.
   -->
   
   It should give `PrimitiveArray<Interval(DayTime)>[-1] `
   
   **Additional context**
   <!--
   Add any other context about the problem here.
   -->
   
   I believe the same problem exists for `IntervalMonthDayNanoType`'s `make_value()` implementation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Negatively Resulting Interval Subtractions [arrow-rs]

Posted by "berkaysynnada (via GitHub)" <gi...@apache.org>.
berkaysynnada commented on issue #5105:
URL: https://github.com/apache/arrow-rs/issues/5105#issuecomment-1824424257

   I think we have 2 datatypes carrying risk now; decimals and intervals. Since decimals have a total order and intervals don't (I mean they cannot be ordered without some assumptions), we should follow different paths:
   1) Decimals: They can continue as other ordinary types: https://github.com/apache/arrow-rs/blob/200e8c80084442d9579e00967e407cd83191565d/arrow-ord/src/cmp.rs#L207-L221 We would just need another struct that implements ArrayOrd with the capability of decimal comparison. 
   
   2) Interval: They need their own `apply()` function like `apply_partial()` (alternative to: https://github.com/apache/arrow-rs/blob/200e8c80084442d9579e00967e407cd83191565d/arrow-ord/src/cmp.rs#L289-L297). The logic would be specialized to intervals, but would use the same vectorized operations, possibly with some duplications. We cannot use ArrayOrd for intervals since there exist values of Interval A and Interval B, we can get both A<B=false and B<A=false.
   
   What do you think of such support?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Negatively Resulting Interval Subtractions [arrow-rs]

Posted by "berkaysynnada (via GitHub)" <gi...@apache.org>.
berkaysynnada commented on issue #5105:
URL: https://github.com/apache/arrow-rs/issues/5105#issuecomment-1824256778

   Hi @tustvold. Currently, intervals having more than one field (`DayTime` and `MonthDayNano`) may give incorrect results as you can imagine (especially if one of the least-significant parts is a negative value). This behavior may cause silent bugs. Do you think we should open an issue and resolve this? It seems a little dangerous to keep it this way. Since `ArrayOrd::is_lt` returns a boolean, I think we can't have special handling that always stays on the safe side.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Negatively Resulting Interval Subtractions [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #5105:
URL: https://github.com/apache/arrow-rs/issues/5105#issuecomment-1820845139

   > I should fix my problem by fixing DF ScalarValue
   
   FWIW I don't think intervals have a well-behaved ordering relation, the arrow-ord kernels don't support them


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Negatively Resulting Interval Subtractions [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #5105:
URL: https://github.com/apache/arrow-rs/issues/5105#issuecomment-1824261977

   I'm not familiar with this ArrayOrd construction but if it is solely parameterised on the ArrowNativeType and not the DataType it is likely incorrect for more than just intervals. Decimals or timestamps with timezones come to mind.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Negatively Resulting Interval Subtractions [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #5105:
URL: https://github.com/apache/arrow-rs/issues/5105#issuecomment-1820774365

   I think this is correct as the returned value should be 0 days and -1 millis, which when packed into an i64 is equal to `u32::MAX` or `4_294_967_295u32`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Negatively Resulting Interval Subtractions [arrow-rs]

Posted by "berkaysynnada (via GitHub)" <gi...@apache.org>.
berkaysynnada commented on issue #5105:
URL: https://github.com/apache/arrow-rs/issues/5105#issuecomment-1824528007

   > Looking at the linked code, decimals are supported provided they have the same precision and scale.
   
   Yes, I couldn't realize precision and scale are embedded in the type. However, I believe we can still implement what I suggested for intervals. I am planning to provide a small example to illustrate this.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Negatively Resulting Interval Subtractions [arrow-rs]

Posted by "berkaysynnada (via GitHub)" <gi...@apache.org>.
berkaysynnada closed issue #5105: Negatively Resulting Interval Subtractions
URL: https://github.com/apache/arrow-rs/issues/5105


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Negatively Resulting Interval Subtractions [arrow-rs]

Posted by "tustvold (via GitHub)" <gi...@apache.org>.
tustvold commented on issue #5105:
URL: https://github.com/apache/arrow-rs/issues/5105#issuecomment-1824441950

   I'm not entirely sure what is being proposed here, it is insufficient for ordering of kernels to be solely based on their generic types, but the arrow-ord kernels don't do this, as they check the DataType.
   
   Looking at the linked code, decimals are supported provided they have the same precision and scale.
   
   As for intervals I didn't think we supported them, however, looking at the code it will currently order them based on comparison of the i32/i64/i128. This is potentially surprising, but it is a total order and may be the best we can do here... :sweat_smile: 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Re: [I] Negatively Resulting Interval Subtractions [arrow-rs]

Posted by "berkaysynnada (via GitHub)" <gi...@apache.org>.
berkaysynnada commented on issue #5105:
URL: https://github.com/apache/arrow-rs/issues/5105#issuecomment-1820837811

   > I think this is correct as the returned value should be 0 days and -1 millis, which when packed into an i64 is equal to `u32::MAX` or `4_294_967_295`.
   > 
   > `-1_i64` would correspond to `-1 days -1 millis`
   
   Oh yes, you are right. I should fix my problem by fixing DF ScalarValue `partial_cmp`s. Thanks 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org