You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2022/04/12 14:24:39 UTC
[GitHub] [arrow-rs] alamb opened a new pull request, #1546: Add CI check for full validation mode
alamb opened a new pull request, #1546:
URL: https://github.com/apache/arrow-rs/pull/1546
# Which issue does this PR close?
Closes https://github.com/apache/arrow-rs/issues/1544
# Rationale for this change
We have a full validation mode that validates the full contents of creating Arrays but it is (on purpose) not called in many places in arrow for performance reasons: https://docs.rs/arrow/11.1.0/arrow/array/struct.ArrayData.html#method.validate_full
This leads to two possible issues:
1. Lack of test coverage of arrays constructed within arrow
2. Possible Lack of test coverage of the validation routine itself
# What changes are included in this PR?
1. Add a `force_validate` feature flag that forced the validation check for *all* array creations (defaults to off)
2. Add a new CI check that runs with this flag on
# Are there any user-facing changes?
New optional feature `force_validate`
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1546: Add CI check for full validation mode
Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1546:
URL: https://github.com/apache/arrow-rs/pull/1546#discussion_r848738135
##########
arrow/src/compute/kernels/filter.rs:
##########
@@ -1692,6 +1692,9 @@ mod tests {
}
#[test]
+ // Fails when validation enabled
Review Comment:
:cry:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] alamb commented on pull request #1546: Add CI check for full validation mode
Posted by GitBox <gi...@apache.org>.
alamb commented on PR #1546:
URL: https://github.com/apache/arrow-rs/pull/1546#issuecomment-1099082157
I will try and polish this up shortly
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] codecov-commenter commented on pull request #1546: Add CI check for full validation mode
Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on PR #1546:
URL: https://github.com/apache/arrow-rs/pull/1546#issuecomment-1096842544
# [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1546?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
> Merging [#1546](https://codecov.io/gh/apache/arrow-rs/pull/1546?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (19dc05a) into [master](https://codecov.io/gh/apache/arrow-rs/commit/68038f595b62202906d9a6235575b3a236c09546?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (68038f5) will **decrease** coverage by `0.00%`.
> The diff coverage is `0.00%`.
> :exclamation: Current head 19dc05a differs from pull request most recent head 58e3c28. Consider uploading reports for the commit 58e3c28 to get more accurate results
```diff
@@ Coverage Diff @@
## master #1546 +/- ##
==========================================
- Coverage 82.82% 82.82% -0.01%
==========================================
Files 190 190
Lines 54941 54943 +2
==========================================
- Hits 45507 45506 -1
- Misses 9434 9437 +3
```
| [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/1546?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
|---|---|---|
| [arrow/src/array/array\_binary.rs](https://codecov.io/gh/apache/arrow-rs/pull/1546/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X2JpbmFyeS5ycw==) | `92.93% <ø> (ø)` | |
| [arrow/src/array/array\_boolean.rs](https://codecov.io/gh/apache/arrow-rs/pull/1546/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X2Jvb2xlYW4ucnM=) | `93.18% <ø> (ø)` | |
| [arrow/src/array/array\_list.rs](https://codecov.io/gh/apache/arrow-rs/pull/1546/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X2xpc3QucnM=) | `95.52% <ø> (ø)` | |
| [arrow/src/array/array\_primitive.rs](https://codecov.io/gh/apache/arrow-rs/pull/1546/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2FycmF5X3ByaW1pdGl2ZS5ycw==) | `94.85% <ø> (ø)` | |
| [arrow/src/array/data.rs](https://codecov.io/gh/apache/arrow-rs/pull/1546/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L2RhdGEucnM=) | `82.98% <0.00%> (-0.15%)` | :arrow_down: |
| [arrow/src/datatypes/datatype.rs](https://codecov.io/gh/apache/arrow-rs/pull/1546/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2RhdGF0eXBlcy9kYXRhdHlwZS5ycw==) | `66.40% <0.00%> (-0.40%)` | :arrow_down: |
| [arrow/src/datatypes/field.rs](https://codecov.io/gh/apache/arrow-rs/pull/1546/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2RhdGF0eXBlcy9maWVsZC5ycw==) | `53.79% <0.00%> (-0.31%)` | :arrow_down: |
| [parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow-rs/pull/1546/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2VuY29kaW5nLnJz) | `93.56% <0.00%> (+0.18%)` | :arrow_up: |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1546?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/1546?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [68038f5...58e3c28](https://codecov.io/gh/apache/arrow-rs/pull/1546?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] viirya commented on a diff in pull request #1546: Add CI check for full validation mode
Posted by GitBox <gi...@apache.org>.
viirya commented on code in PR #1546:
URL: https://github.com/apache/arrow-rs/pull/1546#discussion_r850005874
##########
arrow/src/compute/kernels/filter.rs:
##########
@@ -1692,6 +1692,9 @@ mod tests {
}
#[test]
+ // Fails when validation enabled
Review Comment:
going to fix it at #1567
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] alamb commented on a diff in pull request #1546: Add CI check for full validation mode
Posted by GitBox <gi...@apache.org>.
alamb commented on code in PR #1546:
URL: https://github.com/apache/arrow-rs/pull/1546#discussion_r848881395
##########
arrow/src/compute/kernels/filter.rs:
##########
@@ -1692,6 +1692,9 @@ mod tests {
}
#[test]
+ // Fails when validation enabled
Review Comment:
at least we know there is a problem. So it is progress 🤷 we'll get it fixed
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] alamb commented on a diff in pull request #1546: Add CI check for full validation mode
Posted by GitBox <gi...@apache.org>.
alamb commented on code in PR #1546:
URL: https://github.com/apache/arrow-rs/pull/1546#discussion_r848523045
##########
arrow/src/array/data.rs:
##########
@@ -286,15 +286,20 @@ impl ArrayData {
Some(null_count) => null_count,
};
let null_bitmap = null_bit_buffer.map(Bitmap::from);
- Self {
+ let new_self = Self {
data_type,
len,
null_count,
offset,
buffers,
child_data,
null_bitmap,
- }
+ };
+
+ // Provide a force_validate mode
+ #[cfg(feature = "force_validate")]
+ new_self.validate_full().unwrap();
Review Comment:
Here is the validation call
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] alamb merged pull request #1546: Add CI check for full validation mode
Posted by GitBox <gi...@apache.org>.
alamb merged PR #1546:
URL: https://github.com/apache/arrow-rs/pull/1546
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow-rs] tustvold commented on a diff in pull request #1546: Add CI check for full validation mode
Posted by GitBox <gi...@apache.org>.
tustvold commented on code in PR #1546:
URL: https://github.com/apache/arrow-rs/pull/1546#discussion_r848735135
##########
arrow/src/array/data.rs:
##########
@@ -286,15 +286,20 @@ impl ArrayData {
Some(null_count) => null_count,
};
let null_bitmap = null_bit_buffer.map(Bitmap::from);
- Self {
+ let new_self = Self {
data_type,
len,
null_count,
offset,
buffers,
child_data,
null_bitmap,
- }
+ };
+
+ // Provide a force_validate mode
+ #[cfg(feature = "force_validate")]
+ new_self.validate_full().unwrap();
Review Comment:
:heart:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org