You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/14 20:56:31 UTC
[GitHub] [arrow] seddonm1 opened a new pull request #8918: ARROW-10907: [Rust] Correct date64 behavior
seddonm1 opened a new pull request #8918:
URL: https://github.com/apache/arrow/pull/8918
This PR fixes the behavior of a UTF8 -> Date64 conversion process to use `%Y-%m-%dT%H:%M:%S` rather than `%Y-%m-%d` with `00:00:00` time component.
It aligns with https://github.com/apache/arrow/pull/8913.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] seddonm1 commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
seddonm1 commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543648282
##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
Ok(Arc::new(builder.finish()) as ArrayRef)
}
Date64(DateUnit::Millisecond) => {
- use chrono::{NaiveDate, NaiveTime};
- let zero_time = NaiveTime::from_hms(0, 0, 0);
+ use chrono::NaiveDateTime;
let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
for i in 0..string_array.len() {
if string_array.is_null(i) {
builder.append_null()?;
} else {
- match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
- {
- Ok(date) => builder.append_value(
- date.and_time(zero_time).timestamp_millis() as i64,
- )?,
+ match NaiveDateTime::parse_from_str(
+ string_array.value(i),
Review comment:
have updated with your suggested changes. thank you very much - this helps me learn.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb commented on pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#issuecomment-745587393
Thanks @Dandandan and @seddonm1 👍
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543485838
##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
Ok(Arc::new(builder.finish()) as ArrayRef)
}
Date64(DateUnit::Millisecond) => {
- use chrono::{NaiveDate, NaiveTime};
- let zero_time = NaiveTime::from_hms(0, 0, 0);
+ use chrono::NaiveDateTime;
let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
for i in 0..string_array.len() {
if string_array.is_null(i) {
builder.append_null()?;
} else {
- match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
- {
- Ok(date) => builder.append_value(
- date.and_time(zero_time).timestamp_millis() as i64,
- )?,
+ match NaiveDateTime::parse_from_str(
+ string_array.value(i),
Review comment:
See latest commits in #8913. There is a default parser that also avoids parsing the format for each value (faster), and also supports the decimal fraction
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] github-actions[bot] commented on pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#issuecomment-744710560
https://issues.apache.org/jira/browse/ARROW-10907
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] alamb closed pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
alamb closed pull request #8918:
URL: https://github.com/apache/arrow/pull/8918
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] seddonm1 commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
seddonm1 commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543636737
##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
Ok(Arc::new(builder.finish()) as ArrayRef)
}
Date64(DateUnit::Millisecond) => {
- use chrono::{NaiveDate, NaiveTime};
- let zero_time = NaiveTime::from_hms(0, 0, 0);
+ use chrono::NaiveDateTime;
let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
for i in 0..string_array.len() {
if string_array.is_null(i) {
builder.append_null()?;
} else {
- match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
- {
- Ok(date) => builder.append_value(
- date.and_time(zero_time).timestamp_millis() as i64,
- )?,
+ match NaiveDateTime::parse_from_str(
+ string_array.value(i),
Review comment:
Thanks @Dandandan will update.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543485838
##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
Ok(Arc::new(builder.finish()) as ArrayRef)
}
Date64(DateUnit::Millisecond) => {
- use chrono::{NaiveDate, NaiveTime};
- let zero_time = NaiveTime::from_hms(0, 0, 0);
+ use chrono::NaiveDateTime;
let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
for i in 0..string_array.len() {
if string_array.is_null(i) {
builder.append_null()?;
} else {
- match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
- {
- Ok(date) => builder.append_value(
- date.and_time(zero_time).timestamp_millis() as i64,
- )?,
+ match NaiveDateTime::parse_from_str(
+ string_array.value(i),
Review comment:
See latest commits in #8913. There is a default parser that also avoids parsing the format for each value, and also supports the decimal fraction
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io edited a comment on pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#issuecomment-744711262
# [Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=h1) Report
> Merging [#8918](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=desc) (1863c08) into [master](https://codecov.io/gh/apache/arrow/commit/408e5be0bc3533d27e28793f669839904d3a76d9?el=desc) (408e5be) will **decrease** coverage by `23.45%`.
> The diff coverage is `100.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/8918/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #8918 +/- ##
===========================================
- Coverage 75.35% 51.90% -23.46%
===========================================
Files 177 173 -4
Lines 40821 31082 -9739
===========================================
- Hits 30762 16132 -14630
- Misses 10059 14950 +4891
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/datafusion/src/physical\_plan/expressions.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2V4cHJlc3Npb25zLnJz) | `0.00% <ø> (ø)` | |
| [rust/arrow/src/compute/kernels/cast.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2Nhc3QucnM=) | `96.21% <100.00%> (-0.13%)` | :arrow_down: |
| [rust/parquet/src/column/page.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9jb2x1bW4vcGFnZS5ycw==) | `0.00% <0.00%> (-98.69%)` | :arrow_down: |
| [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `0.00% <0.00%> (-98.15%)` | :arrow_down: |
| [rust/parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9hcnJvd193cml0ZXIucnM=) | `0.00% <0.00%> (-97.34%)` | :arrow_down: |
| [rust/parquet/src/basic.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iYXNpYy5ycw==) | `0.00% <0.00%> (-97.27%)` | :arrow_down: |
| [rust/parquet/src/file/properties.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3Byb3BlcnRpZXMucnM=) | `0.00% <0.00%> (-95.73%)` | :arrow_down: |
| [rust/parquet/src/file/serialized\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3NlcmlhbGl6ZWRfcmVhZGVyLnJz) | `0.00% <0.00%> (-95.62%)` | :arrow_down: |
| [rust/parquet/src/file/writer.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3dyaXRlci5ycw==) | `0.00% <0.00%> (-95.11%)` | :arrow_down: |
| [rust/parquet/src/arrow/record\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9yZWNvcmRfcmVhZGVyLnJz) | `0.00% <0.00%> (-94.54%)` | :arrow_down: |
| ... and [44 more](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=footer). Last update [cbb1ed5...1863c08](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] Dandandan commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543485838
##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
Ok(Arc::new(builder.finish()) as ArrayRef)
}
Date64(DateUnit::Millisecond) => {
- use chrono::{NaiveDate, NaiveTime};
- let zero_time = NaiveTime::from_hms(0, 0, 0);
+ use chrono::NaiveDateTime;
let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
for i in 0..string_array.len() {
if string_array.is_null(i) {
builder.append_null()?;
} else {
- match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
- {
- Ok(date) => builder.append_value(
- date.and_time(zero_time).timestamp_millis() as i64,
- )?,
+ match NaiveDateTime::parse_from_str(
+ string_array.value(i),
Review comment:
See latest commits in #8913. There is a default parser for the same format that also avoids parsing the format for each value (faster), and also supports the decimal fraction
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [arrow] codecov-io commented on pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64
Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#issuecomment-744711262
# [Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=h1) Report
> Merging [#8918](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=desc) (332c94f) into [master](https://codecov.io/gh/apache/arrow/commit/7f3794cb7cb295e353442b5914e404b83915cf88?el=desc) (7f3794c) will **decrease** coverage by `23.79%`.
> The diff coverage is `100.00%`.
[![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/8918/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=tree)
```diff
@@ Coverage Diff @@
## master #8918 +/- ##
===========================================
- Coverage 75.69% 51.90% -23.80%
===========================================
Files 181 173 -8
Lines 41011 31082 -9929
===========================================
- Hits 31043 16132 -14911
- Misses 9968 14950 +4982
```
| [Impacted Files](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=tree) | Coverage Δ | |
|---|---|---|
| [rust/arrow/src/compute/kernels/cast.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2Nhc3QucnM=) | `96.21% <100.00%> (-0.13%)` | :arrow_down: |
| [rust/parquet/src/column/page.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9jb2x1bW4vcGFnZS5ycw==) | `0.00% <0.00%> (-98.69%)` | :arrow_down: |
| [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `0.00% <0.00%> (-98.15%)` | :arrow_down: |
| [rust/parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9hcnJvd193cml0ZXIucnM=) | `0.00% <0.00%> (-97.34%)` | :arrow_down: |
| [rust/parquet/src/basic.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iYXNpYy5ycw==) | `0.00% <0.00%> (-97.27%)` | :arrow_down: |
| [rust/parquet/src/file/properties.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3Byb3BlcnRpZXMucnM=) | `0.00% <0.00%> (-95.73%)` | :arrow_down: |
| [rust/parquet/src/file/serialized\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3NlcmlhbGl6ZWRfcmVhZGVyLnJz) | `0.00% <0.00%> (-95.62%)` | :arrow_down: |
| [rust/parquet/src/file/writer.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3dyaXRlci5ycw==) | `0.00% <0.00%> (-95.11%)` | :arrow_down: |
| [rust/parquet/src/arrow/record\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9yZWNvcmRfcmVhZGVyLnJz) | `0.00% <0.00%> (-94.54%)` | :arrow_down: |
| [rust/parquet/src/file/statistics.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3N0YXRpc3RpY3MucnM=) | `0.00% <0.00%> (-94.41%)` | :arrow_down: |
| ... and [49 more](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree-more) | |
------
[Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=continue).
> **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
> `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
> Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=footer). Last update [7f3794c...332c94f](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org