You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/14 20:56:31 UTC

[GitHub] [arrow] seddonm1 opened a new pull request #8918: ARROW-10907: [Rust] Correct date64 behavior

seddonm1 opened a new pull request #8918:
URL: https://github.com/apache/arrow/pull/8918


   This PR fixes the behavior of a UTF8 -> Date64 conversion process to use `%Y-%m-%dT%H:%M:%S` rather than `%Y-%m-%d` with `00:00:00` time component.
   
   It aligns with https://github.com/apache/arrow/pull/8913.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] seddonm1 commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
seddonm1 commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543648282



##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
                 Ok(Arc::new(builder.finish()) as ArrayRef)
             }
             Date64(DateUnit::Millisecond) => {
-                use chrono::{NaiveDate, NaiveTime};
-                let zero_time = NaiveTime::from_hms(0, 0, 0);
+                use chrono::NaiveDateTime;
                 let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
                 let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
                 for i in 0..string_array.len() {
                     if string_array.is_null(i) {
                         builder.append_null()?;
                     } else {
-                        match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
-                        {
-                            Ok(date) => builder.append_value(
-                                date.and_time(zero_time).timestamp_millis() as i64,
-                            )?,
+                        match NaiveDateTime::parse_from_str(
+                            string_array.value(i),

Review comment:
       have updated with your suggested changes. thank you very much - this helps me learn.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb commented on pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#issuecomment-745587393


   Thanks @Dandandan  and @seddonm1 👍 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543485838



##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
                 Ok(Arc::new(builder.finish()) as ArrayRef)
             }
             Date64(DateUnit::Millisecond) => {
-                use chrono::{NaiveDate, NaiveTime};
-                let zero_time = NaiveTime::from_hms(0, 0, 0);
+                use chrono::NaiveDateTime;
                 let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
                 let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
                 for i in 0..string_array.len() {
                     if string_array.is_null(i) {
                         builder.append_null()?;
                     } else {
-                        match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
-                        {
-                            Ok(date) => builder.append_value(
-                                date.and_time(zero_time).timestamp_millis() as i64,
-                            )?,
+                        match NaiveDateTime::parse_from_str(
+                            string_array.value(i),

Review comment:
       See latest commits in #8913. There is a default parser that also avoids parsing the format for each value (faster), and also supports the decimal fraction




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#issuecomment-744710560


   https://issues.apache.org/jira/browse/ARROW-10907


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb closed pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
alamb closed pull request #8918:
URL: https://github.com/apache/arrow/pull/8918


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] seddonm1 commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
seddonm1 commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543636737



##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
                 Ok(Arc::new(builder.finish()) as ArrayRef)
             }
             Date64(DateUnit::Millisecond) => {
-                use chrono::{NaiveDate, NaiveTime};
-                let zero_time = NaiveTime::from_hms(0, 0, 0);
+                use chrono::NaiveDateTime;
                 let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
                 let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
                 for i in 0..string_array.len() {
                     if string_array.is_null(i) {
                         builder.append_null()?;
                     } else {
-                        match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
-                        {
-                            Ok(date) => builder.append_value(
-                                date.and_time(zero_time).timestamp_millis() as i64,
-                            )?,
+                        match NaiveDateTime::parse_from_str(
+                            string_array.value(i),

Review comment:
       Thanks @Dandandan will update.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543485838



##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
                 Ok(Arc::new(builder.finish()) as ArrayRef)
             }
             Date64(DateUnit::Millisecond) => {
-                use chrono::{NaiveDate, NaiveTime};
-                let zero_time = NaiveTime::from_hms(0, 0, 0);
+                use chrono::NaiveDateTime;
                 let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
                 let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
                 for i in 0..string_array.len() {
                     if string_array.is_null(i) {
                         builder.append_null()?;
                     } else {
-                        match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
-                        {
-                            Ok(date) => builder.append_value(
-                                date.and_time(zero_time).timestamp_millis() as i64,
-                            )?,
+                        match NaiveDateTime::parse_from_str(
+                            string_array.value(i),

Review comment:
       See latest commits in #8913. There is a default parser that also avoids parsing the format for each value, and also supports the decimal fraction




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#issuecomment-744711262


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=h1) Report
   > Merging [#8918](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=desc) (1863c08) into [master](https://codecov.io/gh/apache/arrow/commit/408e5be0bc3533d27e28793f669839904d3a76d9?el=desc) (408e5be) will **decrease** coverage by `23.45%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/8918/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master    #8918       +/-   ##
   ===========================================
   - Coverage   75.35%   51.90%   -23.46%     
   ===========================================
     Files         177      173        -4     
     Lines       40821    31082     -9739     
   ===========================================
   - Hits        30762    16132    -14630     
   - Misses      10059    14950     +4891     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/datafusion/src/physical\_plan/expressions.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9kYXRhZnVzaW9uL3NyYy9waHlzaWNhbF9wbGFuL2V4cHJlc3Npb25zLnJz) | `0.00% <ø> (ø)` | |
   | [rust/arrow/src/compute/kernels/cast.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2Nhc3QucnM=) | `96.21% <100.00%> (-0.13%)` | :arrow_down: |
   | [rust/parquet/src/column/page.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9jb2x1bW4vcGFnZS5ycw==) | `0.00% <0.00%> (-98.69%)` | :arrow_down: |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `0.00% <0.00%> (-98.15%)` | :arrow_down: |
   | [rust/parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9hcnJvd193cml0ZXIucnM=) | `0.00% <0.00%> (-97.34%)` | :arrow_down: |
   | [rust/parquet/src/basic.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iYXNpYy5ycw==) | `0.00% <0.00%> (-97.27%)` | :arrow_down: |
   | [rust/parquet/src/file/properties.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3Byb3BlcnRpZXMucnM=) | `0.00% <0.00%> (-95.73%)` | :arrow_down: |
   | [rust/parquet/src/file/serialized\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3NlcmlhbGl6ZWRfcmVhZGVyLnJz) | `0.00% <0.00%> (-95.62%)` | :arrow_down: |
   | [rust/parquet/src/file/writer.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3dyaXRlci5ycw==) | `0.00% <0.00%> (-95.11%)` | :arrow_down: |
   | [rust/parquet/src/arrow/record\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9yZWNvcmRfcmVhZGVyLnJz) | `0.00% <0.00%> (-94.54%)` | :arrow_down: |
   | ... and [44 more](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=footer). Last update [cbb1ed5...1863c08](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] Dandandan commented on a change in pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
Dandandan commented on a change in pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#discussion_r543485838



##########
File path: rust/arrow/src/compute/kernels/cast.rs
##########
@@ -401,19 +401,18 @@ pub fn cast(array: &ArrayRef, to_type: &DataType) -> Result<ArrayRef> {
                 Ok(Arc::new(builder.finish()) as ArrayRef)
             }
             Date64(DateUnit::Millisecond) => {
-                use chrono::{NaiveDate, NaiveTime};
-                let zero_time = NaiveTime::from_hms(0, 0, 0);
+                use chrono::NaiveDateTime;
                 let string_array = array.as_any().downcast_ref::<StringArray>().unwrap();
                 let mut builder = PrimitiveBuilder::<Date64Type>::new(string_array.len());
                 for i in 0..string_array.len() {
                     if string_array.is_null(i) {
                         builder.append_null()?;
                     } else {
-                        match NaiveDate::parse_from_str(string_array.value(i), "%Y-%m-%d")
-                        {
-                            Ok(date) => builder.append_value(
-                                date.and_time(zero_time).timestamp_millis() as i64,
-                            )?,
+                        match NaiveDateTime::parse_from_str(
+                            string_array.value(i),

Review comment:
       See latest commits in #8913. There is a default parser for the same format that also avoids parsing the format for each value (faster), and also supports the decimal fraction




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io commented on pull request #8918: ARROW-10907: [Rust] Fix Cast UTF8 to Date64

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #8918:
URL: https://github.com/apache/arrow/pull/8918#issuecomment-744711262


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=h1) Report
   > Merging [#8918](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=desc) (332c94f) into [master](https://codecov.io/gh/apache/arrow/commit/7f3794cb7cb295e353442b5914e404b83915cf88?el=desc) (7f3794c) will **decrease** coverage by `23.79%`.
   > The diff coverage is `100.00%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/8918/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=tree)
   
   ```diff
   @@             Coverage Diff             @@
   ##           master    #8918       +/-   ##
   ===========================================
   - Coverage   75.69%   51.90%   -23.80%     
   ===========================================
     Files         181      173        -8     
     Lines       41011    31082     -9929     
   ===========================================
   - Hits        31043    16132    -14911     
   - Misses       9968    14950     +4982     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/arrow/src/compute/kernels/cast.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY29tcHV0ZS9rZXJuZWxzL2Nhc3QucnM=) | `96.21% <100.00%> (-0.13%)` | :arrow_down: |
   | [rust/parquet/src/column/page.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9jb2x1bW4vcGFnZS5ycw==) | `0.00% <0.00%> (-98.69%)` | :arrow_down: |
   | [rust/parquet/src/record/api.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9yZWNvcmQvYXBpLnJz) | `0.00% <0.00%> (-98.15%)` | :arrow_down: |
   | [rust/parquet/src/arrow/arrow\_writer.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9hcnJvd193cml0ZXIucnM=) | `0.00% <0.00%> (-97.34%)` | :arrow_down: |
   | [rust/parquet/src/basic.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9iYXNpYy5ycw==) | `0.00% <0.00%> (-97.27%)` | :arrow_down: |
   | [rust/parquet/src/file/properties.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3Byb3BlcnRpZXMucnM=) | `0.00% <0.00%> (-95.73%)` | :arrow_down: |
   | [rust/parquet/src/file/serialized\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3NlcmlhbGl6ZWRfcmVhZGVyLnJz) | `0.00% <0.00%> (-95.62%)` | :arrow_down: |
   | [rust/parquet/src/file/writer.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3dyaXRlci5ycw==) | `0.00% <0.00%> (-95.11%)` | :arrow_down: |
   | [rust/parquet/src/arrow/record\_reader.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9hcnJvdy9yZWNvcmRfcmVhZGVyLnJz) | `0.00% <0.00%> (-94.54%)` | :arrow_down: |
   | [rust/parquet/src/file/statistics.rs](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9maWxlL3N0YXRpc3RpY3MucnM=) | `0.00% <0.00%> (-94.41%)` | :arrow_down: |
   | ... and [49 more](https://codecov.io/gh/apache/arrow/pull/8918/diff?src=pr&el=tree-more) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=footer). Last update [7f3794c...332c94f](https://codecov.io/gh/apache/arrow/pull/8918?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org