You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/01/13 23:10:10 UTC

[GitHub] [arrow] nealrichardson opened a new pull request #9203: ARROW-11247: [C++] Infer date32 columns in CSV

nealrichardson opened a new pull request #9203:
URL: https://github.com/apache/arrow/pull/9203


   I confirmed in R that the date column is detected correctly; I've attempted a test but need to rebuild to run it and may need to poke at that some more.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #9203: ARROW-11247: [C++] Infer date32 columns in CSV

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #9203:
URL: https://github.com/apache/arrow/pull/9203#issuecomment-760097655


   Please update the docs for https://arrow.apache.org/docs/cpp/csv.html#data-types and https://arrow.apache.org/docs/python/csv.html


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jorisvandenbossche commented on pull request #9203: ARROW-11247: [C++] Infer date32 columns in CSV

Posted by GitBox <gi...@apache.org>.
jorisvandenbossche commented on pull request #9203:
URL: https://github.com/apache/arrow/pull/9203#issuecomment-760007070


   I added a commit that updates the tests for the new behaviour, *in case* we decide we are OK with that.
   
   Generally I think we should do the best inference from Arrow's point of view, and which is thus date type for a date string. 
   
   The only reason I am thinking to not do it is that, for people converting the data to pandas afterwards, dates are not that well supported (at this point) in pandas. Now, there is a `to_pandas(..., date_as_object=False)` keyword a user can specify to still get a datetime64 dtype in pandas instead of datetime.date objects.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9203: ARROW-11247: [C++] Infer date32 columns in CSV

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9203:
URL: https://github.com/apache/arrow/pull/9203#issuecomment-759816313


   https://issues.apache.org/jira/browse/ARROW-11247


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on a change in pull request #9203: ARROW-11247: [C++] Infer date32 columns in CSV

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on a change in pull request #9203:
URL: https://github.com/apache/arrow/pull/9203#discussion_r556938141



##########
File path: cpp/src/arrow/csv/column_builder_test.cc
##########
@@ -15,13 +15,14 @@
 // specific language governing permissions and limitations
 // under the License.
 
+#include "arrow/csv/column_builder.h"
+
+#include <gtest/gtest.h>
+

Review comment:
       My editor runs clang-format and IDK why it moved this




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou commented on pull request #9203: ARROW-11247: [C++] Infer date32 columns in CSV

Posted by GitBox <gi...@apache.org>.
pitrou commented on pull request #9203:
URL: https://github.com/apache/arrow/pull/9203#issuecomment-760254982


   Travis-CI build: https://travis-ci.com/github/pitrou/arrow/builds/213173530
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] pitrou closed pull request #9203: ARROW-11247: [C++] Infer date32 columns in CSV

Posted by GitBox <gi...@apache.org>.
pitrou closed pull request #9203:
URL: https://github.com/apache/arrow/pull/9203


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] nealrichardson commented on pull request #9203: ARROW-11247: [C++] Infer date32 columns in CSV

Posted by GitBox <gi...@apache.org>.
nealrichardson commented on pull request #9203:
URL: https://github.com/apache/arrow/pull/9203#issuecomment-759830864


   Some python tests are failing with this change; I think the tests should be updated from where it assumes dates will be parsed (suboptimally IMO) as timestamps, but @jorisvandenbossche maybe you can review and weigh in. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org