You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/05/19 08:56:20 UTC

[GitHub] [arrow-rs] kazuk opened a new pull request #326: allow to read non-standard CSV

kazuk opened a new pull request #326:
URL: https://github.com/apache/arrow-rs/pull/326


   # Which issue does this PR close?
   
   Closes #315.
   
   # Rationale for this change
    
   
   # What changes are included in this PR?
   
   `Reader::from_reader`  split into `Reader::build_csv_reader`, `Reader::from_csv_reader`.
   
   `Reader::build_csv_reader` builds `csv_crate::Reader`,with all CSV reader options.
   `Rader::from_csv_reader` builds `arrow::csv::reader::Reader` for `csv_crate::Reader`.
   
   add fn `infer_file_schema_with_csv_options` .
   
    change  `infer_file_schema` calls  `infer_file_schema_with_csv_options` with default options.
   
   add fn `infer_reader_schema_with_csv_options`.
     change `infer_reader_schema` calls `infer_reader_schema` with default options.
   
   add `escape` `quote` `terminator` to `ReaderBuilder`
   
   `ReaderBuilder::build` uses added options.
   
   
   # Are there any user-facing changes?
   
   currently minimized API change.
   
   - add fn `ReaderBuilder::escape`
   - add fn `ReaderBuilder::quote`
   - add fn `ReaderBuilder::terminator`
   
    please concider make public for fn `infer_file_schema_with_csv_options`, `infer_reader_schema_with_csv_options` , `Reader::build_csv_reader` , `Reader::from_csv_reader`.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] kazuk commented on pull request #326: allow to read non-standard CSV

Posted by GitBox <gi...@apache.org>.
kazuk commented on pull request #326:
URL: https://github.com/apache/arrow-rs/pull/326#issuecomment-846677229


   Thank you for review.
   
   > The only thing that I think is missing from this PR is some basic tests -- specifically to show everything hooked together properly and the options get to the csv reader.
   
   I add a basic tests for detects feature broken.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb edited a comment on pull request #326: allow to read non-standard CSV

Posted by GitBox <gi...@apache.org>.
alamb edited a comment on pull request #326:
URL: https://github.com/apache/arrow-rs/pull/326#issuecomment-847231889


   The MIRI CI check is not related to this PR (edit https://github.com/apache/arrow-rs/issues/345)
   
   However, the lint failure seems to be: https://github.com/apache/arrow-rs/pull/326/checks?check_run_id=2658268180
   
   I think it can be resolved by running `cargo fmt` locally and then pushing the result. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] kazuk commented on pull request #326: allow to read non-standard CSV

Posted by GitBox <gi...@apache.org>.
kazuk commented on pull request #326:
URL: https://github.com/apache/arrow-rs/pull/326#issuecomment-847347801


   oh!, Thanks.
   
   I run `cargo fmt` and push.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #326: allow to read non-standard CSV

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #326:
URL: https://github.com/apache/arrow-rs/pull/326#issuecomment-847231889


   The MIRI CI check is not related to this PR (I'll try and look at it shortly)
   
   However, the lint failure seems to be: https://github.com/apache/arrow-rs/pull/326/checks?check_run_id=2658268180
   
   I think it can be resolved by running `cargo fmt` locally and then pushing the result. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #326: allow to read non-standard CSV

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #326:
URL: https://github.com/apache/arrow-rs/pull/326#issuecomment-848286561


   Thanks again @kazuk 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] codecov-commenter commented on pull request #326: allow to read non-standard CSV

Posted by GitBox <gi...@apache.org>.
codecov-commenter commented on pull request #326:
URL: https://github.com/apache/arrow-rs/pull/326#issuecomment-846188424


   # [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/326?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#326](https://codecov.io/gh/apache/arrow-rs/pull/326?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (4bf2e0c) into [master](https://codecov.io/gh/apache/arrow-rs/commit/c863a2c44bffa5c092a49e07910d5e9225483193?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (c863a2c) will **decrease** coverage by `0.02%`.
   > The diff coverage is `66.66%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow-rs/pull/326/graphs/tree.svg?width=650&height=150&src=pr&token=pq9V9qWZ1N&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)](https://codecov.io/gh/apache/arrow-rs/pull/326?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master     #326      +/-   ##
   ==========================================
   - Coverage   82.52%   82.50%   -0.03%     
   ==========================================
     Files         162      162              
     Lines       44007    44036      +29     
   ==========================================
   + Hits        36316    36331      +15     
   - Misses       7691     7705      +14     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow-rs/pull/326?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [arrow/src/csv/reader.rs](https://codecov.io/gh/apache/arrow-rs/pull/326/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2Nzdi9yZWFkZXIucnM=) | `86.97% <66.66%> (-1.39%)` | :arrow_down: |
   | [arrow/src/array/transform/boolean.rs](https://codecov.io/gh/apache/arrow-rs/pull/326/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-YXJyb3cvc3JjL2FycmF5L3RyYW5zZm9ybS9ib29sZWFuLnJz) | `76.92% <0.00%> (-7.70%)` | :arrow_down: |
   | [parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow-rs/pull/326/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-cGFycXVldC9zcmMvZW5jb2RpbmdzL2VuY29kaW5nLnJz) | `94.85% <0.00%> (-0.20%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow-rs/pull/326?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow-rs/pull/326?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [c863a2c...4bf2e0c](https://codecov.io/gh/apache/arrow-rs/pull/326?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb merged pull request #326: allow to read non-standard CSV

Posted by GitBox <gi...@apache.org>.
alamb merged pull request #326:
URL: https://github.com/apache/arrow-rs/pull/326


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow-rs] alamb commented on pull request #326: allow to read non-standard CSV

Posted by GitBox <gi...@apache.org>.
alamb commented on pull request #326:
URL: https://github.com/apache/arrow-rs/pull/326#issuecomment-848286414


   The MIRI failure is unrelated to this PR: https://github.com/apache/arrow-rs/issues/345


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org