You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2020/12/25 09:20:54 UTC

[GitHub] [arrow] Dandandan opened a new pull request #9010: ARROW-11033 [Rust] Csv writing performance improvements

Dandandan opened a new pull request #9010:
URL: https://github.com/apache/arrow/pull/9010


   Some performance improvements for the csv writer
   
   * Use lexical core for numeric types
   * Allow setting batch size in convert (slightly faster reading)
   * Avoid allocation of vec
   
   
   PR: `cargo run --release --bin tpch -- convert --input /home/danielheres/Code/gdd/arrow/rust/benchmarks/tpch-dbgen --output ./output --format csv -s 20000`
   Orders / lineitems:
   
   ```
   Conversion completed in 2050 ms
   Conversion completed in 16955 ms
   ```
   Master `cargo run --release --bin tpch -- convert --input /home/danielheres/Code/gdd/arrow/rust/benchmarks/tpch-dbgen --output ./output --format csv`
   
   ```
   Conversion completed in 2336 ms
   Conversion completed in 19070 ms
   ```


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9010: ARROW-11033 [Rust] Csv writing performance improvements

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9010:
URL: https://github.com/apache/arrow/pull/9010#issuecomment-751221207


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=h1) Report
   > Merging [#9010](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=desc) (0fda55a) into [master](https://codecov.io/gh/apache/arrow/commit/a4f7c4a2acda874b3d6eb2eb4c986e7c3267c755?el=desc) (a4f7c4a) will **decrease** coverage by `0.00%`.
   > The diff coverage is `88.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9010/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9010      +/-   ##
   ==========================================
   - Coverage   82.87%   82.87%   -0.01%     
   ==========================================
     Files         201      201              
     Lines       49739    49746       +7     
   ==========================================
   + Hits        41220    41225       +5     
   - Misses       8519     8521       +2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/benchmarks/src/bin/tpch.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9iZW5jaG1hcmtzL3NyYy9iaW4vdHBjaC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/arrow/src/csv/writer.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY3N2L3dyaXRlci5ycw==) | `79.38% <100.00%> (+0.55%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/fixed\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2ZpeGVkX2JpbmFyeS5ycw==) | `78.94% <0.00%> (-5.27%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=footer). Last update [a4f7c4a...8f3d28c](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] alamb closed pull request #9010: ARROW-11033 [Rust] Csv writing performance improvements

Posted by GitBox <gi...@apache.org>.
alamb closed pull request #9010:
URL: https://github.com/apache/arrow/pull/9010


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io commented on pull request #9010: ARROW-11033 [Rust] Csv writing performance improvements

Posted by GitBox <gi...@apache.org>.
codecov-io commented on pull request #9010:
URL: https://github.com/apache/arrow/pull/9010#issuecomment-751221207


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=h1) Report
   > Merging [#9010](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=desc) (2676383) into [master](https://codecov.io/gh/apache/arrow/commit/a4f7c4a2acda874b3d6eb2eb4c986e7c3267c755?el=desc) (a4f7c4a) will **decrease** coverage by `0.00%`.
   > The diff coverage is `89.47%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9010/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9010      +/-   ##
   ==========================================
   - Coverage   82.87%   82.87%   -0.01%     
   ==========================================
     Files         201      201              
     Lines       49739    49748       +9     
   ==========================================
   + Hits        41220    41227       +7     
   - Misses       8519     8521       +2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/benchmarks/src/bin/tpch.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9iZW5jaG1hcmtzL3NyYy9iaW4vdHBjaC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/arrow/src/csv/writer.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY3N2L3dyaXRlci5ycw==) | `79.56% <100.00%> (+0.73%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/fixed\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2ZpeGVkX2JpbmFyeS5ycw==) | `78.94% <0.00%> (-5.27%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=footer). Last update [a4f7c4a...e4d5b5d](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9010: ARROW-11033 [Rust] Csv writing performance improvements

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9010:
URL: https://github.com/apache/arrow/pull/9010#issuecomment-751221207


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=h1) Report
   > Merging [#9010](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=desc) (e4d5b5d) into [master](https://codecov.io/gh/apache/arrow/commit/a4f7c4a2acda874b3d6eb2eb4c986e7c3267c755?el=desc) (a4f7c4a) will **decrease** coverage by `0.00%`.
   > The diff coverage is `89.47%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9010/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree)
   
   ```diff
   @@            Coverage Diff             @@
   ##           master    #9010      +/-   ##
   ==========================================
   - Coverage   82.87%   82.87%   -0.01%     
   ==========================================
     Files         201      201              
     Lines       49739    49748       +9     
   ==========================================
   + Hits        41220    41227       +7     
   - Misses       8519     8521       +2     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/benchmarks/src/bin/tpch.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9iZW5jaG1hcmtzL3NyYy9iaW4vdHBjaC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/arrow/src/csv/writer.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY3N2L3dyaXRlci5ycw==) | `79.56% <100.00%> (+0.73%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/fixed\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2ZpeGVkX2JpbmFyeS5ycw==) | `78.94% <0.00%> (-5.27%)` | :arrow_down: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=footer). Last update [a4f7c4a...e4d5b5d](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9010: ARROW-11033 [Rust] Csv writing performance improvements

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9010:
URL: https://github.com/apache/arrow/pull/9010#issuecomment-751221207


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=h1) Report
   > Merging [#9010](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=desc) (aaa696f) into [master](https://codecov.io/gh/apache/arrow/commit/a4f7c4a2acda874b3d6eb2eb4c986e7c3267c755?el=desc) (a4f7c4a) will **increase** coverage by `0.00%`.
   > The diff coverage is `88.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9010/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #9010   +/-   ##
   =======================================
     Coverage   82.87%   82.87%           
   =======================================
     Files         201      201           
     Lines       49739    49746    +7     
   =======================================
   + Hits        41220    41226    +6     
   - Misses       8519     8520    +1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/benchmarks/src/bin/tpch.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9iZW5jaG1hcmtzL3NyYy9iaW4vdHBjaC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/arrow/src/csv/writer.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY3N2L3dyaXRlci5ycw==) | `79.38% <100.00%> (+0.55%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/fixed\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2ZpeGVkX2JpbmFyeS5ycw==) | `78.94% <0.00%> (-5.27%)` | :arrow_down: |
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `95.43% <0.00%> (+0.19%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=footer). Last update [a4f7c4a...4f2a72a](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9010: ARROW-11033 [Rust] Csv writing performance improvements

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9010:
URL: https://github.com/apache/arrow/pull/9010#issuecomment-751221207


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=h1) Report
   > Merging [#9010](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=desc) (3e808e1) into [master](https://codecov.io/gh/apache/arrow/commit/a4f7c4a2acda874b3d6eb2eb4c986e7c3267c755?el=desc) (a4f7c4a) will **increase** coverage by `0.00%`.
   > The diff coverage is `88.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9010/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #9010   +/-   ##
   =======================================
     Coverage   82.87%   82.87%           
   =======================================
     Files         201      201           
     Lines       49739    49746    +7     
   =======================================
   + Hits        41220    41226    +6     
   - Misses       8519     8520    +1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/benchmarks/src/bin/tpch.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9iZW5jaG1hcmtzL3NyYy9iaW4vdHBjaC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/arrow/src/csv/writer.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY3N2L3dyaXRlci5ycw==) | `79.38% <100.00%> (+0.55%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/fixed\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2ZpeGVkX2JpbmFyeS5ycw==) | `78.94% <0.00%> (-5.27%)` | :arrow_down: |
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `95.43% <0.00%> (+0.19%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=footer). Last update [a4f7c4a...3e808e1](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] codecov-io edited a comment on pull request #9010: ARROW-11033 [Rust] Csv writing performance improvements

Posted by GitBox <gi...@apache.org>.
codecov-io edited a comment on pull request #9010:
URL: https://github.com/apache/arrow/pull/9010#issuecomment-751221207


   # [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=h1) Report
   > Merging [#9010](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=desc) (8f3d28c) into [master](https://codecov.io/gh/apache/arrow/commit/a4f7c4a2acda874b3d6eb2eb4c986e7c3267c755?el=desc) (a4f7c4a) will **increase** coverage by `0.00%`.
   > The diff coverage is `88.23%`.
   
   [![Impacted file tree graph](https://codecov.io/gh/apache/arrow/pull/9010/graphs/tree.svg?width=650&height=150&src=pr&token=LpTCFbqVT1)](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree)
   
   ```diff
   @@           Coverage Diff           @@
   ##           master    #9010   +/-   ##
   =======================================
     Coverage   82.87%   82.87%           
   =======================================
     Files         201      201           
     Lines       49739    49746    +7     
   =======================================
   + Hits        41220    41226    +6     
   - Misses       8519     8520    +1     
   ```
   
   
   | [Impacted Files](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=tree) | Coverage Δ | |
   |---|---|---|
   | [rust/benchmarks/src/bin/tpch.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9iZW5jaG1hcmtzL3NyYy9iaW4vdHBjaC5ycw==) | `0.00% <0.00%> (ø)` | |
   | [rust/arrow/src/csv/writer.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvY3N2L3dyaXRlci5ycw==) | `79.38% <100.00%> (+0.55%)` | :arrow_up: |
   | [rust/arrow/src/array/transform/fixed\_binary.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9hcnJvdy9zcmMvYXJyYXkvdHJhbnNmb3JtL2ZpeGVkX2JpbmFyeS5ycw==) | `78.94% <0.00%> (-5.27%)` | :arrow_down: |
   | [rust/parquet/src/encodings/encoding.rs](https://codecov.io/gh/apache/arrow/pull/9010/diff?src=pr&el=tree#diff-cnVzdC9wYXJxdWV0L3NyYy9lbmNvZGluZ3MvZW5jb2RpbmcucnM=) | `95.43% <0.00%> (+0.19%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=continue).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=footer). Last update [a4f7c4a...8f3d28c](https://codecov.io/gh/apache/arrow/pull/9010?src=pr&el=lastupdated). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments).
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #9010: ARROW-11033 [Rust] Csv writing performance improvements

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #9010:
URL: https://github.com/apache/arrow/pull/9010#issuecomment-751221626


   https://issues.apache.org/jira/browse/ARROW-11033


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org