You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@arrow.apache.org by GitBox <gi...@apache.org> on 2021/12/20 10:30:10 UTC

[GitHub] [arrow] thisisnic opened a new pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

thisisnic opened a new pull request #11992:
URL: https://github.com/apache/arrow/pull/11992


   This PR switches to using the asynchronous scanner by default when reading in datasets.  I've tested it locally on a large dataset (2.5Gb of CSV files) and it does resolve the original issue, but due to the size of the files involved I wasn't sure this was something I could easily write tests for.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #11992:
URL: https://github.com/apache/arrow/pull/11992#issuecomment-1004287116


   Benchmark runs are scheduled for baseline = cb1897ee0d20cfbaad1d879573362ce29c3e11b0 and contender = 762fad5e5d1499b20db81a75cbc448c1ef6fca03. 762fad5e5d1499b20db81a75cbc448c1ef6fca03 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Scheduled] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8bb812d0b6da4e698e453f1aae4722ed...26840705a4014ff89d68cba86a054384/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/ca7f4801807d46769210fafec4ee8e19...0de313df39a24e76889a45f29adcba12/)
   [Failed] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/ffa64dd8e33e4a7fa34c84d5f25455ae...54d39cec7f5d409eb4d09570713c8110/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #11992:
URL: https://github.com/apache/arrow/pull/11992#issuecomment-1004287116


   Benchmark runs are scheduled for baseline = cb1897ee0d20cfbaad1d879573362ce29c3e11b0 and contender = 762fad5e5d1499b20db81a75cbc448c1ef6fca03. 762fad5e5d1499b20db81a75cbc448c1ef6fca03 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8bb812d0b6da4e698e453f1aae4722ed...26840705a4014ff89d68cba86a054384/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/ca7f4801807d46769210fafec4ee8e19...0de313df39a24e76889a45f29adcba12/)
   [Failed] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/ffa64dd8e33e4a7fa34c84d5f25455ae...54d39cec7f5d409eb4d09570713c8110/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #11992:
URL: https://github.com/apache/arrow/pull/11992#issuecomment-998211413


   @ursabot please benchmark lang=R


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot commented on pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

Posted by GitBox <gi...@apache.org>.
ursabot commented on pull request #11992:
URL: https://github.com/apache/arrow/pull/11992#issuecomment-1004287116


   Benchmark runs are scheduled for baseline = cb1897ee0d20cfbaad1d879573362ce29c3e11b0 and contender = 762fad5e5d1499b20db81a75cbc448c1ef6fca03. 762fad5e5d1499b20db81a75cbc448c1ef6fca03 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Scheduled] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8bb812d0b6da4e698e453f1aae4722ed...26840705a4014ff89d68cba86a054384/)
   [Scheduled] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/ca7f4801807d46769210fafec4ee8e19...0de313df39a24e76889a45f29adcba12/)
   [Scheduled] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/ffa64dd8e33e4a7fa34c84d5f25455ae...54d39cec7f5d409eb4d09570713c8110/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] jonkeane commented on pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

Posted by GitBox <gi...@apache.org>.
jonkeane commented on pull request #11992:
URL: https://github.com/apache/arrow/pull/11992#issuecomment-998219847


   @ursabot please benchmark


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] ursabot edited a comment on pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

Posted by GitBox <gi...@apache.org>.
ursabot edited a comment on pull request #11992:
URL: https://github.com/apache/arrow/pull/11992#issuecomment-1004287116


   Benchmark runs are scheduled for baseline = cb1897ee0d20cfbaad1d879573362ce29c3e11b0 and contender = 762fad5e5d1499b20db81a75cbc448c1ef6fca03. 762fad5e5d1499b20db81a75cbc448c1ef6fca03 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
   Conbench compare runs links:
   [Finished :arrow_down:0.0% :arrow_up:0.0%] [ec2-t3-xlarge-us-east-2](https://conbench.ursa.dev/compare/runs/8bb812d0b6da4e698e453f1aae4722ed...26840705a4014ff89d68cba86a054384/)
   [Failed :arrow_down:1.35% :arrow_up:0.0%] [ursa-i9-9960x](https://conbench.ursa.dev/compare/runs/ca7f4801807d46769210fafec4ee8e19...0de313df39a24e76889a45f29adcba12/)
   [Failed] [ursa-thinkcentre-m75q](https://conbench.ursa.dev/compare/runs/ffa64dd8e33e4a7fa34c84d5f25455ae...54d39cec7f5d409eb4d09570713c8110/)
   Supported benchmarks:
   ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python. Runs only benchmarks with cloud = True
   ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
   ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] thisisnic commented on pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

Posted by GitBox <gi...@apache.org>.
thisisnic commented on pull request #11992:
URL: https://github.com/apache/arrow/pull/11992#issuecomment-998006454


   @ursabot please benchmark lang=r


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] github-actions[bot] commented on pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #11992:
URL: https://github.com/apache/arrow/pull/11992#issuecomment-997798077






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [arrow] thisisnic closed pull request #11992: ARROW-14653: [R] head() hangs on CSV datasets > 600MB

Posted by GitBox <gi...@apache.org>.
thisisnic closed pull request #11992:
URL: https://github.com/apache/arrow/pull/11992


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@arrow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org