You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2022/05/09 11:00:16 UTC

[GitHub] [incubator-doris] Lchangliang opened a new pull request, #9472: [Enhancement] Improve parquet reader

Lchangliang opened a new pull request, #9472:
URL: https://github.com/apache/incubator-doris/pull/9472

   # Proposed changes
   
   Issue Number: close #xxx
   
   ## Problem Summary:
   
   1. Modifying Arrow's Interface Parameters. Use pre_buffer/threads to speed up parquet file reading.
   2. Adding prefetch worker, cache batches.
   
   Compare:
   
   Table:
   Dup表
   3 Buckets
   1 副本
   
   Parquet Filesize:
   12.09G
   153,600,000行
   
   load_param:
   "send_batch_parallelism" = "4"
   
   Before:
   10.5MB/s。
   134383行/s
   
   After:
   52.2MB/s
   648101行/s
   
   ## Checklist(Required)
   
   1. Does it affect the original behavior: (Yes/No/I Don't know)
   2. Has unit tests been added: (Yes/No/No Need)
   3. Has document been added or modified: (Yes/No/No Need)
   4. Does it need to update dependencies: (Yes/No)
   5. Are there any changes that cannot be rolled back: (Yes/No)
   
   ## Further comments
   
   If this is a relatively large or complex change, kick off the discussion at [dev@doris.apache.org](mailto:dev@doris.apache.org) by explaining why you chose the solution you did and what alternatives you considered, etc...
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9472: [Enhancement] improve parquet reader via arrow's prefetch and multi thread

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9472:
URL: https://github.com/apache/incubator-doris/pull/9472#issuecomment-1128741877

   PR approved by anyone and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] github-actions[bot] commented on pull request #9472: [Enhancement] improve parquet reader via arrow's prefetch and multi thread

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #9472:
URL: https://github.com/apache/incubator-doris/pull/9472#issuecomment-1128845692

   PR approved by at least one committer and no changes requested.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] yiguolei merged pull request #9472: [Enhancement] improve parquet reader via arrow's prefetch and multi thread

Posted by GitBox <gi...@apache.org>.
yiguolei merged PR #9472:
URL: https://github.com/apache/incubator-doris/pull/9472


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org