You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@doris.apache.org by GitBox <gi...@apache.org> on 2021/03/05 03:47:36 UTC

[GitHub] [incubator-doris] HappenLee opened a new issue #5467: [Feature] Support Parallel Merge Sort in Exchange Node

HappenLee opened a new issue #5467:
URL: https://github.com/apache/incubator-doris/issues/5467


   ## Motivation
   Currently, Doris supports merging and sorting the data sent by the underlying sorted nodes on the exchange node to ensure that the final data is in order. This is a multi-channel merge process, but when the amount of underlying data is too large and there are too many sending nodes, the single thread merge sort will become the bottleneck of query performance.
   
   So,We Should Support Parallel Merge in Exchange Node to speed up query. 
   ## Implementation
   
   This picture show the design of parallel merge
   
   ![image.png](https://upload-images.jianshu.io/upload_images/8552201-d8337a1ef0026560.png?imageMogr2/auto-orient/strip%7CimageView2/2/w/1240)
   
   
    There we chose parallel merge, we should make thread execute more parallel to minimized the computation of top merger
    TopMerger: have child merger to supplier data
    ChildMerger: have sender queue to supplier data, each merger start a thread to merge data firstly
    SenderQueue: the data from other node
    Before parallel merge, if we have 81 sender queue, data is 1000, the computation is 1000 * log(81)
    After parallel merge, the computation is MAX(1000 * log(2), 500 * log(41))
   
   
   #### How many child merger and thread
    Now we only support max 3 merge child, each child use 1 thread, because:
    we have N _sender_queue, M merge child. the best way is log(N / M) = M * log(M)
    So if N = 8, M = 2
     	 N = 81, M = 3
            N = 1024, M = 4
    normally the N is lower than 1024, so we chose 8 <= N < 81, M = 2
                                                                                        N >= 81, M = 3
   
   ## Test
   
   3000kw data test `order by` 
   
   | Merge Sort | Parallel Merge Sort|
   | :-----: | :----: |
   | 3.3s | 1.49s |


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org


[GitHub] [incubator-doris] morningman closed issue #5467: [Feature] Support Parallel Merge Sort in Exchange Node

Posted by GitBox <gi...@apache.org>.
morningman closed issue #5467:
URL: https://github.com/apache/incubator-doris/issues/5467


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@doris.apache.org
For additional commands, e-mail: commits-help@doris.apache.org