You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@druid.apache.org by GitBox <gi...@apache.org> on 2018/07/27 19:14:02 UTC

[GitHub] a2l007 opened a new issue #6057: Broker sends sequential requests to the historical for union queries

a2l007 opened a new issue #6057: Broker sends sequential requests to the historical for union queries
URL: https://github.com/apache/incubator-druid/issues/6057
 
 
   Analyzing groupBy queries based on Union datasources, we have seen that for a given query if a historical has segments for the multiple datasources part of this query, the broker generates multiple requests to this historical, but in a serial manner.
   For example, running the following union query:
   
   ```
   {
     "queryType": "groupBy",
     "dataSource": {
       "type": "union",
       "dataSources": [
         {
           "type": "table",
           "name": "tableA"
         },
         {
           "type": "table",
           "name": "tableB"
         },
         {
           "type": "table",
           "name": "tableC"
         }
       ]
     },
     "intervals": {
       "type": "LegacySegmentSpec",
       "intervals": [
         "2018-07-23T00:00:00.000Z\/2018-07-24T00:00:00.000Z"
       ]
     },
     "virtualColumns": [
       
     ],
     "filter": null,
     "granularity": "DAY",
     "dimensions": [
       {
         "type": "default",
         "dimension": "acc_id"
       }
     ],
     "aggregations": [
       {
         "fieldName": "clicks",
         "name": "clicks",
         "type": "longSum"
       }
     ],
     "postAggregations": [
       
     ],
     "having": null,
     "limitSpec": {
       "type": "NoopLimitSpec"
     },
     "descending": false
   }
   ```
   generates logs such as:
   ```
   2018-07-24T14:50:06,500 INFO [qtp1256384385-231[groupBy_q_test_1]] com.metamx.http.client.pool.ChannelResourceFactory - Generating: https://tier
   .historical.foo.com:4443
   2018-07-24T14:50:40,366 INFO [qtp1256384385-231[groupBy_q_test_1]] com.metamx.http.client.pool.ChannelResourceFactory - Generating: https://tier
   .historical.foo.com:4443
   ```
   
   This historical contains segments for datasources `tableA` and `tableB` and therefore it sends two request to the historical but as it can be seen from the timestamps, there is a delay between both the requests.
    [BrokerServerView](https://github.com/apache/incubator-druid/blob/master/server/src/main/java/io/druid/client/BrokerServerView.java#L291) confirms this behaviour that at an instant for a given query, there can be only a single request to a historical. This behaviour clearly causes query execution delays for union queries.
   Before I work on investigating if this can be parallelized, I wanted to check with the community for any comments on whether this is already a known issue with union queries or if it is actually a bug.
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@druid.apache.org
For additional commands, e-mail: commits-help@druid.apache.org