You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/07/05 05:01:05 UTC

[GitHub] [incubator-pinot] kishoreg opened a new issue #5658: star-tree latency issue

kishoreg opened a new issue #5658:
URL: https://github.com/apache/incubator-pinot/issues/5658


   Query:
   
   ```
   select
               platform_id,
               customer_id,
               sum(clicks) as "clicks",
               sum(impressions) as "impressions",
               sum(cost_usd_micros) / 1000000.00 as "cost_usd_amount" 
            from
               pinot.default.metrics 
            where
               platform_id = 13 
               and utc_date >= date('2020-06-05') 
               and utc_date < date('2020-07-05') 
            group by
               platform_id,
               customer_id limit 10000000000
   ```
   
   with star tree config
   ```
   {
               "dimensionsSplitOrder": [
                 "utc_date",
                 "platform_id",
                 "customer_id",
                 "account_id",
                 "campaign_id",
                 "promotion_id"
               ],
               "skipStarNodeCreationForDimensions": [
               ],
               "functionColumnPairs": [
                 "SUM__insertions",
                 "SUM__impressions",
                 "SUM__clicks",
                 "SUM__cost_usd_micros"
               ]
             },
   ```
   
   Response
   ```
   "numServersQueried": 2,
       "numServersResponded": 2,
       "numSegmentsQueried": 324,
       "numSegmentsProcessed": 301,
       "numSegmentsMatched": 299,
       "numConsumingSegmentsQueried": 1,
       "numDocsScanned": 810244,
       "numEntriesScannedInFilter": 0,
       "numEntriesScannedPostFilter": 4051220,
       "numGroupsLimitReached": false,
       "totalDocs": 304059555,
       "timeUsedMs": 460,
       "segmentStatistics": [],
       "traceInfo": {},
   ```
   
   changing max_leaf_records to 1 make the query slower.
   
   ```
       "numServersQueried": 2,
       "numServersResponded": 2,
       "numSegmentsQueried": 442,
       "numSegmentsProcessed": 309,
       "numSegmentsMatched": 309,
       "numConsumingSegmentsQueried": 1,
       "numDocsScanned": 280283,
       "numEntriesScannedInFilter": 0,
       "numEntriesScannedPostFilter": 1401415,
       "numGroupsLimitReached": false,
       "totalDocs": 419437982,
       "timeUsedMs": 2555,
       "segmentStatistics": [],
       "traceInfo": {},
   ```
   
   Note numDocsScanned reduced as expected 810244/280283 but latency increased from 490ms to 2555 ms.
   
   @Jackie-Jiang, @mayankshriv why would this happen?
   
   
   `


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #5658: star-tree latency issue

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #5658:
URL: https://github.com/apache/incubator-pinot/issues/5658#issuecomment-654048897


   Both `numSegments` and `totalDocs` increased, meaning this is not the same data.
   Also, even though the `numDocsScanned ` is reduced from 800K to 280K, the bottleneck should be on the IO where each server needs to process ~150 segments. Changing `max_leaf_records` to 1 could significantly increase the size of the segment, which increases the data size to load, thus increase the latency.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org