You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2020/07/05 05:01:05 UTC
[GitHub] [incubator-pinot] kishoreg opened a new issue #5658: star-tree latency issue
kishoreg opened a new issue #5658:
URL: https://github.com/apache/incubator-pinot/issues/5658
Query:
```
select
platform_id,
customer_id,
sum(clicks) as "clicks",
sum(impressions) as "impressions",
sum(cost_usd_micros) / 1000000.00 as "cost_usd_amount"
from
pinot.default.metrics
where
platform_id = 13
and utc_date >= date('2020-06-05')
and utc_date < date('2020-07-05')
group by
platform_id,
customer_id limit 10000000000
```
with star tree config
```
{
"dimensionsSplitOrder": [
"utc_date",
"platform_id",
"customer_id",
"account_id",
"campaign_id",
"promotion_id"
],
"skipStarNodeCreationForDimensions": [
],
"functionColumnPairs": [
"SUM__insertions",
"SUM__impressions",
"SUM__clicks",
"SUM__cost_usd_micros"
]
},
```
Response
```
"numServersQueried": 2,
"numServersResponded": 2,
"numSegmentsQueried": 324,
"numSegmentsProcessed": 301,
"numSegmentsMatched": 299,
"numConsumingSegmentsQueried": 1,
"numDocsScanned": 810244,
"numEntriesScannedInFilter": 0,
"numEntriesScannedPostFilter": 4051220,
"numGroupsLimitReached": false,
"totalDocs": 304059555,
"timeUsedMs": 460,
"segmentStatistics": [],
"traceInfo": {},
```
changing max_leaf_records to 1 make the query slower.
```
"numServersQueried": 2,
"numServersResponded": 2,
"numSegmentsQueried": 442,
"numSegmentsProcessed": 309,
"numSegmentsMatched": 309,
"numConsumingSegmentsQueried": 1,
"numDocsScanned": 280283,
"numEntriesScannedInFilter": 0,
"numEntriesScannedPostFilter": 1401415,
"numGroupsLimitReached": false,
"totalDocs": 419437982,
"timeUsedMs": 2555,
"segmentStatistics": [],
"traceInfo": {},
```
Note numDocsScanned reduced as expected 810244/280283 but latency increased from 490ms to 2555 ms.
@Jackie-Jiang, @mayankshriv why would this happen?
`
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org
[GitHub] [incubator-pinot] Jackie-Jiang commented on issue #5658: star-tree latency issue
Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #5658:
URL: https://github.com/apache/incubator-pinot/issues/5658#issuecomment-654048897
Both `numSegments` and `totalDocs` increased, meaning this is not the same data.
Also, even though the `numDocsScanned ` is reduced from 800K to 280K, the bottleneck should be on the IO where each server needs to process ~150 segments. Changing `max_leaf_records` to 1 could significantly increase the size of the segment, which increases the data size to load, thus increase the latency.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org