You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2021/05/26 08:15:08 UTC

[GitHub] [incubator-pinot] Eywek opened a new issue #6982: Avoid timeouts when using a large offset limit

Eywek opened a new issue #6982:
URL: https://github.com/apache/incubator-pinot/issues/6982


   Follow up from https://apache-pinot.slack.com/archives/C011C9JHN7R/p1621587669055200
   
   ---------------
   
   We're having issue with LIMIT on a table with 4.5 millions of rows, when I’m doing this query:
   ```sql
   SELECT * FROM datasource_609bc4f74e3c000300131110 ORDER BY "timestamp" ASC LIMIT 100000,10
   ```
   I’m getting a result in ~2.5s, and I can see in the query response stats totalDocs=4794306  which is fine.
   
   But when I’m doing this one (offset 1 000 000 instead of 100 000):
   ```sql
   SELECT * FROM datasource_609bc4f74e3c000300131110 ORDER BY "timestamp" ASC LIMIT 1000000,10
   ```
   I’m getting no rows in ~10s and the totalDocs is 569840 because Pinot servers timeouts:
   <img width="360" alt="Capture d’écran 2021-05-25 à 10 00 34" src="https://user-images.githubusercontent.com/6900054/119625595-a6a95e00-be0a-11eb-81e6-439b24345d94.png">
   
   
   We have an hybrid table with segmentPruning by time, we have 16G of heap and around 32G available for nmaped file on each machine. Our segments contains around ~390k documents.
   
   How can we solve this issue without increasing timeouts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org