You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by "shwin (via GitHub)" <gi...@apache.org> on 2023/07/03 04:43:25 UTC

[GitHub] [pinot] shwin commented on issue #10986: Count discrepancy among `in`/`=` and `<` queries using timestamp index

shwin commented on issue #10986:
URL: https://github.com/apache/pinot/issues/10986#issuecomment-1617292660

   @Jackie-Jiang yep! My latest table returns _3_ segments for the `>=/<` query and _2_ for the `=` query. The table has 3000 segments overall.
   
   For the >=/< query:
   ```
   "exceptions": [],
     "numServersQueried": 2,
     "numServersResponded": 2,
     "numSegmentsQueried": 3000,
     "numSegmentsProcessed": 3000,
     "numSegmentsMatched": 3,
     "numConsumingSegmentsQueried": 0,
     "numConsumingSegmentsProcessed": 0,
     "numConsumingSegmentsMatched": 0,
     "numDocsScanned": 3110482,
     "numEntriesScannedInFilter": 0,
     "numEntriesScannedPostFilter": 6220964,
     "numGroupsLimitReached": false,
     "totalDocs": 3994721240,
     "timeUsedMs": 44,
     "offlineThreadCpuTimeNs": 0,
     "realtimeThreadCpuTimeNs": 0,
     "offlineSystemActivitiesCpuTimeNs": 0,
     "realtimeSystemActivitiesCpuTimeNs": 0,
     "offlineResponseSerializationCpuTimeNs": 0,
     "realtimeResponseSerializationCpuTimeNs": 0,
     "offlineTotalCpuTimeNs": 0,
     "realtimeTotalCpuTimeNs": 0,
     "segmentStatistics": [],
     "traceInfo": {},
     "minConsumingFreshnessTimeMs": 0,
     "numSegmentsPrunedByBroker": 0,
     "numSegmentsPrunedByServer": 0,
     "numSegmentsPrunedInvalid": 0,
     "numSegmentsPrunedByLimit": 0,
     "numSegmentsPrunedByValue": 0,
     "explainPlanNumEmptyFilterSegments": 0,
     "explainPlanNumMatchAllFilterSegments": 0,
     "numRowsResultSet": 3
   ```
   
   So specfically:
   ```
     "numSegmentsProcessed": 3000,
     "numSegmentsMatched": 3,
   ```
   
   
   For the `=` query:
   ```
   "numServersQueried": 2,
     "numServersResponded": 2,
     "numSegmentsQueried": 3000,
     "numSegmentsProcessed": 3000,
     "numSegmentsMatched": 2,
     "numConsumingSegmentsQueried": 0,
     "numConsumingSegmentsProcessed": 0,
     "numConsumingSegmentsMatched": 0,
     "numDocsScanned": 2287773,
     "numEntriesScannedInFilter": 0,
     "numEntriesScannedPostFilter": 4575546,
     "numGroupsLimitReached": false,
     "totalDocs": 3994721240,
     "timeUsedMs": 561,
     "offlineThreadCpuTimeNs": 0,
     "realtimeThreadCpuTimeNs": 0,
     "offlineSystemActivitiesCpuTimeNs": 0,
     "realtimeSystemActivitiesCpuTimeNs": 0,
     "offlineResponseSerializationCpuTimeNs": 0,
     "realtimeResponseSerializationCpuTimeNs": 0,
     "offlineTotalCpuTimeNs": 0,
     "realtimeTotalCpuTimeNs": 0,
     "segmentStatistics": [],
     "traceInfo": {},
     "minConsumingFreshnessTimeMs": 0,
     "numSegmentsPrunedByBroker": 0,
     "numSegmentsPrunedByServer": 0,
     "numSegmentsPrunedInvalid": 0,
     "numSegmentsPrunedByLimit": 0,
     "numSegmentsPrunedByValue": 0,
     "explainPlanNumEmptyFilterSegments": 0,
     "explainPlanNumMatchAllFilterSegments": 0,
     "numRowsResultSet": 2
   ```
   
   So specifically:
   ```
     "numSegmentsProcessed": 3000,
     "numSegmentsMatched": 2,
   ```
   
   
   In both cases I'm a little surprised we're processing all 3000 segments, I guess because we're querying the generated timestamp$DAY columns instead of just the main timestamp column? If I do a between query on just my timestamp column (eg `blocktimestamp >= something AND blockTimestamp < something` I get 3 segments processed.
   
   So, in any case, I guess it's not the pruning here since the `numSegmentsProcessed` is identical in both queries.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org