You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/01/11 21:11:13 UTC

[GitHub] [pinot] lksvenoy-r7 opened a new issue #7994: RealtimeProvisioningHelper does not account for deepStore

lksvenoy-r7 opened a new issue #7994:
URL: https://github.com/apache/pinot/issues/7994


   The RealtimeProvisioningHelper assumed that you need all the segments in memory in the realtime host, this isn't the case if you are pushing completed segments to deepStore. It would be nice if we could tell the realtime provisioning helper that deep store is enabled so that it can account for this.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on issue #7994: RealtimeProvisioningHelper does not account for deepStore

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on issue #7994:
URL: https://github.com/apache/pinot/issues/7994#issuecomment-1010368708


   @lksvenoy-r7 thanks for filing the issue. One question. All realtime completed segments are indeed stored in deepstore. The question is whether there needs to be a copy in memory (on the server) or not.
   Currently, there is an argument `retentionHours` that can be provided, which says how many hours of data are typically queried. (See https://docs.pinot.apache.org/operators/operating-pinot/tuning/realtime#realtimeprovisioninghelper)
   
   If you only query the last N hours most of the time, and do not mind the additional latency of paging for the other queries, you can specify this while invoking the command.
   
   Does the help?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] mcvsubbu commented on issue #7994: RealtimeProvisioningHelper does not account for deepStore

Posted by GitBox <gi...@apache.org>.
mcvsubbu commented on issue #7994:
URL: https://github.com/apache/pinot/issues/7994#issuecomment-1011300929


   OK, I see where the confusion arises. We will address it.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] lksvenoy-r7 commented on issue #7994: RealtimeProvisioningHelper does not account for deepStore

Posted by GitBox <gi...@apache.org>.
lksvenoy-r7 commented on issue #7994:
URL: https://github.com/apache/pinot/issues/7994#issuecomment-1010813990


   @mcvsubbu that definitely helps and that is what I have been doing, but this is misleading, as retentionHours should have a direct correlation to the retention of the table. If a query causes a server to load segments from deep store, memory usage is going to be higher. By separating those two concepts, table retention and deep store, it is possible to give a more accurate picture of how much memory is used during consumption, and how much memory is used during queries for N time buckets of data. Perhaps I am misunderstanding, but let me know what your thoughts are.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org