You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pinot.apache.org by GitBox <gi...@apache.org> on 2022/04/05 15:59:09 UTC

[GitHub] [pinot] MrNeocore opened a new issue, #8469: [Feature request] Notify when a new OFFLINE segment is query-able

MrNeocore opened a new issue, #8469:
URL: https://github.com/apache/pinot/issues/8469

   Following Slack discussion: https://apache-pinot.slack.com/archives/C011C9JHN7R/p1649059683439889
   
   **Summary:**
   In some cases, there is a need to know when / if a new OFFLINE segment is ready for query.
   For example:
   1. A new segment is pushed to Pinot's controller via a `SegmentCreationAndTarPush` Job.
   2. The job ends as soon as this segment is copied over to the Controller (as far as I know).
   3. Some services may be interested in the fact that new data is available (e.g. to update some materialized view), we can notify them using a pub/sub system for exemple.
   4. Those services come ask for this updated data.
   5. As long as Pinot's servers are not done downloading the new segment from the deep store, they serve "stale" data.
   
   => The root issue comes from the fact that the business-level notify message is tied to the end of the `SegmentCreationAndTarPush` Job and not the true availability of new data through Pinot.
   
   **Proposition**
   1. Implement a callback mechanism into Pinot which will be called as soon as data is ready for query _(By @snleee)_
   - Could be a generic Java function callback
   - Could be a message being sent on a pub/sub system
   - Could be a call to some external REST API
   
   2. Implement a `waitForDataAvailability` option (poor name) from within Pinot's Push Jobs.
   
   3. Other?
   
   **Current workarounds**
   1. Wait for a fixed time: Brittle solution
   2. Wait for IS == EV  _(By @mayankshriv)_
   3. Poll segment status (given that its name is known)
   4. Other?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org


[GitHub] [pinot] Jackie-Jiang commented on issue #8469: [Feature request] Notify when a new OFFLINE segment is query-able

Posted by GitBox <gi...@apache.org>.
Jackie-Jiang commented on issue #8469:
URL: https://github.com/apache/pinot/issues/8469#issuecomment-1090601570

   In order to know if a new segment is ready for query, the information should come from the broker. Here is the steps of new segment becoming queryable:
   1. Segment uploaded to the deep store via controller (or metadata push)
   2. Server gets notified to download and load the segment
   3. Controller collects server status and updates table's external view
   4. Broker listens to the external view change and includes the query into the routing. The segment is queryable only when it is loaded on an enabled server that is not disabled for routing.
   
   Do you think it can solve the problem if we add an API to the broker to return if the given segment is ready for query (and all queryable segments for a given table)? We may also add the API to the controller to be redirected to the broker


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscribe@pinot.apache.org
For additional commands, e-mail: commits-help@pinot.apache.org