You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2019/11/02 07:33:01 UTC

[GitHub] [pulsar] candlerb opened a new issue #5537: Reader method to locate message ID by event time

candlerb opened a new issue #5537: Reader method to locate message ID by event time
URL: https://github.com/apache/pulsar/issues/5537
 
 
   **Is your feature request related to a problem? Please describe.**
   I would like to be able to initialize a reader using an event time.  Typical use cases:
   * I have a long-lived, persistent topic containing system logs, and I want to do an ad-hoc query saying "show me logs starting from 1 hour ago"
   * I have a reader which is storing its messageID state in a database, and I want to rewind it to a particular point in time
   * I am developing reader code and I want to test it across a subset of recent messages, not from the very beginning of the topic.
   
   Currently when you create a "reader", you can initialize it to start at either the earliest message or the end of the topic - there are special sentinel message IDs for those cases.  You can also initialize it to any existing message ID on a topic, but to do that, you must already know a valid message ID.
   
   **Describe the solution you'd like**
   An API call which, given a topic and event time, returns the message ID of the first message after that time.
   
   This functionality must already exist internally in Pulsar because the admin API has a `reset-cursor` call for subscriptions.  I would like that internal search for event time to be exposed.
   
   I suspect this would end up in the REST API rather than the client protocol.  Ideally it would be accessible to clients without admin privileges.  I found some work to allow subscription-related admin calls to unprivileged consumers when dealing with their own subscriptions - #2964 / #2981 - so I'd like the new call to be covered by this.
   
   **Describe alternatives you've considered**
   For topic X, I could write a separate index topic X1 which emits a messageID and timestamp, say one message for every 10,000 messages in X.  Then I can scan X1 looking for the last timestamp before the time of interest, and then read X forward from that messageID.  This would have to be duplicated for every topic.
   
   I could create a temporary subscription, use reset-cursor on it, and read using a consumer - or read one message, and use it to get the message ID to initialize a reader.  It seems overblown to create and destroy a subscription just for that.
   
   **Additional context**
   Provides feature parity to Kafka: `getOffsetsByTimes` / `offsetsForTimes`

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services