You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pulsar.apache.org by Rajan Dhabalia <rd...@apache.org> on 2023/06/20 07:03:30 UTC

Re: [DISCUSS] PIP-268: Add support of topic stats/stats-internal using client api

Hi,

>> >> And for the Admin API, we have several supported params
> getStats(Map<StatsOption,
Object>)/getInternalStats(Map<InternalStatsOption, Object>)
> I will update the PIP with this API change and please let me know if
there is any other suggestion.

I have updated PIP with API change to address concerns of supported params.

Thanks,
Rajan



On Mon, May 15, 2023 at 3:57 PM Rajan Dhabalia <rd...@apache.org> wrote:

> Hi Penghui,
>
>
> >> And for the Admin API, we have several supported params
> Good point regarding the stats/internal-stats params. I think we can pass
> option-map into below GET API for stats so, we can easily enhance if more
> options would be added in future and each option data-type can be
> documented into javadoc and can also be validated at client side before
> sending to server.
> getStats(Map<StatsOption,
> Object>)/getInternalStats(Map<InternalStatsOption, Object>)
> I will update the PIP with this API change and please let me know if there
> is any other suggestion.
>
> >> How do we handle the stats/internal-stats of a partitioned topic?
> PIP already covers this scenario. Please check
> "TopicStatsInfo/TopicInternalStatsInfo" definition. TopicStatsProvider
> should be able to handle partition/non-partitioned requests with the same
> api definition and TopicStatsProvider returns stats-info
> (TopicStatsInfo/TopicInternalStatsInfo) with map of aggregated partition
> response and user can retrieve stats for each partition/topic in case of
> partition/mulit-topic consumer and for non-partition topic it will just
> have entry for that particular non-partition topic but application will not
> have different impl or different api definition for these usecases.
>
> >> Do we need a compatibility section to clarify the behavior of the new
> client pointing to an old broker? What is the exact error that users will
> get if the broker doesn't support the new protocol?
> Generally we don't consider version incompatibility with client version
> upgrade with lower server version. Therefore, as you can see we have never
> handled it for any such feature for any client lib. Generally if a server
> finds any unknown protocol the server ignores it and the client should get
> a timeout exception. So, either we can add server side unknown protocol
> exceptions or we can continue this behavior for now.
>
> Thanks,
> Rajan
>
> On Mon, May 15, 2023 at 6:09 AM PengHui Li <pe...@apache.org> wrote:
>
>> I have no strong objection. Just a few questions.
>>
>> How do we handle the stats/internal-stats of a partitioned topic?
>> As I understand, the client side will combine the stats/internal-stats
>> of the partitions? It's better to clarify in the proposal.
>>
>> Do we need a compatibility section to clarify the behavior of the new
>> client
>> pointing to an old broker? What is the exact error that users will get if
>> the
>> broker doesn't support the new protocol?
>>
>> And for the Admin API, we have several supported params
>>
>> Options for getStats:
>> - getPreciseBacklog
>> - subscriptionBacklogSize
>> - getEarliestTimeInBacklog
>>
>> Options for getInternalStats:
>> - metadata
>>
>> Do we need to add to the binary protocol to make it consistent with the
>> REST API?
>> And the params supported in getStats API are not supported in
>> getInternalStats API.
>> We'd better have a separate message in the wire protocol.
>>
>> Looking further, any other APIs that are supported in Admin API should be
>> considered
>> to add to binary protocol? I mean, we'd better have a strategy to avoid
>> moving many
>> APIs to binary protocol. For example:
>>
>> - Tenant/Namespace level REST APIs, no
>> - Topic policies APIs, no
>> - Topic's internal info, maybe
>> - Expire messages, no
>> - Topic's properties, no
>> - ...
>>
>> Regards,
>> Penghui
>>
>> On Mon, May 15, 2023 at 3:47 PM Rajan Dhabalia <rd...@apache.org>
>> wrote:
>>
>> > Well useful fields really depend on the applications and different
>> usecases
>> > require different fields and it can eventually end up covering all.
>> > Therefore, we should try to create a protocol which can transport
>> > stats/internal and client should be able to represent that data to
>> > applications in a standard format which should be similar to admin-api
>> > response to maintain consistency. Therefore, this PIP provides client
>> side
>> > API where applications can retrieve and access stats/internal in the
>> same
>> > format as admin-api response to have seamless experience with generic
>> > client-server protocol.
>> >
>> > Thanks,
>> > Rajan
>> >
>> > On Mon, May 15, 2023 at 12:25 AM Enrico Olivelli <eo...@gmail.com>
>> > wrote:
>> >
>> > > I understand the need for this PIP and I support it, but I have some
>> > > questions/open points.
>> > >
>> > > I wonder if we should define a smaller subset of the stats/internal
>> > > stats to serve to the clients using this new Request/Response.
>> > > Maybe it is better to serve only the minimal amount of data that is
>> > useful.
>> > >
>> > > If we have a well defined list of stats to serve with this purpose we
>> > > can save us from serving the JSON, that is less efficient,
>> > > we can declare them as protobuf fields and deal better with protocol
>> > > changes and client bindings.
>> > >
>> > >
>> > > Enrico
>> > >
>> > > Il giorno lun 15 mag 2023 alle ore 02:35 Rajan Dhabalia
>> > > <rd...@apache.org> ha scritto:
>> > > >
>> > > > There are multiple factors tcp would be better than http when you
>> are
>> > > > considering performance and scale due to lower overhead, faster
>> > parsing,
>> > > > more control over connection, congestion control, etc and that
>> could be
>> > > > reasons why we have tcp binary protocol for producers/consumers
>> > > > communication with server and this will be an extension for those
>> > > producers
>> > > > and consumers applications which would like to periodically or
>> > on-demand
>> > > > retrieve stats from large number of application nodes and we would
>> like
>> > > to
>> > > > let those producers/consumers retrieve it and use it in application
>> > > without
>> > > > worrying about scaling issues at admin apis.
>> > > >
>> > > > Thanks,
>> > > > Rajan
>> > > >
>> > > > On Sun, May 14, 2023 at 6:47 AM Asaf Mesika <as...@gmail.com>
>> > > wrote:
>> > > >
>> > > > > If I dive into the exact details of the cause of the
>> > > > > performance implications in current Admin HTTP API:
>> > > > >
>> > > > > Do you think the root cause of the performance is the Jersey
>> > > implementation
>> > > > > of `AsyncResponse.resume(stats)` which takes a thread from a
>> thread
>> > > pool,
>> > > > > serialize the object and then performs a blocking I/O write of the
>> > JSON
>> > > > > string?
>> > > > >
>> > > > > If I compared that with Netty HTTP, it would write the same String
>> > > using
>> > > > > async I/O and not blocking a thread.
>> > > > >
>> > > > > Given the normally large response size of those objects, the
>> response
>> > > > > headers or request headers are negligible in terms of performance
>> > > impact.
>> > > > >
>> > > > > In terms of accepting connection, both Netty and Jetty has async
>> IO
>> > > > > implementation.
>> > > > >
>> > > > > Compared Jetty and Jersey with Netty based binary TCP, if we end
>> up
>> > > writing
>> > > > > a JSON string, the only difference I see is the blocking I/O of
>> > > writing the
>> > > > > response.
>> > > > >
>> > > > > WDYT?
>> > > > >
>> > > > >
>> > > > > On Fri, May 12, 2023 at 7:29 AM Rajan Dhabalia <
>> rdhabalia@apache.org
>> > >
>> > > > > wrote:
>> > > > >
>> > > > > > Communicating over binary protocol is more scalable and
>> performant
>> > > than
>> > > > > > HTTP. Admin API over http has a long history of bottleneck and
>> > > > > performance
>> > > > > > issues which could also sometimes be a bottleneck for lookup
>> > > requests and
>> > > > > > that was the reason we introduced lookup over binary protocol as
>> > > well. We
>> > > > > > have multiple usecases which require fetching stats with
>> relatively
>> > > > > higher
>> > > > > > rate and definitely we would like to avoid it over http which
>> could
>> > > be a
>> > > > > > bottleneck for those applications or could be for others.
>> > > > > > This PIP doesn't mention security so, let's not misinterpret the
>> > > > > usecases.
>> > > > > > Sometimes, pulsar is deployed behind the proxy and potentially
>> used
>> > > SNI
>> > > > > > routing proxy which can't be used as http proxy and we would
>> like
>> > to
>> > > let
>> > > > > > users access stats for a given topic using the same
>> broker-service
>> > > url
>> > > > > > rather than having separate http endpoints. So, this api
>> addresses
>> > > scale,
>> > > > > > performance, and use accessibility in pulsar.
>> > > > > >
>> > > > > > Thanks,
>> > > > > > Rajan
>> > > > > >
>> > > > > > On Thu, May 11, 2023 at 6:24 AM Asaf Mesika <
>> asaf.mesika@gmail.com
>> > >
>> > > > > wrote:
>> > > > > >
>> > > > > > > Before I dive into the PIP, I have several questions on the
>> > > background
>> > > > > > > provided below:
>> > > > > > >
>> > > > > > >
>> > > > > > > On Tue, May 9, 2023 at 9:08 AM Rajan Dhabalia <
>> > > rdhabalia@apache.org>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Hi,
>> > > > > > > >
>> > > > > > > > Right now, Pulsar provides the topic's stats and
>> stats-internal
>> > > over
>> > > > > > HTTP
>> > > > > > > > admin API, and this stats data is used by user applications
>> and
>> > > also
>> > > > > by
>> > > > > > > > Pulsar internal components such as Pulsar-functions to
>> derive
>> > the
>> > > > > > certain
>> > > > > > > > states of the applications.
>> > > > > > > > for example, there are use cases where the application
>> wants to
>> > > check
>> > > > > > the
>> > > > > > > > topic's backlog, subscription's state (readPosition, list of
>> > > > > > > > subscriptions), numberOfEntriesSinceFirstNotAckedMessage,
>> etc
>> > to
>> > > > > > > bootstrap
>> > > > > > > > the application or handle the application’s resiliency and
>> > state
>> > > > > > > > dynamically. Applications can retrieve this stats
>> information
>> > by
>> > > > > using
>> > > > > > > the
>> > > > > > > > broker’s admin HTTP APIs.
>> > > > > > > >
>> > > > > > > > However, stats retrieval over HTTP API doesn’t work well in
>> use
>> > > cases
>> > > > > > > when
>> > > > > > > > users would like to access this API at a higher scale when a
>> > > large
>> > > > > > number
>> > > > > > > > of application nodes would like to use it over HTTP which
>> could
>> > > > > > overload
>> > > > > > > > brokers and sometimes makes broker irresponsive and impact
>> > admin
>> > > API
>> > > > > > > > performance. It also becomes difficult when Pulsar is
>> deployed
>> > > in the
>> > > > > > > cloud
>> > > > > > > > behind the SNI proxy and applications also want to access
>> > > large-scale
>> > > > > > > stats
>> > > > > > > > information periodically over different HTTP ports. Instead
>> it
>> > > would
>> > > > > be
>> > > > > > > > better if applications can fetch stats over on the same
>> binary
>> > > > > protocol
>> > > > > > > for
>> > > > > > > > scalability and accessibility reasons.
>> > > > > > > >
>> > > > > > >
>> > > > > > > Why do you think using a binary protocol over HTTP would make
>> > more
>> > > > > > > performant to respond to multiple calls at once?
>> > > > > > > Same question but for the security issue - why do you think
>> the
>> > > HTTP
>> > > > > port
>> > > > > > > of admin API is harder to access than the binary protocol
>> port?
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > >
>> > > > > > > >
>> > > > > > > > Therefore, there are multiple use cases where
>> producer/consumer
>> > > > > > > > applications need stats information for topics using the
>> client
>> > > > > library
>> > > > > > > > over binary protocol. Hence, this PIP introduces client API
>> for
>> > > > > > producers
>> > > > > > > > and consumers to access topic stats/internal-stats
>> information
>> > > which
>> > > > > > can
>> > > > > > > be
>> > > > > > > > used by applications as needed.
>> > > > > > > >
>> > > > > > > > Please visit and review the PIP:
>> > > > > > > > https://github.com/apache/pulsar/issues/20265
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > Thanks,
>> > > > > > > >
>> > > > > > > > Rajan
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > >
>> >
>>
>