You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ignite.apache.org by Pavel Tupitsyn <pt...@apache.org> on 2021/05/18 08:47:02 UTC

Re: IEP-68: Thin Client Data Streamer

Igniters,

I've implemented a "stateful" approach (see IEP) in .NET Thin Client:
client-side flush adds data to server-side streamer, but does not flush it
immediately.
It provides ~35% perf improvement over stateless approach (where every
client request opens a new streamer, adds data, closes streamer).

However, in this case we lose server-side buffered data if the server node
fails, which seems to be a serious issue.
On the client we don't even know which part of data was lost, so we can't
re-send this data.

Therefore, a stateless approach seems to be the only proper way:
once a client request completes, the data is guaranteed to be in the cache,
and failed requests (connection loss, node failure) can be retried.
This simplifies both client and server implementations.

Thoughts?

On Tue, Mar 9, 2021 at 6:32 PM Pavel Tupitsyn <pt...@apache.org> wrote:

> Alex,
>
> I did not include this for two reasons:
>
> 1) Existing thick API behavior is very confusing and counter-intuitive:
> it returns a Future that won't complete unless you call more APIs.
>
> I've seen multiple users doing this:
> await streamer.Add(1, 1);
> await streamer.Add(2, 2);
>
> This code hangs - the first Add will never complete, because the batch is
> not sent,
> and the batch won't be sent because we are awaiting the first Add.
>
> 2) I'm not sure what's the use case for this feature - why do we want to
> know
> when the flush happens?
>
> We can come up with a better API, but should we bother at all?
>
> On Fri, Mar 5, 2021 at 6:09 PM Alex Plehanov <pl...@gmail.com>
> wrote:
>
>> Pavel,
>>
>> IEP looks good to me overall. I have only one concern: currently, for
>> thick
>> client "add" method we return the future which completes when the current
>> batch is actually added to the cache, even if "flush" method is not
>> invoked
>> explicitly. For thin client with the proposed protocol we can't provide
>> such functionality without explicit "flush". Perhaps we should notify
>> client when batch actually flushed and send this notification when some
>> "require notification" flag is sent with the request. WDYT?
>>
>> пт, 5 мар. 2021 г. в 17:03, Ivan Daschinsky <iv...@gmail.com>:
>>
>> > Agree, this is completely offtopic here.
>> >
>> > пт, 5 мар. 2021 г. в 17:02, Pavel Tupitsyn <pt...@apache.org>:
>> >
>> > > > custom DSL
>> > > This is a topic for 3.0 - would you like to start a separate thread?
>> > >
>> > > On Fri, Mar 5, 2021 at 4:54 PM Ivan Daschinsky <iv...@gmail.com>
>> > > wrote:
>> > >
>> > > > I suppose, that the best variantl -- custom DSL similar to MongoDB
>> > query
>> > > > language.
>> > > > I understand that implementing this is much harder, but users want
>> this
>> > > > variant.
>> > > >
>> > > > пт, 5 мар. 2021 г. в 16:52, Pavel Tupitsyn <pt...@apache.org>:
>> > > >
>> > > > > > I suppose this is of questional usability, same for existing
>> > > > > > implementation for ScanQuery and ContinousQuery
>> > > > >
>> > > > > One way or another, we need to be able to run custom logic
>> > > > > on the server side from thin clients.
>> > > > >
>> > > > > Our current direction can be seen as "Java is our DSL":
>> > > > > write server-side logic in Java, deploy to servers, call from thin
>> > > > clients.
>> > > > >
>> > > > > This approach is already used for Services, Compute, ScanQuery,
>> > > > > ContinuousQuery.
>> > > > >
>> > > > > If you have a better idea in mind, please share.
>> > > > >
>> > > > > On Fri, Mar 5, 2021 at 4:12 PM Ivan Daschinsky <
>> ivandasch@gmail.com>
>> > > > > wrote:
>> > > > >
>> > > > > > >>> - Corresponding types should exist on server nodes
>> > > > > > Ok, but I suppose this is of questional usability, same for
>> > existing
>> > > > > > implementation for ScanQuery and ContinousQuery.
>> > > > > > For queries it's probably ok to add some custom filters and put
>> > them
>> > > in
>> > > > > > servers' classpathes, but I can hardly imagine
>> > > > > > somebody wants to create custom Receiver this way.
>> > > > > >
>> > > > > > пт, 5 мар. 2021 г. в 15:54, Pavel Tupitsyn <
>> ptupitsyn@apache.org>:
>> > > > > >
>> > > > > > > > How do you suggests to serialize this object?
>> > > > > > >
>> > > > > > > Like a normal binary object. This scenario already exists for
>> > > > ScanQuery
>> > > > > > and
>> > > > > > > ContinuousQuery filters.
>> > > > > > > - It works for Java and .NET; potentially for other platforms
>> > > > > > > - Corresponding types should exist on server nodes
>> > > > > > >
>> > > > > > > E.g. if a Java class `MyFilter { String containsText }` is
>> loaded
>> > > on
>> > > > > > server
>> > > > > > > nodes,
>> > > > > > > a Python thin client can easily write a corresponding
>> > BinaryObject
>> > > > with
>> > > > > > one
>> > > > > > > field to achieve server-side filtering.
>> > > > > > >
>> > > > > > >
>> > > > > > > > I also suppose, that closing should be done wia
>> > OP_RESOURCE_CLOSE
>> > > > > > >
>> > > > > > > Thanks, forgot to mention this - updated the IEP.
>> > > > > > >
>> > > > > > >
>> > > > > > > On Fri, Mar 5, 2021 at 3:42 PM Ivan Daschinsky <
>> > > ivandasch@gmail.com>
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > I also suppose, that closing should be done wia
>> > > OP_RESOURCE_CLOSE.
>> > > > > > It'is
>> > > > > > > > more consistent and resembles with existing cursor api.
>> > > > > > > >
>> > > > > > > > пт, 5 мар. 2021 г. в 15:37, Ivan Daschinsky <
>> > ivandasch@gmail.com
>> > > >:
>> > > > > > > >
>> > > > > > > > > >> Both proposed requests have a Flush flag - please see
>> > > Details
>> > > > > > > sections
>> > > > > > > > > in the IEP.
>> > > > > > > > > Ok, sorry, I missed this.
>> > > > > > > > > >> StreamReceiver is public API [1]
>> > > > > > > > > I know it. But this "object" should contains custom logic
>> for
>> > > > > putting
>> > > > > > > > data
>> > > > > > > > > in cache. How do you suggests to serialize this object?
>> > > > > > > > > Implement custom classloader for java? What about other
>> > > > platforms?
>> > > > > > > > >
>> > > > > > > > > I suppose that we should not add this field, because it is
>> > > > > confusing.
>> > > > > > > > > Everything will work for local unit tests and only for
>> java.
>> > > > > > > > > But in real environment this will not work at all.
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > пт, 5 мар. 2021 г. в 15:23, Pavel Tupitsyn <
>> > > ptupitsyn@apache.org
>> > > > >:
>> > > > > > > > >
>> > > > > > > > >> Ivan,
>> > > > > > > > >>
>> > > > > > > > >> > Receiver is mostly internal stuff
>> > > > > > > > >> StreamReceiver is public API [1]
>> > > > > > > > >>
>> > > > > > > > >> > STREAMER_FLUSH_REQUEST should be added too
>> > > > > > > > >> Both proposed requests have a Flush flag - please see
>> > Details
>> > > > > > sections
>> > > > > > > > in
>> > > > > > > > >> the IEP.
>> > > > > > > > >> When user code calls client-side Flush method, we send
>> the
>> > > > current
>> > > > > > > > >> client-side batch with that flag enabled,
>> > > > > > > > >> and flush server-side batches too.
>> > > > > > > > >>
>> > > > > > > > >> A separate request for that does not seem to be
>> necessary,
>> > or
>> > > > am I
>> > > > > > > > missing
>> > > > > > > > >> some different use case?
>> > > > > > > > >>
>> > > > > > > > >> [1]
>> > > > > > > > >>
>> > > > > > > > >>
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/stream/package-summary.html
>> > > > > > > > >>
>> > > > > > > > >>
>> > > > > > > > >> On Fri, Mar 5, 2021 at 2:50 PM Ivan Daschinsky <
>> > > > > ivandasch@gmail.com
>> > > > > > >
>> > > > > > > > >> wrote:
>> > > > > > > > >>
>> > > > > > > > >> > IMHO, also STREAMER_FLUSH_REQUEST should be added too.
>> > > Client
>> > > > > can
>> > > > > > > > flush
>> > > > > > > > >> > large batches instead of closing resource.
>> > > > > > > > >> > IMHO this is preferable than creating streamer per
>> batch.
>> > > > > > > > >> >
>> > > > > > > > >> > пт, 5 мар. 2021 г. в 14:48, Ivan Daschinsky <
>> > > > > ivandasch@gmail.com
>> > > > > > >:
>> > > > > > > > >> >
>> > > > > > > > >> > > Pavel, thank you for your effort.
>> > > > > > > > >> > >
>> > > > > > > > >> > > BTW, are you sure that receiver should be a param of
>> > > > > > > > >> > > STREAMER_START_REQUEST?
>> > > > > > > > >> > > Receiver is mostly internal stuff and I can hardly
>> > imagine
>> > > > why
>> > > > > > > > >> > > someone would decide to change defaults.
>> > > > > > > > >> > >
>> > > > > > > > >> > > пт, 5 мар. 2021 г. в 13:01, Pavel Tupitsyn <
>> > > > > > ptupitsyn@apache.org
>> > > > > > > >:
>> > > > > > > > >> > >
>> > > > > > > > >> > >> Igor,
>> > > > > > > > >> > >>
>> > > > > > > > >> > >> As per our private conversation, I'll try to provide
>> > some
>> > > > > perf
>> > > > > > > > >> > >> measurements
>> > > > > > > > >> > >> for those 4 variants, and more detailed descriptions
>> > too.
>> > > > > > > > >> > >>
>> > > > > > > > >> > >> Thanks!
>> > > > > > > > >> > >>
>> > > > > > > > >> > >> On Fri, Mar 5, 2021 at 12:55 PM Igor Sapego <
>> > > > > > isapego@apache.org>
>> > > > > > > > >> wrote:
>> > > > > > > > >> > >>
>> > > > > > > > >> > >> > Pavel,
>> > > > > > > > >> > >> >
>> > > > > > > > >> > >> > I've checked the IEP and I like it. The only thing
>> > that
>> > > > > > seems a
>> > > > > > > > bit
>> > > > > > > > >> > >> > confusing to me
>> > > > > > > > >> > >> > is that there are 4 different variants for clients
>> > but
>> > > > > there
>> > > > > > > are
>> > > > > > > > >> cons
>> > > > > > > > >> > >> and
>> > > > > > > > >> > >> > pros for
>> > > > > > > > >> > >> > different variants. Maybe at least few sentences
>> > should
>> > > > be
>> > > > > > > > written
>> > > > > > > > >> > here
>> > > > > > > > >> > >> to
>> > > > > > > > >> > >> > give developers who are not familiar with
>> > DataStreamer
>> > > > some
>> > > > > > > ideas
>> > > > > > > > >> and
>> > > > > > > > >> > >> make
>> > > > > > > > >> > >> > it easier for them to choose.
>> > > > > > > > >> > >> >
>> > > > > > > > >> > >> > Best Regards,
>> > > > > > > > >> > >> > Igor
>> > > > > > > > >> > >> >
>> > > > > > > > >> > >> >
>> > > > > > > > >> > >> > On Thu, Mar 4, 2021 at 3:14 PM Pavel Tupitsyn <
>> > > > > > > > >> ptupitsyn@apache.org>
>> > > > > > > > >> > >> > wrote:
>> > > > > > > > >> > >> >
>> > > > > > > > >> > >> > > Igniters,
>> > > > > > > > >> > >> > >
>> > > > > > > > >> > >> > > Please review the IEP [1] and let me know what
>> you
>> > > > think.
>> > > > > > > > >> > >> > >
>> > > > > > > > >> > >> > > [1]
>> > > > > > > > >> > >> > >
>> > > > > > > > >> > >> > >
>> > > > > > > > >> > >> >
>> > > > > > > > >> > >>
>> > > > > > > > >> >
>> > > > > > > > >>
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>> https://cwiki.apache.org/confluence/display/IGNITE/IEP-68%3A+Thin+Client+Data+Streamer
>> > > > > > > > >> > >> > >
>> > > > > > > > >> > >> >
>> > > > > > > > >> > >>
>> > > > > > > > >> > >
>> > > > > > > > >> > >
>> > > > > > > > >> > > --
>> > > > > > > > >> > > Sincerely yours, Ivan Daschinskiy
>> > > > > > > > >> > >
>> > > > > > > > >> >
>> > > > > > > > >> >
>> > > > > > > > >> > --
>> > > > > > > > >> > Sincerely yours, Ivan Daschinskiy
>> > > > > > > > >> >
>> > > > > > > > >>
>> > > > > > > > >
>> > > > > > > > >
>> > > > > > > > > --
>> > > > > > > > > Sincerely yours, Ivan Daschinskiy
>> > > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > --
>> > > > > > > > Sincerely yours, Ivan Daschinskiy
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > Sincerely yours, Ivan Daschinskiy
>> > > > > >
>> > > > >
>> > > >
>> > > >
>> > > > --
>> > > > Sincerely yours, Ivan Daschinskiy
>> > > >
>> > >
>> >
>> >
>> > --
>> > Sincerely yours, Ivan Daschinskiy
>> >
>>
>

Re: IEP-68: Thin Client Data Streamer

Posted by Pavel Tupitsyn <pt...@apache.org>.
Changes are ready for review:

https://issues.apache.org/jira/browse/IGNITE-14187
https://github.com/apache/ignite/pull/8847

On Tue, May 18, 2021 at 12:03 PM Igor Sapego <is...@gridgain.com> wrote:

> Pavel,
>
> +1 from me for a stateless approach. We could definitely invent something
> here to make a stateful approach working but that would make this feature
> too complex and error-prone, IMO. It makes sense to file an improvement
> ticket with benchmark results and maybe code draft if we decide to move
> this way in future though.
>
> Best Regards,
> Igor
>
>
> On Tue, May 18, 2021 at 11:47 AM Pavel Tupitsyn <pt...@apache.org>
> wrote:
>
> > Igniters,
> >
> > I've implemented a "stateful" approach (see IEP) in .NET Thin Client:
> > client-side flush adds data to server-side streamer, but does not flush
> it
> > immediately.
> > It provides ~35% perf improvement over stateless approach (where every
> > client request opens a new streamer, adds data, closes streamer).
> >
> > However, in this case we lose server-side buffered data if the server
> node
> > fails, which seems to be a serious issue.
> > On the client we don't even know which part of data was lost, so we can't
> > re-send this data.
> >
> > Therefore, a stateless approach seems to be the only proper way:
> > once a client request completes, the data is guaranteed to be in the
> cache,
> > and failed requests (connection loss, node failure) can be retried.
> > This simplifies both client and server implementations.
> >
> > Thoughts?
> >
> > On Tue, Mar 9, 2021 at 6:32 PM Pavel Tupitsyn <pt...@apache.org>
> > wrote:
> >
> > > Alex,
> > >
> > > I did not include this for two reasons:
> > >
> > > 1) Existing thick API behavior is very confusing and counter-intuitive:
> > > it returns a Future that won't complete unless you call more APIs.
> > >
> > > I've seen multiple users doing this:
> > > await streamer.Add(1, 1);
> > > await streamer.Add(2, 2);
> > >
> > > This code hangs - the first Add will never complete, because the batch
> is
> > > not sent,
> > > and the batch won't be sent because we are awaiting the first Add.
> > >
> > > 2) I'm not sure what's the use case for this feature - why do we want
> to
> > > know
> > > when the flush happens?
> > >
> > > We can come up with a better API, but should we bother at all?
> > >
> > > On Fri, Mar 5, 2021 at 6:09 PM Alex Plehanov <pl...@gmail.com>
> > > wrote:
> > >
> > >> Pavel,
> > >>
> > >> IEP looks good to me overall. I have only one concern: currently, for
> > >> thick
> > >> client "add" method we return the future which completes when the
> > current
> > >> batch is actually added to the cache, even if "flush" method is not
> > >> invoked
> > >> explicitly. For thin client with the proposed protocol we can't
> provide
> > >> such functionality without explicit "flush". Perhaps we should notify
> > >> client when batch actually flushed and send this notification when
> some
> > >> "require notification" flag is sent with the request. WDYT?
> > >>
> > >> пт, 5 мар. 2021 г. в 17:03, Ivan Daschinsky <iv...@gmail.com>:
> > >>
> > >> > Agree, this is completely offtopic here.
> > >> >
> > >> > пт, 5 мар. 2021 г. в 17:02, Pavel Tupitsyn <pt...@apache.org>:
> > >> >
> > >> > > > custom DSL
> > >> > > This is a topic for 3.0 - would you like to start a separate
> thread?
> > >> > >
> > >> > > On Fri, Mar 5, 2021 at 4:54 PM Ivan Daschinsky <
> ivandasch@gmail.com
> > >
> > >> > > wrote:
> > >> > >
> > >> > > > I suppose, that the best variantl -- custom DSL similar to
> MongoDB
> > >> > query
> > >> > > > language.
> > >> > > > I understand that implementing this is much harder, but users
> want
> > >> this
> > >> > > > variant.
> > >> > > >
> > >> > > > пт, 5 мар. 2021 г. в 16:52, Pavel Tupitsyn <
> ptupitsyn@apache.org
> > >:
> > >> > > >
> > >> > > > > > I suppose this is of questional usability, same for existing
> > >> > > > > > implementation for ScanQuery and ContinousQuery
> > >> > > > >
> > >> > > > > One way or another, we need to be able to run custom logic
> > >> > > > > on the server side from thin clients.
> > >> > > > >
> > >> > > > > Our current direction can be seen as "Java is our DSL":
> > >> > > > > write server-side logic in Java, deploy to servers, call from
> > thin
> > >> > > > clients.
> > >> > > > >
> > >> > > > > This approach is already used for Services, Compute,
> ScanQuery,
> > >> > > > > ContinuousQuery.
> > >> > > > >
> > >> > > > > If you have a better idea in mind, please share.
> > >> > > > >
> > >> > > > > On Fri, Mar 5, 2021 at 4:12 PM Ivan Daschinsky <
> > >> ivandasch@gmail.com>
> > >> > > > > wrote:
> > >> > > > >
> > >> > > > > > >>> - Corresponding types should exist on server nodes
> > >> > > > > > Ok, but I suppose this is of questional usability, same for
> > >> > existing
> > >> > > > > > implementation for ScanQuery and ContinousQuery.
> > >> > > > > > For queries it's probably ok to add some custom filters and
> > put
> > >> > them
> > >> > > in
> > >> > > > > > servers' classpathes, but I can hardly imagine
> > >> > > > > > somebody wants to create custom Receiver this way.
> > >> > > > > >
> > >> > > > > > пт, 5 мар. 2021 г. в 15:54, Pavel Tupitsyn <
> > >> ptupitsyn@apache.org>:
> > >> > > > > >
> > >> > > > > > > > How do you suggests to serialize this object?
> > >> > > > > > >
> > >> > > > > > > Like a normal binary object. This scenario already exists
> > for
> > >> > > > ScanQuery
> > >> > > > > > and
> > >> > > > > > > ContinuousQuery filters.
> > >> > > > > > > - It works for Java and .NET; potentially for other
> > platforms
> > >> > > > > > > - Corresponding types should exist on server nodes
> > >> > > > > > >
> > >> > > > > > > E.g. if a Java class `MyFilter { String containsText }` is
> > >> loaded
> > >> > > on
> > >> > > > > > server
> > >> > > > > > > nodes,
> > >> > > > > > > a Python thin client can easily write a corresponding
> > >> > BinaryObject
> > >> > > > with
> > >> > > > > > one
> > >> > > > > > > field to achieve server-side filtering.
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > > I also suppose, that closing should be done wia
> > >> > OP_RESOURCE_CLOSE
> > >> > > > > > >
> > >> > > > > > > Thanks, forgot to mention this - updated the IEP.
> > >> > > > > > >
> > >> > > > > > >
> > >> > > > > > > On Fri, Mar 5, 2021 at 3:42 PM Ivan Daschinsky <
> > >> > > ivandasch@gmail.com>
> > >> > > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > I also suppose, that closing should be done wia
> > >> > > OP_RESOURCE_CLOSE.
> > >> > > > > > It'is
> > >> > > > > > > > more consistent and resembles with existing cursor api.
> > >> > > > > > > >
> > >> > > > > > > > пт, 5 мар. 2021 г. в 15:37, Ivan Daschinsky <
> > >> > ivandasch@gmail.com
> > >> > > >:
> > >> > > > > > > >
> > >> > > > > > > > > >> Both proposed requests have a Flush flag - please
> see
> > >> > > Details
> > >> > > > > > > sections
> > >> > > > > > > > > in the IEP.
> > >> > > > > > > > > Ok, sorry, I missed this.
> > >> > > > > > > > > >> StreamReceiver is public API [1]
> > >> > > > > > > > > I know it. But this "object" should contains custom
> > logic
> > >> for
> > >> > > > > putting
> > >> > > > > > > > data
> > >> > > > > > > > > in cache. How do you suggests to serialize this
> object?
> > >> > > > > > > > > Implement custom classloader for java? What about
> other
> > >> > > > platforms?
> > >> > > > > > > > >
> > >> > > > > > > > > I suppose that we should not add this field, because
> it
> > is
> > >> > > > > confusing.
> > >> > > > > > > > > Everything will work for local unit tests and only for
> > >> java.
> > >> > > > > > > > > But in real environment this will not work at all.
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > > пт, 5 мар. 2021 г. в 15:23, Pavel Tupitsyn <
> > >> > > ptupitsyn@apache.org
> > >> > > > >:
> > >> > > > > > > > >
> > >> > > > > > > > >> Ivan,
> > >> > > > > > > > >>
> > >> > > > > > > > >> > Receiver is mostly internal stuff
> > >> > > > > > > > >> StreamReceiver is public API [1]
> > >> > > > > > > > >>
> > >> > > > > > > > >> > STREAMER_FLUSH_REQUEST should be added too
> > >> > > > > > > > >> Both proposed requests have a Flush flag - please see
> > >> > Details
> > >> > > > > > sections
> > >> > > > > > > > in
> > >> > > > > > > > >> the IEP.
> > >> > > > > > > > >> When user code calls client-side Flush method, we
> send
> > >> the
> > >> > > > current
> > >> > > > > > > > >> client-side batch with that flag enabled,
> > >> > > > > > > > >> and flush server-side batches too.
> > >> > > > > > > > >>
> > >> > > > > > > > >> A separate request for that does not seem to be
> > >> necessary,
> > >> > or
> > >> > > > am I
> > >> > > > > > > > missing
> > >> > > > > > > > >> some different use case?
> > >> > > > > > > > >>
> > >> > > > > > > > >> [1]
> > >> > > > > > > > >>
> > >> > > > > > > > >>
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/stream/package-summary.html
> > >> > > > > > > > >>
> > >> > > > > > > > >>
> > >> > > > > > > > >> On Fri, Mar 5, 2021 at 2:50 PM Ivan Daschinsky <
> > >> > > > > ivandasch@gmail.com
> > >> > > > > > >
> > >> > > > > > > > >> wrote:
> > >> > > > > > > > >>
> > >> > > > > > > > >> > IMHO, also STREAMER_FLUSH_REQUEST should be added
> > too.
> > >> > > Client
> > >> > > > > can
> > >> > > > > > > > flush
> > >> > > > > > > > >> > large batches instead of closing resource.
> > >> > > > > > > > >> > IMHO this is preferable than creating streamer per
> > >> batch.
> > >> > > > > > > > >> >
> > >> > > > > > > > >> > пт, 5 мар. 2021 г. в 14:48, Ivan Daschinsky <
> > >> > > > > ivandasch@gmail.com
> > >> > > > > > >:
> > >> > > > > > > > >> >
> > >> > > > > > > > >> > > Pavel, thank you for your effort.
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > BTW, are you sure that receiver should be a param
> > of
> > >> > > > > > > > >> > > STREAMER_START_REQUEST?
> > >> > > > > > > > >> > > Receiver is mostly internal stuff and I can
> hardly
> > >> > imagine
> > >> > > > why
> > >> > > > > > > > >> > > someone would decide to change defaults.
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > пт, 5 мар. 2021 г. в 13:01, Pavel Tupitsyn <
> > >> > > > > > ptupitsyn@apache.org
> > >> > > > > > > >:
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > >> Igor,
> > >> > > > > > > > >> > >>
> > >> > > > > > > > >> > >> As per our private conversation, I'll try to
> > provide
> > >> > some
> > >> > > > > perf
> > >> > > > > > > > >> > >> measurements
> > >> > > > > > > > >> > >> for those 4 variants, and more detailed
> > descriptions
> > >> > too.
> > >> > > > > > > > >> > >>
> > >> > > > > > > > >> > >> Thanks!
> > >> > > > > > > > >> > >>
> > >> > > > > > > > >> > >> On Fri, Mar 5, 2021 at 12:55 PM Igor Sapego <
> > >> > > > > > isapego@apache.org>
> > >> > > > > > > > >> wrote:
> > >> > > > > > > > >> > >>
> > >> > > > > > > > >> > >> > Pavel,
> > >> > > > > > > > >> > >> >
> > >> > > > > > > > >> > >> > I've checked the IEP and I like it. The only
> > thing
> > >> > that
> > >> > > > > > seems a
> > >> > > > > > > > bit
> > >> > > > > > > > >> > >> > confusing to me
> > >> > > > > > > > >> > >> > is that there are 4 different variants for
> > clients
> > >> > but
> > >> > > > > there
> > >> > > > > > > are
> > >> > > > > > > > >> cons
> > >> > > > > > > > >> > >> and
> > >> > > > > > > > >> > >> > pros for
> > >> > > > > > > > >> > >> > different variants. Maybe at least few
> sentences
> > >> > should
> > >> > > > be
> > >> > > > > > > > written
> > >> > > > > > > > >> > here
> > >> > > > > > > > >> > >> to
> > >> > > > > > > > >> > >> > give developers who are not familiar with
> > >> > DataStreamer
> > >> > > > some
> > >> > > > > > > ideas
> > >> > > > > > > > >> and
> > >> > > > > > > > >> > >> make
> > >> > > > > > > > >> > >> > it easier for them to choose.
> > >> > > > > > > > >> > >> >
> > >> > > > > > > > >> > >> > Best Regards,
> > >> > > > > > > > >> > >> > Igor
> > >> > > > > > > > >> > >> >
> > >> > > > > > > > >> > >> >
> > >> > > > > > > > >> > >> > On Thu, Mar 4, 2021 at 3:14 PM Pavel Tupitsyn
> <
> > >> > > > > > > > >> ptupitsyn@apache.org>
> > >> > > > > > > > >> > >> > wrote:
> > >> > > > > > > > >> > >> >
> > >> > > > > > > > >> > >> > > Igniters,
> > >> > > > > > > > >> > >> > >
> > >> > > > > > > > >> > >> > > Please review the IEP [1] and let me know
> what
> > >> you
> > >> > > > think.
> > >> > > > > > > > >> > >> > >
> > >> > > > > > > > >> > >> > > [1]
> > >> > > > > > > > >> > >> > >
> > >> > > > > > > > >> > >> > >
> > >> > > > > > > > >> > >> >
> > >> > > > > > > > >> > >>
> > >> > > > > > > > >> >
> > >> > > > > > > > >>
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> >
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-68%3A+Thin+Client+Data+Streamer
> > >> > > > > > > > >> > >> > >
> > >> > > > > > > > >> > >> >
> > >> > > > > > > > >> > >>
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> > > --
> > >> > > > > > > > >> > > Sincerely yours, Ivan Daschinskiy
> > >> > > > > > > > >> > >
> > >> > > > > > > > >> >
> > >> > > > > > > > >> >
> > >> > > > > > > > >> > --
> > >> > > > > > > > >> > Sincerely yours, Ivan Daschinskiy
> > >> > > > > > > > >> >
> > >> > > > > > > > >>
> > >> > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > > > --
> > >> > > > > > > > > Sincerely yours, Ivan Daschinskiy
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > --
> > >> > > > > > > > Sincerely yours, Ivan Daschinskiy
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > Sincerely yours, Ivan Daschinskiy
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > > >
> > >> > > > --
> > >> > > > Sincerely yours, Ivan Daschinskiy
> > >> > > >
> > >> > >
> > >> >
> > >> >
> > >> > --
> > >> > Sincerely yours, Ivan Daschinskiy
> > >> >
> > >>
> > >
> >
>

Re: IEP-68: Thin Client Data Streamer

Posted by Igor Sapego <is...@gridgain.com>.
Pavel,

+1 from me for a stateless approach. We could definitely invent something
here to make a stateful approach working but that would make this feature
too complex and error-prone, IMO. It makes sense to file an improvement
ticket with benchmark results and maybe code draft if we decide to move
this way in future though.

Best Regards,
Igor


On Tue, May 18, 2021 at 11:47 AM Pavel Tupitsyn <pt...@apache.org>
wrote:

> Igniters,
>
> I've implemented a "stateful" approach (see IEP) in .NET Thin Client:
> client-side flush adds data to server-side streamer, but does not flush it
> immediately.
> It provides ~35% perf improvement over stateless approach (where every
> client request opens a new streamer, adds data, closes streamer).
>
> However, in this case we lose server-side buffered data if the server node
> fails, which seems to be a serious issue.
> On the client we don't even know which part of data was lost, so we can't
> re-send this data.
>
> Therefore, a stateless approach seems to be the only proper way:
> once a client request completes, the data is guaranteed to be in the cache,
> and failed requests (connection loss, node failure) can be retried.
> This simplifies both client and server implementations.
>
> Thoughts?
>
> On Tue, Mar 9, 2021 at 6:32 PM Pavel Tupitsyn <pt...@apache.org>
> wrote:
>
> > Alex,
> >
> > I did not include this for two reasons:
> >
> > 1) Existing thick API behavior is very confusing and counter-intuitive:
> > it returns a Future that won't complete unless you call more APIs.
> >
> > I've seen multiple users doing this:
> > await streamer.Add(1, 1);
> > await streamer.Add(2, 2);
> >
> > This code hangs - the first Add will never complete, because the batch is
> > not sent,
> > and the batch won't be sent because we are awaiting the first Add.
> >
> > 2) I'm not sure what's the use case for this feature - why do we want to
> > know
> > when the flush happens?
> >
> > We can come up with a better API, but should we bother at all?
> >
> > On Fri, Mar 5, 2021 at 6:09 PM Alex Plehanov <pl...@gmail.com>
> > wrote:
> >
> >> Pavel,
> >>
> >> IEP looks good to me overall. I have only one concern: currently, for
> >> thick
> >> client "add" method we return the future which completes when the
> current
> >> batch is actually added to the cache, even if "flush" method is not
> >> invoked
> >> explicitly. For thin client with the proposed protocol we can't provide
> >> such functionality without explicit "flush". Perhaps we should notify
> >> client when batch actually flushed and send this notification when some
> >> "require notification" flag is sent with the request. WDYT?
> >>
> >> пт, 5 мар. 2021 г. в 17:03, Ivan Daschinsky <iv...@gmail.com>:
> >>
> >> > Agree, this is completely offtopic here.
> >> >
> >> > пт, 5 мар. 2021 г. в 17:02, Pavel Tupitsyn <pt...@apache.org>:
> >> >
> >> > > > custom DSL
> >> > > This is a topic for 3.0 - would you like to start a separate thread?
> >> > >
> >> > > On Fri, Mar 5, 2021 at 4:54 PM Ivan Daschinsky <ivandasch@gmail.com
> >
> >> > > wrote:
> >> > >
> >> > > > I suppose, that the best variantl -- custom DSL similar to MongoDB
> >> > query
> >> > > > language.
> >> > > > I understand that implementing this is much harder, but users want
> >> this
> >> > > > variant.
> >> > > >
> >> > > > пт, 5 мар. 2021 г. в 16:52, Pavel Tupitsyn <ptupitsyn@apache.org
> >:
> >> > > >
> >> > > > > > I suppose this is of questional usability, same for existing
> >> > > > > > implementation for ScanQuery and ContinousQuery
> >> > > > >
> >> > > > > One way or another, we need to be able to run custom logic
> >> > > > > on the server side from thin clients.
> >> > > > >
> >> > > > > Our current direction can be seen as "Java is our DSL":
> >> > > > > write server-side logic in Java, deploy to servers, call from
> thin
> >> > > > clients.
> >> > > > >
> >> > > > > This approach is already used for Services, Compute, ScanQuery,
> >> > > > > ContinuousQuery.
> >> > > > >
> >> > > > > If you have a better idea in mind, please share.
> >> > > > >
> >> > > > > On Fri, Mar 5, 2021 at 4:12 PM Ivan Daschinsky <
> >> ivandasch@gmail.com>
> >> > > > > wrote:
> >> > > > >
> >> > > > > > >>> - Corresponding types should exist on server nodes
> >> > > > > > Ok, but I suppose this is of questional usability, same for
> >> > existing
> >> > > > > > implementation for ScanQuery and ContinousQuery.
> >> > > > > > For queries it's probably ok to add some custom filters and
> put
> >> > them
> >> > > in
> >> > > > > > servers' classpathes, but I can hardly imagine
> >> > > > > > somebody wants to create custom Receiver this way.
> >> > > > > >
> >> > > > > > пт, 5 мар. 2021 г. в 15:54, Pavel Tupitsyn <
> >> ptupitsyn@apache.org>:
> >> > > > > >
> >> > > > > > > > How do you suggests to serialize this object?
> >> > > > > > >
> >> > > > > > > Like a normal binary object. This scenario already exists
> for
> >> > > > ScanQuery
> >> > > > > > and
> >> > > > > > > ContinuousQuery filters.
> >> > > > > > > - It works for Java and .NET; potentially for other
> platforms
> >> > > > > > > - Corresponding types should exist on server nodes
> >> > > > > > >
> >> > > > > > > E.g. if a Java class `MyFilter { String containsText }` is
> >> loaded
> >> > > on
> >> > > > > > server
> >> > > > > > > nodes,
> >> > > > > > > a Python thin client can easily write a corresponding
> >> > BinaryObject
> >> > > > with
> >> > > > > > one
> >> > > > > > > field to achieve server-side filtering.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > > I also suppose, that closing should be done wia
> >> > OP_RESOURCE_CLOSE
> >> > > > > > >
> >> > > > > > > Thanks, forgot to mention this - updated the IEP.
> >> > > > > > >
> >> > > > > > >
> >> > > > > > > On Fri, Mar 5, 2021 at 3:42 PM Ivan Daschinsky <
> >> > > ivandasch@gmail.com>
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > I also suppose, that closing should be done wia
> >> > > OP_RESOURCE_CLOSE.
> >> > > > > > It'is
> >> > > > > > > > more consistent and resembles with existing cursor api.
> >> > > > > > > >
> >> > > > > > > > пт, 5 мар. 2021 г. в 15:37, Ivan Daschinsky <
> >> > ivandasch@gmail.com
> >> > > >:
> >> > > > > > > >
> >> > > > > > > > > >> Both proposed requests have a Flush flag - please see
> >> > > Details
> >> > > > > > > sections
> >> > > > > > > > > in the IEP.
> >> > > > > > > > > Ok, sorry, I missed this.
> >> > > > > > > > > >> StreamReceiver is public API [1]
> >> > > > > > > > > I know it. But this "object" should contains custom
> logic
> >> for
> >> > > > > putting
> >> > > > > > > > data
> >> > > > > > > > > in cache. How do you suggests to serialize this object?
> >> > > > > > > > > Implement custom classloader for java? What about other
> >> > > > platforms?
> >> > > > > > > > >
> >> > > > > > > > > I suppose that we should not add this field, because it
> is
> >> > > > > confusing.
> >> > > > > > > > > Everything will work for local unit tests and only for
> >> java.
> >> > > > > > > > > But in real environment this will not work at all.
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > пт, 5 мар. 2021 г. в 15:23, Pavel Tupitsyn <
> >> > > ptupitsyn@apache.org
> >> > > > >:
> >> > > > > > > > >
> >> > > > > > > > >> Ivan,
> >> > > > > > > > >>
> >> > > > > > > > >> > Receiver is mostly internal stuff
> >> > > > > > > > >> StreamReceiver is public API [1]
> >> > > > > > > > >>
> >> > > > > > > > >> > STREAMER_FLUSH_REQUEST should be added too
> >> > > > > > > > >> Both proposed requests have a Flush flag - please see
> >> > Details
> >> > > > > > sections
> >> > > > > > > > in
> >> > > > > > > > >> the IEP.
> >> > > > > > > > >> When user code calls client-side Flush method, we send
> >> the
> >> > > > current
> >> > > > > > > > >> client-side batch with that flag enabled,
> >> > > > > > > > >> and flush server-side batches too.
> >> > > > > > > > >>
> >> > > > > > > > >> A separate request for that does not seem to be
> >> necessary,
> >> > or
> >> > > > am I
> >> > > > > > > > missing
> >> > > > > > > > >> some different use case?
> >> > > > > > > > >>
> >> > > > > > > > >> [1]
> >> > > > > > > > >>
> >> > > > > > > > >>
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/stream/package-summary.html
> >> > > > > > > > >>
> >> > > > > > > > >>
> >> > > > > > > > >> On Fri, Mar 5, 2021 at 2:50 PM Ivan Daschinsky <
> >> > > > > ivandasch@gmail.com
> >> > > > > > >
> >> > > > > > > > >> wrote:
> >> > > > > > > > >>
> >> > > > > > > > >> > IMHO, also STREAMER_FLUSH_REQUEST should be added
> too.
> >> > > Client
> >> > > > > can
> >> > > > > > > > flush
> >> > > > > > > > >> > large batches instead of closing resource.
> >> > > > > > > > >> > IMHO this is preferable than creating streamer per
> >> batch.
> >> > > > > > > > >> >
> >> > > > > > > > >> > пт, 5 мар. 2021 г. в 14:48, Ivan Daschinsky <
> >> > > > > ivandasch@gmail.com
> >> > > > > > >:
> >> > > > > > > > >> >
> >> > > > > > > > >> > > Pavel, thank you for your effort.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > BTW, are you sure that receiver should be a param
> of
> >> > > > > > > > >> > > STREAMER_START_REQUEST?
> >> > > > > > > > >> > > Receiver is mostly internal stuff and I can hardly
> >> > imagine
> >> > > > why
> >> > > > > > > > >> > > someone would decide to change defaults.
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > пт, 5 мар. 2021 г. в 13:01, Pavel Tupitsyn <
> >> > > > > > ptupitsyn@apache.org
> >> > > > > > > >:
> >> > > > > > > > >> > >
> >> > > > > > > > >> > >> Igor,
> >> > > > > > > > >> > >>
> >> > > > > > > > >> > >> As per our private conversation, I'll try to
> provide
> >> > some
> >> > > > > perf
> >> > > > > > > > >> > >> measurements
> >> > > > > > > > >> > >> for those 4 variants, and more detailed
> descriptions
> >> > too.
> >> > > > > > > > >> > >>
> >> > > > > > > > >> > >> Thanks!
> >> > > > > > > > >> > >>
> >> > > > > > > > >> > >> On Fri, Mar 5, 2021 at 12:55 PM Igor Sapego <
> >> > > > > > isapego@apache.org>
> >> > > > > > > > >> wrote:
> >> > > > > > > > >> > >>
> >> > > > > > > > >> > >> > Pavel,
> >> > > > > > > > >> > >> >
> >> > > > > > > > >> > >> > I've checked the IEP and I like it. The only
> thing
> >> > that
> >> > > > > > seems a
> >> > > > > > > > bit
> >> > > > > > > > >> > >> > confusing to me
> >> > > > > > > > >> > >> > is that there are 4 different variants for
> clients
> >> > but
> >> > > > > there
> >> > > > > > > are
> >> > > > > > > > >> cons
> >> > > > > > > > >> > >> and
> >> > > > > > > > >> > >> > pros for
> >> > > > > > > > >> > >> > different variants. Maybe at least few sentences
> >> > should
> >> > > > be
> >> > > > > > > > written
> >> > > > > > > > >> > here
> >> > > > > > > > >> > >> to
> >> > > > > > > > >> > >> > give developers who are not familiar with
> >> > DataStreamer
> >> > > > some
> >> > > > > > > ideas
> >> > > > > > > > >> and
> >> > > > > > > > >> > >> make
> >> > > > > > > > >> > >> > it easier for them to choose.
> >> > > > > > > > >> > >> >
> >> > > > > > > > >> > >> > Best Regards,
> >> > > > > > > > >> > >> > Igor
> >> > > > > > > > >> > >> >
> >> > > > > > > > >> > >> >
> >> > > > > > > > >> > >> > On Thu, Mar 4, 2021 at 3:14 PM Pavel Tupitsyn <
> >> > > > > > > > >> ptupitsyn@apache.org>
> >> > > > > > > > >> > >> > wrote:
> >> > > > > > > > >> > >> >
> >> > > > > > > > >> > >> > > Igniters,
> >> > > > > > > > >> > >> > >
> >> > > > > > > > >> > >> > > Please review the IEP [1] and let me know what
> >> you
> >> > > > think.
> >> > > > > > > > >> > >> > >
> >> > > > > > > > >> > >> > > [1]
> >> > > > > > > > >> > >> > >
> >> > > > > > > > >> > >> > >
> >> > > > > > > > >> > >> >
> >> > > > > > > > >> > >>
> >> > > > > > > > >> >
> >> > > > > > > > >>
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/IGNITE/IEP-68%3A+Thin+Client+Data+Streamer
> >> > > > > > > > >> > >> > >
> >> > > > > > > > >> > >> >
> >> > > > > > > > >> > >>
> >> > > > > > > > >> > >
> >> > > > > > > > >> > >
> >> > > > > > > > >> > > --
> >> > > > > > > > >> > > Sincerely yours, Ivan Daschinskiy
> >> > > > > > > > >> > >
> >> > > > > > > > >> >
> >> > > > > > > > >> >
> >> > > > > > > > >> > --
> >> > > > > > > > >> > Sincerely yours, Ivan Daschinskiy
> >> > > > > > > > >> >
> >> > > > > > > > >>
> >> > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > > > --
> >> > > > > > > > > Sincerely yours, Ivan Daschinskiy
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > --
> >> > > > > > > > Sincerely yours, Ivan Daschinskiy
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > Sincerely yours, Ivan Daschinskiy
> >> > > > > >
> >> > > > >
> >> > > >
> >> > > >
> >> > > > --
> >> > > > Sincerely yours, Ivan Daschinskiy
> >> > > >
> >> > >
> >> >
> >> >
> >> > --
> >> > Sincerely yours, Ivan Daschinskiy
> >> >
> >>
> >
>