You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by narges saleh <sn...@gmail.com> on 2020/02/02 21:33:50 UTC

DataStreamer as a Service

Hi All,

Is there a problem with running the datastreamer as a service, being
instantiated in init method? Or loading the data via JDBC connection with
streaming mode enabled?
In either case, the deployment is affinity based.

thanks.

Re: DataStreamer as a Service

Posted by narges saleh <sn...@gmail.com>.
Understood. Thank you, for the feedback.

On Tue, Feb 4, 2020 at 7:09 AM Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> Data Streamer can be used on server node all right, however, it is still a
> "client" operation, i.e., it will batch some data locally and only then
> send to server nodes, including itself.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> вт, 4 февр. 2020 г. в 13:58, narges saleh <sn...@gmail.com>:
>
>> Hi,
>> I am not sure I follow the relationship between per partition batching
>> and client-side capabilities. Does this mean that the data streamer cannot
>> do per partition batching on the server side, for example the service grid?
>>
>> I understand that low intensity streaming defeats the purpose of having
>> batching (whether client side or not), but my case is long lived high
>> intensity data traffic.
>>
>> thanks.
>>
>> On Tue, Feb 4, 2020 at 4:14 AM Ilya Kasnacheev <il...@gmail.com>
>> wrote:
>>
>>> Hello!
>>>
>>> In case of long-lived, low-intensity streaming, Data Streamer will not
>>> be able to utilize its client-side per-partition batching capabilities,
>>> instead being just a wrapper over cache update operations, which are
>>> available as part of Cache API.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> вт, 4 февр. 2020 г. в 03:41, Denis Magda <dm...@apache.org>:
>>>
>>>> Ilya,
>>>>
>>>> I don't quite understand why data streamer is not suitable as a
>>>> long-running solution. Please don't mislead, otherwise, list out specific
>>>> limitations. I don't see anything wrong by having an opened data
>>>> streamer that transfer data to Ignite in real-time.
>>>>
>>>> Narges, if the streamer crashes then your service/app needs to resend
>>>> those records that were not acknowledged. Probably, you might utilize Kafka
>>>> Connect here that keeps track of committed/pending records.
>>>>
>>>> -
>>>> Denis
>>>>
>>>>
>>>> On Mon, Feb 3, 2020 at 6:13 AM Ilya Kasnacheev <
>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> I think these benefits are imaginary. You will have to worry about
>>>>> service more, rather about data streamer which may be recreated at any time.
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> пн, 3 февр. 2020 г. в 16:58, narges saleh <sn...@gmail.com>:
>>>>>
>>>>>> Thanks Ilya.
>>>>>>  I have to listen to these burst of data which arrive every few
>>>>>> seconds meaning an almost constant bursts of data from different data
>>>>>> sources.
>>>>>> The main reason that the services grid is appealing to me is its
>>>>>> resiliency; I don't have to worry about it. With the client side streamer,
>>>>>> I will have to deploy it and keep it up running, and load/re balance it.
>>>>>>
>>>>>> On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev <
>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>
>>>>>>> Hello!
>>>>>>>
>>>>>>> I don't see why you would deploy it as a service, sounds like you
>>>>>>> will have to send more data over network. If you have to pull batches in,
>>>>>>> then service should work. I recommend re-acquiring data streamer for each
>>>>>>> batch.
>>>>>>>
>>>>>>> Please note that Data Streamer is very scalable, so it is preferred
>>>>>>> to tune it than trying to use more than one streamer.
>>>>>>>
>>>>>>> Regards,
>>>>>>> --
>>>>>>> Ilya Kasnacheev
>>>>>>>
>>>>>>>
>>>>>>> пн, 3 февр. 2020 г. в 16:11, narges saleh <sn...@gmail.com>:
>>>>>>>
>>>>>>>> Hi Ilya
>>>>>>>> The data comes in huge batches of records (each burst can be up to
>>>>>>>> 50-100 MB, which I plan to spread across multiple streamers) so, the
>>>>>>>> streamer seems to be the way to go. Also, I don't want to establish a JDBC
>>>>>>>> connection each time.
>>>>>>>> So, if the streamer is the way to go, is it feasible to deploy it
>>>>>>>> as a service?
>>>>>>>> thanks.
>>>>>>>>
>>>>>>>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <
>>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello!
>>>>>>>>>
>>>>>>>>> Contrary to its name, data streamer is not actually suitable for
>>>>>>>>> long-lived, low-intensity streaming. What it's good for is burst load of
>>>>>>>>> large number of data in a short period of time.
>>>>>>>>>
>>>>>>>>> If your data arrives in large batches, you can use Data Streamer
>>>>>>>>> for each batch. If not, better use Cache API.
>>>>>>>>>
>>>>>>>>> If you are worried that plain Cache API is slow, but also want
>>>>>>>>> failure resilience, there's catch-22. The only way to make something
>>>>>>>>> resilient is to put it into cache :)
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> --
>>>>>>>>> Ilya Kasnacheev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> But services are by definition long lived, right? Here is my
>>>>>>>>>> layout: The data is continuously generated and sent to the streamer
>>>>>>>>>> services (via JDBC connection with set streaming on option), deployed, say,
>>>>>>>>>> as node singleton (actually deployed also as microservices) to load the
>>>>>>>>>> data into the caches. The streamers do flush data based on some timers.
>>>>>>>>>>  If the streamer crashes before the buffer is flushed, the client
>>>>>>>>>> catches the exception and resends the batch. Any issue with this layout?
>>>>>>>>>>
>>>>>>>>>> thanks.
>>>>>>>>>>
>>>>>>>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>>>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hello!
>>>>>>>>>>>
>>>>>>>>>>> It is not recommended to have long-lived data streamers, it's
>>>>>>>>>>> best to acquire it when it is needed.
>>>>>>>>>>>
>>>>>>>>>>> If you have to keep data streamer around, don't forget to
>>>>>>>>>>> flush() it. This way you don't have to worry about its queue.
>>>>>>>>>>>
>>>>>>>>>>> Regards,
>>>>>>>>>>> --
>>>>>>>>>>> Ilya Kasnacheev
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <snarges124@gmail.com
>>>>>>>>>>> >:
>>>>>>>>>>>
>>>>>>>>>>>> Hi,
>>>>>>>>>>>> My specific question/concern is with regard to the state of the
>>>>>>>>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>>>>>>>>> redeployed. Specifically, what happens to the data?
>>>>>>>>>>>> I have a similar question with regard to the state of a
>>>>>>>>>>>> continuous query when it is deployed as a service, what happens to the data
>>>>>>>>>>>> in the listener's queue?
>>>>>>>>>>>>
>>>>>>>>>>>> thanks.
>>>>>>>>>>>>
>>>>>>>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <
>>>>>>>>>>>> mikael-aronsson@telia.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi!
>>>>>>>>>>>>>
>>>>>>>>>>>>> Not as far as I know, I have a number of services using
>>>>>>>>>>>>> streamers
>>>>>>>>>>>>> without any problems, do you have any specific problem with it
>>>>>>>>>>>>> ?
>>>>>>>>>>>>>
>>>>>>>>>>>>> Mikael
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>>>>>>>>> > Hi All,
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > Is there a problem with running the datastreamer as a
>>>>>>>>>>>>> service, being
>>>>>>>>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>>>>>>>>> connection
>>>>>>>>>>>>> > with streaming mode enabled?
>>>>>>>>>>>>> > In either case, the deployment is affinity based.
>>>>>>>>>>>>> >
>>>>>>>>>>>>> > thanks.
>>>>>>>>>>>>>
>>>>>>>>>>>>

Re: DataStreamer as a Service

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Data Streamer can be used on server node all right, however, it is still a
"client" operation, i.e., it will batch some data locally and only then
send to server nodes, including itself.

Regards,
-- 
Ilya Kasnacheev


вт, 4 февр. 2020 г. в 13:58, narges saleh <sn...@gmail.com>:

> Hi,
> I am not sure I follow the relationship between per partition batching and
> client-side capabilities. Does this mean that the data streamer cannot do
> per partition batching on the server side, for example the service grid?
>
> I understand that low intensity streaming defeats the purpose of having
> batching (whether client side or not), but my case is long lived high
> intensity data traffic.
>
> thanks.
>
> On Tue, Feb 4, 2020 at 4:14 AM Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> In case of long-lived, low-intensity streaming, Data Streamer will not be
>> able to utilize its client-side per-partition batching capabilities,
>> instead being just a wrapper over cache update operations, which are
>> available as part of Cache API.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> вт, 4 февр. 2020 г. в 03:41, Denis Magda <dm...@apache.org>:
>>
>>> Ilya,
>>>
>>> I don't quite understand why data streamer is not suitable as a
>>> long-running solution. Please don't mislead, otherwise, list out specific
>>> limitations. I don't see anything wrong by having an opened data
>>> streamer that transfer data to Ignite in real-time.
>>>
>>> Narges, if the streamer crashes then your service/app needs to resend
>>> those records that were not acknowledged. Probably, you might utilize Kafka
>>> Connect here that keeps track of committed/pending records.
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Mon, Feb 3, 2020 at 6:13 AM Ilya Kasnacheev <
>>> ilya.kasnacheev@gmail.com> wrote:
>>>
>>>> Hello!
>>>>
>>>> I think these benefits are imaginary. You will have to worry about
>>>> service more, rather about data streamer which may be recreated at any time.
>>>>
>>>> Regards,
>>>> --
>>>> Ilya Kasnacheev
>>>>
>>>>
>>>> пн, 3 февр. 2020 г. в 16:58, narges saleh <sn...@gmail.com>:
>>>>
>>>>> Thanks Ilya.
>>>>>  I have to listen to these burst of data which arrive every few
>>>>> seconds meaning an almost constant bursts of data from different data
>>>>> sources.
>>>>> The main reason that the services grid is appealing to me is its
>>>>> resiliency; I don't have to worry about it. With the client side streamer,
>>>>> I will have to deploy it and keep it up running, and load/re balance it.
>>>>>
>>>>> On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev <
>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> I don't see why you would deploy it as a service, sounds like you
>>>>>> will have to send more data over network. If you have to pull batches in,
>>>>>> then service should work. I recommend re-acquiring data streamer for each
>>>>>> batch.
>>>>>>
>>>>>> Please note that Data Streamer is very scalable, so it is preferred
>>>>>> to tune it than trying to use more than one streamer.
>>>>>>
>>>>>> Regards,
>>>>>> --
>>>>>> Ilya Kasnacheev
>>>>>>
>>>>>>
>>>>>> пн, 3 февр. 2020 г. в 16:11, narges saleh <sn...@gmail.com>:
>>>>>>
>>>>>>> Hi Ilya
>>>>>>> The data comes in huge batches of records (each burst can be up to
>>>>>>> 50-100 MB, which I plan to spread across multiple streamers) so, the
>>>>>>> streamer seems to be the way to go. Also, I don't want to establish a JDBC
>>>>>>> connection each time.
>>>>>>> So, if the streamer is the way to go, is it feasible to deploy it as
>>>>>>> a service?
>>>>>>> thanks.
>>>>>>>
>>>>>>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <
>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello!
>>>>>>>>
>>>>>>>> Contrary to its name, data streamer is not actually suitable for
>>>>>>>> long-lived, low-intensity streaming. What it's good for is burst load of
>>>>>>>> large number of data in a short period of time.
>>>>>>>>
>>>>>>>> If your data arrives in large batches, you can use Data Streamer
>>>>>>>> for each batch. If not, better use Cache API.
>>>>>>>>
>>>>>>>> If you are worried that plain Cache API is slow, but also want
>>>>>>>> failure resilience, there's catch-22. The only way to make something
>>>>>>>> resilient is to put it into cache :)
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> --
>>>>>>>> Ilya Kasnacheev
>>>>>>>>
>>>>>>>>
>>>>>>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> But services are by definition long lived, right? Here is my
>>>>>>>>> layout: The data is continuously generated and sent to the streamer
>>>>>>>>> services (via JDBC connection with set streaming on option), deployed, say,
>>>>>>>>> as node singleton (actually deployed also as microservices) to load the
>>>>>>>>> data into the caches. The streamers do flush data based on some timers.
>>>>>>>>>  If the streamer crashes before the buffer is flushed, the client
>>>>>>>>> catches the exception and resends the batch. Any issue with this layout?
>>>>>>>>>
>>>>>>>>> thanks.
>>>>>>>>>
>>>>>>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> Hello!
>>>>>>>>>>
>>>>>>>>>> It is not recommended to have long-lived data streamers, it's
>>>>>>>>>> best to acquire it when it is needed.
>>>>>>>>>>
>>>>>>>>>> If you have to keep data streamer around, don't forget to flush()
>>>>>>>>>> it. This way you don't have to worry about its queue.
>>>>>>>>>>
>>>>>>>>>> Regards,
>>>>>>>>>> --
>>>>>>>>>> Ilya Kasnacheev
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>>>>>>>>>>
>>>>>>>>>>> Hi,
>>>>>>>>>>> My specific question/concern is with regard to the state of the
>>>>>>>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>>>>>>>> redeployed. Specifically, what happens to the data?
>>>>>>>>>>> I have a similar question with regard to the state of a
>>>>>>>>>>> continuous query when it is deployed as a service, what happens to the data
>>>>>>>>>>> in the listener's queue?
>>>>>>>>>>>
>>>>>>>>>>> thanks.
>>>>>>>>>>>
>>>>>>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi!
>>>>>>>>>>>>
>>>>>>>>>>>> Not as far as I know, I have a number of services using
>>>>>>>>>>>> streamers
>>>>>>>>>>>> without any problems, do you have any specific problem with it ?
>>>>>>>>>>>>
>>>>>>>>>>>> Mikael
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>>>>>>>> > Hi All,
>>>>>>>>>>>> >
>>>>>>>>>>>> > Is there a problem with running the datastreamer as a
>>>>>>>>>>>> service, being
>>>>>>>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>>>>>>>> connection
>>>>>>>>>>>> > with streaming mode enabled?
>>>>>>>>>>>> > In either case, the deployment is affinity based.
>>>>>>>>>>>> >
>>>>>>>>>>>> > thanks.
>>>>>>>>>>>>
>>>>>>>>>>>

Re: DataStreamer as a Service

Posted by narges saleh <sn...@gmail.com>.
Hi,
I am not sure I follow the relationship between per partition batching and
client-side capabilities. Does this mean that the data streamer cannot do
per partition batching on the server side, for example the service grid?

I understand that low intensity streaming defeats the purpose of having
batching (whether client side or not), but my case is long lived high
intensity data traffic.

thanks.

On Tue, Feb 4, 2020 at 4:14 AM Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> In case of long-lived, low-intensity streaming, Data Streamer will not be
> able to utilize its client-side per-partition batching capabilities,
> instead being just a wrapper over cache update operations, which are
> available as part of Cache API.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> вт, 4 февр. 2020 г. в 03:41, Denis Magda <dm...@apache.org>:
>
>> Ilya,
>>
>> I don't quite understand why data streamer is not suitable as a
>> long-running solution. Please don't mislead, otherwise, list out specific
>> limitations. I don't see anything wrong by having an opened data
>> streamer that transfer data to Ignite in real-time.
>>
>> Narges, if the streamer crashes then your service/app needs to resend
>> those records that were not acknowledged. Probably, you might utilize Kafka
>> Connect here that keeps track of committed/pending records.
>>
>> -
>> Denis
>>
>>
>> On Mon, Feb 3, 2020 at 6:13 AM Ilya Kasnacheev <il...@gmail.com>
>> wrote:
>>
>>> Hello!
>>>
>>> I think these benefits are imaginary. You will have to worry about
>>> service more, rather about data streamer which may be recreated at any time.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> пн, 3 февр. 2020 г. в 16:58, narges saleh <sn...@gmail.com>:
>>>
>>>> Thanks Ilya.
>>>>  I have to listen to these burst of data which arrive every few seconds
>>>> meaning an almost constant bursts of data from different data sources.
>>>> The main reason that the services grid is appealing to me is its
>>>> resiliency; I don't have to worry about it. With the client side streamer,
>>>> I will have to deploy it and keep it up running, and load/re balance it.
>>>>
>>>> On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev <
>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> I don't see why you would deploy it as a service, sounds like you will
>>>>> have to send more data over network. If you have to pull batches in, then
>>>>> service should work. I recommend re-acquiring data streamer for each batch.
>>>>>
>>>>> Please note that Data Streamer is very scalable, so it is preferred to
>>>>> tune it than trying to use more than one streamer.
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> пн, 3 февр. 2020 г. в 16:11, narges saleh <sn...@gmail.com>:
>>>>>
>>>>>> Hi Ilya
>>>>>> The data comes in huge batches of records (each burst can be up to
>>>>>> 50-100 MB, which I plan to spread across multiple streamers) so, the
>>>>>> streamer seems to be the way to go. Also, I don't want to establish a JDBC
>>>>>> connection each time.
>>>>>> So, if the streamer is the way to go, is it feasible to deploy it as
>>>>>> a service?
>>>>>> thanks.
>>>>>>
>>>>>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <
>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>
>>>>>>> Hello!
>>>>>>>
>>>>>>> Contrary to its name, data streamer is not actually suitable for
>>>>>>> long-lived, low-intensity streaming. What it's good for is burst load of
>>>>>>> large number of data in a short period of time.
>>>>>>>
>>>>>>> If your data arrives in large batches, you can use Data Streamer for
>>>>>>> each batch. If not, better use Cache API.
>>>>>>>
>>>>>>> If you are worried that plain Cache API is slow, but also want
>>>>>>> failure resilience, there's catch-22. The only way to make something
>>>>>>> resilient is to put it into cache :)
>>>>>>>
>>>>>>> Regards,
>>>>>>> --
>>>>>>> Ilya Kasnacheev
>>>>>>>
>>>>>>>
>>>>>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> But services are by definition long lived, right? Here is my
>>>>>>>> layout: The data is continuously generated and sent to the streamer
>>>>>>>> services (via JDBC connection with set streaming on option), deployed, say,
>>>>>>>> as node singleton (actually deployed also as microservices) to load the
>>>>>>>> data into the caches. The streamers do flush data based on some timers.
>>>>>>>>  If the streamer crashes before the buffer is flushed, the client
>>>>>>>> catches the exception and resends the batch. Any issue with this layout?
>>>>>>>>
>>>>>>>> thanks.
>>>>>>>>
>>>>>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>>
>>>>>>>>> Hello!
>>>>>>>>>
>>>>>>>>> It is not recommended to have long-lived data streamers, it's best
>>>>>>>>> to acquire it when it is needed.
>>>>>>>>>
>>>>>>>>> If you have to keep data streamer around, don't forget to flush()
>>>>>>>>> it. This way you don't have to worry about its queue.
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> --
>>>>>>>>> Ilya Kasnacheev
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>>>>>>>>>
>>>>>>>>>> Hi,
>>>>>>>>>> My specific question/concern is with regard to the state of the
>>>>>>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>>>>>>> redeployed. Specifically, what happens to the data?
>>>>>>>>>> I have a similar question with regard to the state of a
>>>>>>>>>> continuous query when it is deployed as a service, what happens to the data
>>>>>>>>>> in the listener's queue?
>>>>>>>>>>
>>>>>>>>>> thanks.
>>>>>>>>>>
>>>>>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi!
>>>>>>>>>>>
>>>>>>>>>>> Not as far as I know, I have a number of services using
>>>>>>>>>>> streamers
>>>>>>>>>>> without any problems, do you have any specific problem with it ?
>>>>>>>>>>>
>>>>>>>>>>> Mikael
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>>>>>>> > Hi All,
>>>>>>>>>>> >
>>>>>>>>>>> > Is there a problem with running the datastreamer as a service,
>>>>>>>>>>> being
>>>>>>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>>>>>>> connection
>>>>>>>>>>> > with streaming mode enabled?
>>>>>>>>>>> > In either case, the deployment is affinity based.
>>>>>>>>>>> >
>>>>>>>>>>> > thanks.
>>>>>>>>>>>
>>>>>>>>>>

Re: DataStreamer as a Service

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

In case of long-lived, low-intensity streaming, Data Streamer will not be
able to utilize its client-side per-partition batching capabilities,
instead being just a wrapper over cache update operations, which are
available as part of Cache API.

Regards,
-- 
Ilya Kasnacheev


вт, 4 февр. 2020 г. в 03:41, Denis Magda <dm...@apache.org>:

> Ilya,
>
> I don't quite understand why data streamer is not suitable as a
> long-running solution. Please don't mislead, otherwise, list out specific
> limitations. I don't see anything wrong by having an opened data
> streamer that transfer data to Ignite in real-time.
>
> Narges, if the streamer crashes then your service/app needs to resend
> those records that were not acknowledged. Probably, you might utilize Kafka
> Connect here that keeps track of committed/pending records.
>
> -
> Denis
>
>
> On Mon, Feb 3, 2020 at 6:13 AM Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> I think these benefits are imaginary. You will have to worry about
>> service more, rather about data streamer which may be recreated at any time.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пн, 3 февр. 2020 г. в 16:58, narges saleh <sn...@gmail.com>:
>>
>>> Thanks Ilya.
>>>  I have to listen to these burst of data which arrive every few seconds
>>> meaning an almost constant bursts of data from different data sources.
>>> The main reason that the services grid is appealing to me is its
>>> resiliency; I don't have to worry about it. With the client side streamer,
>>> I will have to deploy it and keep it up running, and load/re balance it.
>>>
>>> On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev <
>>> ilya.kasnacheev@gmail.com> wrote:
>>>
>>>> Hello!
>>>>
>>>> I don't see why you would deploy it as a service, sounds like you will
>>>> have to send more data over network. If you have to pull batches in, then
>>>> service should work. I recommend re-acquiring data streamer for each batch.
>>>>
>>>> Please note that Data Streamer is very scalable, so it is preferred to
>>>> tune it than trying to use more than one streamer.
>>>>
>>>> Regards,
>>>> --
>>>> Ilya Kasnacheev
>>>>
>>>>
>>>> пн, 3 февр. 2020 г. в 16:11, narges saleh <sn...@gmail.com>:
>>>>
>>>>> Hi Ilya
>>>>> The data comes in huge batches of records (each burst can be up to
>>>>> 50-100 MB, which I plan to spread across multiple streamers) so, the
>>>>> streamer seems to be the way to go. Also, I don't want to establish a JDBC
>>>>> connection each time.
>>>>> So, if the streamer is the way to go, is it feasible to deploy it as a
>>>>> service?
>>>>> thanks.
>>>>>
>>>>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <
>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> Contrary to its name, data streamer is not actually suitable for
>>>>>> long-lived, low-intensity streaming. What it's good for is burst load of
>>>>>> large number of data in a short period of time.
>>>>>>
>>>>>> If your data arrives in large batches, you can use Data Streamer for
>>>>>> each batch. If not, better use Cache API.
>>>>>>
>>>>>> If you are worried that plain Cache API is slow, but also want
>>>>>> failure resilience, there's catch-22. The only way to make something
>>>>>> resilient is to put it into cache :)
>>>>>>
>>>>>> Regards,
>>>>>> --
>>>>>> Ilya Kasnacheev
>>>>>>
>>>>>>
>>>>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:
>>>>>>
>>>>>>> Hi,
>>>>>>> But services are by definition long lived, right? Here is my layout:
>>>>>>> The data is continuously generated and sent to the streamer services (via
>>>>>>> JDBC connection with set streaming on option), deployed, say, as node
>>>>>>> singleton (actually deployed also as microservices) to load the data into
>>>>>>> the caches. The streamers do flush data based on some timers.
>>>>>>>  If the streamer crashes before the buffer is flushed, the client
>>>>>>> catches the exception and resends the batch. Any issue with this layout?
>>>>>>>
>>>>>>> thanks.
>>>>>>>
>>>>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>>
>>>>>>>> Hello!
>>>>>>>>
>>>>>>>> It is not recommended to have long-lived data streamers, it's best
>>>>>>>> to acquire it when it is needed.
>>>>>>>>
>>>>>>>> If you have to keep data streamer around, don't forget to flush()
>>>>>>>> it. This way you don't have to worry about its queue.
>>>>>>>>
>>>>>>>> Regards,
>>>>>>>> --
>>>>>>>> Ilya Kasnacheev
>>>>>>>>
>>>>>>>>
>>>>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>>>>>>>>
>>>>>>>>> Hi,
>>>>>>>>> My specific question/concern is with regard to the state of the
>>>>>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>>>>>> redeployed. Specifically, what happens to the data?
>>>>>>>>> I have a similar question with regard to the state of a continuous
>>>>>>>>> query when it is deployed as a service, what happens to the data in the
>>>>>>>>> listener's queue?
>>>>>>>>>
>>>>>>>>> thanks.
>>>>>>>>>
>>>>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi!
>>>>>>>>>>
>>>>>>>>>> Not as far as I know, I have a number of services using streamers
>>>>>>>>>> without any problems, do you have any specific problem with it ?
>>>>>>>>>>
>>>>>>>>>> Mikael
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>>>>>> > Hi All,
>>>>>>>>>> >
>>>>>>>>>> > Is there a problem with running the datastreamer as a service,
>>>>>>>>>> being
>>>>>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>>>>>> connection
>>>>>>>>>> > with streaming mode enabled?
>>>>>>>>>> > In either case, the deployment is affinity based.
>>>>>>>>>> >
>>>>>>>>>> > thanks.
>>>>>>>>>>
>>>>>>>>>

Re: DataStreamer as a Service

Posted by Denis Magda <dm...@apache.org>.
Ilya,

I don't quite understand why data streamer is not suitable as a
long-running solution. Please don't mislead, otherwise, list out specific
limitations. I don't see anything wrong by having an opened data
streamer that transfer data to Ignite in real-time.

Narges, if the streamer crashes then your service/app needs to resend those
records that were not acknowledged. Probably, you might utilize Kafka
Connect here that keeps track of committed/pending records.

-
Denis


On Mon, Feb 3, 2020 at 6:13 AM Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> I think these benefits are imaginary. You will have to worry about service
> more, rather about data streamer which may be recreated at any time.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 3 февр. 2020 г. в 16:58, narges saleh <sn...@gmail.com>:
>
>> Thanks Ilya.
>>  I have to listen to these burst of data which arrive every few seconds
>> meaning an almost constant bursts of data from different data sources.
>> The main reason that the services grid is appealing to me is its
>> resiliency; I don't have to worry about it. With the client side streamer,
>> I will have to deploy it and keep it up running, and load/re balance it.
>>
>> On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev <il...@gmail.com>
>> wrote:
>>
>>> Hello!
>>>
>>> I don't see why you would deploy it as a service, sounds like you will
>>> have to send more data over network. If you have to pull batches in, then
>>> service should work. I recommend re-acquiring data streamer for each batch.
>>>
>>> Please note that Data Streamer is very scalable, so it is preferred to
>>> tune it than trying to use more than one streamer.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> пн, 3 февр. 2020 г. в 16:11, narges saleh <sn...@gmail.com>:
>>>
>>>> Hi Ilya
>>>> The data comes in huge batches of records (each burst can be up to
>>>> 50-100 MB, which I plan to spread across multiple streamers) so, the
>>>> streamer seems to be the way to go. Also, I don't want to establish a JDBC
>>>> connection each time.
>>>> So, if the streamer is the way to go, is it feasible to deploy it as a
>>>> service?
>>>> thanks.
>>>>
>>>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <
>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> Contrary to its name, data streamer is not actually suitable for
>>>>> long-lived, low-intensity streaming. What it's good for is burst load of
>>>>> large number of data in a short period of time.
>>>>>
>>>>> If your data arrives in large batches, you can use Data Streamer for
>>>>> each batch. If not, better use Cache API.
>>>>>
>>>>> If you are worried that plain Cache API is slow, but also want failure
>>>>> resilience, there's catch-22. The only way to make something resilient is
>>>>> to put it into cache :)
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:
>>>>>
>>>>>> Hi,
>>>>>> But services are by definition long lived, right? Here is my layout:
>>>>>> The data is continuously generated and sent to the streamer services (via
>>>>>> JDBC connection with set streaming on option), deployed, say, as node
>>>>>> singleton (actually deployed also as microservices) to load the data into
>>>>>> the caches. The streamers do flush data based on some timers.
>>>>>>  If the streamer crashes before the buffer is flushed, the client
>>>>>> catches the exception and resends the batch. Any issue with this layout?
>>>>>>
>>>>>> thanks.
>>>>>>
>>>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>>
>>>>>>> Hello!
>>>>>>>
>>>>>>> It is not recommended to have long-lived data streamers, it's best
>>>>>>> to acquire it when it is needed.
>>>>>>>
>>>>>>> If you have to keep data streamer around, don't forget to flush()
>>>>>>> it. This way you don't have to worry about its queue.
>>>>>>>
>>>>>>> Regards,
>>>>>>> --
>>>>>>> Ilya Kasnacheev
>>>>>>>
>>>>>>>
>>>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>>>>>>>
>>>>>>>> Hi,
>>>>>>>> My specific question/concern is with regard to the state of the
>>>>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>>>>> redeployed. Specifically, what happens to the data?
>>>>>>>> I have a similar question with regard to the state of a continuous
>>>>>>>> query when it is deployed as a service, what happens to the data in the
>>>>>>>> listener's queue?
>>>>>>>>
>>>>>>>> thanks.
>>>>>>>>
>>>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi!
>>>>>>>>>
>>>>>>>>> Not as far as I know, I have a number of services using streamers
>>>>>>>>> without any problems, do you have any specific problem with it ?
>>>>>>>>>
>>>>>>>>> Mikael
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>>>>> > Hi All,
>>>>>>>>> >
>>>>>>>>> > Is there a problem with running the datastreamer as a service,
>>>>>>>>> being
>>>>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>>>>> connection
>>>>>>>>> > with streaming mode enabled?
>>>>>>>>> > In either case, the deployment is affinity based.
>>>>>>>>> >
>>>>>>>>> > thanks.
>>>>>>>>>
>>>>>>>>

Re: DataStreamer as a Service

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

I think these benefits are imaginary. You will have to worry about service
more, rather about data streamer which may be recreated at any time.

Regards,
-- 
Ilya Kasnacheev


пн, 3 февр. 2020 г. в 16:58, narges saleh <sn...@gmail.com>:

> Thanks Ilya.
>  I have to listen to these burst of data which arrive every few seconds
> meaning an almost constant bursts of data from different data sources.
> The main reason that the services grid is appealing to me is its
> resiliency; I don't have to worry about it. With the client side streamer,
> I will have to deploy it and keep it up running, and load/re balance it.
>
> On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> I don't see why you would deploy it as a service, sounds like you will
>> have to send more data over network. If you have to pull batches in, then
>> service should work. I recommend re-acquiring data streamer for each batch.
>>
>> Please note that Data Streamer is very scalable, so it is preferred to
>> tune it than trying to use more than one streamer.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пн, 3 февр. 2020 г. в 16:11, narges saleh <sn...@gmail.com>:
>>
>>> Hi Ilya
>>> The data comes in huge batches of records (each burst can be up to
>>> 50-100 MB, which I plan to spread across multiple streamers) so, the
>>> streamer seems to be the way to go. Also, I don't want to establish a JDBC
>>> connection each time.
>>> So, if the streamer is the way to go, is it feasible to deploy it as a
>>> service?
>>> thanks.
>>>
>>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <
>>> ilya.kasnacheev@gmail.com> wrote:
>>>
>>>> Hello!
>>>>
>>>> Contrary to its name, data streamer is not actually suitable for
>>>> long-lived, low-intensity streaming. What it's good for is burst load of
>>>> large number of data in a short period of time.
>>>>
>>>> If your data arrives in large batches, you can use Data Streamer for
>>>> each batch. If not, better use Cache API.
>>>>
>>>> If you are worried that plain Cache API is slow, but also want failure
>>>> resilience, there's catch-22. The only way to make something resilient is
>>>> to put it into cache :)
>>>>
>>>> Regards,
>>>> --
>>>> Ilya Kasnacheev
>>>>
>>>>
>>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:
>>>>
>>>>> Hi,
>>>>> But services are by definition long lived, right? Here is my layout:
>>>>> The data is continuously generated and sent to the streamer services (via
>>>>> JDBC connection with set streaming on option), deployed, say, as node
>>>>> singleton (actually deployed also as microservices) to load the data into
>>>>> the caches. The streamers do flush data based on some timers.
>>>>>  If the streamer crashes before the buffer is flushed, the client
>>>>> catches the exception and resends the batch. Any issue with this layout?
>>>>>
>>>>> thanks.
>>>>>
>>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> It is not recommended to have long-lived data streamers, it's best to
>>>>>> acquire it when it is needed.
>>>>>>
>>>>>> If you have to keep data streamer around, don't forget to flush() it.
>>>>>> This way you don't have to worry about its queue.
>>>>>>
>>>>>> Regards,
>>>>>> --
>>>>>> Ilya Kasnacheev
>>>>>>
>>>>>>
>>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>>>>>>
>>>>>>> Hi,
>>>>>>> My specific question/concern is with regard to the state of the
>>>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>>>> redeployed. Specifically, what happens to the data?
>>>>>>> I have a similar question with regard to the state of a continuous
>>>>>>> query when it is deployed as a service, what happens to the data in the
>>>>>>> listener's queue?
>>>>>>>
>>>>>>> thanks.
>>>>>>>
>>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi!
>>>>>>>>
>>>>>>>> Not as far as I know, I have a number of services using streamers
>>>>>>>> without any problems, do you have any specific problem with it ?
>>>>>>>>
>>>>>>>> Mikael
>>>>>>>>
>>>>>>>>
>>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>>>> > Hi All,
>>>>>>>> >
>>>>>>>> > Is there a problem with running the datastreamer as a service,
>>>>>>>> being
>>>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>>>> connection
>>>>>>>> > with streaming mode enabled?
>>>>>>>> > In either case, the deployment is affinity based.
>>>>>>>> >
>>>>>>>> > thanks.
>>>>>>>>
>>>>>>>

Re: DataStreamer as a Service

Posted by narges saleh <sn...@gmail.com>.
Thanks Ilya.
 I have to listen to these burst of data which arrive every few seconds
meaning an almost constant bursts of data from different data sources.
The main reason that the services grid is appealing to me is its
resiliency; I don't have to worry about it. With the client side streamer,
I will have to deploy it and keep it up running, and load/re balance it.

On Mon, Feb 3, 2020 at 7:17 AM Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> I don't see why you would deploy it as a service, sounds like you will
> have to send more data over network. If you have to pull batches in, then
> service should work. I recommend re-acquiring data streamer for each batch.
>
> Please note that Data Streamer is very scalable, so it is preferred to
> tune it than trying to use more than one streamer.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 3 февр. 2020 г. в 16:11, narges saleh <sn...@gmail.com>:
>
>> Hi Ilya
>> The data comes in huge batches of records (each burst can be up to 50-100
>> MB, which I plan to spread across multiple streamers) so, the streamer
>> seems to be the way to go. Also, I don't want to establish a JDBC
>> connection each time.
>> So, if the streamer is the way to go, is it feasible to deploy it as a
>> service?
>> thanks.
>>
>> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <il...@gmail.com>
>> wrote:
>>
>>> Hello!
>>>
>>> Contrary to its name, data streamer is not actually suitable for
>>> long-lived, low-intensity streaming. What it's good for is burst load of
>>> large number of data in a short period of time.
>>>
>>> If your data arrives in large batches, you can use Data Streamer for
>>> each batch. If not, better use Cache API.
>>>
>>> If you are worried that plain Cache API is slow, but also want failure
>>> resilience, there's catch-22. The only way to make something resilient is
>>> to put it into cache :)
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:
>>>
>>>> Hi,
>>>> But services are by definition long lived, right? Here is my layout:
>>>> The data is continuously generated and sent to the streamer services (via
>>>> JDBC connection with set streaming on option), deployed, say, as node
>>>> singleton (actually deployed also as microservices) to load the data into
>>>> the caches. The streamers do flush data based on some timers.
>>>>  If the streamer crashes before the buffer is flushed, the client
>>>> catches the exception and resends the batch. Any issue with this layout?
>>>>
>>>> thanks.
>>>>
>>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>>> ilya.kasnacheev@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> It is not recommended to have long-lived data streamers, it's best to
>>>>> acquire it when it is needed.
>>>>>
>>>>> If you have to keep data streamer around, don't forget to flush() it.
>>>>> This way you don't have to worry about its queue.
>>>>>
>>>>> Regards,
>>>>> --
>>>>> Ilya Kasnacheev
>>>>>
>>>>>
>>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>>>>>
>>>>>> Hi,
>>>>>> My specific question/concern is with regard to the state of the
>>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>>> redeployed. Specifically, what happens to the data?
>>>>>> I have a similar question with regard to the state of a continuous
>>>>>> query when it is deployed as a service, what happens to the data in the
>>>>>> listener's queue?
>>>>>>
>>>>>> thanks.
>>>>>>
>>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi!
>>>>>>>
>>>>>>> Not as far as I know, I have a number of services using streamers
>>>>>>> without any problems, do you have any specific problem with it ?
>>>>>>>
>>>>>>> Mikael
>>>>>>>
>>>>>>>
>>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>>> > Hi All,
>>>>>>> >
>>>>>>> > Is there a problem with running the datastreamer as a service,
>>>>>>> being
>>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>>> connection
>>>>>>> > with streaming mode enabled?
>>>>>>> > In either case, the deployment is affinity based.
>>>>>>> >
>>>>>>> > thanks.
>>>>>>>
>>>>>>

Re: DataStreamer as a Service

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

I don't see why you would deploy it as a service, sounds like you will have
to send more data over network. If you have to pull batches in, then
service should work. I recommend re-acquiring data streamer for each batch.

Please note that Data Streamer is very scalable, so it is preferred to tune
it than trying to use more than one streamer.

Regards,
-- 
Ilya Kasnacheev


пн, 3 февр. 2020 г. в 16:11, narges saleh <sn...@gmail.com>:

> Hi Ilya
> The data comes in huge batches of records (each burst can be up to 50-100
> MB, which I plan to spread across multiple streamers) so, the streamer
> seems to be the way to go. Also, I don't want to establish a JDBC
> connection each time.
> So, if the streamer is the way to go, is it feasible to deploy it as a
> service?
> thanks.
>
> On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> Contrary to its name, data streamer is not actually suitable for
>> long-lived, low-intensity streaming. What it's good for is burst load of
>> large number of data in a short period of time.
>>
>> If your data arrives in large batches, you can use Data Streamer for each
>> batch. If not, better use Cache API.
>>
>> If you are worried that plain Cache API is slow, but also want failure
>> resilience, there's catch-22. The only way to make something resilient is
>> to put it into cache :)
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:
>>
>>> Hi,
>>> But services are by definition long lived, right? Here is my layout: The
>>> data is continuously generated and sent to the streamer services (via JDBC
>>> connection with set streaming on option), deployed, say, as node singleton
>>> (actually deployed also as microservices) to load the data into the caches.
>>> The streamers do flush data based on some timers.
>>>  If the streamer crashes before the buffer is flushed, the client
>>> catches the exception and resends the batch. Any issue with this layout?
>>>
>>> thanks.
>>>
>>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <
>>> ilya.kasnacheev@gmail.com> wrote:
>>>
>>>> Hello!
>>>>
>>>> It is not recommended to have long-lived data streamers, it's best to
>>>> acquire it when it is needed.
>>>>
>>>> If you have to keep data streamer around, don't forget to flush() it.
>>>> This way you don't have to worry about its queue.
>>>>
>>>> Regards,
>>>> --
>>>> Ilya Kasnacheev
>>>>
>>>>
>>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>>>>
>>>>> Hi,
>>>>> My specific question/concern is with regard to the state of the
>>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>>> redeployed. Specifically, what happens to the data?
>>>>> I have a similar question with regard to the state of a continuous
>>>>> query when it is deployed as a service, what happens to the data in the
>>>>> listener's queue?
>>>>>
>>>>> thanks.
>>>>>
>>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com>
>>>>> wrote:
>>>>>
>>>>>> Hi!
>>>>>>
>>>>>> Not as far as I know, I have a number of services using streamers
>>>>>> without any problems, do you have any specific problem with it ?
>>>>>>
>>>>>> Mikael
>>>>>>
>>>>>>
>>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>>> > Hi All,
>>>>>> >
>>>>>> > Is there a problem with running the datastreamer as a service,
>>>>>> being
>>>>>> > instantiated in init method? Or loading the data via JDBC
>>>>>> connection
>>>>>> > with streaming mode enabled?
>>>>>> > In either case, the deployment is affinity based.
>>>>>> >
>>>>>> > thanks.
>>>>>>
>>>>>

Re: DataStreamer as a Service

Posted by narges saleh <sn...@gmail.com>.
Hi Ilya
The data comes in huge batches of records (each burst can be up to 50-100
MB, which I plan to spread across multiple streamers) so, the streamer
seems to be the way to go. Also, I don't want to establish a JDBC
connection each time.
So, if the streamer is the way to go, is it feasible to deploy it as a
service?
thanks.

On Mon, Feb 3, 2020 at 6:51 AM Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> Contrary to its name, data streamer is not actually suitable for
> long-lived, low-intensity streaming. What it's good for is burst load of
> large number of data in a short period of time.
>
> If your data arrives in large batches, you can use Data Streamer for each
> batch. If not, better use Cache API.
>
> If you are worried that plain Cache API is slow, but also want failure
> resilience, there's catch-22. The only way to make something resilient is
> to put it into cache :)
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:
>
>> Hi,
>> But services are by definition long lived, right? Here is my layout: The
>> data is continuously generated and sent to the streamer services (via JDBC
>> connection with set streaming on option), deployed, say, as node singleton
>> (actually deployed also as microservices) to load the data into the caches.
>> The streamers do flush data based on some timers.
>>  If the streamer crashes before the buffer is flushed, the client catches
>> the exception and resends the batch. Any issue with this layout?
>>
>> thanks.
>>
>> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <il...@gmail.com>
>> wrote:
>>
>>> Hello!
>>>
>>> It is not recommended to have long-lived data streamers, it's best to
>>> acquire it when it is needed.
>>>
>>> If you have to keep data streamer around, don't forget to flush() it.
>>> This way you don't have to worry about its queue.
>>>
>>> Regards,
>>> --
>>> Ilya Kasnacheev
>>>
>>>
>>> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>>>
>>>> Hi,
>>>> My specific question/concern is with regard to the state of the
>>>> streamer when it run as a service, i.e. when it crashes and it gets
>>>> redeployed. Specifically, what happens to the data?
>>>> I have a similar question with regard to the state of a continuous
>>>> query when it is deployed as a service, what happens to the data in the
>>>> listener's queue?
>>>>
>>>> thanks.
>>>>
>>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com>
>>>> wrote:
>>>>
>>>>> Hi!
>>>>>
>>>>> Not as far as I know, I have a number of services using streamers
>>>>> without any problems, do you have any specific problem with it ?
>>>>>
>>>>> Mikael
>>>>>
>>>>>
>>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>>> > Hi All,
>>>>> >
>>>>> > Is there a problem with running the datastreamer as a service, being
>>>>> > instantiated in init method? Or loading the data via JDBC connection
>>>>> > with streaming mode enabled?
>>>>> > In either case, the deployment is affinity based.
>>>>> >
>>>>> > thanks.
>>>>>
>>>>

Re: DataStreamer as a Service

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Contrary to its name, data streamer is not actually suitable for
long-lived, low-intensity streaming. What it's good for is burst load of
large number of data in a short period of time.

If your data arrives in large batches, you can use Data Streamer for each
batch. If not, better use Cache API.

If you are worried that plain Cache API is slow, but also want failure
resilience, there's catch-22. The only way to make something resilient is
to put it into cache :)

Regards,
-- 
Ilya Kasnacheev


пн, 3 февр. 2020 г. в 14:34, narges saleh <sn...@gmail.com>:

> Hi,
> But services are by definition long lived, right? Here is my layout: The
> data is continuously generated and sent to the streamer services (via JDBC
> connection with set streaming on option), deployed, say, as node singleton
> (actually deployed also as microservices) to load the data into the caches.
> The streamers do flush data based on some timers.
>  If the streamer crashes before the buffer is flushed, the client catches
> the exception and resends the batch. Any issue with this layout?
>
> thanks.
>
> On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <il...@gmail.com>
> wrote:
>
>> Hello!
>>
>> It is not recommended to have long-lived data streamers, it's best to
>> acquire it when it is needed.
>>
>> If you have to keep data streamer around, don't forget to flush() it.
>> This way you don't have to worry about its queue.
>>
>> Regards,
>> --
>> Ilya Kasnacheev
>>
>>
>> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>>
>>> Hi,
>>> My specific question/concern is with regard to the state of the streamer
>>> when it run as a service, i.e. when it crashes and it gets redeployed.
>>> Specifically, what happens to the data?
>>> I have a similar question with regard to the state of a continuous query
>>> when it is deployed as a service, what happens to the data in the
>>> listener's queue?
>>>
>>> thanks.
>>>
>>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com> wrote:
>>>
>>>> Hi!
>>>>
>>>> Not as far as I know, I have a number of services using streamers
>>>> without any problems, do you have any specific problem with it ?
>>>>
>>>> Mikael
>>>>
>>>>
>>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>>> > Hi All,
>>>> >
>>>> > Is there a problem with running the datastreamer as a service, being
>>>> > instantiated in init method? Or loading the data via JDBC connection
>>>> > with streaming mode enabled?
>>>> > In either case, the deployment is affinity based.
>>>> >
>>>> > thanks.
>>>>
>>>

Re: DataStreamer as a Service

Posted by narges saleh <sn...@gmail.com>.
Hi,
But services are by definition long lived, right? Here is my layout: The
data is continuously generated and sent to the streamer services (via JDBC
connection with set streaming on option), deployed, say, as node singleton
(actually deployed also as microservices) to load the data into the caches.
The streamers do flush data based on some timers.
 If the streamer crashes before the buffer is flushed, the client catches
the exception and resends the batch. Any issue with this layout?

thanks.

On Mon, Feb 3, 2020 at 5:02 AM Ilya Kasnacheev <il...@gmail.com>
wrote:

> Hello!
>
> It is not recommended to have long-lived data streamers, it's best to
> acquire it when it is needed.
>
> If you have to keep data streamer around, don't forget to flush() it. This
> way you don't have to worry about its queue.
>
> Regards,
> --
> Ilya Kasnacheev
>
>
> пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:
>
>> Hi,
>> My specific question/concern is with regard to the state of the streamer
>> when it run as a service, i.e. when it crashes and it gets redeployed.
>> Specifically, what happens to the data?
>> I have a similar question with regard to the state of a continuous query
>> when it is deployed as a service, what happens to the data in the
>> listener's queue?
>>
>> thanks.
>>
>> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com> wrote:
>>
>>> Hi!
>>>
>>> Not as far as I know, I have a number of services using streamers
>>> without any problems, do you have any specific problem with it ?
>>>
>>> Mikael
>>>
>>>
>>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>>> > Hi All,
>>> >
>>> > Is there a problem with running the datastreamer as a service, being
>>> > instantiated in init method? Or loading the data via JDBC connection
>>> > with streaming mode enabled?
>>> > In either case, the deployment is affinity based.
>>> >
>>> > thanks.
>>>
>>

Re: DataStreamer as a Service

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

It is not recommended to have long-lived data streamers, it's best to
acquire it when it is needed.

If you have to keep data streamer around, don't forget to flush() it. This
way you don't have to worry about its queue.

Regards,
-- 
Ilya Kasnacheev


пн, 3 февр. 2020 г. в 13:24, narges saleh <sn...@gmail.com>:

> Hi,
> My specific question/concern is with regard to the state of the streamer
> when it run as a service, i.e. when it crashes and it gets redeployed.
> Specifically, what happens to the data?
> I have a similar question with regard to the state of a continuous query
> when it is deployed as a service, what happens to the data in the
> listener's queue?
>
> thanks.
>
> On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com> wrote:
>
>> Hi!
>>
>> Not as far as I know, I have a number of services using streamers
>> without any problems, do you have any specific problem with it ?
>>
>> Mikael
>>
>>
>> Den 2020-02-02 kl. 22:33, skrev narges saleh:
>> > Hi All,
>> >
>> > Is there a problem with running the datastreamer as a service, being
>> > instantiated in init method? Or loading the data via JDBC connection
>> > with streaming mode enabled?
>> > In either case, the deployment is affinity based.
>> >
>> > thanks.
>>
>

Re: DataStreamer as a Service

Posted by narges saleh <sn...@gmail.com>.
Hi,
My specific question/concern is with regard to the state of the streamer
when it run as a service, i.e. when it crashes and it gets redeployed.
Specifically, what happens to the data?
I have a similar question with regard to the state of a continuous query
when it is deployed as a service, what happens to the data in the
listener's queue?

thanks.

On Sun, Feb 2, 2020 at 4:18 PM Mikael <mi...@telia.com> wrote:

> Hi!
>
> Not as far as I know, I have a number of services using streamers
> without any problems, do you have any specific problem with it ?
>
> Mikael
>
>
> Den 2020-02-02 kl. 22:33, skrev narges saleh:
> > Hi All,
> >
> > Is there a problem with running the datastreamer as a service, being
> > instantiated in init method? Or loading the data via JDBC connection
> > with streaming mode enabled?
> > In either case, the deployment is affinity based.
> >
> > thanks.
>

Re: DataStreamer as a Service

Posted by Mikael <mi...@telia.com>.
Hi!

Not as far as I know, I have a number of services using streamers 
without any problems, do you have any specific problem with it ?

Mikael


Den 2020-02-02 kl. 22:33, skrev narges saleh:
> Hi All,
>
> Is there a problem with running the datastreamer as a service, being 
> instantiated in init method? Or loading the data via JDBC connection 
> with streaming mode enabled?
> In either case, the deployment is affinity based.
>
> thanks.