You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by narges saleh <sn...@gmail.com> on 2020/10/02 10:49:45 UTC

Continuous Query

Hi All,
 If I want to watch for a rolling timestamp pattern in all the records that
get inserted to all my caches, is it more efficient to use timer based jobs
(that checks all the records in some interval) or  continuous queries that
locally filter on the pattern? These records can get inserted in any order
and some can arrive with delays.
An example is to watch for all the records whose timestamp ends in 50, if
the timestamp is in the format yyyy-mm-dd hh:mi.

thanks

Re: Continuous Query

Posted by narges saleh <sn...@gmail.com>.

Thanks Raymond for the pointer.

On Wed, Oct 7, 2020 at 4:39 PM Raymond Wilson <ra...@trimble.com>
wrote:

> It's possible to configure the continuous query to return only keys in the
> cache that are stored on the local node.
>
> On the C# client we do it like this:
>
> var _queryHandle = queueCache.QueryContinuous
>             (qry: new ContinuousQuery<[..KeyType..],
> [...ValueType...]>(listener) {Local = true},
>               initialQry: new ScanQuery< [..KeyType..], [...ValueType...]
> > {Local = true});
>
> On Thu, Oct 8, 2020 at 9:53 AM narges saleh <sn...@gmail.com> wrote:
>
>> Thanks for the detailed explanation.
>>
>> I don't know how efficient it would be if you have to filter each record
>> one by one and then update each record, three times, to keep track of the
>> status, if you're dealing with millions of records each hour, even if the
>> cache is partitioned. I guess I will need to benchmark this.  thanks again.
>>
>> On Wed, Oct 7, 2020 at 12:00 PM Denis Magda <dm...@apache.org> wrote:
>>
>>> I recalled a complete solution. That's what you would need to do if
>>> decide to process records in real-time with continuous queries in the
>>> *fault-tolerant fashion* (using pseudo-code rather than actual Ignite APIs).
>>>
>>> First. You need to add a flag field to the record's class that keeps the
>>> current processing status. Like that:
>>>
>>> MyRecord {
>>> int id;
>>> Date created;
>>> byte status; //0 - not processed, 1 - being processed withing a
>>> continuous query filter, 2 - processed by the filter, all the logic
>>> successfully completed
>>> }
>>>
>>> Second. The continuous query filter (that will be executed on nodes that
>>> store a copy of a record) needs to have the following structure.
>>>
>>> @IgniteAsyncCallback
>>> filterMethod(MyRecords updatedRecord) {
>>>
>>> if (isThisNodePrimaryNodeForTheRecord(updatedRecord)) { // execute on a
>>> primary node only
>>>     updatedRecord.status = 1 // setting the flag to signify the
>>> processing is started.
>>>
>>>     //execute your business logic
>>>
>>>     updatedRecord.status = 2 // finished processing
>>> }
>>>   return false; //you don't want a notification to be sent to the client
>>> application or another node that deployed the continuous query
>>> }
>>>
>>> Third. If any node leaves the cluster or the whole cluster is restarted,
>>> then you need to execute your custom for all the records with status=0 or
>>> status=1. To do that you can broadcast a compute task:
>>>
>>> // Application side
>>>
>>> int[] unprocessedRecords = "select id from MyRecord where status < 2;"
>>>
>>> IgniteCompute.affinityRun(idsOfUnprocessedRecords,
>>> taskWithMyCustomLogic); //the task will be executed only on the nodes that
>>> store the records
>>>
>>> // Server side
>>>
>>> taskWithMyCustomLogic() {
>>>     updatedRecord.status = 1 // setting the flag to signify the
>>> processing is started.
>>>
>>>     //execute your business logic
>>>
>>>     updatedRecord.status = 2 // finished processing
>>> }
>>>
>>>
>>> That's it. So, the third step already requires you to have a compute
>>> task that would run the calculation in case of failures. Thus, if the
>>> real-time aspect of the processing is not crucial right now, then you can
>>> start with the batch-based approach by running a compute task once at a
>>> time and then introduce the continuous queries-based improvement whenever
>>> is needed. You decide. Hope it helps.
>>>
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Wed, Oct 7, 2020 at 9:19 AM Denis Magda <dm...@apache.org> wrote:
>>>
>>>> So, a lesson for the future, would the continuous query approach still
>>>>> be preferable if the calculation involves the cache with continuous query
>>>>> and say a look up table? For example, if I want to see the country in the
>>>>> cache employee exists in the list of the countries that I am interested in.
>>>>
>>>>
>>>> You can access other caches from within the filter but the logic has to
>>>> be executed asynchronously to avoid deadlocks:
>>>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>>>
>>>> Also, what do I need to do if I want the filter for the continuous
>>>>> query to execute on the cache on the local node only? Say, I have the
>>>>> continuous query deployed as singleton service on each node, to capture
>>>>> certain changes to a cache on the local node.
>>>>
>>>>
>>>> <https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html>
>>>>
>>>> The filter will be deployed and executed on every server node. The
>>>> filter is executed only on a server node that owns a record that is being
>>>> modified and passed into a filter. Hmm, it's also said that the filter can
>>>> be executed on a backup node. Check if it's true, and then you need to add
>>>> a special check into the filter that would allow executing the logic only
>>>> if it's the primary node:
>>>>
>>>> https://ignite.apache.org/docs/latest/key-value-api/continuous-queries#remote-filter
>>>>
>>>>
>>>> -
>>>> Denis
>>>>
>>>>
>>>> On Wed, Oct 7, 2020 at 4:39 AM narges saleh <sn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Also, what do I need to do if I want the filter for the continuous
>>>>> query to execute on the cache on the local node only? Say, I have the
>>>>> continuous query deployed as singleton service on each node, to capture
>>>>> certain changes to a cache on the local node.
>>>>>
>>>>> On Wed, Oct 7, 2020 at 5:54 AM narges saleh <sn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Thank you, Denis.
>>>>>> So, a lesson for the future, would the continuous query approach
>>>>>> still be preferable if the calculation involves the cache with continuous
>>>>>> query and say a look up table? For example, if I want to see the country in
>>>>>> the cache employee exists in the list of the countries that I am interested
>>>>>> in.
>>>>>>
>>>>>> On Tue, Oct 6, 2020 at 4:11 PM Denis Magda <dm...@apache.org> wrote:
>>>>>>
>>>>>>> Thanks
>>>>>>>
>>>>>>> Then, I would consider the continuous queries based solution as long
>>>>>>> as the records can be updated in real-time:
>>>>>>>
>>>>>>>    - You can process the records on the fly and don't need to come
>>>>>>>    up with any batch task.
>>>>>>>    - The continuous query filter will be executed once on a node
>>>>>>>    that stores the record's primary copy. If the primary node fails in the
>>>>>>>    middle of the filter's calculation execution, then the filter will be
>>>>>>>    executed on a backup node. So, you will not lose any updates but might need
>>>>>>>    to introduce some logic/flag that confirms the calculation is not executed
>>>>>>>    twice for a single record (this can happen if the primary node failed in
>>>>>>>    the middle of the calculation execution and then the backup node picked up
>>>>>>>    and started executing the calculation from scratch).
>>>>>>>    - Updates of other tables or records from within the continuous
>>>>>>>    query filter must go through an async thread pool. You need to use
>>>>>>>    IgniteAsyncCallback annotation for that:
>>>>>>>    https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>>>>>>
>>>>>>> Alternatively, you can always run the calculation in the
>>>>>>> batch-fashion:
>>>>>>>
>>>>>>>    - Run a compute task once in a while
>>>>>>>    - Read all the latest records that satisfy the requests with SQL
>>>>>>>    or any other APIs
>>>>>>>    - Complete the calculation, mark already processed records just
>>>>>>>    in case if everything is failed in the middle and you need to run the
>>>>>>>    calculation from scratch
>>>>>>>
>>>>>>>
>>>>>>> -
>>>>>>> Denis
>>>>>>>
>>>>>>>
>>>>>>> On Mon, Oct 5, 2020 at 8:33 PM narges saleh <sn...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Denis
>>>>>>>>  The calculation itself doesn't involve an update or read of
>>>>>>>> another record, but based on the outcome of the calculation, the process
>>>>>>>> might make changes in some other tables.
>>>>>>>>
>>>>>>>> thanks.
>>>>>>>>
>>>>>>>> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dm...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Good. Another clarification:
>>>>>>>>>
>>>>>>>>>    - Does that calculation change the state of the record
>>>>>>>>>    (updates any fields)?
>>>>>>>>>    - Does the calculation read or update any other records?
>>>>>>>>>
>>>>>>>>> -
>>>>>>>>> Denis
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> The latter; the server needs to perform some calculations on the
>>>>>>>>>> data without sending any notification to the app.
>>>>>>>>>>
>>>>>>>>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> And after you detect a record that satisfies the condition, do
>>>>>>>>>>> you need to send any notification to the application? Or is it more like a
>>>>>>>>>>> server detects and does some calculation logically without updating the app.
>>>>>>>>>>>
>>>>>>>>>>> -
>>>>>>>>>>> Denis
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <
>>>>>>>>>>> snarges124@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> The detection should happen at most a couple of minutes after a
>>>>>>>>>>>> record is inserted in the cache but all the detections are local to the
>>>>>>>>>>>> node. But some records with the current timestamp might show up in the
>>>>>>>>>>>> system with big delays.
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> What are your requirements? Do you need to process the records
>>>>>>>>>>>>> as soon as they are put into the cluster?
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thank you Dennis for the reply.
>>>>>>>>>>>>>> From the perspective of performance/resource overhead and
>>>>>>>>>>>>>> reliability, which approach is preferable? Does a continuous query based
>>>>>>>>>>>>>> approach impose a lot more overhead?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org>
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi Narges,
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Use continuous queries if you need to be notified in
>>>>>>>>>>>>>>> real-time, i.e. 1) a record is inserted, 2) the continuous filter confirms
>>>>>>>>>>>>>>> the record's time satisfies your condition, 3) the continuous queries
>>>>>>>>>>>>>>> notifies your application that does require processing.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> The jobs are better for a batching use case when it's ok to
>>>>>>>>>>>>>>> process records together with some delay.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> -
>>>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <
>>>>>>>>>>>>>>> snarges124@gmail.com> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>>  If I want to watch for a rolling timestamp pattern in all
>>>>>>>>>>>>>>>> the records that get inserted to all my caches, is it more efficient to use
>>>>>>>>>>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>>>>>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>>>>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>>>>>>>>>> An example is to watch for all the records whose timestamp
>>>>>>>>>>>>>>>> ends in 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> thanks
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>>> -
>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>
> --
> <http://www.trimble.com/>
> Raymond Wilson
> Solution Architect, Civil Construction Software Systems (CCSS)
> 11 Birmingham Drive | Christchurch, New Zealand
> +64-21-2013317 Mobile
> raymond_wilson@trimble.com
>
>
> <https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>
>

Re: Continuous Query

Posted by Raymond Wilson <ra...@trimble.com>.

It's possible to configure the continuous query to return only keys in the
cache that are stored on the local node.

On the C# client we do it like this:

var _queryHandle = queueCache.QueryContinuous
            (qry: new ContinuousQuery<[..KeyType..],
[...ValueType...]>(listener) {Local = true},
              initialQry: new ScanQuery< [..KeyType..], [...ValueType...]
> {Local = true});

On Thu, Oct 8, 2020 at 9:53 AM narges saleh <sn...@gmail.com> wrote:

> Thanks for the detailed explanation.
>
> I don't know how efficient it would be if you have to filter each record
> one by one and then update each record, three times, to keep track of the
> status, if you're dealing with millions of records each hour, even if the
> cache is partitioned. I guess I will need to benchmark this.  thanks again.
>
> On Wed, Oct 7, 2020 at 12:00 PM Denis Magda <dm...@apache.org> wrote:
>
>> I recalled a complete solution. That's what you would need to do if
>> decide to process records in real-time with continuous queries in the
>> *fault-tolerant fashion* (using pseudo-code rather than actual Ignite APIs).
>>
>> First. You need to add a flag field to the record's class that keeps the
>> current processing status. Like that:
>>
>> MyRecord {
>> int id;
>> Date created;
>> byte status; //0 - not processed, 1 - being processed withing a
>> continuous query filter, 2 - processed by the filter, all the logic
>> successfully completed
>> }
>>
>> Second. The continuous query filter (that will be executed on nodes that
>> store a copy of a record) needs to have the following structure.
>>
>> @IgniteAsyncCallback
>> filterMethod(MyRecords updatedRecord) {
>>
>> if (isThisNodePrimaryNodeForTheRecord(updatedRecord)) { // execute on a
>> primary node only
>>     updatedRecord.status = 1 // setting the flag to signify the
>> processing is started.
>>
>>     //execute your business logic
>>
>>     updatedRecord.status = 2 // finished processing
>> }
>>   return false; //you don't want a notification to be sent to the client
>> application or another node that deployed the continuous query
>> }
>>
>> Third. If any node leaves the cluster or the whole cluster is restarted,
>> then you need to execute your custom for all the records with status=0 or
>> status=1. To do that you can broadcast a compute task:
>>
>> // Application side
>>
>> int[] unprocessedRecords = "select id from MyRecord where status < 2;"
>>
>> IgniteCompute.affinityRun(idsOfUnprocessedRecords,
>> taskWithMyCustomLogic); //the task will be executed only on the nodes that
>> store the records
>>
>> // Server side
>>
>> taskWithMyCustomLogic() {
>>     updatedRecord.status = 1 // setting the flag to signify the
>> processing is started.
>>
>>     //execute your business logic
>>
>>     updatedRecord.status = 2 // finished processing
>> }
>>
>>
>> That's it. So, the third step already requires you to have a compute task
>> that would run the calculation in case of failures. Thus, if the real-time
>> aspect of the processing is not crucial right now, then you can start with
>> the batch-based approach by running a compute task once at a time and then
>> introduce the continuous queries-based improvement whenever is needed. You
>> decide. Hope it helps.
>>
>>
>> -
>> Denis
>>
>>
>> On Wed, Oct 7, 2020 at 9:19 AM Denis Magda <dm...@apache.org> wrote:
>>
>>> So, a lesson for the future, would the continuous query approach still
>>>> be preferable if the calculation involves the cache with continuous query
>>>> and say a look up table? For example, if I want to see the country in the
>>>> cache employee exists in the list of the countries that I am interested in.
>>>
>>>
>>> You can access other caches from within the filter but the logic has to
>>> be executed asynchronously to avoid deadlocks:
>>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>>
>>> Also, what do I need to do if I want the filter for the continuous query
>>>> to execute on the cache on the local node only? Say, I have the continuous
>>>> query deployed as singleton service on each node, to capture certain
>>>> changes to a cache on the local node.
>>>
>>>
>>> <https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html>
>>>
>>> The filter will be deployed and executed on every server node. The
>>> filter is executed only on a server node that owns a record that is being
>>> modified and passed into a filter. Hmm, it's also said that the filter can
>>> be executed on a backup node. Check if it's true, and then you need to add
>>> a special check into the filter that would allow executing the logic only
>>> if it's the primary node:
>>>
>>> https://ignite.apache.org/docs/latest/key-value-api/continuous-queries#remote-filter
>>>
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Wed, Oct 7, 2020 at 4:39 AM narges saleh <sn...@gmail.com>
>>> wrote:
>>>
>>>> Also, what do I need to do if I want the filter for the continuous
>>>> query to execute on the cache on the local node only? Say, I have the
>>>> continuous query deployed as singleton service on each node, to capture
>>>> certain changes to a cache on the local node.
>>>>
>>>> On Wed, Oct 7, 2020 at 5:54 AM narges saleh <sn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Thank you, Denis.
>>>>> So, a lesson for the future, would the continuous query approach still
>>>>> be preferable if the calculation involves the cache with continuous query
>>>>> and say a look up table? For example, if I want to see the country in the
>>>>> cache employee exists in the list of the countries that I am interested in.
>>>>>
>>>>> On Tue, Oct 6, 2020 at 4:11 PM Denis Magda <dm...@apache.org> wrote:
>>>>>
>>>>>> Thanks
>>>>>>
>>>>>> Then, I would consider the continuous queries based solution as long
>>>>>> as the records can be updated in real-time:
>>>>>>
>>>>>>    - You can process the records on the fly and don't need to come
>>>>>>    up with any batch task.
>>>>>>    - The continuous query filter will be executed once on a node
>>>>>>    that stores the record's primary copy. If the primary node fails in the
>>>>>>    middle of the filter's calculation execution, then the filter will be
>>>>>>    executed on a backup node. So, you will not lose any updates but might need
>>>>>>    to introduce some logic/flag that confirms the calculation is not executed
>>>>>>    twice for a single record (this can happen if the primary node failed in
>>>>>>    the middle of the calculation execution and then the backup node picked up
>>>>>>    and started executing the calculation from scratch).
>>>>>>    - Updates of other tables or records from within the continuous
>>>>>>    query filter must go through an async thread pool. You need to use
>>>>>>    IgniteAsyncCallback annotation for that:
>>>>>>    https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>>>>>
>>>>>> Alternatively, you can always run the calculation in the
>>>>>> batch-fashion:
>>>>>>
>>>>>>    - Run a compute task once in a while
>>>>>>    - Read all the latest records that satisfy the requests with SQL
>>>>>>    or any other APIs
>>>>>>    - Complete the calculation, mark already processed records just
>>>>>>    in case if everything is failed in the middle and you need to run the
>>>>>>    calculation from scratch
>>>>>>
>>>>>>
>>>>>> -
>>>>>> Denis
>>>>>>
>>>>>>
>>>>>> On Mon, Oct 5, 2020 at 8:33 PM narges saleh <sn...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Denis
>>>>>>>  The calculation itself doesn't involve an update or read of another
>>>>>>> record, but based on the outcome of the calculation, the process might make
>>>>>>> changes in some other tables.
>>>>>>>
>>>>>>> thanks.
>>>>>>>
>>>>>>> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dm...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Good. Another clarification:
>>>>>>>>
>>>>>>>>    - Does that calculation change the state of the record (updates
>>>>>>>>    any fields)?
>>>>>>>>    - Does the calculation read or update any other records?
>>>>>>>>
>>>>>>>> -
>>>>>>>> Denis
>>>>>>>>
>>>>>>>>
>>>>>>>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> The latter; the server needs to perform some calculations on the
>>>>>>>>> data without sending any notification to the app.
>>>>>>>>>
>>>>>>>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> And after you detect a record that satisfies the condition, do
>>>>>>>>>> you need to send any notification to the application? Or is it more like a
>>>>>>>>>> server detects and does some calculation logically without updating the app.
>>>>>>>>>>
>>>>>>>>>> -
>>>>>>>>>> Denis
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <
>>>>>>>>>> snarges124@gmail.com> wrote:
>>>>>>>>>>
>>>>>>>>>>> The detection should happen at most a couple of minutes after a
>>>>>>>>>>> record is inserted in the cache but all the detections are local to the
>>>>>>>>>>> node. But some records with the current timestamp might show up in the
>>>>>>>>>>> system with big delays.
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> What are your requirements? Do you need to process the records
>>>>>>>>>>>> as soon as they are put into the cluster?
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Thank you Dennis for the reply.
>>>>>>>>>>>>> From the perspective of performance/resource overhead and
>>>>>>>>>>>>> reliability, which approach is preferable? Does a continuous query based
>>>>>>>>>>>>> approach impose a lot more overhead?
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org>
>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi Narges,
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Use continuous queries if you need to be notified in
>>>>>>>>>>>>>> real-time, i.e. 1) a record is inserted, 2) the continuous filter confirms
>>>>>>>>>>>>>> the record's time satisfies your condition, 3) the continuous queries
>>>>>>>>>>>>>> notifies your application that does require processing.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> The jobs are better for a batching use case when it's ok to
>>>>>>>>>>>>>> process records together with some delay.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -
>>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <
>>>>>>>>>>>>>> snarges124@gmail.com> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>>  If I want to watch for a rolling timestamp pattern in all
>>>>>>>>>>>>>>> the records that get inserted to all my caches, is it more efficient to use
>>>>>>>>>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>>>>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>>>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>>>>>>>>> An example is to watch for all the records whose timestamp
>>>>>>>>>>>>>>> ends in 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> thanks
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> --
>>>>>>>>>>>> -
>>>>>>>>>>>> Denis
>>>>>>>>>>>>
>>>>>>>>>>>>

-- 
<http://www.trimble.com/>
Raymond Wilson
Solution Architect, Civil Construction Software Systems (CCSS)
11 Birmingham Drive | Christchurch, New Zealand
+64-21-2013317 Mobile
raymond_wilson@trimble.com

<https://worksos.trimble.com/?utm_source=Trimble&utm_medium=emailsign&utm_campaign=Launch>

Re: Continuous Query

Posted by narges saleh <sn...@gmail.com>.

Thanks for the detailed explanation.

I don't know how efficient it would be if you have to filter each record
one by one and then update each record, three times, to keep track of the
status, if you're dealing with millions of records each hour, even if the
cache is partitioned. I guess I will need to benchmark this.  thanks again.

On Wed, Oct 7, 2020 at 12:00 PM Denis Magda <dm...@apache.org> wrote:

> I recalled a complete solution. That's what you would need to do if decide
> to process records in real-time with continuous queries in the
> *fault-tolerant fashion* (using pseudo-code rather than actual Ignite APIs).
>
> First. You need to add a flag field to the record's class that keeps the
> current processing status. Like that:
>
> MyRecord {
> int id;
> Date created;
> byte status; //0 - not processed, 1 - being processed withing a
> continuous query filter, 2 - processed by the filter, all the logic
> successfully completed
> }
>
> Second. The continuous query filter (that will be executed on nodes that
> store a copy of a record) needs to have the following structure.
>
> @IgniteAsyncCallback
> filterMethod(MyRecords updatedRecord) {
>
> if (isThisNodePrimaryNodeForTheRecord(updatedRecord)) { // execute on a
> primary node only
>     updatedRecord.status = 1 // setting the flag to signify the processing
> is started.
>
>     //execute your business logic
>
>     updatedRecord.status = 2 // finished processing
> }
>   return false; //you don't want a notification to be sent to the client
> application or another node that deployed the continuous query
> }
>
> Third. If any node leaves the cluster or the whole cluster is restarted,
> then you need to execute your custom for all the records with status=0 or
> status=1. To do that you can broadcast a compute task:
>
> // Application side
>
> int[] unprocessedRecords = "select id from MyRecord where status < 2;"
>
> IgniteCompute.affinityRun(idsOfUnprocessedRecords, taskWithMyCustomLogic);
> //the task will be executed only on the nodes that store the records
>
> // Server side
>
> taskWithMyCustomLogic() {
>     updatedRecord.status = 1 // setting the flag to signify the processing
> is started.
>
>     //execute your business logic
>
>     updatedRecord.status = 2 // finished processing
> }
>
>
> That's it. So, the third step already requires you to have a compute task
> that would run the calculation in case of failures. Thus, if the real-time
> aspect of the processing is not crucial right now, then you can start with
> the batch-based approach by running a compute task once at a time and then
> introduce the continuous queries-based improvement whenever is needed. You
> decide. Hope it helps.
>
>
> -
> Denis
>
>
> On Wed, Oct 7, 2020 at 9:19 AM Denis Magda <dm...@apache.org> wrote:
>
>> So, a lesson for the future, would the continuous query approach still be
>>> preferable if the calculation involves the cache with continuous query and
>>> say a look up table? For example, if I want to see the country in the cache
>>> employee exists in the list of the countries that I am interested in.
>>
>>
>> You can access other caches from within the filter but the logic has to
>> be executed asynchronously to avoid deadlocks:
>> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>
>> Also, what do I need to do if I want the filter for the continuous query
>>> to execute on the cache on the local node only? Say, I have the continuous
>>> query deployed as singleton service on each node, to capture certain
>>> changes to a cache on the local node.
>>
>>
>> <https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html>
>>
>> The filter will be deployed and executed on every server node. The filter
>> is executed only on a server node that owns a record that is being modified
>> and passed into a filter. Hmm, it's also said that the filter can be
>> executed on a backup node. Check if it's true, and then you need to add a
>> special check into the filter that would allow executing the logic only if
>> it's the primary node:
>>
>> https://ignite.apache.org/docs/latest/key-value-api/continuous-queries#remote-filter
>>
>>
>> -
>> Denis
>>
>>
>> On Wed, Oct 7, 2020 at 4:39 AM narges saleh <sn...@gmail.com> wrote:
>>
>>> Also, what do I need to do if I want the filter for the continuous query
>>> to execute on the cache on the local node only? Say, I have the continuous
>>> query deployed as singleton service on each node, to capture certain
>>> changes to a cache on the local node.
>>>
>>> On Wed, Oct 7, 2020 at 5:54 AM narges saleh <sn...@gmail.com>
>>> wrote:
>>>
>>>> Thank you, Denis.
>>>> So, a lesson for the future, would the continuous query approach still
>>>> be preferable if the calculation involves the cache with continuous query
>>>> and say a look up table? For example, if I want to see the country in the
>>>> cache employee exists in the list of the countries that I am interested in.
>>>>
>>>> On Tue, Oct 6, 2020 at 4:11 PM Denis Magda <dm...@apache.org> wrote:
>>>>
>>>>> Thanks
>>>>>
>>>>> Then, I would consider the continuous queries based solution as long
>>>>> as the records can be updated in real-time:
>>>>>
>>>>>    - You can process the records on the fly and don't need to come up
>>>>>    with any batch task.
>>>>>    - The continuous query filter will be executed once on a node that
>>>>>    stores the record's primary copy. If the primary node fails in the middle
>>>>>    of the filter's calculation execution, then the filter will be executed on
>>>>>    a backup node. So, you will not lose any updates but might need to
>>>>>    introduce some logic/flag that confirms the calculation is not executed
>>>>>    twice for a single record (this can happen if the primary node failed in
>>>>>    the middle of the calculation execution and then the backup node picked up
>>>>>    and started executing the calculation from scratch).
>>>>>    - Updates of other tables or records from within the continuous
>>>>>    query filter must go through an async thread pool. You need to use
>>>>>    IgniteAsyncCallback annotation for that:
>>>>>    https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>>>>
>>>>> Alternatively, you can always run the calculation in the batch-fashion:
>>>>>
>>>>>    - Run a compute task once in a while
>>>>>    - Read all the latest records that satisfy the requests with SQL
>>>>>    or any other APIs
>>>>>    - Complete the calculation, mark already processed records just in
>>>>>    case if everything is failed in the middle and you need to run the
>>>>>    calculation from scratch
>>>>>
>>>>>
>>>>> -
>>>>> Denis
>>>>>
>>>>>
>>>>> On Mon, Oct 5, 2020 at 8:33 PM narges saleh <sn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Denis
>>>>>>  The calculation itself doesn't involve an update or read of another
>>>>>> record, but based on the outcome of the calculation, the process might make
>>>>>> changes in some other tables.
>>>>>>
>>>>>> thanks.
>>>>>>
>>>>>> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dm...@apache.org> wrote:
>>>>>>
>>>>>>> Good. Another clarification:
>>>>>>>
>>>>>>>    - Does that calculation change the state of the record (updates
>>>>>>>    any fields)?
>>>>>>>    - Does the calculation read or update any other records?
>>>>>>>
>>>>>>> -
>>>>>>> Denis
>>>>>>>
>>>>>>>
>>>>>>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> The latter; the server needs to perform some calculations on the
>>>>>>>> data without sending any notification to the app.
>>>>>>>>
>>>>>>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> And after you detect a record that satisfies the condition, do you
>>>>>>>>> need to send any notification to the application? Or is it more like a
>>>>>>>>> server detects and does some calculation logically without updating the app.
>>>>>>>>>
>>>>>>>>> -
>>>>>>>>> Denis
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> The detection should happen at most a couple of minutes after a
>>>>>>>>>> record is inserted in the cache but all the detections are local to the
>>>>>>>>>> node. But some records with the current timestamp might show up in the
>>>>>>>>>> system with big delays.
>>>>>>>>>>
>>>>>>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> What are your requirements? Do you need to process the records
>>>>>>>>>>> as soon as they are put into the cluster?
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Thank you Dennis for the reply.
>>>>>>>>>>>> From the perspective of performance/resource overhead and
>>>>>>>>>>>> reliability, which approach is preferable? Does a continuous query based
>>>>>>>>>>>> approach impose a lot more overhead?
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi Narges,
>>>>>>>>>>>>>
>>>>>>>>>>>>> Use continuous queries if you need to be notified in
>>>>>>>>>>>>> real-time, i.e. 1) a record is inserted, 2) the continuous filter confirms
>>>>>>>>>>>>> the record's time satisfies your condition, 3) the continuous queries
>>>>>>>>>>>>> notifies your application that does require processing.
>>>>>>>>>>>>>
>>>>>>>>>>>>> The jobs are better for a batching use case when it's ok to
>>>>>>>>>>>>> process records together with some delay.
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> -
>>>>>>>>>>>>> Denis
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <
>>>>>>>>>>>>> snarges124@gmail.com> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>>  If I want to watch for a rolling timestamp pattern in all
>>>>>>>>>>>>>> the records that get inserted to all my caches, is it more efficient to use
>>>>>>>>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>>>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>>>>>>>> An example is to watch for all the records whose timestamp
>>>>>>>>>>>>>> ends in 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> thanks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>>>>> -
>>>>>>>>>>> Denis
>>>>>>>>>>>
>>>>>>>>>>>

Re: Continuous Query

Posted by Denis Magda <dm...@apache.org>.

I recalled a complete solution. That's what you would need to do if decide
to process records in real-time with continuous queries in the
*fault-tolerant fashion* (using pseudo-code rather than actual Ignite APIs).

First. You need to add a flag field to the record's class that keeps the
current processing status. Like that:

MyRecord {
int id;
Date created;
byte status; //0 - not processed, 1 - being processed withing a
continuous query filter, 2 - processed by the filter, all the logic
successfully completed
}

Second. The continuous query filter (that will be executed on nodes that
store a copy of a record) needs to have the following structure.

@IgniteAsyncCallback
filterMethod(MyRecords updatedRecord) {

if (isThisNodePrimaryNodeForTheRecord(updatedRecord)) { // execute on a
primary node only
    updatedRecord.status = 1 // setting the flag to signify the processing
is started.

    //execute your business logic

    updatedRecord.status = 2 // finished processing
}
  return false; //you don't want a notification to be sent to the client
application or another node that deployed the continuous query
}

Third. If any node leaves the cluster or the whole cluster is restarted,
then you need to execute your custom for all the records with status=0 or
status=1. To do that you can broadcast a compute task:

// Application side

int[] unprocessedRecords = "select id from MyRecord where status < 2;"

IgniteCompute.affinityRun(idsOfUnprocessedRecords, taskWithMyCustomLogic);
//the task will be executed only on the nodes that store the records

// Server side

taskWithMyCustomLogic() {
    updatedRecord.status = 1 // setting the flag to signify the processing
is started.

    //execute your business logic

    updatedRecord.status = 2 // finished processing
}


That's it. So, the third step already requires you to have a compute task
that would run the calculation in case of failures. Thus, if the real-time
aspect of the processing is not crucial right now, then you can start with
the batch-based approach by running a compute task once at a time and then
introduce the continuous queries-based improvement whenever is needed. You
decide. Hope it helps.


-
Denis


On Wed, Oct 7, 2020 at 9:19 AM Denis Magda <dm...@apache.org> wrote:

> So, a lesson for the future, would the continuous query approach still be
>> preferable if the calculation involves the cache with continuous query and
>> say a look up table? For example, if I want to see the country in the cache
>> employee exists in the list of the countries that I am interested in.
>
>
> You can access other caches from within the filter but the logic has to be
> executed asynchronously to avoid deadlocks:
> https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>
> Also, what do I need to do if I want the filter for the continuous query
>> to execute on the cache on the local node only? Say, I have the continuous
>> query deployed as singleton service on each node, to capture certain
>> changes to a cache on the local node.
>
>
> <https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html>
>
> The filter will be deployed and executed on every server node. The filter
> is executed only on a server node that owns a record that is being modified
> and passed into a filter. Hmm, it's also said that the filter can be
> executed on a backup node. Check if it's true, and then you need to add a
> special check into the filter that would allow executing the logic only if
> it's the primary node:
>
> https://ignite.apache.org/docs/latest/key-value-api/continuous-queries#remote-filter
>
>
> -
> Denis
>
>
> On Wed, Oct 7, 2020 at 4:39 AM narges saleh <sn...@gmail.com> wrote:
>
>> Also, what do I need to do if I want the filter for the continuous query
>> to execute on the cache on the local node only? Say, I have the continuous
>> query deployed as singleton service on each node, to capture certain
>> changes to a cache on the local node.
>>
>> On Wed, Oct 7, 2020 at 5:54 AM narges saleh <sn...@gmail.com> wrote:
>>
>>> Thank you, Denis.
>>> So, a lesson for the future, would the continuous query approach still
>>> be preferable if the calculation involves the cache with continuous query
>>> and say a look up table? For example, if I want to see the country in the
>>> cache employee exists in the list of the countries that I am interested in.
>>>
>>> On Tue, Oct 6, 2020 at 4:11 PM Denis Magda <dm...@apache.org> wrote:
>>>
>>>> Thanks
>>>>
>>>> Then, I would consider the continuous queries based solution as long as
>>>> the records can be updated in real-time:
>>>>
>>>>    - You can process the records on the fly and don't need to come up
>>>>    with any batch task.
>>>>    - The continuous query filter will be executed once on a node that
>>>>    stores the record's primary copy. If the primary node fails in the middle
>>>>    of the filter's calculation execution, then the filter will be executed on
>>>>    a backup node. So, you will not lose any updates but might need to
>>>>    introduce some logic/flag that confirms the calculation is not executed
>>>>    twice for a single record (this can happen if the primary node failed in
>>>>    the middle of the calculation execution and then the backup node picked up
>>>>    and started executing the calculation from scratch).
>>>>    - Updates of other tables or records from within the continuous
>>>>    query filter must go through an async thread pool. You need to use
>>>>    IgniteAsyncCallback annotation for that:
>>>>    https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>>>
>>>> Alternatively, you can always run the calculation in the batch-fashion:
>>>>
>>>>    - Run a compute task once in a while
>>>>    - Read all the latest records that satisfy the requests with SQL or
>>>>    any other APIs
>>>>    - Complete the calculation, mark already processed records just in
>>>>    case if everything is failed in the middle and you need to run the
>>>>    calculation from scratch
>>>>
>>>>
>>>> -
>>>> Denis
>>>>
>>>>
>>>> On Mon, Oct 5, 2020 at 8:33 PM narges saleh <sn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Denis
>>>>>  The calculation itself doesn't involve an update or read of another
>>>>> record, but based on the outcome of the calculation, the process might make
>>>>> changes in some other tables.
>>>>>
>>>>> thanks.
>>>>>
>>>>> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dm...@apache.org> wrote:
>>>>>
>>>>>> Good. Another clarification:
>>>>>>
>>>>>>    - Does that calculation change the state of the record (updates
>>>>>>    any fields)?
>>>>>>    - Does the calculation read or update any other records?
>>>>>>
>>>>>> -
>>>>>> Denis
>>>>>>
>>>>>>
>>>>>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> The latter; the server needs to perform some calculations on the
>>>>>>> data without sending any notification to the app.
>>>>>>>
>>>>>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> And after you detect a record that satisfies the condition, do you
>>>>>>>> need to send any notification to the application? Or is it more like a
>>>>>>>> server detects and does some calculation logically without updating the app.
>>>>>>>>
>>>>>>>> -
>>>>>>>> Denis
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> The detection should happen at most a couple of minutes after a
>>>>>>>>> record is inserted in the cache but all the detections are local to the
>>>>>>>>> node. But some records with the current timestamp might show up in the
>>>>>>>>> system with big delays.
>>>>>>>>>
>>>>>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> What are your requirements? Do you need to process the records as
>>>>>>>>>> soon as they are put into the cluster?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Thank you Dennis for the reply.
>>>>>>>>>>> From the perspective of performance/resource overhead and
>>>>>>>>>>> reliability, which approach is preferable? Does a continuous query based
>>>>>>>>>>> approach impose a lot more overhead?
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi Narges,
>>>>>>>>>>>>
>>>>>>>>>>>> Use continuous queries if you need to be notified in real-time,
>>>>>>>>>>>> i.e. 1) a record is inserted, 2) the continuous filter confirms the
>>>>>>>>>>>> record's time satisfies your condition, 3) the continuous queries notifies
>>>>>>>>>>>> your application that does require processing.
>>>>>>>>>>>>
>>>>>>>>>>>> The jobs are better for a batching use case when it's ok to
>>>>>>>>>>>> process records together with some delay.
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> -
>>>>>>>>>>>> Denis
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <
>>>>>>>>>>>> snarges124@gmail.com> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>>  If I want to watch for a rolling timestamp pattern in all the
>>>>>>>>>>>>> records that get inserted to all my caches, is it more efficient to use
>>>>>>>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>>>>>>> An example is to watch for all the records whose timestamp
>>>>>>>>>>>>> ends in 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>>>>>>
>>>>>>>>>>>>> thanks
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> -
>>>>>>>>>> Denis
>>>>>>>>>>
>>>>>>>>>>

Re: Continuous Query

Posted by Denis Magda <dm...@apache.org>.

>
> So, a lesson for the future, would the continuous query approach still be
> preferable if the calculation involves the cache with continuous query and
> say a look up table? For example, if I want to see the country in the cache
> employee exists in the list of the countries that I am interested in.


You can access other caches from within the filter but the logic has to be
executed asynchronously to avoid deadlocks:
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html

Also, what do I need to do if I want the filter for the continuous query to
> execute on the cache on the local node only? Say, I have the continuous
> query deployed as singleton service on each node, to capture certain
> changes to a cache on the local node.

<https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html>

The filter will be deployed and executed on every server node. The filter
is executed only on a server node that owns a record that is being modified
and passed into a filter. Hmm, it's also said that the filter can be
executed on a backup node. Check if it's true, and then you need to add a
special check into the filter that would allow executing the logic only if
it's the primary node:
https://ignite.apache.org/docs/latest/key-value-api/continuous-queries#remote-filter


-
Denis


On Wed, Oct 7, 2020 at 4:39 AM narges saleh <sn...@gmail.com> wrote:

> Also, what do I need to do if I want the filter for the continuous query
> to execute on the cache on the local node only? Say, I have the continuous
> query deployed as singleton service on each node, to capture certain
> changes to a cache on the local node.
>
> On Wed, Oct 7, 2020 at 5:54 AM narges saleh <sn...@gmail.com> wrote:
>
>> Thank you, Denis.
>> So, a lesson for the future, would the continuous query approach still be
>> preferable if the calculation involves the cache with continuous query and
>> say a look up table? For example, if I want to see the country in the cache
>> employee exists in the list of the countries that I am interested in.
>>
>> On Tue, Oct 6, 2020 at 4:11 PM Denis Magda <dm...@apache.org> wrote:
>>
>>> Thanks
>>>
>>> Then, I would consider the continuous queries based solution as long as
>>> the records can be updated in real-time:
>>>
>>>    - You can process the records on the fly and don't need to come up
>>>    with any batch task.
>>>    - The continuous query filter will be executed once on a node that
>>>    stores the record's primary copy. If the primary node fails in the middle
>>>    of the filter's calculation execution, then the filter will be executed on
>>>    a backup node. So, you will not lose any updates but might need to
>>>    introduce some logic/flag that confirms the calculation is not executed
>>>    twice for a single record (this can happen if the primary node failed in
>>>    the middle of the calculation execution and then the backup node picked up
>>>    and started executing the calculation from scratch).
>>>    - Updates of other tables or records from within the continuous
>>>    query filter must go through an async thread pool. You need to use
>>>    IgniteAsyncCallback annotation for that:
>>>    https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>>
>>> Alternatively, you can always run the calculation in the batch-fashion:
>>>
>>>    - Run a compute task once in a while
>>>    - Read all the latest records that satisfy the requests with SQL or
>>>    any other APIs
>>>    - Complete the calculation, mark already processed records just in
>>>    case if everything is failed in the middle and you need to run the
>>>    calculation from scratch
>>>
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Mon, Oct 5, 2020 at 8:33 PM narges saleh <sn...@gmail.com>
>>> wrote:
>>>
>>>> Denis
>>>>  The calculation itself doesn't involve an update or read of another
>>>> record, but based on the outcome of the calculation, the process might make
>>>> changes in some other tables.
>>>>
>>>> thanks.
>>>>
>>>> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dm...@apache.org> wrote:
>>>>
>>>>> Good. Another clarification:
>>>>>
>>>>>    - Does that calculation change the state of the record (updates
>>>>>    any fields)?
>>>>>    - Does the calculation read or update any other records?
>>>>>
>>>>> -
>>>>> Denis
>>>>>
>>>>>
>>>>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> The latter; the server needs to perform some calculations on the data
>>>>>> without sending any notification to the app.
>>>>>>
>>>>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org> wrote:
>>>>>>
>>>>>>> And after you detect a record that satisfies the condition, do you
>>>>>>> need to send any notification to the application? Or is it more like a
>>>>>>> server detects and does some calculation logically without updating the app.
>>>>>>>
>>>>>>> -
>>>>>>> Denis
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> The detection should happen at most a couple of minutes after a
>>>>>>>> record is inserted in the cache but all the detections are local to the
>>>>>>>> node. But some records with the current timestamp might show up in the
>>>>>>>> system with big delays.
>>>>>>>>
>>>>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> What are your requirements? Do you need to process the records as
>>>>>>>>> soon as they are put into the cluster?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Thank you Dennis for the reply.
>>>>>>>>>> From the perspective of performance/resource overhead and
>>>>>>>>>> reliability, which approach is preferable? Does a continuous query based
>>>>>>>>>> approach impose a lot more overhead?
>>>>>>>>>>
>>>>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi Narges,
>>>>>>>>>>>
>>>>>>>>>>> Use continuous queries if you need to be notified in real-time,
>>>>>>>>>>> i.e. 1) a record is inserted, 2) the continuous filter confirms the
>>>>>>>>>>> record's time satisfies your condition, 3) the continuous queries notifies
>>>>>>>>>>> your application that does require processing.
>>>>>>>>>>>
>>>>>>>>>>> The jobs are better for a batching use case when it's ok to
>>>>>>>>>>> process records together with some delay.
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> -
>>>>>>>>>>> Denis
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <
>>>>>>>>>>> snarges124@gmail.com> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Hi All,
>>>>>>>>>>>>  If I want to watch for a rolling timestamp pattern in all the
>>>>>>>>>>>> records that get inserted to all my caches, is it more efficient to use
>>>>>>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>>>>>> An example is to watch for all the records whose timestamp ends
>>>>>>>>>>>> in 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>>>>>
>>>>>>>>>>>> thanks
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> -
>>>>>>>>> Denis
>>>>>>>>>
>>>>>>>>>

Re: Continuous Query

Posted by narges saleh <sn...@gmail.com>.

Also, what do I need to do if I want the filter for the continuous query to
execute on the cache on the local node only? Say, I have the continuous
query deployed as singleton service on each node, to capture certain
changes to a cache on the local node.

On Wed, Oct 7, 2020 at 5:54 AM narges saleh <sn...@gmail.com> wrote:

> Thank you, Denis.
> So, a lesson for the future, would the continuous query approach still be
> preferable if the calculation involves the cache with continuous query and
> say a look up table? For example, if I want to see the country in the cache
> employee exists in the list of the countries that I am interested in.
>
> On Tue, Oct 6, 2020 at 4:11 PM Denis Magda <dm...@apache.org> wrote:
>
>> Thanks
>>
>> Then, I would consider the continuous queries based solution as long as
>> the records can be updated in real-time:
>>
>>    - You can process the records on the fly and don't need to come up
>>    with any batch task.
>>    - The continuous query filter will be executed once on a node that
>>    stores the record's primary copy. If the primary node fails in the middle
>>    of the filter's calculation execution, then the filter will be executed on
>>    a backup node. So, you will not lose any updates but might need to
>>    introduce some logic/flag that confirms the calculation is not executed
>>    twice for a single record (this can happen if the primary node failed in
>>    the middle of the calculation execution and then the backup node picked up
>>    and started executing the calculation from scratch).
>>    - Updates of other tables or records from within the continuous query
>>    filter must go through an async thread pool. You need to use
>>    IgniteAsyncCallback annotation for that:
>>    https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>>
>> Alternatively, you can always run the calculation in the batch-fashion:
>>
>>    - Run a compute task once in a while
>>    - Read all the latest records that satisfy the requests with SQL or
>>    any other APIs
>>    - Complete the calculation, mark already processed records just in
>>    case if everything is failed in the middle and you need to run the
>>    calculation from scratch
>>
>>
>> -
>> Denis
>>
>>
>> On Mon, Oct 5, 2020 at 8:33 PM narges saleh <sn...@gmail.com> wrote:
>>
>>> Denis
>>>  The calculation itself doesn't involve an update or read of another
>>> record, but based on the outcome of the calculation, the process might make
>>> changes in some other tables.
>>>
>>> thanks.
>>>
>>> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dm...@apache.org> wrote:
>>>
>>>> Good. Another clarification:
>>>>
>>>>    - Does that calculation change the state of the record (updates any
>>>>    fields)?
>>>>    - Does the calculation read or update any other records?
>>>>
>>>> -
>>>> Denis
>>>>
>>>>
>>>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com>
>>>> wrote:
>>>>
>>>>> The latter; the server needs to perform some calculations on the data
>>>>> without sending any notification to the app.
>>>>>
>>>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org> wrote:
>>>>>
>>>>>> And after you detect a record that satisfies the condition, do you
>>>>>> need to send any notification to the application? Or is it more like a
>>>>>> server detects and does some calculation logically without updating the app.
>>>>>>
>>>>>> -
>>>>>> Denis
>>>>>>
>>>>>>
>>>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> The detection should happen at most a couple of minutes after a
>>>>>>> record is inserted in the cache but all the detections are local to the
>>>>>>> node. But some records with the current timestamp might show up in the
>>>>>>> system with big delays.
>>>>>>>
>>>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> What are your requirements? Do you need to process the records as
>>>>>>>> soon as they are put into the cluster?
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Thank you Dennis for the reply.
>>>>>>>>> From the perspective of performance/resource overhead and
>>>>>>>>> reliability, which approach is preferable? Does a continuous query based
>>>>>>>>> approach impose a lot more overhead?
>>>>>>>>>
>>>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi Narges,
>>>>>>>>>>
>>>>>>>>>> Use continuous queries if you need to be notified in real-time,
>>>>>>>>>> i.e. 1) a record is inserted, 2) the continuous filter confirms the
>>>>>>>>>> record's time satisfies your condition, 3) the continuous queries notifies
>>>>>>>>>> your application that does require processing.
>>>>>>>>>>
>>>>>>>>>> The jobs are better for a batching use case when it's ok to
>>>>>>>>>> process records together with some delay.
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> -
>>>>>>>>>> Denis
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com>
>>>>>>>>>> wrote:
>>>>>>>>>>
>>>>>>>>>>> Hi All,
>>>>>>>>>>>  If I want to watch for a rolling timestamp pattern in all the
>>>>>>>>>>> records that get inserted to all my caches, is it more efficient to use
>>>>>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>>>>> An example is to watch for all the records whose timestamp ends
>>>>>>>>>>> in 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>>>>
>>>>>>>>>>> thanks
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>
>>>>>>>> --
>>>>>>>> -
>>>>>>>> Denis
>>>>>>>>
>>>>>>>>

Re: Continuous Query

Posted by narges saleh <sn...@gmail.com>.

Thank you, Denis.
So, a lesson for the future, would the continuous query approach still be
preferable if the calculation involves the cache with continuous query and
say a look up table? For example, if I want to see the country in the cache
employee exists in the list of the countries that I am interested in.

On Tue, Oct 6, 2020 at 4:11 PM Denis Magda <dm...@apache.org> wrote:

> Thanks
>
> Then, I would consider the continuous queries based solution as long as
> the records can be updated in real-time:
>
>    - You can process the records on the fly and don't need to come up
>    with any batch task.
>    - The continuous query filter will be executed once on a node that
>    stores the record's primary copy. If the primary node fails in the middle
>    of the filter's calculation execution, then the filter will be executed on
>    a backup node. So, you will not lose any updates but might need to
>    introduce some logic/flag that confirms the calculation is not executed
>    twice for a single record (this can happen if the primary node failed in
>    the middle of the calculation execution and then the backup node picked up
>    and started executing the calculation from scratch).
>    - Updates of other tables or records from within the continuous query
>    filter must go through an async thread pool. You need to use
>    IgniteAsyncCallback annotation for that:
>    https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html
>
> Alternatively, you can always run the calculation in the batch-fashion:
>
>    - Run a compute task once in a while
>    - Read all the latest records that satisfy the requests with SQL or
>    any other APIs
>    - Complete the calculation, mark already processed records just in
>    case if everything is failed in the middle and you need to run the
>    calculation from scratch
>
>
> -
> Denis
>
>
> On Mon, Oct 5, 2020 at 8:33 PM narges saleh <sn...@gmail.com> wrote:
>
>> Denis
>>  The calculation itself doesn't involve an update or read of another
>> record, but based on the outcome of the calculation, the process might make
>> changes in some other tables.
>>
>> thanks.
>>
>> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dm...@apache.org> wrote:
>>
>>> Good. Another clarification:
>>>
>>>    - Does that calculation change the state of the record (updates any
>>>    fields)?
>>>    - Does the calculation read or update any other records?
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com>
>>> wrote:
>>>
>>>> The latter; the server needs to perform some calculations on the data
>>>> without sending any notification to the app.
>>>>
>>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org> wrote:
>>>>
>>>>> And after you detect a record that satisfies the condition, do you
>>>>> need to send any notification to the application? Or is it more like a
>>>>> server detects and does some calculation logically without updating the app.
>>>>>
>>>>> -
>>>>> Denis
>>>>>
>>>>>
>>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> The detection should happen at most a couple of minutes after a
>>>>>> record is inserted in the cache but all the detections are local to the
>>>>>> node. But some records with the current timestamp might show up in the
>>>>>> system with big delays.
>>>>>>
>>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org>
>>>>>> wrote:
>>>>>>
>>>>>>> What are your requirements? Do you need to process the records as
>>>>>>> soon as they are put into the cluster?
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank you Dennis for the reply.
>>>>>>>> From the perspective of performance/resource overhead and
>>>>>>>> reliability, which approach is preferable? Does a continuous query based
>>>>>>>> approach impose a lot more overhead?
>>>>>>>>
>>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi Narges,
>>>>>>>>>
>>>>>>>>> Use continuous queries if you need to be notified in real-time,
>>>>>>>>> i.e. 1) a record is inserted, 2) the continuous filter confirms the
>>>>>>>>> record's time satisfies your condition, 3) the continuous queries notifies
>>>>>>>>> your application that does require processing.
>>>>>>>>>
>>>>>>>>> The jobs are better for a batching use case when it's ok to
>>>>>>>>> process records together with some delay.
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> -
>>>>>>>>> Denis
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com>
>>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>>> Hi All,
>>>>>>>>>>  If I want to watch for a rolling timestamp pattern in all the
>>>>>>>>>> records that get inserted to all my caches, is it more efficient to use
>>>>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>>>> An example is to watch for all the records whose timestamp ends
>>>>>>>>>> in 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>>>
>>>>>>>>>> thanks
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>> -
>>>>>>> Denis
>>>>>>>
>>>>>>>

Re: Continuous Query

Posted by Denis Magda <dm...@apache.org>.

Thanks

Then, I would consider the continuous queries based solution as long as the
records can be updated in real-time:

   - You can process the records on the fly and don't need to come up with
   any batch task.
   - The continuous query filter will be executed once on a node that
   stores the record's primary copy. If the primary node fails in the middle
   of the filter's calculation execution, then the filter will be executed on
   a backup node. So, you will not lose any updates but might need to
   introduce some logic/flag that confirms the calculation is not executed
   twice for a single record (this can happen if the primary node failed in
   the middle of the calculation execution and then the backup node picked up
   and started executing the calculation from scratch).
   - Updates of other tables or records from within the continuous query
   filter must go through an async thread pool. You need to use
   IgniteAsyncCallback annotation for that:
   https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/lang/IgniteAsyncCallback.html

Alternatively, you can always run the calculation in the batch-fashion:

   - Run a compute task once in a while
   - Read all the latest records that satisfy the requests with SQL or any
   other APIs
   - Complete the calculation, mark already processed records just in case
   if everything is failed in the middle and you need to run the calculation
   from scratch


-
Denis


On Mon, Oct 5, 2020 at 8:33 PM narges saleh <sn...@gmail.com> wrote:

> Denis
>  The calculation itself doesn't involve an update or read of another
> record, but based on the outcome of the calculation, the process might make
> changes in some other tables.
>
> thanks.
>
> On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dm...@apache.org> wrote:
>
>> Good. Another clarification:
>>
>>    - Does that calculation change the state of the record (updates any
>>    fields)?
>>    - Does the calculation read or update any other records?
>>
>> -
>> Denis
>>
>>
>> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com> wrote:
>>
>>> The latter; the server needs to perform some calculations on the data
>>> without sending any notification to the app.
>>>
>>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org> wrote:
>>>
>>>> And after you detect a record that satisfies the condition, do you need
>>>> to send any notification to the application? Or is it more like a server
>>>> detects and does some calculation logically without updating the app.
>>>>
>>>> -
>>>> Denis
>>>>
>>>>
>>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com>
>>>> wrote:
>>>>
>>>>> The detection should happen at most a couple of minutes after a record
>>>>> is inserted in the cache but all the detections are local to the node. But
>>>>> some records with the current timestamp might show up in the system with
>>>>> big delays.
>>>>>
>>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org> wrote:
>>>>>
>>>>>> What are your requirements? Do you need to process the records as
>>>>>> soon as they are put into the cluster?
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Thank you Dennis for the reply.
>>>>>>> From the perspective of performance/resource overhead and
>>>>>>> reliability, which approach is preferable? Does a continuous query based
>>>>>>> approach impose a lot more overhead?
>>>>>>>
>>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi Narges,
>>>>>>>>
>>>>>>>> Use continuous queries if you need to be notified in real-time,
>>>>>>>> i.e. 1) a record is inserted, 2) the continuous filter confirms the
>>>>>>>> record's time satisfies your condition, 3) the continuous queries notifies
>>>>>>>> your application that does require processing.
>>>>>>>>
>>>>>>>> The jobs are better for a batching use case when it's ok to process
>>>>>>>> records together with some delay.
>>>>>>>>
>>>>>>>>
>>>>>>>> -
>>>>>>>> Denis
>>>>>>>>
>>>>>>>>
>>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com>
>>>>>>>> wrote:
>>>>>>>>
>>>>>>>>> Hi All,
>>>>>>>>>  If I want to watch for a rolling timestamp pattern in all the
>>>>>>>>> records that get inserted to all my caches, is it more efficient to use
>>>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>>> An example is to watch for all the records whose timestamp ends in
>>>>>>>>> 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>>
>>>>>>>>> thanks
>>>>>>>>>
>>>>>>>>>
>>>>>>
>>>>>> --
>>>>>> -
>>>>>> Denis
>>>>>>
>>>>>>

Re: Continuous Query

Posted by narges saleh <sn...@gmail.com>.

Denis
 The calculation itself doesn't involve an update or read of another
record, but based on the outcome of the calculation, the process might make
changes in some other tables.

thanks.

On Mon, Oct 5, 2020 at 7:04 PM Denis Magda <dm...@apache.org> wrote:

> Good. Another clarification:
>
>    - Does that calculation change the state of the record (updates any
>    fields)?
>    - Does the calculation read or update any other records?
>
> -
> Denis
>
>
> On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com> wrote:
>
>> The latter; the server needs to perform some calculations on the data
>> without sending any notification to the app.
>>
>> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org> wrote:
>>
>>> And after you detect a record that satisfies the condition, do you need
>>> to send any notification to the application? Or is it more like a server
>>> detects and does some calculation logically without updating the app.
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com>
>>> wrote:
>>>
>>>> The detection should happen at most a couple of minutes after a record
>>>> is inserted in the cache but all the detections are local to the node. But
>>>> some records with the current timestamp might show up in the system with
>>>> big delays.
>>>>
>>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org> wrote:
>>>>
>>>>> What are your requirements? Do you need to process the records as soon
>>>>> as they are put into the cluster?
>>>>>
>>>>>
>>>>>
>>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com> wrote:
>>>>>
>>>>>> Thank you Dennis for the reply.
>>>>>> From the perspective of performance/resource overhead and
>>>>>> reliability, which approach is preferable? Does a continuous query based
>>>>>> approach impose a lot more overhead?
>>>>>>
>>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org> wrote:
>>>>>>
>>>>>>> Hi Narges,
>>>>>>>
>>>>>>> Use continuous queries if you need to be notified in real-time, i.e.
>>>>>>> 1) a record is inserted, 2) the continuous filter confirms the record's
>>>>>>> time satisfies your condition, 3) the continuous queries notifies your
>>>>>>> application that does require processing.
>>>>>>>
>>>>>>> The jobs are better for a batching use case when it's ok to process
>>>>>>> records together with some delay.
>>>>>>>
>>>>>>>
>>>>>>> -
>>>>>>> Denis
>>>>>>>
>>>>>>>
>>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Hi All,
>>>>>>>>  If I want to watch for a rolling timestamp pattern in all the
>>>>>>>> records that get inserted to all my caches, is it more efficient to use
>>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>>> An example is to watch for all the records whose timestamp ends in
>>>>>>>> 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>>
>>>>>>>> thanks
>>>>>>>>
>>>>>>>>
>>>>>
>>>>> --
>>>>> -
>>>>> Denis
>>>>>
>>>>>

Re: Continuous Query

Posted by Denis Magda <dm...@apache.org>.

Good. Another clarification:

   - Does that calculation change the state of the record (updates any
   fields)?
   - Does the calculation read or update any other records?

-
Denis


On Sat, Oct 3, 2020 at 1:34 PM narges saleh <sn...@gmail.com> wrote:

> The latter; the server needs to perform some calculations on the data
> without sending any notification to the app.
>
> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org> wrote:
>
>> And after you detect a record that satisfies the condition, do you need
>> to send any notification to the application? Or is it more like a server
>> detects and does some calculation logically without updating the app.
>>
>> -
>> Denis
>>
>>
>> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com>
>> wrote:
>>
>>> The detection should happen at most a couple of minutes after a record
>>> is inserted in the cache but all the detections are local to the node. But
>>> some records with the current timestamp might show up in the system with
>>> big delays.
>>>
>>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org> wrote:
>>>
>>>> What are your requirements? Do you need to process the records as soon
>>>> as they are put into the cluster?
>>>>
>>>>
>>>>
>>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com> wrote:
>>>>
>>>>> Thank you Dennis for the reply.
>>>>> From the perspective of performance/resource overhead and reliability,
>>>>> which approach is preferable? Does a continuous query based approach impose
>>>>> a lot more overhead?
>>>>>
>>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org> wrote:
>>>>>
>>>>>> Hi Narges,
>>>>>>
>>>>>> Use continuous queries if you need to be notified in real-time, i.e.
>>>>>> 1) a record is inserted, 2) the continuous filter confirms the record's
>>>>>> time satisfies your condition, 3) the continuous queries notifies your
>>>>>> application that does require processing.
>>>>>>
>>>>>> The jobs are better for a batching use case when it's ok to process
>>>>>> records together with some delay.
>>>>>>
>>>>>>
>>>>>> -
>>>>>> Denis
>>>>>>
>>>>>>
>>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com>
>>>>>> wrote:
>>>>>>
>>>>>>> Hi All,
>>>>>>>  If I want to watch for a rolling timestamp pattern in all the
>>>>>>> records that get inserted to all my caches, is it more efficient to use
>>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>>> An example is to watch for all the records whose timestamp ends in
>>>>>>> 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>>
>>>>>>> thanks
>>>>>>>
>>>>>>>
>>>>
>>>> --
>>>> -
>>>> Denis
>>>>
>>>>

Re: Continuous Query

Posted by Ilya Kasnacheev <il...@gmail.com>.

Please send an empty message to: user-unsubscribe@ignite.apache.org to
unsubscribe yourself from the list.

Regards,
-- 
Ilya Kasnacheev


пн, 5 окт. 2020 г. в 07:35, Priya Yadav <Pr...@fico.com>:

> unsubscribe
> ------------------------------
> *From:* narges saleh <sn...@gmail.com>
> *Sent:* Sunday, 4 October 2020 2:03 AM
> *To:* user@ignite.apache.org <us...@ignite.apache.org>
> *Subject:* Re: Continuous Query
>
> The latter; the server needs to perform some calculations on the data
> without sending any notification to the app.
>
> On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org> wrote:
>
> And after you detect a record that satisfies the condition, do you need to
> send any notification to the application? Or is it more like a server
> detects and does some calculation logically without updating the app.
>
> -
> Denis
>
>
> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com> wrote:
>
> The detection should happen at most a couple of minutes after a record is
> inserted in the cache but all the detections are local to the node. But
> some records with the current timestamp might show up in the system with
> big delays.
>
> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org> wrote:
>
> What are your requirements? Do you need to process the records as soon as
> they are put into the cluster?
>
>
>
> On Friday, October 2, 2020, narges saleh <sn...@gmail.com> wrote:
>
> Thank you Dennis for the reply.
> From the perspective of performance/resource overhead and reliability,
> which approach is preferable? Does a continuous query based approach impose
> a lot more overhead?
>
> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org> wrote:
>
> Hi Narges,
>
> Use continuous queries if you need to be notified in real-time, i.e. 1) a
> record is inserted, 2) the continuous filter confirms the record's time
> satisfies your condition, 3) the continuous queries notifies your
> application that does require processing.
>
> The jobs are better for a batching use case when it's ok to process
> records together with some delay.
>
>
> -
> Denis
>
>
> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com> wrote:
>
> Hi All,
>  If I want to watch for a rolling timestamp pattern in all the records
> that get inserted to all my caches, is it more efficient to use timer based
> jobs (that checks all the records in some interval) or  continuous queries
> that locally filter on the pattern? These records can get inserted in any
> order  and some can arrive with delays.
> An example is to watch for all the records whose timestamp ends in 50, if
> the timestamp is in the format yyyy-mm-dd hh:mi.
>
> thanks
>
>
>
> --
> -
> Denis
>
> This email and any files transmitted with it are confidential, proprietary
> and intended solely for the individual or entity to whom they are
> addressed. If you have received this email in error please delete it
> immediately.
>

Re: Continuous Query

Posted by Priya Yadav <Pr...@fico.com>.

unsubscribe
________________________________
From: narges saleh <sn...@gmail.com>
Sent: Sunday, 4 October 2020 2:03 AM
To: user@ignite.apache.org <us...@ignite.apache.org>
Subject: Re: Continuous Query

The latter; the server needs to perform some calculations on the data without sending any notification to the app.

On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org>> wrote:
And after you detect a record that satisfies the condition, do you need to send any notification to the application? Or is it more like a server detects and does some calculation logically without updating the app.

-
Denis

On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com>> wrote:
The detection should happen at most a couple of minutes after a record is inserted in the cache but all the detections are local to the node. But some records with the current timestamp might show up in the system with big delays.

On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org>> wrote:
What are your requirements? Do you need to process the records as soon as they are put into the cluster?

On Friday, October 2, 2020, narges saleh <sn...@gmail.com>> wrote:
Thank you Dennis for the reply.
From the perspective of performance/resource overhead and reliability, which approach is preferable? Does a continuous query based approach impose a lot more overhead?

On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org>> wrote:
Hi Narges,

Use continuous queries if you need to be notified in real-time, i.e. 1) a record is inserted, 2) the continuous filter confirms the record's time satisfies your condition, 3) the continuous queries notifies your application that does require processing.

The jobs are better for a batching use case when it's ok to process records together with some delay.

-
Denis

On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com>> wrote:
Hi All,
 If I want to watch for a rolling timestamp pattern in all the records that get inserted to all my caches, is it more efficient to use timer based jobs (that checks all the records in some interval) or  continuous queries that locally filter on the pattern? These records can get inserted in any order  and some can arrive with delays.
An example is to watch for all the records whose timestamp ends in 50, if the timestamp is in the format yyyy-mm-dd hh:mi.

thanks

--
-
Denis

This email and any files transmitted with it are confidential, proprietary and intended solely for the individual or entity to whom they are addressed. If you have received this email in error please delete it immediately.

Re: Continuous Query

Posted by narges saleh <sn...@gmail.com>.

The latter; the server needs to perform some calculations on the data
without sending any notification to the app.

On Fri, Oct 2, 2020 at 4:25 PM Denis Magda <dm...@apache.org> wrote:

> And after you detect a record that satisfies the condition, do you need to
> send any notification to the application? Or is it more like a server
> detects and does some calculation logically without updating the app.
>
> -
> Denis
>
>
> On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com> wrote:
>
>> The detection should happen at most a couple of minutes after a record is
>> inserted in the cache but all the detections are local to the node. But
>> some records with the current timestamp might show up in the system with
>> big delays.
>>
>> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org> wrote:
>>
>>> What are your requirements? Do you need to process the records as soon
>>> as they are put into the cluster?
>>>
>>>
>>>
>>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com> wrote:
>>>
>>>> Thank you Dennis for the reply.
>>>> From the perspective of performance/resource overhead and reliability,
>>>> which approach is preferable? Does a continuous query based approach impose
>>>> a lot more overhead?
>>>>
>>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org> wrote:
>>>>
>>>>> Hi Narges,
>>>>>
>>>>> Use continuous queries if you need to be notified in real-time, i.e.
>>>>> 1) a record is inserted, 2) the continuous filter confirms the record's
>>>>> time satisfies your condition, 3) the continuous queries notifies your
>>>>> application that does require processing.
>>>>>
>>>>> The jobs are better for a batching use case when it's ok to process
>>>>> records together with some delay.
>>>>>
>>>>>
>>>>> -
>>>>> Denis
>>>>>
>>>>>
>>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Hi All,
>>>>>>  If I want to watch for a rolling timestamp pattern in all the
>>>>>> records that get inserted to all my caches, is it more efficient to use
>>>>>> timer based jobs (that checks all the records in some interval) or
>>>>>> continuous queries that locally filter on the pattern? These records can
>>>>>> get inserted in any order  and some can arrive with delays.
>>>>>> An example is to watch for all the records whose timestamp ends in
>>>>>> 50, if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>>
>>>>>> thanks
>>>>>>
>>>>>>
>>>
>>> --
>>> -
>>> Denis
>>>
>>>

Re: Continuous Query

Posted by Denis Magda <dm...@apache.org>.

And after you detect a record that satisfies the condition, do you need to
send any notification to the application? Or is it more like a server
detects and does some calculation logically without updating the app.

-
Denis


On Fri, Oct 2, 2020 at 11:22 AM narges saleh <sn...@gmail.com> wrote:

> The detection should happen at most a couple of minutes after a record is
> inserted in the cache but all the detections are local to the node. But
> some records with the current timestamp might show up in the system with
> big delays.
>
> On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org> wrote:
>
>> What are your requirements? Do you need to process the records as soon as
>> they are put into the cluster?
>>
>>
>>
>> On Friday, October 2, 2020, narges saleh <sn...@gmail.com> wrote:
>>
>>> Thank you Dennis for the reply.
>>> From the perspective of performance/resource overhead and reliability,
>>> which approach is preferable? Does a continuous query based approach impose
>>> a lot more overhead?
>>>
>>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org> wrote:
>>>
>>>> Hi Narges,
>>>>
>>>> Use continuous queries if you need to be notified in real-time, i.e. 1)
>>>> a record is inserted, 2) the continuous filter confirms the record's time
>>>> satisfies your condition, 3) the continuous queries notifies your
>>>> application that does require processing.
>>>>
>>>> The jobs are better for a batching use case when it's ok to process
>>>> records together with some delay.
>>>>
>>>>
>>>> -
>>>> Denis
>>>>
>>>>
>>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com>
>>>> wrote:
>>>>
>>>>> Hi All,
>>>>>  If I want to watch for a rolling timestamp pattern in all the records
>>>>> that get inserted to all my caches, is it more efficient to use timer based
>>>>> jobs (that checks all the records in some interval) or  continuous queries
>>>>> that locally filter on the pattern? These records can get inserted in any
>>>>> order  and some can arrive with delays.
>>>>> An example is to watch for all the records whose timestamp ends in 50,
>>>>> if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>>
>>>>> thanks
>>>>>
>>>>>
>>
>> --
>> -
>> Denis
>>
>>

Re: Continuous Query

Posted by narges saleh <sn...@gmail.com>.

The detection should happen at most a couple of minutes after a record is
inserted in the cache but all the detections are local to the node. But
some records with the current timestamp might show up in the system with
big delays.

On Fri, Oct 2, 2020 at 12:23 PM Denis Magda <dm...@apache.org> wrote:

> What are your requirements? Do you need to process the records as soon as
> they are put into the cluster?
>
>
>
> On Friday, October 2, 2020, narges saleh <sn...@gmail.com> wrote:
>
>> Thank you Dennis for the reply.
>> From the perspective of performance/resource overhead and reliability,
>> which approach is preferable? Does a continuous query based approach impose
>> a lot more overhead?
>>
>> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org> wrote:
>>
>>> Hi Narges,
>>>
>>> Use continuous queries if you need to be notified in real-time, i.e. 1)
>>> a record is inserted, 2) the continuous filter confirms the record's time
>>> satisfies your condition, 3) the continuous queries notifies your
>>> application that does require processing.
>>>
>>> The jobs are better for a batching use case when it's ok to process
>>> records together with some delay.
>>>
>>>
>>> -
>>> Denis
>>>
>>>
>>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com>
>>> wrote:
>>>
>>>> Hi All,
>>>>  If I want to watch for a rolling timestamp pattern in all the records
>>>> that get inserted to all my caches, is it more efficient to use timer based
>>>> jobs (that checks all the records in some interval) or  continuous queries
>>>> that locally filter on the pattern? These records can get inserted in any
>>>> order  and some can arrive with delays.
>>>> An example is to watch for all the records whose timestamp ends in 50,
>>>> if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>>
>>>> thanks
>>>>
>>>>
>
> --
> -
> Denis
>
>

Re: Continuous Query

Posted by Denis Magda <dm...@apache.org>.

What are your requirements? Do you need to process the records as soon as
they are put into the cluster?



On Friday, October 2, 2020, narges saleh <sn...@gmail.com> wrote:

> Thank you Dennis for the reply.
> From the perspective of performance/resource overhead and reliability,
> which approach is preferable? Does a continuous query based approach impose
> a lot more overhead?
>
> On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org> wrote:
>
>> Hi Narges,
>>
>> Use continuous queries if you need to be notified in real-time, i.e. 1) a
>> record is inserted, 2) the continuous filter confirms the record's time
>> satisfies your condition, 3) the continuous queries notifies your
>> application that does require processing.
>>
>> The jobs are better for a batching use case when it's ok to process
>> records together with some delay.
>>
>>
>> -
>> Denis
>>
>>
>> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com> wrote:
>>
>>> Hi All,
>>>  If I want to watch for a rolling timestamp pattern in all the records
>>> that get inserted to all my caches, is it more efficient to use timer based
>>> jobs (that checks all the records in some interval) or  continuous queries
>>> that locally filter on the pattern? These records can get inserted in any
>>> order  and some can arrive with delays.
>>> An example is to watch for all the records whose timestamp ends in 50,
>>> if the timestamp is in the format yyyy-mm-dd hh:mi.
>>>
>>> thanks
>>>
>>>

-- 
-
Denis

Re: Continuous Query

Posted by narges saleh <sn...@gmail.com>.

Thank you Dennis for the reply.
From the perspective of performance/resource overhead and reliability,
which approach is preferable? Does a continuous query based approach impose
a lot more overhead?

On Fri, Oct 2, 2020 at 9:52 AM Denis Magda <dm...@apache.org> wrote:

> Hi Narges,
>
> Use continuous queries if you need to be notified in real-time, i.e. 1) a
> record is inserted, 2) the continuous filter confirms the record's time
> satisfies your condition, 3) the continuous queries notifies your
> application that does require processing.
>
> The jobs are better for a batching use case when it's ok to process
> records together with some delay.
>
>
> -
> Denis
>
>
> On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com> wrote:
>
>> Hi All,
>>  If I want to watch for a rolling timestamp pattern in all the records
>> that get inserted to all my caches, is it more efficient to use timer based
>> jobs (that checks all the records in some interval) or  continuous queries
>> that locally filter on the pattern? These records can get inserted in any
>> order  and some can arrive with delays.
>> An example is to watch for all the records whose timestamp ends in 50, if
>> the timestamp is in the format yyyy-mm-dd hh:mi.
>>
>> thanks
>>
>>

Re: Continuous Query

Posted by Denis Magda <dm...@apache.org>.

Hi Narges,

Use continuous queries if you need to be notified in real-time, i.e. 1) a
record is inserted, 2) the continuous filter confirms the record's time
satisfies your condition, 3) the continuous queries notifies your
application that does require processing.

The jobs are better for a batching use case when it's ok to process records
together with some delay.

-
Denis

On Fri, Oct 2, 2020 at 3:50 AM narges saleh <sn...@gmail.com> wrote:

> Hi All,
>  If I want to watch for a rolling timestamp pattern in all the records
> that get inserted to all my caches, is it more efficient to use timer based
> jobs (that checks all the records in some interval) or  continuous queries
> that locally filter on the pattern? These records can get inserted in any
> order  and some can arrive with delays.
> An example is to watch for all the records whose timestamp ends in 50, if
> the timestamp is in the format yyyy-mm-dd hh:mi.
>
> thanks
>
>