You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by VeenaMithare <v....@cmcmarkets.com> on 2021/01/18 16:43:48 UTC

2.8.1 : Continuous Query Initial query not returning any result sometimes

Hi Team,

Our env : 
Servers : 3
Cache Configuration : REPLICATED

In one of our clients, the continuous query - Initial query did not return
any result during client start whereas it was expected to return a single
row. This happened for about 3-4 times over a period of 2 hours. It did
return the right result after a couple of restarts. 

I am trying to debug what happened when it did not return the result . My
suspicions are as below :

1. Last week we changed the  client communication timeout as below as per
the discussion here ( 
http://apache-ignite-users.70518.x6.nabble.com/IgniteSpiOperationTimeoutException-Operation-timed-out-timeoutStrategy-ExponentialBackoffTimeoutStray-tp34196p34377.html
). Server was already on the below timeout values. 

 : 

 <bean
class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi"
scope="prototype">
                <property name="connectTimeout" value="5000"/>
                <property name="maxConnectTimeout" value="10000"/>


Could it be possible that TcpCommunicationSpi timed out before the results
were seen on the client side  ?

2. Did the replication between the servers not work ? Is it possible that
one of the servers had the wrong value ? How do I debug this ? Please note
when I connect to each of the 3 servers using dbeaver, they all return the
correct row right now.( Since I got to know of this issue after 2 days, I am
not sure what was the row value at the time of this issue )

3. How do I know which server( out of the 3 servers ) was the client's
continuous query -> initial query run on ? 

How do I debug this further ?

regards,
Veena.





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes and UnregisteredBinaryTypeExceptio

Posted by VeenaMithare <v....@cmcmarkets.com>.
On further debugging, it looks like all the clients that had failed the
count, were executing their query on 
 the same server machinename002. 

When I looked at the server logs, it looks like that the records that were
missing were not present on this server when the query was executed but were
present on the other two servers. ( Kindly note that the cache is 
replicated one and all the records should ideally be present on all the 3
servers ). 

When I look at the exceptions logs, I see unregisteredbinarytypeexception
for these records : 
Attempted to update binary metadata inside a critical synchronization block
(will be automatically retried). This exception must not be wrapped to any
other exception class. If you encounter this exception outside of
EntryProcessor, please report to Apache Ignite dev-list.

Typically we dont do anything when we encounter this exception as per the
message. Please let me know if we need to do anything here .  

In this case, I suspect the reattempt of the entry processor did not work ?
and left the machinename002 out of sync with the other two servers ? How do
we know if the UnregisteredBinaryTypeException has resolved in the next
attempt ?









--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by VeenaMithare <v....@cmcmarkets.com>.
Hi Andrei, 

Did you get a chance to look at my comments . 

Regarding doing the continuous query as per this example : 
https://github.com/gridgain/gridgain/blob/master/examples/src/main/java/org/apache/ignite/examples/datagrid/CacheContinuousQueryExample.java

Our code is pretty much as shown in the example. I cannot use the
cache.query in a try with resources block since I want the continuous query
to live for the life time of the client. Hence the cursor is closed when the
client is stopped. 

The changed code is there in 
InitialQueryProject-1.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/t2757/InitialQueryProject-1.zip>  

regards,
Veena.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by VeenaMithare <v....@cmcmarkets.com>.
HI Andrei, 

Some more points :

1. Also the javadocs for the querycursor says : 
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/query/QueryCursor.html#getAll--

List<T> getAll()
Gets all query results and stores them in the collection. Use this method
when you know in advance that query result is relatively small and will not
cause memory utilization issues.
/Since all the results will be fetched, all the resources will be closed
automatically after this call, e.g. there is no need to call close() method
in this case.
/

 We are closing the cursor as mentioned in my previous comment - but
according to this comment, if the cursor is not closed it shouldnt be a
issue since a invocation of the getall method will close it . 

Please let me know if we are missing something. 

2. Some observations-  On all the nodes where I got the wrong count :
Between the time it executes cache.query and cur.getall, I see this log : 
2021-01-15T15:37:15,167 INFO  o.a.i.s.c.t.TcpCommunicationSpi
[grid-nio-worker-tcp-comm-2-#62%InstanceName%]: Established outgoing
communication connection [locAddr=/a.b.c.153:50546,
rmtAddr=machinename003.cmc.local/a.b.c.202:47130 ]

Not sure why it is trying to connect to another server between the
cache.query and cur.getall

On the nodes where the count has been right, I dont see that line. 

3. Could this be linked to this issue here : 

http://apache-ignite-users.70518.x6.nabble.com/Regarding-Connection-timed-out-observed-during-client-startup-td35157.html









--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by VeenaMithare <v....@cmcmarkets.com>.
Hi Andrei,

Also as I said it is a trimmed down  version of the actual code. In the
actual code, the cursor is taken into a instance variable and closed. 

Made modifications accordingly to the client class . Please find the same
attached,
InitialQueryProject-1.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/t2757/InitialQueryProject-1.zip>  
regards,
Veena.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by VeenaMithare <v....@cmcmarkets.com>.
Hi Andrei,
>>From your source code, I can see that the QueryCursor was created inside
startContinuousQuery. Then you return it from that method, but don't store
it in any variable:

>>continuousQueryHelper.startContinuousQuery(ignite,qry,testCache,subscribeParameters.callBackList,outputFieldsList);

I dont need to store the cursor that is returned from startContinuousQuery
because the cursor is read within the startContinuousQuery . 

Please find the code within startContinuousQuery in ContinuousQueryHelper
class . The cursor is stored in a variable and read through in the loop . 

        // Create new continuous query.
        QueryCursor<Cache.Entry&lt;BinaryObject, BinaryObject>> cur =
cache.query(qry);
        LOGGER.info(
            "PROJECT LISTENS: >>> Cache continuous query started
successfully. Start Initial Query...{}",
            qry.getInitialQuery());

        // Iterate through existing data.
        // Please note there is a strange behaviour here. The iterator
returns a iterator of Cache.Entry<BinaryObject, BinaryObject>
        // However, each of the element of the iterator is actually not
Cache.Entry<BinaryObject, BinaryObject> .
        // Each element of the iterator is List of fields in a row. ( i.e.
it matches what is supposed to be returned by
        // SQLFieldsQuery.query("query..").getAll();
        List addedOrModifiedList = new ArrayList<>();
        List removedList = new ArrayList<>();

        int count=0;
        for (Object e : cur.getAll()) {

            BinaryObject
binaryObject=createBinaryObject((List<?>)e,outputFieldsList,ignite,"TEST");
            ++count;
            addedOrModifiedList.add(binaryObject);
            LOGGER.debug("PROJECT LISTENS: Initial entry binaryObject : {}
", binaryObject);


        }

I am not sure how it can be garbage collected before it is read. 

regards,
Veena.




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by andrei <ae...@gmail.com>.
Hi,

 From your source code, I can see that the QueryCursor was created 
inside startContinuousQuery. Then you return it from that method, but 
don't store it in any variable:

continuousQueryHelper.startContinuousQuery(ignite,qry,testCache,subscribeParameters.callBackList,outputFieldsList); for (int i =0; i <100000; i++) {
     LOGGER.info("ListenerClient running... i=" + i); try {
         Thread.sleep(20000); }catch (Exception e) {
         //Ignore this block }
}

My guess is that your QueryCursor instance can be collected by the GC 
within those 20 seconds, and I think that in this case, your query can 
be canceled. As the result you see only some part of updates.

Can you please re-create your code according to the example:

https://github.com/gridgain/gridgain/blob/master/examples/src/main/java/org/apache/ignite/examples/datagrid/CacheContinuousQueryExample.java

And try one more time.

BR,
Andrei

1/20/2021 1:36 PM, VeenaMithare пишет:
> Hi Andrei,
>
> Please find a trimmed down version of the code attached.
> Have a look at the ContinuousQueryHelper -> setInitialQuery to check how the
> initial query has been coded.
>
> regards,
> Veena.
> InitialQueryProject.zip
> <http://apache-ignite-users.70518.x6.nabble.com/file/t2757/InitialQueryProject.zip>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by VeenaMithare <v....@cmcmarkets.com>.
Hi Andrei,

Please find a trimmed down version of the code attached. 
Have a look at the ContinuousQueryHelper -> setInitialQuery to check how the
initial query has been coded.

regards,
Veena.
InitialQueryProject.zip
<http://apache-ignite-users.70518.x6.nabble.com/file/t2757/InitialQueryProject.zip>  



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by VeenaMithare <v....@cmcmarkets.com>.
HI Andrei, 

The initial query(sql query) is failing, not the continuous update. 

Also so far we have had no issues in the initial query.. ( for the past 1
year or so ). I can share a reproducer if that will help, 

regards,
Veena.





--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by andrei <ae...@gmail.com>.
Hi,

Could you please provide your client's code where you are listening to 
continuous query? Probably something is wrong there.

BR,
Andrei

1/19/2021 4:13 PM, VeenaMithare пишет:
> Hi ,
>
> It happened a couple of times on Friday ( 15th ) . Today I started a dummy
> client with a continuous query against the same client multiple times
> against the same cluster - and I am not able to reproduce it. This cache
> contains 58 records , and I see all the 58 records being recorded in the
> initial query .
>
> I have so far looked into the below :
> 1. Baseline topology of the cluster on 15th - It looks okay . The topology
> log shows servers as 3 all the time.i.e The cluster was not segmented.
> 2. Another client which has a continuous query against the same cache -
> seemed to fetch only 40 records as against 58 records on friday.
> 3. I have checked the cache entry distribution for this cache yesterday /
> today and the primary entries for the  58 records seem to be distributed as
> 17, 18, 23 on the 3 server nodes. However I dont know what the distribution
> was on friday.
>
> How do I debug the below  :
> 2. Did the replication between the servers not work ? Is it possible that
> one of the servers had the wrong value ? How do I debug this ? Please note
> when I connect to each of the 3 servers using dbeaver, they all return the
> correct row right now.( Since I got to know of this issue after 2 days, I am
> not sure what was the row value at the time of this issue )
>
> 3. How do I know which server( out of the 3 servers ) was the client's
> continuous query -> initial query run on
>
> regards,
> Veena.
>
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by VeenaMithare <v....@cmcmarkets.com>.
Hi , 

It happened a couple of times on Friday ( 15th ) . Today I started a dummy
client with a continuous query against the same client multiple times
against the same cluster - and I am not able to reproduce it. This cache
contains 58 records , and I see all the 58 records being recorded in the
initial query . 

I have so far looked into the below : 
1. Baseline topology of the cluster on 15th - It looks okay . The topology
log shows servers as 3 all the time.i.e The cluster was not segmented.  
2. Another client which has a continuous query against the same cache -
seemed to fetch only 40 records as against 58 records on friday. 
3. I have checked the cache entry distribution for this cache yesterday /
today and the primary entries for the  58 records seem to be distributed as
17, 18, 23 on the 3 server nodes. However I dont know what the distribution
was on friday. 

How do I debug the below  :
2. Did the replication between the servers not work ? Is it possible that
one of the servers had the wrong value ? How do I debug this ? Please note
when I connect to each of the 3 servers using dbeaver, they all return the
correct row right now.( Since I got to know of this issue after 2 days, I am
not sure what was the row value at the time of this issue )

3. How do I know which server( out of the 3 servers ) was the client's
continuous query -> initial query run on 

regards,
Veena.






--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Does this happen if you just run this SQL query during client start-up? Can
you reproduce this by bringing nodes down and back up?

Regards,
-- 
Ilya Kasnacheev


вт, 19 янв. 2021 г. в 15:11, VeenaMithare <v....@cmcmarkets.com>:

> No . Its a SQL query.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by VeenaMithare <v....@cmcmarkets.com>.
No . Its a SQL query.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: 2.8.1 : Continuous Query Initial query not returning any result sometimes

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

What kind of initial query do you have? Is it a scan query?

Regards,
-- 
Ilya Kasnacheev


пн, 18 янв. 2021 г. в 19:44, VeenaMithare <v....@cmcmarkets.com>:

> Hi Team,
>
> Our env :
> Servers : 3
> Cache Configuration : REPLICATED
>
> In one of our clients, the continuous query - Initial query did not return
> any result during client start whereas it was expected to return a single
> row. This happened for about 3-4 times over a period of 2 hours. It did
> return the right result after a couple of restarts.
>
> I am trying to debug what happened when it did not return the result . My
> suspicions are as below :
>
> 1. Last week we changed the  client communication timeout as below as per
> the discussion here (
>
> http://apache-ignite-users.70518.x6.nabble.com/IgniteSpiOperationTimeoutException-Operation-timed-out-timeoutStrategy-ExponentialBackoffTimeoutStray-tp34196p34377.html
> ). Server was already on the below timeout values.
>
>  :
>
>  <bean
> class="org.apache.ignite.spi.communication.tcp.TcpCommunicationSpi"
> scope="prototype">
>                 <property name="connectTimeout" value="5000"/>
>                 <property name="maxConnectTimeout" value="10000"/>
>
>
> Could it be possible that TcpCommunicationSpi timed out before the results
> were seen on the client side  ?
>
> 2. Did the replication between the servers not work ? Is it possible that
> one of the servers had the wrong value ? How do I debug this ? Please note
> when I connect to each of the 3 servers using dbeaver, they all return the
> correct row right now.( Since I got to know of this issue after 2 days, I
> am
> not sure what was the row value at the time of this issue )
>
> 3. How do I know which server( out of the 3 servers ) was the client's
> continuous query -> initial query run on ?
>
> How do I debug this further ?
>
> regards,
> Veena.
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>