You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@geode.apache.org by "Thacker, Dharam" <dh...@jpmorgan.com> on 2018/11/21 16:52:52 UTC

RE: Inconsistent behaviour with functional indexes

Thank you Jason & Anil!

[+Corrected typo in email subject]

As per your suggestions, I tried to redefine all regions and indexes as shown below with fresh cluster configuration. That helped to increase performance and it was even improving on subsequent calls to same query.

Though I still could not understand index picking strategy, as ideally it should have used equi join index expressions. As shown below, it’s using indexoption_region_indx.

[info 2018/11/21 22:03:41.630 IST AServer <Function Execution Processor2> tid=0x5e] Query Executed in 159.0429 ms; rowCount = 430; indexesUsed(2):indexoption_region_indx(Results: 430),ga_dealref_indx(Results: 1) "select c as assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and i.expiryDate = '2018-11-21' and i.agentCity = 'EMEA' LIMIT 5000"

[info 2018/11/21 22:04:00.664 IST AServer <Function Execution Processor2> tid=0x5e] Query Executed in 68.54671 ms; rowCount = 430; indexesUsed(2):indexoption_region_indx(Results: 430),ga_dealref_indx(Results: 1) "select c as assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and i.expiryDate = '2018-11-21' and i.agentCity = 'EMEA' LIMIT 5000"

[info 2018/11/21 22:04:18.224 IST AServer <Function Execution Processor2> tid=0x5e] Query Executed in 47.1064 ms; rowCount = 430; indexesUsed(2):indexoption_region_indx(Results: 430),ga_dealref_indx(Results: 1) "select c as assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and i.expiryDate = '2018-11-21' and i.agentCity = 'EMEA' LIMIT 5000"

[info 2018/11/21 22:04:48.360 IST AServer <Function Execution Processor2> tid=0x5e] Query Executed in 22.3961 ms; rowCount = 229; indexesUsed(2):indexoption_region_indx(Results: 229),ga_dealref_indx(Results: 1) "select c as assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and i.expiryDate = '2018-11-21' and i.agentCity = 'NA' LIMIT 5000"

[cid:image001.png@01D481E7.11A0DB10]

More attempting to give hints resulted into “NullPointerException” for me as shown below with same above queries.

<HINT 'indexoption_expirydt_indx','ga_dealref_indx'>select c as assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and i.expiryDate = '2018-11-21' and i.agentCity = 'NA' LIMIT 5000

Exception:


[warning 2018/11/21 22:13:15.950 IST StarfishServer <Function Execution Processor1> tid=0x5f] java.lang.NullPointerException
org.apache.geode.cache.execute.FunctionException: java.lang.NullPointerException
                at org.apache.geode.management.internal.beans.QueryDataFunction$LocalQueryFunction.execute(QueryDataFunction.java:545)
                at org.apache.geode.internal.cache.execute.AbstractExecution.executeFunctionLocally(AbstractExecution.java:333)
                at org.apache.geode.internal.cache.execute.AbstractExecution$2.run(AbstractExecution.java:302)
                at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
                at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
                at org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1121)
                at org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
                at org.apache.geode.distributed.internal.ClusterDistributionManager$9$1.run(ClusterDistributionManager.java:990)
                at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.NullPointerException
                at org.apache.geode.cache.query.internal.IndexInfo.evaluateIndexKey(IndexInfo.java:55)
                at org.apache.geode.cache.query.internal.CompiledComparison.getSizeEstimate(CompiledComparison.java:301)
                at org.apache.geode.cache.query.internal.AbstractGroupOrRangeJunction.getCondtionsSortedOnIncreasingEstimatedIndexResultSize(AbstractGroupOrRangeJunction.java:226)
                at org.apache.geode.cache.query.internal.AbstractGroupOrRangeJunction.auxFilterEvaluate(AbstractGroupOrRangeJunction.java:257)
                at org.apache.geode.cache.query.internal.AbstractGroupOrRangeJunction.filterEvaluate(AbstractGroupOrRangeJunction.java:169)
                at org.apache.geode.cache.query.internal.CompiledJunction.filterEvaluate(CompiledJunction.java:187)
                at org.apache.geode.cache.query.internal.CompiledSelect.evaluate(CompiledSelect.java:537)
                at org.apache.geode.cache.query.internal.CompiledSelect.evaluate(CompiledSelect.java:55)
                at org.apache.geode.cache.query.internal.DefaultQuery.executeUsingContext(DefaultQuery.java:443)
                at org.apache.geode.cache.query.internal.DefaultQuery.execute(DefaultQuery.java:270)
                at org.apache.geode.cache.query.internal.DefaultQuery.execute(DefaultQuery.java:203)
                at org.apache.geode.management.internal.beans.QueryDataFunction$LocalQueryFunction.execute(QueryDataFunction.java:540)
                ... 8 more


Thanks & Regards,
Dharam

From: Jason Huynh [mailto:jhuynh@pivotal.io]
Sent: Friday, November 16, 2018 11:55 PM
To: user@geode.apache.org
Subject: Re: Incosistent behaviour with functional indexes


The query tracing might have some bugs around it, I would guess this is the case if you are getting the expected/correct results from the query itself..

The second case seems like a good fit for the join optimizations we made a while back.  I am not sure if the toString() is causing the optimization not to take effect or not.  We had only tested for leaf nodes being of an identifier/field type and not necessarily a method invocation.  Maybe try using c.dealRefId = i.dealRefId and see if that helps?



On Fri, Nov 16, 2018 at 10:16 AM Anilkumar Gingade <ag...@pivotal.io>> wrote:
Hi Dharam,

Here is what i think may be happening, please note i haven't tried it on my side.

In Query 1, its using index "group_name_indx_1" as a input and doing index lookup using "indexoption_dealref_1", reason why it only shows results=632 with indexoption_dealref_1. I was expecting it to be 640 not sure why its 632.

In Query 2 case, its using the "indexoption_expirydt_indx" and ignoring the others; with "AND" condition, in certain (or all) cases, query uses only one index as it is found to be faster.
Have you tried using hint in this case? If its not working there could be some issue with hint; please hint is just a suggestion to query engine.

Can you try changing the order of the condition to,
where c.dealRefId.toString() = i.dealRefId.toString() and i.expiryDate = '2018-11-21'
With and without hint.

-Anil.






On Fri, Nov 16, 2018 at 5:19 AM Thacker, Dharam <dh...@jpmorgan.com>> wrote:
Missed to clarify few things,


1.       I am using GEODE VERSION 1.6.0

2.       expiryDate in below query is also a ISO string only (YYYY-MM-DD) format

Thanks,
Dharam

From: Thacker, Dharam
Sent: Friday, November 16, 2018 6:36 PM
To: user@geode.apache.org<ma...@geode.apache.org>
Subject: Incosistent behaviour with functional indexes

Hi Team,

I am seeing inconsistent behavior in usage of indexes in GEODE OQL queries.

Could someone help me to validate if there is any known issue/new issue?

gfsh>list indexes
Error! Filename not specified.


[Here I see 2 indexes being used correctly but I am not sure why group_name_indx_1 has 0 results]

[info 2018/11/16 18:05:44.092 IST StarfishServer <Function Execution Processor1> tid=0x5b] Query Executed in 26.6535 ms; rowCount = 640; indexesUsed(2):group_name_indx_1(Results: 0),indexoption_dealref_1(Results: 632) "select c as assignment,i.cptySpn,i.cptyName,i.bookName from /GroupAssignment c,/IndexOption i where c.dealRefId.toString() = i.dealRefId.toString() LIMIT 1000"


[Here it takes > 3 seconds even though I have index on expiryDate and dealRef attribute in both regions]

[info 2018/11/16 18:07:14.632 IST StarfishServer <Function Execution Processor1> tid=0x5b] Query Executed in 3840.7922 ms; rowCount = 640; indexesUsed(1):indexoption_expirydt_indx(Results: 640) "select c as assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate from /GroupAssignment c,/IndexOption i where i.expiryDate = '2018-11-21' and c.dealRefId.toString() = i.dealRefId.toString()  LIMIT 1000"

[Even though I give hints to query, it’s not using it well. I even tried to tweak query but does not look good yet]

[info 2018/11/16 18:17:44.236 IST StarfishServer <Function Execution Processor1> tid=0x5b] Query Executed in 325.6136 ms; rowCount = 229; indexesUsed(1):indexoption_region_indx(Results: 229) "<HINT 'indexoption_dealref_indx', 'indexoption_expirydt_indx','indexoption_cptySpn_indx','ga_dealref_indx'>select c as assignment, r.bookName from /GroupAssignment c,(select i.dealRefId,i.bookName,i.cptySpn,i.cptyName from /IndexOption i where i.expiryDate = '2018-11-21' and i.agentCity = 'NA') r where c.dealRefId = r.dealRefId"


Thanks,
Dharam

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer<https://www.jpmorgan.com/emaildisclaimer> including on confidentiality, legal privilege, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer<https://www.jpmorgan.com/emaildisclaimer> including on confidentiality, legal privilege, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

This message is confidential and subject to terms at: https://www.jpmorgan.com/emaildisclaimer including on confidentiality, legal privilege, viruses and monitoring of electronic messages. If you are not the intended recipient, please delete this message and notify the sender immediately. Any unauthorized use is strictly prohibited.

Re: Inconsistent behaviour with functional indexes

Posted by Anilkumar Gingade <ag...@pivotal.io>.
Hi Dharam,

Good to hear that the performance issue got resolved.

Can you please create a Geode ticket for NPE.

About use of Index at join level; I think its working as expected, applying
the region level index at the leaf level to filter out the results and then
applying join condition, using index look up.

-Anil.



The query engine has minimalistic






On Wed, Nov 21, 2018 at 8:53 AM Thacker, Dharam <dh...@jpmorgan.com>
wrote:

> Thank you Jason & Anil!
>
>
>
> *[+Corrected typo in email subject]*
>
>
>
> As per your suggestions, I tried to redefine all regions and indexes as
> shown below with fresh cluster configuration. That helped to increase
> performance and it was even improving on subsequent calls to same query.
>
>
>
> Though I still could not understand index picking strategy, as ideally it
> should have used equi join index expressions. As shown below, it’s using
> indexoption_region_indx.
>
>
>
> [info 2018/11/21 22:03:41.630 IST AServer <Function Execution Processor2>
> tid=0x5e] Query Executed in 159.0429 ms; rowCount = 430;
> indexesUsed(2):indexoption_region_indx(Results:
> 430),ga_dealref_indx(Results: 1) "select c as
> assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from
> /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and
> i.expiryDate = '2018-11-21' and i.agentCity = 'EMEA' LIMIT 5000"
>
>
>
> [info 2018/11/21 22:04:00.664 IST AServer <Function Execution Processor2>
> tid=0x5e] Query Executed in 68.54671 ms; rowCount = 430;
> indexesUsed(2):indexoption_region_indx(Results:
> 430),ga_dealref_indx(Results: 1) "select c as
> assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from
> /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and
> i.expiryDate = '2018-11-21' and i.agentCity = 'EMEA' LIMIT 5000"
>
>
>
> [info 2018/11/21 22:04:18.224 IST AServer <Function Execution Processor2>
> tid=0x5e] Query Executed in 47.1064 ms; rowCount = 430;
> indexesUsed(2):indexoption_region_indx(Results:
> 430),ga_dealref_indx(Results: 1) "select c as
> assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from
> /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and
> i.expiryDate = '2018-11-21' and i.agentCity = 'EMEA' LIMIT 5000"
>
>
>
> [info 2018/11/21 22:04:48.360 IST AServer <Function Execution Processor2>
> tid=0x5e] Query Executed in 22.3961 ms; rowCount = 229;
> indexesUsed(2):indexoption_region_indx(Results:
> 229),ga_dealref_indx(Results: 1) "select c as
> assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from
> /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and
> i.expiryDate = '2018-11-21' and i.agentCity = 'NA' LIMIT 5000"
>
>
>
>
>
> More attempting to give hints resulted into “NullPointerException” for me
> as shown below with same above queries.
>
>
>
> <HINT 'indexoption_expirydt_indx','ga_dealref_indx'>select c as
> assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate,i.agentCity from
> /GroupAssignment c,/IndexOption i where c.dealRefId = i.dealRefId and
> i.expiryDate = '2018-11-21' and i.agentCity = 'NA' LIMIT 5000
>
>
>
> *Exception:*
>
>
>
>
>
> [warning 2018/11/21 22:13:15.950 IST StarfishServer <Function Execution
> Processor1> tid=0x5f] java.lang.NullPointerException
>
> org.apache.geode.cache.execute.FunctionException:
> java.lang.NullPointerException
>
>                 at
> org.apache.geode.management.internal.beans.QueryDataFunction$LocalQueryFunction.execute(QueryDataFunction.java:545)
>
>                 at
> org.apache.geode.internal.cache.execute.AbstractExecution.executeFunctionLocally(AbstractExecution.java:333)
>
>                 at
> org.apache.geode.internal.cache.execute.AbstractExecution$2.run(AbstractExecution.java:302)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>
>                 at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
>                 at
> org.apache.geode.distributed.internal.ClusterDistributionManager.runUntilShutdown(ClusterDistributionManager.java:1121)
>
>                 at
> org.apache.geode.distributed.internal.ClusterDistributionManager.access$000(ClusterDistributionManager.java:109)
>
>                 at
> org.apache.geode.distributed.internal.ClusterDistributionManager$9$1.run(ClusterDistributionManager.java:990)
>
>                 at java.lang.Thread.run(Thread.java:748)
>
> Caused by: java.lang.NullPointerException
>
>                 at
> org.apache.geode.cache.query.internal.IndexInfo.evaluateIndexKey(IndexInfo.java:55)
>
>                 at
> org.apache.geode.cache.query.internal.CompiledComparison.getSizeEstimate(CompiledComparison.java:301)
>
>                 at
> org.apache.geode.cache.query.internal.AbstractGroupOrRangeJunction.getCondtionsSortedOnIncreasingEstimatedIndexResultSize(AbstractGroupOrRangeJunction.java:226)
>
>                 at
> org.apache.geode.cache.query.internal.AbstractGroupOrRangeJunction.auxFilterEvaluate(AbstractGroupOrRangeJunction.java:257)
>
>                 at
> org.apache.geode.cache.query.internal.AbstractGroupOrRangeJunction.filterEvaluate(AbstractGroupOrRangeJunction.java:169)
>
>                 at
> org.apache.geode.cache.query.internal.CompiledJunction.filterEvaluate(CompiledJunction.java:187)
>
>                 at
> org.apache.geode.cache.query.internal.CompiledSelect.evaluate(CompiledSelect.java:537)
>
>                 at
> org.apache.geode.cache.query.internal.CompiledSelect.evaluate(CompiledSelect.java:55)
>
>                 at
> org.apache.geode.cache.query.internal.DefaultQuery.executeUsingContext(DefaultQuery.java:443)
>
>                 at
> org.apache.geode.cache.query.internal.DefaultQuery.execute(DefaultQuery.java:270)
>
>                 at
> org.apache.geode.cache.query.internal.DefaultQuery.execute(DefaultQuery.java:203)
>
>                 at
> org.apache.geode.management.internal.beans.QueryDataFunction$LocalQueryFunction.execute(QueryDataFunction.java:540)
>
>                 ... 8 more
>
>
>
>
>
> Thanks & Regards,
>
> Dharam
>
>
>
> *From:* Jason Huynh [mailto:jhuynh@pivotal.io]
> *Sent:* Friday, November 16, 2018 11:55 PM
> *To:* user@geode.apache.org
> *Subject:* Re: Incosistent behaviour with functional indexes
>
>
>
>
>
> The query tracing might have some bugs around it, I would guess this is
> the case if you are getting the expected/correct results from the query
> itself..
>
>
>
> The second case seems like a good fit for the join optimizations we made a
> while back.  I am not sure if the toString() is causing the optimization
> not to take effect or not.  We had only tested for leaf nodes being of an
> identifier/field type and not necessarily a method invocation.  Maybe try
> using c.dealRefId = i.dealRefId and see if that helps?
>
>
>
>
>
>
>
> On Fri, Nov 16, 2018 at 10:16 AM Anilkumar Gingade <ag...@pivotal.io>
> wrote:
>
> Hi Dharam,
>
>
>
> Here is what i think may be happening, please note i haven't tried it on
> my side.
>
>
>
> In Query 1, its using index "group_name_indx_1" as a input and doing index
> lookup using "indexoption_dealref_1", reason why it only shows results=632
> with indexoption_dealref_1. I was expecting it to be 640 not sure why its
> 632.
>
>
>
> In Query 2 case, its using the "indexoption_expirydt_indx" and ignoring
> the others; with "AND" condition, in certain (or all) cases, query uses
> only one index as it is found to be faster.
>
> Have you tried using hint in this case? If its not working there could be
> some issue with hint; please hint is just a suggestion to query engine.
>
>
>
> Can you try changing the order of the condition to,
>
> where c.dealRefId.toString() = i.dealRefId.toString() and i.expiryDate =
> '2018-11-21'
>
> With and without hint.
>
>
>
> -Anil.
>
>
>
>
>
>
>
>
>
>
>
>
>
> On Fri, Nov 16, 2018 at 5:19 AM Thacker, Dharam <
> dharam.thacker@jpmorgan.com> wrote:
>
> Missed to clarify few things,
>
>
>
> 1.       I am using GEODE VERSION 1.6.0
>
> 2.       expiryDate in below query is also a ISO string only (YYYY-MM-DD)
> format
>
>
>
> Thanks,
>
> Dharam
>
>
>
> *From:* Thacker, Dharam
> *Sent:* Friday, November 16, 2018 6:36 PM
> *To:* user@geode.apache.org
> *Subject:* Incosistent behaviour with functional indexes
>
>
>
> Hi Team,
>
>
>
> I am seeing inconsistent behavior in usage of indexes in GEODE OQL queries.
>
>
>
> Could someone help me to validate if there is any known issue/new issue?
>
>
>
> gfsh>list indexes
>
> *Error! Filename not specified.*
>
>
>
>
>
> [Here I see 2 indexes being used correctly but I am not sure why
> group_name_indx_1 has 0 results]
>
>
>
> [info 2018/11/16 18:05:44.092 IST StarfishServer <Function Execution
> Processor1> tid=0x5b] Query Executed in 26.6535 ms; rowCount = 640; indexesUsed(2):group_name_indx_1(Results:
> 0),indexoption_dealref_1(Results: 632) "select c as
> assignment,i.cptySpn,i.cptyName,i.bookName from /GroupAssignment
> c,/IndexOption i where c.dealRefId.toString() = i.dealRefId.toString()
> LIMIT 1000"
>
>
>
>
>
> [Here it takes > 3 seconds even though I have index on expiryDate and
> dealRef attribute in both regions]
>
>
>
> [info 2018/11/16 18:07:14.632 IST StarfishServer <Function Execution
> Processor1> tid=0x5b] Query Executed in 3840.7922 ms; rowCount = 640; indexesUsed(1):indexoption_expirydt_indx(Results:
> 640) "select c as assignment,i.cptySpn,i.cptyName,i.bookName,i.expiryDate
> from /GroupAssignment c,/IndexOption i where i.expiryDate = '2018-11-21'
> and c.dealRefId.toString() = i.dealRefId.toString()  LIMIT 1000"
>
>
>
> [Even though I give hints to query, it’s not using it well. I even tried
> to tweak query but does not look good yet]
>
>
>
> [info 2018/11/16 18:17:44.236 IST StarfishServer <Function Execution
> Processor1> tid=0x5b] Query Executed in 325.6136 ms; rowCount = 229;
> indexesUsed(1):indexoption_region_indx(Results: 229) "<HINT
> 'indexoption_dealref_indx',
> 'indexoption_expirydt_indx','indexoption_cptySpn_indx','ga_dealref_indx'>select
> c as assignment, r.bookName from /GroupAssignment c,(select
> i.dealRefId,i.bookName,i.cptySpn,i.cptyName from /IndexOption i where
> i.expiryDate = '2018-11-21' and i.agentCity = 'NA') r where c.dealRefId =
> r.dealRefId"
>
>
>
>
>
> Thanks,
>
> Dharam
>
> This message is confidential and subject to terms at: https://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>
> This message is confidential and subject to terms at: https://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>
> This message is confidential and subject to terms at: https://
> www.jpmorgan.com/emaildisclaimer including on confidentiality, legal
> privilege, viruses and monitoring of electronic messages. If you are not
> the intended recipient, please delete this message and notify the sender
> immediately. Any unauthorized use is strictly prohibited.
>