You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@ignite.apache.org by Naveen <na...@gmail.com> on 2018/03/29 05:57:51 UTC

Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Hi

I am using ignite 2.3 with native persistence layer as backing store

We do have close to half to 1 billion records in each of the tables.

There are some adhoc requirements to query the tables with diffrent where
conditions, columns which we use in where clause may not have indexes, which
may take time to execute the query, but it should slow the down or crash the
cluster. We are not using eviction, all our data is residing in RAM.

My question is - shall we have any means to run the queries directly on
persistence store instead of on RAM, so that whatever queries we run will
not impact cluster ??

Hope you understood my requirement ?

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Posted by Andrey Mashenkov <an...@gmail.com>.

Hi Naveen,

Ignite is memory centric database and I doubt it should run query on Native
persistent that is on DISK.
However, for now, Ignite can use Heap memory only for heavy queries.
It will be fixed in future versions via introducing memory region for SQL
queries
that will allow Ignite to use disk for very large queries as a trade off to
avoid OOM.

Looks like, Map queries fails due to OOM. It is a known issue for pre 2.3
versions [1] when Map queries fetch full result dataset.
Is it possible ASSOCIATED_PARTIES objects are too large or you have too low
heap size?
Is it possible ResultSet or connections are not closed properly and there
is some leakage?

There should be enough heap memory on all nodes at least 1 page of query
results per query.
Page size is equals to 1024 by default. So, you need free memory for 1024
ASSOCIATED_PARTIES object per query.

[1] https://issues.apache.org/jira/browse/IGNITE-5991

On Fri, Apr 20, 2018 at 9:42 AM, Naveen <na...@gmail.com> wrote:

> HI Andrey
>
> The reason I was trying to explore this feature is,
>
> I will give you an example. I have a cache with 20M records and when I run
> this query
> SELECT * FROM "AssociatedPartiesCache".ASSOCIATED_PARTIES
>
> Query took more than 200 secs and ran out of memory, here is the error
> thrown
> This I have tried with lary=true, guess syntax and usage is correct.
> jdbc:ignite:thin://10.144.114.113?lazy=true
>
> SQL Error [50000]: javax.cache.CacheException: Failed to run map query
> remotely.Failed to execute map query on the node:
> ef5b4e7d-3423-4b84-8427-0491cd13f6c4, class
> org.apache.ignite.IgniteCheckedException:Failed to execute SQL query. Out
> of
> memory.; SQL statement:
> SELECT
> __Z0.ASSOCIATED_PARTY_ID __C0_0,
> __Z0.WALLETID __C0_1,
> __Z0.UPDATEDDATETIME __C0_2,
> __Z0.UPDATEDBY __C0_3,
> __Z0.PARTY_ID __C0_4
> FROM "AssociatedPartiesCache".ASSOCIATED_PARTIES __Z0 [90108-195]
>   javax.cache.CacheException: Failed to run map query remotely.Failed to
> execute map query on the node: ef5b4e7d-3423-4b84-8427-0491cd13f6c4, class
> org.apache.ignite.IgniteCheckedException:Failed to execute SQL query. Out
> of
> memory.; SQL statement:
> SELECT
> __Z0.ASSOCIATED_PARTY_ID __C0_0,
> __Z0.WALLETID __C0_1,
> __Z0.UPDATEDDATETIME __C0_2,
> __Z0.UPDATEDBY __C0_3,
> __Z0.PARTY_ID __C0_4
> FROM "AssociatedPartiesCache".ASSOCIATED_PARTIES __Z0 [90108-195]
>   javax.cache.CacheException: Failed to run map query remotely.Failed to
> execute map query on the node: ef5b4e7d-3423-4b84-8427-0491cd13f6c4, class
> org.apache.ignite.IgniteCheckedException:Failed to execute SQL query. Out
> of
> memory.; SQL statement:
> SELECT
> __Z0.ASSOCIATED_PARTY_ID __C0_0,
> __Z0.WALLETID __C0_1,
> __Z0.UPDATEDDATETIME __C0_2,
> __Z0.UPDATEDBY __C0_3,
> __Z0.PARTY_ID __C0_4
> FROM "AssociatedPartiesCache".ASSOCIATED_PARTIES __Z0 [90108-195]
>
> In spite of, lazy=true, I got into this issue.
>
> The way I was looking at possible solution to overcome this issue is.
>
> Long running queries should not impact server RAM, Ignite should run query
> on Native persistent that is on DISK, the query which I am running should
> not bring cluster down, I am still fine if the query takes couple of
> minutes
> to execute, but ultimately should not disturb the cluster
>
> After this error, cluster stopped working, we had to restart the cluster
> again to make it work.
> I believe there are definitely ways to overcome this issue, we do have
> billion records in some of the tables, for some reason unknowingly runs
> query on those tables, it should not bring down the cluster abtruptly
>
> Thanks
> Naveen
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>



-- 
Best regards,
Andrey V. Mashenkov

Re: Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Posted by Naveen <na...@gmail.com>.

HI Andrey 

The reason I was trying to explore this feature is, 

I will give you an example. I have a cache with 20M records and when I run
this query 
SELECT * FROM "AssociatedPartiesCache".ASSOCIATED_PARTIES

Query took more than 200 secs and ran out of memory, here is the error
thrown
This I have tried with lary=true, guess syntax and usage is correct. 
jdbc:ignite:thin://10.144.114.113?lazy=true

SQL Error [50000]: javax.cache.CacheException: Failed to run map query
remotely.Failed to execute map query on the node:
ef5b4e7d-3423-4b84-8427-0491cd13f6c4, class
org.apache.ignite.IgniteCheckedException:Failed to execute SQL query. Out of
memory.; SQL statement:
SELECT
__Z0.ASSOCIATED_PARTY_ID __C0_0,
__Z0.WALLETID __C0_1,
__Z0.UPDATEDDATETIME __C0_2,
__Z0.UPDATEDBY __C0_3,
__Z0.PARTY_ID __C0_4
FROM "AssociatedPartiesCache".ASSOCIATED_PARTIES __Z0 [90108-195]
  javax.cache.CacheException: Failed to run map query remotely.Failed to
execute map query on the node: ef5b4e7d-3423-4b84-8427-0491cd13f6c4, class
org.apache.ignite.IgniteCheckedException:Failed to execute SQL query. Out of
memory.; SQL statement:
SELECT
__Z0.ASSOCIATED_PARTY_ID __C0_0,
__Z0.WALLETID __C0_1,
__Z0.UPDATEDDATETIME __C0_2,
__Z0.UPDATEDBY __C0_3,
__Z0.PARTY_ID __C0_4
FROM "AssociatedPartiesCache".ASSOCIATED_PARTIES __Z0 [90108-195]
  javax.cache.CacheException: Failed to run map query remotely.Failed to
execute map query on the node: ef5b4e7d-3423-4b84-8427-0491cd13f6c4, class
org.apache.ignite.IgniteCheckedException:Failed to execute SQL query. Out of
memory.; SQL statement:
SELECT
__Z0.ASSOCIATED_PARTY_ID __C0_0,
__Z0.WALLETID __C0_1,
__Z0.UPDATEDDATETIME __C0_2,
__Z0.UPDATEDBY __C0_3,
__Z0.PARTY_ID __C0_4
FROM "AssociatedPartiesCache".ASSOCIATED_PARTIES __Z0 [90108-195]

In spite of, lazy=true, I got into this issue.

The way I was looking at possible solution to overcome this issue is. 

Long running queries should not impact server RAM, Ignite should run query
on Native persistent that is on DISK, the query which I am running should
not bring cluster down, I am still fine if the query takes couple of minutes
to execute, but ultimately should not disturb the cluster

After this error, cluster stopped working, we had to restart the cluster
again to make it work.
I believe there are definitely ways to overcome this issue, we do have
billion records in some of the tables, for some reason unknowingly runs
query on those tables, it should not bring down the cluster abtruptly

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Posted by Andrey Mashenkov <an...@gmail.com>.

Hi,

>So are you saying, query execution will not have any impact on the cluster
>activities like GET/PUTs in general ??
Why? Queries and get\put share same resources. So, they will affect each
other.

>I was thinking, if we can run these queries directly on backing store, it
>may not have any impact on RAM from where we always do GETs, will that
work?
When you use CacheStore, you have some external DB and you can run query on
DB bypassing Ignite. It is clear.

1. What "backing store" do you mean when Ignite persistence is used?
2. How are you imagine query will be run on "backing store" directly?
3.Why do you think it will help you as anyway disk pressure will be
increased? Moreover, disk is usually a bottleneck for PUT operations and
for GET in some cases (load from disk or TTL update).

On Tue, Apr 3, 2018 at 9:27 AM, Naveen <na...@gmail.com> wrote:

> Hi ANdrew
>
> There were cases, when I just run select * from table on SQLLINE
> unknowingly, we could see queries getting slowed down and OOM errors. Our
> dev machines not very high end ones.
>
> When we deliver this solution, production support guys can run queries to
> debug any data related issues, during which they may need to join tables,
> for every adhoc query we cant create indexes to speedup the query execution
> right ??
>
> So are you saying, query execution will not have any impact on the cluster
> activities like GET/PUTs in general ??
>
> I was thinking, if we can run these queries directly on backing store, it
> may not have any impact on RAM from where we always do GETs, will that
> work?
>
> Thanks
> Naveen
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>



-- 
Best regards,
Andrey V. Mashenkov

Re: Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Posted by Naveen <na...@gmail.com>.

Hi ANdrew

There were cases, when I just run select * from table on SQLLINE
unknowingly, we could see queries getting slowed down and OOM errors. Our
dev machines not very high end ones.

When we deliver this solution, production support guys can run queries to
debug any data related issues, during which they may need to join tables,
for every adhoc query we cant create indexes to speedup the query execution
right ??

So are you saying, query execution will not have any impact on the cluster
activities like GET/PUTs in general ??

I was thinking, if we can run these queries directly on backing store, it
may not have any impact on RAM from where we always do GETs, will that work?

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Posted by Andrey Mashenkov <an...@gmail.com>.

Hi,

>In this case also, since we dont have eviction in place, all the time data
>is retrieved from RAM only, the only time request goes to Oracle is for
>Upserts and delete.
Here, all queries run on data in RAM only. Ignite just propagate updates to
back store (to Oracle via CacheStore impl)
with keeping consistency guaranties according to CacheStore configuration.

>So if oracle DB is loaded heavily while running these queries also, it does
>not affect the cluster performance when it comes to data retrievals, am i
>right ??
Not quite, CacheStore affects performance due to either CacheStore updates
are synchronous or CacheStore buffer is limited.

With Ignite persistence you are not limited with RAM. PageMemory concept
allow you to query data that resides on disk.
What OOM do you mean Ignite or JVM, and on what side: client or server?
Why you think with same dataset OOM with persistence has higher probability
than with no persistence and no eviction?


On Mon, Apr 2, 2018 at 9:09 AM, Naveen <na...@gmail.com> wrote:

> HI
>
> Let me rephrase my question, guess I have conveyed my question correctly.
>
> Lets take an example
>
> Ignite cluster with backing store as RDBMS - oracle and no eviction in
> place.
>
> As we all know, we can run complex queries on Oracle to retrieve desired
> data.
> In this case also, since we dont have eviction in place, all the time data
> is retrieved from RAM only, the only time request goes to Oracle is for
> Upserts and delete.
> So if oracle DB is loaded heavily while running these queries also, it does
> not affect the cluster performance when it comes to data retrievals, am i
> right ??
> Inserts/Update/Deletes on DB may get slower if DB is loaded with these
> heavy
> queries.
>
> Similar way, how can we achieve this with ignite cluster with native
> persistence ?
> When native persistence is used, if we run some heavy queries on cluster
> thru SQLLINE, it may give out of memory error and node might crash also.
> How
> can we avoid this, it should run on backing store, I am fine if thge query
> exection takes longer time, but cluster should not get crashed.
>
> Hope I conveyed my requirements more clear this time.
>
> Thanks
> Naveen
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>



-- 
Best regards,
Andrey V. Mashenkov

Re: Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Posted by Naveen <na...@gmail.com>.

HI

Let me rephrase my question, guess I have conveyed my question correctly.

Lets take an example

Ignite cluster with backing store as RDBMS - oracle and no eviction in
place.

As we all know, we can run complex queries on Oracle to retrieve desired
data. 
In this case also, since we dont have eviction in place, all the time data
is retrieved from RAM only, the only time request goes to Oracle is for
Upserts and delete. 
So if oracle DB is loaded heavily while running these queries also, it does
not affect the cluster performance when it comes to data retrievals, am i
right ??
Inserts/Update/Deletes on DB may get slower if DB is loaded with these heavy
queries.

Similar way, how can we achieve this with ignite cluster with native
persistence ?
When native persistence is used, if we run some heavy queries on cluster
thru SQLLINE, it may give out of memory error and node might crash also. How
can we avoid this, it should run on backing store, I am fine if thge query
exection takes longer time, but cluster should not get crashed.

Hope I conveyed my requirements more clear this time.

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Posted by Andrey Mashenkov <an...@gmail.com>.

Hi Naveen,

Ignite implements page memory concept since version 2.0 has been released.
This means, Ignite reads page from disk if it is not loaded to memory and
evict pages to disk to free memory.
Page load and eviction work automatically. So, I can't understand what
"feature like eviction" and  "hint to go to backing store" do you mean.

Join performance issue usually caused by lack of H2 optimizer which
sometimes make full table scan (or index scan with low selectivity) and
then join other tables to first one.
You can always try to change table join order to reduce dataset that will
be scanned. EnforceJoinOrder flag is designed to helps you with this by
disabling H2 join order optimization.

If join order changes is not helpful and you always face slow scan with low
CPU usage you can try to increase query parallelism [1].
QueryParallelism value is equals to number of parts cache indices will be
splitted to. This works similar way as adding new nodesand allow you to
utilize more CPU on every node
by running query in several threads (equals to the option value) in
parallel. Each thread will work with lesser dataset that should reduce
query execution time.

Of course there is a limitation, you can't join tables from
(Partitioned)  caches
with different parallelizm level. Replicated cache parallelizm level can be
>1 which is default.
Another issue related to this option is simple queries latency can be
increased as heavy query speed trade off.

[1]
https://apacheignite.readme.io/v2.0/docs/sql-performance-and-debugging#query-parallelism

On Fri, Mar 30, 2018 at 8:14 AM, Naveen <na...@gmail.com> wrote:

> Lazy is something which we can use when we are expecting huge result set to
> minimize the memory consumption at the cost of little performance hit.
> However my requirement to run a query by joining couple of tables with
> complex where clause to debug some data related issues, this query may not
> return huge result set, but scan the the complete table to get the result
> set, during this time it should not hit the performance.
>
> So my question here is, can we try applying the same feature like eviction,
> in case of eviction, if the record is not there in RAM, it will go to
> backing store and gets the data and also load it to memory. Similar way,
> can
> we hint the cluster to run the query on backing store instead of RAM, so
> that that the impact on RAM would be less if you are running the query on
> backing store.
>
> In our case, we have 2 TB servers with 128 core machine, very high end
> machines, RAM is utilized upto 60 to 70%, however CPU resources are not
> utilized.
>
> Hope I was able to convey my adhoc requirements to run various SQL queries
> to debug data related issues and consequences of these queries on the
> performance on the cluster
>
> Thanks
> Naveen
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

-- 
Best regards,
Andrey V. Mashenkov

Re: Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Posted by Naveen <na...@gmail.com>.

Lazy is something which we can use when we are expecting huge result set to
minimize the memory consumption at the cost of little performance hit.
However my requirement to run a query by joining couple of tables with
complex where clause to debug some data related issues, this query may not
return huge result set, but scan the the complete table to get the result
set, during this time it should not hit the performance. 

So my question here is, can we try applying the same feature like eviction,
in case of eviction, if the record is not there in RAM, it will go to
backing store and gets the data and also load it to memory. Similar way, can
we hint the cluster to run the query on backing store instead of RAM, so
that that the impact on RAM would be less if you are running the query on
backing store.

In our case, we have 2 TB servers with 128 core machine, very high end
machines, RAM is utilized upto 60 to 70%, however CPU resources are not
utilized. 

Hope I was able to convey my adhoc requirements to run various SQL queries
to debug data related issues and consequences of these queries on the
performance on the cluster

Thanks
Naveen



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Running heavy queries on Ignite cluster on backing store directly without impacting the cluster

Posted by Andrey Mashenkov <an...@gmail.com>.

Hi Naveen,

You can try a 'lazy' flag for query. It is available from ignite-2.4 that
has been released recently.
See SqlFieldQuery javadoc [1] and JDBC doc [2] for details.

[1]
https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/query/SqlFieldsQuery.html#setLazy-boolean-
[2] https://apacheignite-sql.readme.io/docs/jdbc-client-driver


On Thu, Mar 29, 2018 at 8:57 AM, Naveen <na...@gmail.com> wrote:

> Hi
>
> I am using ignite 2.3 with native persistence layer as backing store
>
> We do have close to half to 1 billion records in each of the tables.
>
> There are some adhoc requirements to query the tables with diffrent where
> conditions, columns which we use in where clause may not have indexes,
> which
> may take time to execute the query, but it should slow the down or crash
> the
> cluster. We are not using eviction, all our data is residing in RAM.
>
> My question is - shall we have any means to run the queries directly on
> persistence store instead of on RAM, so that whatever queries we run will
> not impact cluster ??
>
> Hope you understood my requirement ?
>
> Thanks
> Naveen
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>



-- 
Best regards,
Andrey V. Mashenkov