You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by seiferma <se...@fzi.de> on 2019/05/28 09:09:10 UTC

Efficiency of key queries

Dear all,

we are evaluating techniques for finding all keys contained in a cache. So
far, we tried the most simple approaches mentioned in this mailing list like
iterating over all cache entries or creating a SqlFieldsQuery asking for the
_key column. While the results are correct, the performance is not
satisfying if some cache content is not held in memory.

Our setup is as follows: We got a small spring boot application that is
running an embedded ignite instance. We manually activated the cluster
consisting of this single node. We create all caches using the partitioned
mode, so we can modify indexed fields during runtime by issuing sql
statements. We are aware of the fact that using partitioned mode with a
single node is useless but it should not do any harm. We limit the available
heap memory of the application to one gigabyte and pushed more data to the
cache than the heap memory can hold. Therefore, some cache entries are
written to the hard drive. In addition, we enabled persistent storage to
make the cache survive restarts.

What we see is that the query as well as iterating cache entries takes a
long time to complete. While running the queries, we could see some decent
load on the hard drive. This makes sense for iterating cache entries but not
so much for the query. We expected that the query just aks the database for
the content of the (indexed?) _key column, which should not require the
whole entity being loaded. This request should be pretty fast even if most
of the entities are not available in the heap. With less data, the requests
are faster than we could explain by the smaller amount of data.

Could you give us some hints about what we could have done wrong or how we
could retrieve all keys used in a cache in a more efficient way? If you need
more information, I am keen to provide it.

Best regards
Stephan



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Efficiency of key queries

Posted by aealexsandrov <ae...@gmail.com>.
Hi,

In the case of a single node, I don't think that you can speed up this
process. But you can scale your performance by adding new nodes.

It can be done using compute tasks and local SQL queries or local cache
operation. It means that you will be able to run part of your login on every
data node and send to the client only some results.

Read more about compute tasks you can here:

https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/compute/ComputeTask.html

The example you can see here:

https://github.com/apache/ignite/blob/master/examples/src/main/java/org/apache/ignite/examples/computegrid/ComputeTaskMapExample.java

Local SqlFieldQuery you can do inside compute task with next flag:

https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/cache/query/SqlFieldsQuery.html#setLocal-boolean-

Do local cache operation you can via next methods:

https://ignite.apache.org/releases/latest/javadoc/org/apache/ignite/IgniteCache.html#localPeek-K-org.apache.ignite.cache.CachePeekMode...-

BR,
Andrei







--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Efficiency of key queries

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

- I don't think Ignite can get indexed columns' values from index. It will
load the key-value pair either way.
- Ignite stores key-value pairs in pages together, so when you iterate on
_key, all pages will be loaded into offheap.

I think that there is no easy solution for your use case.

Regards,
-- 
Ilya Kasnacheev


вт, 28 мая 2019 г. в 12:09, seiferma <se...@fzi.de>:

> Dear all,
>
> we are evaluating techniques for finding all keys contained in a cache. So
> far, we tried the most simple approaches mentioned in this mailing list
> like
> iterating over all cache entries or creating a SqlFieldsQuery asking for
> the
> _key column. While the results are correct, the performance is not
> satisfying if some cache content is not held in memory.
>
> Our setup is as follows: We got a small spring boot application that is
> running an embedded ignite instance. We manually activated the cluster
> consisting of this single node. We create all caches using the partitioned
> mode, so we can modify indexed fields during runtime by issuing sql
> statements. We are aware of the fact that using partitioned mode with a
> single node is useless but it should not do any harm. We limit the
> available
> heap memory of the application to one gigabyte and pushed more data to the
> cache than the heap memory can hold. Therefore, some cache entries are
> written to the hard drive. In addition, we enabled persistent storage to
> make the cache survive restarts.
>
> What we see is that the query as well as iterating cache entries takes a
> long time to complete. While running the queries, we could see some decent
> load on the hard drive. This makes sense for iterating cache entries but
> not
> so much for the query. We expected that the query just aks the database for
> the content of the (indexed?) _key column, which should not require the
> whole entity being loaded. This request should be pretty fast even if most
> of the entities are not available in the heap. With less data, the requests
> are faster than we could explain by the smaller amount of data.
>
> Could you give us some hints about what we could have done wrong or how we
> could retrieve all keys used in a cache in a more efficient way? If you
> need
> more information, I am keen to provide it.
>
> Best regards
> Stephan
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>