You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by Raymond Wilson <ra...@trimble.com> on 2018/10/17 20:04:46 UTC

IO implications of ScanQuery

I have a potential work flow where I may need to find a set of elements in
a cache where the values are not be small (1-100Kb say), and where the
numbers of elements in the cache may be large (many millions).



Each key contains fields the scan query could use to select the entries I
want.



Will the scan query only read keys from the store while it is determining
which items in the cache match the query, and then pull the matching values
to return back to the scan query client? Or does scan query read all
key/value data then apply its conditions before returning the matching
items?



Thanks,

Raymond.

Re: IO implications of ScanQuery

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

I would assume the answer is no, but you can also store some named fields
with regular binary writer and then store the rest with raw binary writer.
This way you should be able to index/query the former.

Can you try both approaches, see what works for you?

Regards.
-- 
Ilya Kasnacheev


чт, 18 окт. 2018 г. в 23:12, Raymond Wilson <ra...@trimble.com>:

> Hi Ilya,
>
>
>
> Thanks for the clarification. I assume this not the case if the cache is
> represented as an SQL table as the SQL table will construct a primary key
> from the entire key of the ICache<TK, TV> cache.
>
>
>
> Shifting the question a little, if the keys are serialized via a raw
> IBinarizable interface implemented on the key types (so the field names are
> not included in the serialization) do SQL queries still allow predicate
> functions that refer to those fields in the key? My assumption is ‘yes’ 😊
> .
>
>
>
> Thanks,
>
> Raymond.
>
>
>
> *From:* Ilya Kasnacheev <il...@gmail.com>
> *Sent:* Thursday, October 18, 2018 11:06 PM
> *To:* user@ignite.apache.org
> *Subject:* Re: IO implications of ScanQuery
>
>
>
> Hello!
>
>
>
> As far as my understanding goes, Keys are stored alongside Values in
> Durable Memory pages, and they are likely both pulled from pages on scan
> query.
>
>
>
> This means the latter explanation is closer to reality.
>
>
>
> Regards,
>
> --
>
> Ilya Kasnacheev
>
>
>
>
>
> ср, 17 окт. 2018 г. в 23:05, Raymond Wilson <ra...@trimble.com>:
>
> I have a potential work flow where I may need to find a set of elements in
> a cache where the values are not be small (1-100Kb say), and where the
> numbers of elements in the cache may be large (many millions).
>
>
>
> Each key contains fields the scan query could use to select the entries I
> want.
>
>
>
> Will the scan query only read keys from the store while it is determining
> which items in the cache match the query, and then pull the matching values
> to return back to the scan query client? Or does scan query read all
> key/value data then apply its conditions before returning the matching
> items?
>
>
>
> Thanks,
>
> Raymond.
>
>
>
>

RE: IO implications of ScanQuery

Posted by Raymond Wilson <ra...@trimble.com>.
Hi Ilya,



Thanks for the clarification. I assume this not the case if the cache is
represented as an SQL table as the SQL table will construct a primary key
from the entire key of the ICache<TK, TV> cache.



Shifting the question a little, if the keys are serialized via a raw
IBinarizable interface implemented on the key types (so the field names are
not included in the serialization) do SQL queries still allow predicate
functions that refer to those fields in the key? My assumption is ‘yes’ 😊.



Thanks,

Raymond.



*From:* Ilya Kasnacheev <il...@gmail.com>
*Sent:* Thursday, October 18, 2018 11:06 PM
*To:* user@ignite.apache.org
*Subject:* Re: IO implications of ScanQuery



Hello!



As far as my understanding goes, Keys are stored alongside Values in
Durable Memory pages, and they are likely both pulled from pages on scan
query.



This means the latter explanation is closer to reality.



Regards,

-- 

Ilya Kasnacheev





ср, 17 окт. 2018 г. в 23:05, Raymond Wilson <ra...@trimble.com>:

I have a potential work flow where I may need to find a set of elements in
a cache where the values are not be small (1-100Kb say), and where the
numbers of elements in the cache may be large (many millions).



Each key contains fields the scan query could use to select the entries I
want.



Will the scan query only read keys from the store while it is determining
which items in the cache match the query, and then pull the matching values
to return back to the scan query client? Or does scan query read all
key/value data then apply its conditions before returning the matching
items?



Thanks,

Raymond.

Re: IO implications of ScanQuery

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

As far as my understanding goes, Keys are stored alongside Values in
Durable Memory pages, and they are likely both pulled from pages on scan
query.

This means the latter explanation is closer to reality.

Regards,
-- 
Ilya Kasnacheev


ср, 17 окт. 2018 г. в 23:05, Raymond Wilson <ra...@trimble.com>:

> I have a potential work flow where I may need to find a set of elements in
> a cache where the values are not be small (1-100Kb say), and where the
> numbers of elements in the cache may be large (many millions).
>
>
>
> Each key contains fields the scan query could use to select the entries I
> want.
>
>
>
> Will the scan query only read keys from the store while it is determining
> which items in the cache match the query, and then pull the matching values
> to return back to the scan query client? Or does scan query read all
> key/value data then apply its conditions before returning the matching
> items?
>
>
>
> Thanks,
>
> Raymond.
>
>
>