You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by adipro <ad...@zohocorp.com> on 2020/05/11 16:11:39 UTC

Suggest a better way to access a particular K-V store

The K is key with type String.
The V is the value with type JSONObject/HashMap.

V is having the following structure:

{"agentId":<long>,"score":<double>,"url":<string>}

Now the query on this will be in such a way that -> Get top 50 values of "K"
where values in that K-V store are sorted based on "score". Provided that
K-V store has 10-50 million K-V pairs.

Can someone please suggest which data structure I need to use to solve this
particular case?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Suggest a better way to access a particular K-V store

Posted by akorensh <al...@gmail.com>.
Hi,
   Use these tips for memory planning:
https://apacheignite.readme.io/docs/capacity-planning#memory-capacity-planning-example

 https://apacheignite.readme.io/docs/capacity-planning#capacity-planning-faq 
(this has a spreadsheet w/capacity calculator)


 You could make you cache partitioned, and use compute jobs to run the query
simultaneously on multiple machines then reduce the results to have only the
top 50.

   see: https://apacheignite.readme.io/docs/compute-grid
    https://apacheignite.readme.io/docs/compute-tasks
   https://apacheignite.readme.io/docs/cache-modes
Thanks, Alex



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Suggest a better way to access a particular K-V store

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Please try doing cursor.close() before return;

Regards,
-- 
Ilya Kasnacheev


ср, 3 июн. 2020 г. в 09:17, adipro <ad...@zohocorp.com>:

> Can someone please help regarding this issue?
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Suggest a better way to access a particular K-V store

Posted by adipro <ad...@zohocorp.com>.
Can someone please help regarding this issue?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Suggest a better way to access a particular K-V store

Posted by adipro <ad...@zohocorp.com>.
Thanks for the reply.

For your reference I'm sharing the code.. Please have a look at it..

```
public Set<String> execQuery(Object... args){
            Set<String> urls = new LinkedHashSet<>();
            Long start = (Long)objects[2];
            Long end = (Long)objects[3];
            int i = 0;
            SqlFieldsQuery query = new SqlFieldsQuery("SELECT * FROM URLS
WHERE PROCESS_APPNAME_ID = ? ORDER BY SCORE LIMIT ?"));
            List queryObj = new ArrayList<>();
            queryObj.add(getGlobalAppId(objects[0].toString()));
            if(start != null && end != null){
                queryObj.add(end-start);
            }
            query.setArgs(queryObj.toArray());
            FieldsQueryCursor<List&lt;?>> cursor =
cacheHolder.getCache().query(query);
            for (List<?> row : cursor) {
                urls.add((String)row.get(1));
                i++;
            }
            return urls;
}
```

Is it that cursor.close() is missed here? What happens if it's not there?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Suggest a better way to access a particular K-V store

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Is it possible that you are not closing your JDBC result sets (or
corresponding Ignite QueryCursor)?

It's actually strange. The query in question seem rather bulky. It's not
expected that it is sufficiently fast for you.

Regards,
-- 
Ilya Kasnacheev


вт, 19 мая 2020 г. в 20:36, adipro <ad...@zohocorp.com>:

> My query is
>
> SELECT * FROM URLS WHERE APPNAME_ID = ? ORDER BY SCORE LIMIT ?
>
> That is giving me 0.7-1.0 msec read performance for a test run for about
> some time. But the index with URLS (SCORE ASC, APPNAME_ID), I'm getting a
> read performance of about 0.3 to 0.5 msec. i found it to be constant when
> data grows.. whereas the index which you provided is getting delayed with
> growing data. Ours is write intensive application.
>
> But the thing is in deployment servers we are constantly receiving these
> warnings
>
> ```
> [22:26:31,046][WARNING][query-#48349][IgniteH2Indexing] Long running query
> is finished [duration=3006ms, type=MAP, distributedJoin=false,
> enforceJoinOrder=false, lazy=false, schema=PUBLIC, sql='SELECT
> "__Z0"."ID" "__C0_0",
> "__Z0"."URL" "__C0_1",
> "__Z0"."SCORE" "__C0_2",
> "__Z0"."APPNAME_ID" "__C0_3"
> FROM "PUBLIC"."URLS" "__Z0"
> WHERE "__Z0"."APPNAME_ID" = ?1
> ORDER BY 3 FETCH FIRST ?2 ROWS ONLY', plan=SELECT
>     __Z0.ID AS __C0_0,
>     __Z0.URL AS __C0_1,
>     __Z0.SCORE AS __C0_2,
>     __Z0.APPNAME_ID AS __C0_3
> FROM PUBLIC.URLS __Z0
>     /* PUBLIC.IDX_2_URLS */
>     /* scanCount: 1106428 */
> WHERE __Z0.APPNAME_ID = ?1
> ORDER BY 3
> FETCH FIRST ?2 ROWS ONLY
> /* index sorted */, node=TcpDiscoveryNode
> [id=9a62f525-98a8-43d6-85da-2270f1ee4e7a,
> consistentId=9a62f525-98a8-43d6-85da-2270f1ee4e7a, addrs=ArrayList
> [127.0.0.1, 172.20.42.17], sockAddrs=HashSet [/172.20.42.17:0,
> /127.0.0.1:0], discPort=0, order=3, intOrder=3,
> lastExchangeTime=1589893316075, loc=false,
> ver=8.7.10#20191227-sha1:c481441d, isClient=true], reqId=262124, segment=0]
> ```
>
>
> data metric at the point of warning ->
>
> ```
> [22:34:05,866][INFO][grid-timeout-worker-#407][IgniteKernal]
> Metrics for local node (to disable set 'metricsLogFrequency' to 0)
>     ^-- Node [id=4f2b808c, uptime=04:19:01.321]
>     ^-- H/N/C [hosts=4, nodes=4, CPUs=192]
>     ^-- CPU [cur=0.03%, avg=13.79%, GC=0%]
>     ^-- PageMemory [pages=927396]
>     ^-- Heap [used=1521MB, free=62.86%, comm=4096MB]
>     ^-- Off-heap [used=3665MB, free=56.84%, comm=8392MB]
>     ^--   sysMemPlc region [used=0MB, free=99.98%, comm=100MB]
>     ^--   default region [used=3665MB, free=55.26%, comm=8192MB]
>     ^--   metastoreMemPlc region [used=0MB, free=99.94%, comm=0MB]
>     ^--   TxLog region [used=0MB, free=100%, comm=100MB]
>     ^-- Ignite persistence [used=3622MB]
>     ^--   sysMemPlc region [used=0MB]
>     ^--   default region [used=3622MB]
>     ^--   metastoreMemPlc region [used=0MB]
>     ^--   TxLog region [used=0MB]
>     ^-- Outbound messages queue [size=0]
>     ^-- Public thread pool [active=0, idle=0, qSize=0]
>     ^-- System thread pool [active=0, idle=200, qSize=0]
>     ^-- Striped thread pool [active=2, idle=198, qSize=0]
> ```
>
> Can you please tell why these warnings are coming. Although in client side,
> I added a check if query delays to print a warning. But it didn't through
> any warnings in client machine. It's weird why this warning is coming in
> server logs.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Suggest a better way to access a particular K-V store

Posted by adipro <ad...@zohocorp.com>.
My query is 

SELECT * FROM URLS WHERE APPNAME_ID = ? ORDER BY SCORE LIMIT ?

That is giving me 0.7-1.0 msec read performance for a test run for about
some time. But the index with URLS (SCORE ASC, APPNAME_ID), I'm getting a
read performance of about 0.3 to 0.5 msec. i found it to be constant when
data grows.. whereas the index which you provided is getting delayed with
growing data. Ours is write intensive application.

But the thing is in deployment servers we are constantly receiving these
warnings

```
[22:26:31,046][WARNING][query-#48349][IgniteH2Indexing] Long running query
is finished [duration=3006ms, type=MAP, distributedJoin=false,
enforceJoinOrder=false, lazy=false, schema=PUBLIC, sql='SELECT
"__Z0"."ID" "__C0_0",
"__Z0"."URL" "__C0_1",
"__Z0"."SCORE" "__C0_2",
"__Z0"."APPNAME_ID" "__C0_3"
FROM "PUBLIC"."URLS" "__Z0"
WHERE "__Z0"."APPNAME_ID" = ?1
ORDER BY 3 FETCH FIRST ?2 ROWS ONLY', plan=SELECT
    __Z0.ID AS __C0_0,
    __Z0.URL AS __C0_1,
    __Z0.SCORE AS __C0_2,
    __Z0.APPNAME_ID AS __C0_3
FROM PUBLIC.URLS __Z0
    /* PUBLIC.IDX_2_URLS */
    /* scanCount: 1106428 */
WHERE __Z0.APPNAME_ID = ?1
ORDER BY 3
FETCH FIRST ?2 ROWS ONLY
/* index sorted */, node=TcpDiscoveryNode
[id=9a62f525-98a8-43d6-85da-2270f1ee4e7a,
consistentId=9a62f525-98a8-43d6-85da-2270f1ee4e7a, addrs=ArrayList
[127.0.0.1, 172.20.42.17], sockAddrs=HashSet [/172.20.42.17:0,
/127.0.0.1:0], discPort=0, order=3, intOrder=3,
lastExchangeTime=1589893316075, loc=false,
ver=8.7.10#20191227-sha1:c481441d, isClient=true], reqId=262124, segment=0]
```


data metric at the point of warning ->

```
[22:34:05,866][INFO][grid-timeout-worker-#407][IgniteKernal] 
Metrics for local node (to disable set 'metricsLogFrequency' to 0)
    ^-- Node [id=4f2b808c, uptime=04:19:01.321]
    ^-- H/N/C [hosts=4, nodes=4, CPUs=192]
    ^-- CPU [cur=0.03%, avg=13.79%, GC=0%]
    ^-- PageMemory [pages=927396]
    ^-- Heap [used=1521MB, free=62.86%, comm=4096MB]
    ^-- Off-heap [used=3665MB, free=56.84%, comm=8392MB]
    ^--   sysMemPlc region [used=0MB, free=99.98%, comm=100MB]
    ^--   default region [used=3665MB, free=55.26%, comm=8192MB]
    ^--   metastoreMemPlc region [used=0MB, free=99.94%, comm=0MB]
    ^--   TxLog region [used=0MB, free=100%, comm=100MB]
    ^-- Ignite persistence [used=3622MB]
    ^--   sysMemPlc region [used=0MB]
    ^--   default region [used=3622MB]
    ^--   metastoreMemPlc region [used=0MB]
    ^--   TxLog region [used=0MB]
    ^-- Outbound messages queue [size=0]
    ^-- Public thread pool [active=0, idle=0, qSize=0]
    ^-- System thread pool [active=0, idle=200, qSize=0]
    ^-- Striped thread pool [active=2, idle=198, qSize=0]
```

Can you please tell why these warnings are coming. Although in client side,
I added a check if query delays to print a warning. But it didn't through
any warnings in client machine. It's weird why this warning is coming in
server logs.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Suggest a better way to access a particular K-V store

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Have you tried adding an index on URLS (APPNAME_ID, SCORE DESC)? (or ASC if
you are going to order ascending)

Regards,
-- 
Ilya Kasnacheev


вт, 12 мая 2020 г. в 13:42, adipro <ad...@zohocorp.com>:

> We've already tried that but we are receiving these errors. We are using
> index in a correct way only
> We are using Ignite persistence.
>
> ```
> [15:32:19,481][WARNING][long-qry-#110][LongRunningQueryManager] Query
> execution is too long [duration=3898ms, type=MAP, distributedJoin=false,
> enforceJoinOrder=false, lazy=false, schema=PUBLIC, sql='SELECT
> "__Z0"."ID" "__C0_0",
> "__Z0"."URL" "__C0_1",
> "__Z0"."SCORE" "__C0_2",
> "__Z0"."APPNAME_ID" "__C0_3"
> FROM "PUBLIC"."URLS" "__Z0"
> WHERE "__Z0"."APPNAME_ID" = ?1
> ORDER BY 3 FETCH FIRST ?2 ROWS ONLY', plan=SELECT
>     __Z0.ID AS __C0_0,
>     __Z0.URL AS __C0_1,
>     __Z0.SCORE AS __C0_2,
>     __Z0.APPNAME_ID AS __C0_3
> FROM PUBLIC.URLS __Z0
>     /* PUBLIC.IDX_2_URLS */
>     /* scanCount: 1342938 */
> WHERE __Z0.APPNAME_ID = ?1
> ORDER BY 3
> FETCH FIRST ?2 ROWS ONLY
> /* index sorted */, node=TcpDiscoveryNode
> [id=1acf85ab-112a-4bbf-95f3-b6d3eb8c38b8,
> consistentId=1acf85ab-112a-4bbf-95f3-b6d3eb8c38b8, addrs=ArrayList
> [127.0.0.1, 192.168.177.58], sockAddrs=HashSet [/127.0.0.1:0,
> /192.168.177.58:0], discPort=0, order=4, intOrder=4,
> lastExchangeTime=1589276400188, loc=false,
> ver=8.7.10#20191227-sha1:c481441d, isClient=true], reqId=482, segment=0]
> ```
> The thing is we are running this in an 8 GB machine. According to the
> documents about tuning, we were told that 70% of the RAM should be given to
> heap and non heap so we gave 2GB heap + 2GB offheap + 1GB
> directbuffmemory..
>
> Till 60Lakh records, it didn't throw any warnings, but after that we are
> receiving many warnings.
>
> Is it got to do something with Off-heap size or BTree data size that is
> stored in RAM? Should we increase RAM? We don't want to increase RAM. Is it
> possible without increasing RAM? Or is there a way we can calculate how
> much
> RAM we need for how much data?
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Suggest a better way to access a particular K-V store

Posted by adipro <ad...@zohocorp.com>.
We've already tried that but we are receiving these errors. We are using
index in a correct way only
We are using Ignite persistence.

```
[15:32:19,481][WARNING][long-qry-#110][LongRunningQueryManager] Query
execution is too long [duration=3898ms, type=MAP, distributedJoin=false,
enforceJoinOrder=false, lazy=false, schema=PUBLIC, sql='SELECT
"__Z0"."ID" "__C0_0",
"__Z0"."URL" "__C0_1",
"__Z0"."SCORE" "__C0_2",
"__Z0"."APPNAME_ID" "__C0_3"
FROM "PUBLIC"."URLS" "__Z0"
WHERE "__Z0"."APPNAME_ID" = ?1
ORDER BY 3 FETCH FIRST ?2 ROWS ONLY', plan=SELECT
    __Z0.ID AS __C0_0,
    __Z0.URL AS __C0_1,
    __Z0.SCORE AS __C0_2,
    __Z0.APPNAME_ID AS __C0_3
FROM PUBLIC.URLS __Z0
    /* PUBLIC.IDX_2_URLS */
    /* scanCount: 1342938 */
WHERE __Z0.APPNAME_ID = ?1
ORDER BY 3
FETCH FIRST ?2 ROWS ONLY
/* index sorted */, node=TcpDiscoveryNode
[id=1acf85ab-112a-4bbf-95f3-b6d3eb8c38b8,
consistentId=1acf85ab-112a-4bbf-95f3-b6d3eb8c38b8, addrs=ArrayList
[127.0.0.1, 192.168.177.58], sockAddrs=HashSet [/127.0.0.1:0,
/192.168.177.58:0], discPort=0, order=4, intOrder=4,
lastExchangeTime=1589276400188, loc=false,
ver=8.7.10#20191227-sha1:c481441d, isClient=true], reqId=482, segment=0]
```
The thing is we are running this in an 8 GB machine. According to the
documents about tuning, we were told that 70% of the RAM should be given to
heap and non heap so we gave 2GB heap + 2GB offheap + 1GB directbuffmemory.. 

Till 60Lakh records, it didn't throw any warnings, but after that we are
receiving many warnings.

Is it got to do something with Off-heap size or BTree data size that is
stored in RAM? Should we increase RAM? We don't want to increase RAM. Is it
possible without increasing RAM? Or is there a way we can calculate how much
RAM we need for how much data?



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Suggest a better way to access a particular K-V store

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Ignite SQL table with index on score descending would fit it nicely.

You will have to convert JSON into BinaryObject or just extract score as a
column.

Regards,
-- 
Ilya Kasnacheev


пн, 11 мая 2020 г. в 19:11, adipro <ad...@zohocorp.com>:

> The K is key with type String.
> The V is the value with type JSONObject/HashMap.
>
> V is having the following structure:
>
> {"agentId":<long>,"score":<double>,"url":<string>}
>
> Now the query on this will be in such a way that -> Get top 50 values of
> "K"
> where values in that K-V store are sorted based on "score". Provided that
> K-V store has 10-50 million K-V pairs.
>
> Can someone please suggest which data structure I need to use to solve this
> particular case?
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>