You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by 胡永亮/Bob <hu...@neusoft.com> on 2016/10/18 06:34:56 UTC

Can't increase the speed of loadCache() when increasing more Ignite node

Hi, 

    I load data into Ignite from oracle with loadCache().

    I load 100w data, when Ignite cluster has one node, its cost time is 2m27s.
    Two nodes, its cost time is 2m18s, three 2m15s, four 2m11s.

    I have tested reading the 100w data through jdbc, its cost time is 40s.

    Why don't the loading speed increase with mode Ignite nodes?

    Thanks.



Bob


---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this communication in error,please
immediately notify the sender by return e-mail, and delete the original message and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------

Re: Re: Can't increase the speed of loadCache() when increasing more Ignite node

Posted by 胡永亮/Bob <hu...@neusoft.com>.
Alexey, 

    I see. Thank you very much.


 
From: Alexey Kuznetsov
Date: 2016-10-18 15:06
To: user@ignite.apache.org
Subject: Re: Can't increase the speed of loadCache() when increasing more Ignite node
Bob,

In current Ignite implementation, if you are loading data via cache store each node will iterate whole data set
 and take only those keys that will satisfy affinity function.

So,
 in case of one node - all keys will loaded.
 in case of two nodes: first node will iterate WHOLE data set, but will take 50% of keys, same with second node.
 in case of three nodes: each node will iterate WHOLE data set, but will take 33% of keys.

There is no general solution to speedup load.


On Tue, Oct 18, 2016 at 1:34 PM, 胡永亮/Bob <hu...@neusoft.com> wrote:
Hi, 

    I load data into Ignite from oracle with loadCache().

    I load 100w data, when Ignite cluster has one node, its cost time is 2m27s.
    Two nodes, its cost time is 2m18s, three 2m15s, four 2m11s.

    I have tested reading the 100w data through jdbc, its cost time is 40s.

    Why don't the loading speed increase with mode Ignite nodes?

    Thanks.



Bob

---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s) 
is intended only for the use of the intended recipient and may be confidential and/or privileged of 
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is 
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying 
is strictly prohibited, and may be unlawful.If you have received this communication in error,please 
immediately notify the sender by return e-mail, and delete the original message and all copies from 
your system. Thank you. 
---------------------------------------------------------------------------------------------------



-- 
Alexey Kuznetsov


---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this communication in error,please
immediately notify the sender by return e-mail, and delete the original message and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------

Re: Can't increase the speed of loadCache() when increasing more Ignite node

Posted by Manuel Núñez <ma...@hotmail.com>.
Have you probe to partitioning your data? It’s pretty simple by adding a field (integer partitionId) on your table, so each node will load only its own partitions. You could see an example here: http://apacheignite.gridgain.org/docs/data-loading


El 18 oct 2016, a las 9:06, Alexey Kuznetsov <ak...@apache.org>> escribió:

Bob,

In current Ignite implementation, if you are loading data via cache store each node will iterate whole data set
 and take only those keys that will satisfy affinity function.

So,
 in case of one node - all keys will loaded.
 in case of two nodes: first node will iterate WHOLE data set, but will take 50% of keys, same with second node.
 in case of three nodes: each node will iterate WHOLE data set, but will take 33% of keys.

There is no general solution to speedup load.


On Tue, Oct 18, 2016 at 1:34 PM, 胡永亮/Bob <hu...@neusoft.com>> wrote:
Hi,

    I load data into Ignite from oracle with loadCache().

    I load 100w data, when Ignite cluster has one node, its cost time is 2m27s.
    Two nodes, its cost time is 2m18s, three 2m15s, four 2m11s.

    I have tested reading the 100w data through jdbc, its cost time is 40s.

    Why don't the loading speed increase with mode Ignite nodes?

    Thanks.

________________________________
Bob


---------------------------------------------------------------------------------------------------
Confidentiality Notice: The information contained in this e-mail and any accompanying attachment(s)
is intended only for the use of the intended recipient and may be confidential and/or privileged of
Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader of this communication is
not the intended recipient, unauthorized use, forwarding, printing,  storing, disclosure or copying
is strictly prohibited, and may be unlawful.If you have received this communication in error,please
immediately notify the sender by return e-mail, and delete the original message and all copies from
your system. Thank you.
---------------------------------------------------------------------------------------------------



--
Alexey Kuznetsov


Re: Can't increase the speed of loadCache() when increasing more Ignite node

Posted by Alexey Kuznetsov <ak...@apache.org>.
Bob,

In current Ignite implementation, if you are loading data via cache store
each node will iterate whole data set
 and take only those keys that will satisfy affinity function.

So,
 in case of one node - all keys will loaded.
 in case of two nodes: first node will iterate WHOLE data set, but will
take 50% of keys, same with second node.
 in case of three nodes: each node will iterate WHOLE data set, but will
take 33% of keys.

There is no general solution to speedup load.


On Tue, Oct 18, 2016 at 1:34 PM, 胡永亮/Bob <hu...@neusoft.com> wrote:

> Hi,
>
>     I load data into Ignite from oracle with loadCache().
>
>     I load 100w data, when Ignite cluster has one node, its cost time is
> 2m27s.
>     Two nodes, its cost time is 2m18s, three 2m15s, four 2m11s.
>
>     I have tested reading the 100w data through jdbc, its cost time is 40s.
>
>     Why don't the loading speed increase with mode Ignite nodes?
>
>     Thanks.
>
> ------------------------------
> Bob
>
> ------------------------------------------------------------
> ---------------------------------------
> Confidentiality Notice: The information contained in this e-mail and any
> accompanying attachment(s)
> is intended only for the use of the intended recipient and may be
> confidential and/or privileged of
> Neusoft Corporation, its subsidiaries and/or its affiliates. If any reader
> of this communication is
> not the intended recipient, unauthorized use, forwarding, printing,
> storing, disclosure or copying
> is strictly prohibited, and may be unlawful.If you have received this
> communication in error,please
> immediately notify the sender by return e-mail, and delete the original
> message and all copies from
> your system. Thank you.
> ------------------------------------------------------------
> ---------------------------------------
>



-- 
Alexey Kuznetsov

Re: Can't increase the speed of loadCache() when increasing more Ignite node

Posted by Manu <ma...@hotmail.com>.
Of course, it's not trivial... and changes on database are required (new
field on primary table (better) or new "extended partition table" 1to1
relationship with primary table (id primary table, partitionId)) but using
CacheStoreAdapter implementation it's not such as complex. I would do: 

1. overwrite "write" method on your CacheStoreAdapter implementation to
ensure new entries take proper partition when are written on database: with
ignite.affinity(cacheName).partition(for entryKey), implement insert/update
of new partition field on primary table or on "extended partition table"
(your will need to do 2 inserts on write, one on primary table with entity
and a second one on extended with partitionId)
2. overwrite loadCache(IgniteBiInClosure<Object, BinaryObject> clo,
Object... args) on your CacheStoreAdapter implementation with a flag arg to
allow load cache on full scan mode or on partition mode (using partitionId
field created on primary table or join with "extended partition table")
3. Call cache.load(fullScanFlag) for full scan mode.
4. Once loaded, cache.forEach... put same entry on cache to force re-write
with correct partition.
5. Now table (primary or "partition tabled") is updated with correct
partitions.
6. From now you can call cache.load(null) (for convenience, by default
without params on partition mode) and each node will load its own
partitioned data

Take a look to Affinity-Collocation
(https://apacheignite.readme.io/docs/affinity-collocation) to improve
performance and other important recommendations when use sql joins with
partitioned caches



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Can-t-increase-the-speed-of-loadCache-when-increasing-more-Ignite-node-tp8336p8347.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Can't increase the speed of loadCache() when increasing more Ignite node

Posted by Alexey Kuznetsov <ak...@apache.org>.
Manuel,

Good point!

But this *may* require some changes in database, that in general not always
possible.
And also it is not so trivial to update database with correct partition ID.


On Tue, Oct 18, 2016 at 3:22 PM, Manu <ma...@hotmail.com> wrote:

> Have you probe to partitioning your data? It’s pretty simple by adding a
> field (integer partitionId) on your table, so each node will load only its
> own partitions. You could see an example here:
> http://apacheignite.gridgain.org/docs/data-loading#section-
> partition-aware-data-loading
>
>
>
> --
> View this message in context: http://apache-ignite-users.
> 70518.x6.nabble.com/Can-t-increase-the-speed-of-loadCache-when-increasing-
> more-Ignite-node-tp8336p8340.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>



-- 
Alexey Kuznetsov

Re: Can't increase the speed of loadCache() when increasing more Ignite node

Posted by Manu <ma...@hotmail.com>.
Have you probe to partitioning your data? It’s pretty simple by adding a
field (integer partitionId) on your table, so each node will load only its
own partitions. You could see an example here:
http://apacheignite.gridgain.org/docs/data-loading#section-partition-aware-data-loading



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Can-t-increase-the-speed-of-loadCache-when-increasing-more-Ignite-node-tp8336p8340.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.