You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by rishi007bansod <ri...@gmail.com> on 2016/12/22 06:36:12 UTC

Improve data loading speed from persistant data storage

I am loading data from Oracle database using *cache.loadCache()* command, but
it is taking more time in data loading(Takes about *5 minutes to load 300
MB* of data). Can we use *Ignitedatastreamer* here to improve data loading
speed? or is there any other way for bulk loading of data from underlying
persistent database?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Improve-data-loading-speed-from-persistant-data-storage-tp9692.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Improve data loading speed from persistant data storage

Posted by vkulichenko <va...@gmail.com>.
IgniteDataStreamer and loadCache() are different approaches and separate
APIs. If you want to use IgniteDataStreamer to load the data, simply start a
client node, fetch data from database and stream it into the cluster through
streamer.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Improve-data-loading-speed-from-persistant-data-storage-tp9692p9761.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Improve data loading speed from persistant data storage

Posted by rishi007bansod <ri...@gmail.com>.
Can you give example for how can we *connect Ignitedatastreamer with
loadcache()* 



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Improve-data-loading-speed-from-persistant-data-storage-tp9692p9727.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Improve data loading speed from persistant data storage

Posted by Denis Magda <dm...@apache.org>.
Absolutely, give a try to the IgniteDataStreamer approach. You can create a streamer per cache and inject the data from multiple threads in parallel. Also keep in mind that the streamer is tunable and you can improve the performance by changing some of the parameters.

—
Denis

> On Dec 21, 2016, at 10:36 PM, rishi007bansod <ri...@gmail.com> wrote:
> 
> I am loading data from Oracle database using *cache.loadCache()* command, but
> it is taking more time in data loading(Takes about *5 minutes to load 300
> MB* of data). Can we use *Ignitedatastreamer* here to improve data loading
> speed? or is there any other way for bulk loading of data from underlying
> persistent database?
> 
> 
> 
> --
> View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Improve-data-loading-speed-from-persistant-data-storage-tp9692.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.


Re: Improve data loading speed from persistent data storage

Posted by dkarachentsev <dk...@gridgain.com>.
So, most of the time are spent on getting data from DB - 3.5 min. The rest, I
think, could be indexing process in cache (you may check it by removing
setIndexedTypes()). For speedup you may load data from CSV file, but Ignite
doesn't provide such parser, you should implement it by yourself. Of course,
if it's suitable to your requirements.



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Improve-data-loading-speed-from-persistant-data-storage-tp9692p9699.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Improve data loading speed from persistant data storage

Posted by rishi007bansod <ri...@gmail.com>.
Data Loading time : 214126 milliseconds
DB Size : 370 MB
Number of Entries : 2.4 Million

Only 1 node is used for data loading part from Oracle DB to Ignite cache 
Indexing is applied only on primary keys(default generated using schema
import)  



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Improve-data-loading-speed-from-persistant-data-storage-tp9692p9697.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: Improve data loading speed from persistant data storage

Posted by dkarachentsev <dk...@gridgain.com>.
loadCache() internally already uses IgniteDataStreamer. How long does it take
to fetch all data from DB? Do you use SQL indexing for entities? How many
data nodes in topology?



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/Improve-data-loading-speed-from-persistant-data-storage-tp9692p9695.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.