You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by nithin91 <ni...@franklintempleton.com> on 2020/02/18 13:34:03 UTC

Data Load from Oracle to Ignite is very slow

Hi,

I have multiple oracle tables with more than 10 million rows. I want to load
these tables into Ignite cache.To load the cache I am using Cache JDBC Pojo
Store by getting the required project structure from Web Console.

But Loading the data using cache JDBC POJO Store (i.e.
ignite.cache("CacheName").loadCache(null)) is taking a lot of time.* Is
there any alternative approach to load the data from Oracle DB to ignite
cache*.

Tried using data steamer also but not clear on how to use it.It would be
helpful if some one can share the
sample code to load data to ignite cache using data steamer. 

I have one doubt reg the usage of Data Steamer,

Following is the process mentioned in documentation to implement Data
Steamer,

 // Stream words into the streamer cache.
  for (String word : text)
    stmr.addData(word, 1L);
}

But for my case Looping  through the Result Set generated after executing
the prepared statement using JDBC Connection and add  each row to Data
Steamer.Will this be efficient as i have to loop through 
10 million rows.Please correct me if this is not right way of implementing
Data Steamer.

 






--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Data Load from Oracle to Ignite is very slow

Posted by nithin91 <ni...@franklintempleton.com>.
Hi 

When i executed on the server , Data is getting loaded very fast.Thanks for
the inputs.

With respect to Data Steamer, it would be really helpful if you can share
any sample code other than the one provided in documentation.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Data Load from Oracle to Ignite is very slow

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

There's no way you will load 20,000 records in 25 minutes. That's 10
records per second. I just can't think of any reason why it might take such
monumental amount of time.

With regards to data streamer, as I have said I recommend partitioning your
data and loading every segment from its own thread, using shared data
streamer instance.

Regards,
-- 
Ilya Kasnacheev


вт, 18 февр. 2020 г. в 20:14, nithin91 <
nithinbharadwaj.govindaraju@franklintempleton.com>:

> Hi
>
> We are doing POC, as a result of which we are running it in local mode.
>
> Currently it is taking 25min to load 20000 records with Cache JDBC POJO
> Store.
>
> Even i am giving the initial filter to reduce unnecessary records.
>
>
>
>
> ignite.cache("PieProductRiskCache").loadCache(null,"ignite.example.IgniteUnixImplementation.PieProductRiskKey",
>                         "select *  from Table where
> as_of_Date_Std='31-Dec-2019'");
>
> Regarding the Data Steamer code i have shared, is that the way we implement
> Data Steamers or is there another way of implementing Data Steamers.If the
> approach is correct, then it will not not work right as we are looping
> through the result set.
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Data Load from Oracle to Ignite is very slow

Posted by nithin91 <ni...@franklintempleton.com>.
Hi 

We are doing POC, as a result of which we are running it in local mode.

Currently it is taking 25min to load 20000 records with Cache JDBC POJO
Store.

Even i am giving the initial filter to reduce unnecessary records.


           
ignite.cache("PieProductRiskCache").loadCache(null,"ignite.example.IgniteUnixImplementation.PieProductRiskKey",
            		"select *  from Table where  as_of_Date_Std='31-Dec-2019'");

Regarding the Data Steamer code i have shared, is that the way we implement
Data Steamers or is there another way of implementing Data Steamers.If the
approach is correct, then it will not not work right as we are looping
through the result set.



--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Data Load from Oracle to Ignite is very slow

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

The code that you have provided will not have any edge over CacheStore.

I'm not sure that running two nodes on the same machine is a sound approach
for benchmarking data load speed. Are you sure that performance is not
sufficient? What is actual run time and your expectations?

Regards,
-- 
Ilya Kasnacheev


вт, 18 февр. 2020 г. в 17:53, nithin91 <
nithinbharadwaj.govindaraju@franklintempleton.com>:

> I am having two nodes running on local machine.
>
> Following is the logic i implemented to load data using Data Steamer.Can
> you
> please check whether the implementation is correct and also can you please
> share sample code on how to push
> entries to data streamer from multiple threads
>
> public class PerformanceCacheStore {
>
>         public static void main(String[] args)  throws Exception {
>          String url ="..";
>
>
>
>          try(   Ignite ignite = Ignition.start("Ignite-Client.xml");
>                         Connection conn =
> DriverManager.getConnection(url,"..", "..");
>
>                         PreparedStatement stmt =
>                                         conn.prepareStatement("select  ..
> where COUNTRY_CODE=? and
> as_of_Date_Std=?");
>
>
>                 ) {
>
>                    IgniteDataStreamer<PieProductPerformanceKey,
> PieProductPerformance>
> stmr = ignite.dataStreamer("PieProductPerformanceCache");
>
>                         stmt.setString(1,"USA");
>                         stmt.setString(2,"31-Dec-2019");
>
>                         ResultSet rs = stmt.executeQuery();
>
>
>
>                  while (rs.next()) {
>                         PieProductPerformance perf=new
> PieProductPerformance();
>                         /*
>                         perf Setter Methods
>
>                          */
>
>                         PieProductPerformanceKey perfkey=new
> PieProductPerformanceKey();
>
>                         /*
>                         perfKey Setter Methods
>
>                          */
>
>                         stmr.addData(perfkey, perf);
>
>                  }
>
>                  System.out.println("Completed");
>
>          }
>
>          }
> }
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>

Re: Data Load from Oracle to Ignite is very slow

Posted by nithin91 <ni...@franklintempleton.com>.
I am having two nodes running on local machine.

Following is the logic i implemented to load data using Data Steamer.Can you
please check whether the implementation is correct and also can you please
share sample code on how to push 
entries to data streamer from multiple threads

public class PerformanceCacheStore {
	
	public static void main(String[] args)  throws Exception {
	 String url ="..";
	 
	
	  
	 try(   Ignite ignite = Ignition.start("Ignite-Client.xml");
			Connection conn = DriverManager.getConnection(url,"..", "..");	
			
			PreparedStatement stmt = 
					conn.prepareStatement("select  .. where COUNTRY_CODE=? and
as_of_Date_Std=?");
			 
			
		) {
		 
		   IgniteDataStreamer<PieProductPerformanceKey, PieProductPerformance>
stmr = ignite.dataStreamer("PieProductPerformanceCache");
		 
		 	stmt.setString(1,"USA");
		 	stmt.setString(2,"31-Dec-2019");
		 	
		 	ResultSet rs = stmt.executeQuery();
   		 
		 
   		 
   		 while (rs.next()) {
   			PieProductPerformance perf=new PieProductPerformance();
   			/*
   			perf Setter Methods
   			  
   			 */
   			
   			PieProductPerformanceKey perfkey=new PieProductPerformanceKey();
   			
   			/*
   			perfKey Setter Methods
   			  
   			 */
   			
   			stmr.addData(perfkey, perf);
   			
   		 }
   		 
   		 System.out.println("Completed");
		 
	 }
		 
	 }
}




--
Sent from: http://apache-ignite-users.70518.x6.nabble.com/

Re: Data Load from Oracle to Ignite is very slow

Posted by Ilya Kasnacheev <il...@gmail.com>.
Hello!

Load Cache will pull all rows from result set on all nodes. How many nodes
do you have?

10 million rows is actually a very modest number. I could understand you
worrying if you had 10B rows.

The best approach is to partition your table and push entries to data
streamer from multiple threads, such as, "select * from table where id MOD
? = 0" parametrized with thread number;

Regards,
-- 
Ilya Kasnacheev


вт, 18 февр. 2020 г. в 16:34, nithin91 <
nithinbharadwaj.govindaraju@franklintempleton.com>:

> Hi,
>
> I have multiple oracle tables with more than 10 million rows. I want to
> load
> these tables into Ignite cache.To load the cache I am using Cache JDBC Pojo
> Store by getting the required project structure from Web Console.
>
> But Loading the data using cache JDBC POJO Store (i.e.
> ignite.cache("CacheName").loadCache(null)) is taking a lot of time.* Is
> there any alternative approach to load the data from Oracle DB to ignite
> cache*.
>
> Tried using data steamer also but not clear on how to use it.It would be
> helpful if some one can share the
> sample code to load data to ignite cache using data steamer.
>
> I have one doubt reg the usage of Data Steamer,
>
> Following is the process mentioned in documentation to implement Data
> Steamer,
>
>  // Stream words into the streamer cache.
>   for (String word : text)
>     stmr.addData(word, 1L);
> }
>
> But for my case Looping  through the Result Set generated after executing
> the prepared statement using JDBC Connection and add  each row to Data
> Steamer.Will this be efficient as i have to loop through
> 10 million rows.Please correct me if this is not right way of implementing
> Data Steamer.
>
>
>
>
>
>
>
>
> --
> Sent from: http://apache-ignite-users.70518.x6.nabble.com/
>