You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ignite.apache.org by arthi <Ar...@nielsen.com> on 2016/04/26 10:25:35 UTC

loadCache takes long time to complete with million rows

Hi Team,

I am loading a partitioned cache with 30Million rows using loadCache API
from a persistence store.
The data gets loaded into the cache, but the process takes a long time to
complete.

Here is the config - 
<bean class="org.apache.ignite.configuration.CacheConfiguration">
					<property name="name" value="SHOP_ITEM_BITMAP_CACHE" />
					<property name="eagerTtl" value="false"/>
					<property name="copyOnRead" value="false"/>
					<property name="atomicityMode" value="ATOMIC"/>
                    <property name="backups" value="0"/>
					<property name="memoryMode" value="OFFHEAP_TIERED"/>
					<property name="offHeapMaxMemory" value="0"/>				
					<property name="swapEnabled" value="false"/>
					<property name="cacheMode" value="PARTITIONED" />
					<property name="affinity">
						<bean
class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">
							<property name="partitions" value="64"/>
						</bean>
					</property>
					<property name="cacheStoreFactory">
						<bean
class="javax.cache.configuration.FactoryBuilder$SingletonFactory">
						  <constructor-arg>
							<bean
class="com.nielsen.poc.aggregation.ignite.datagrid.store.ShopItemBitmapStore">										  
							</bean>
						  </constructor-arg>
						</bean>
					</property>
					<property name="readThrough" value="true" />
					<property name="queryEntities">
						<list>
							<bean class="org.apache.ignite.cache.QueryEntity">
								<property name="keyType" value="org.apache.ignite.cache.AffinityKey"
/>
								<property name="valueType"
value="com.nielsen.poc.aggregation.ignite.datagrid.model.ShopItemBitmap" />								
								<property name="fields">
									<map>
										<entry key="id" value="java.lang.Integer" />	
										<entry key="sid_per_id" value="java.lang.Integer" />
										<entry key="sid_mah_id" value="java.lang.Integer" />
										<entry key="sid_itm_id" value="java.lang.Integer" />
										<entry key="sid_prm_id" value="java.lang.Integer" />
										<entry key="sid_cha_code" value="java.lang.String" />
										<entry key="sid_service" value="java.lang.String" />
										<entry key="sid_itm_dist" value="java.lang.String" />	
										<entry key="category" value="java.lang.String" />										
									</map>
								</property>	
								<property name="indexes">
									<list>
										<bean class="org.apache.ignite.cache.QueryIndex">											
											<constructor-arg index="0">
												<list>
													<value>sid_mah_id</value>
													<value>category</value>														
													<value>sid_per_id</value>																																					
												</list>
											</constructor-arg>
											<constructor-arg index="1" value="SORTED"/>
										</bean>
									</list>
								</property>
							</bean>
						</list>
					</property>
				</bean>

can you please guide to see what the issue is?
The same loadCache API can work faster for smaller data sets.

Thanks,
Arthi



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/loadCache-takes-long-time-to-complete-with-million-rows-tp4534.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: loadCache takes long time to complete with million rows

Posted by vkulichenko <va...@gmail.com>.
Hi Arthi,

I'm a bit confused. You say that the data is in cache after 30 minutes, but
the process finishes only after 3 hours. While debugging this, did you get
any idea what is happening during this additional time?

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/loadCache-takes-long-time-to-complete-with-million-rows-tp4534p4638.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: loadCache takes long time to complete with million rows

Posted by arthi <Ar...@nielsen.com>.
Hi Val,

I debugged it. I have 11 nodes in my cluster. I have a partitioned cache,
and it has a affinity co-location defined. 

For a cache load of 29786526 values into the partitioned cache, it takes
close to 3 hours. But, when I monitor the cache during loading, the entire
data is into the cache by just 30 mins, I use
cache.size(CachePeekMode.OFFHEAP)) to peak into the cache. But, the process
takes 2.5 hrs extra to complete. 

I am loading the cache progressively for mulitple runs, and each run needs
to load 29786526 values. I would ideally would want each run to take a
little more than 30mins, but, it takes close to 3 hrs... Is this related to
any commit config? or data rebalancing config?

Please advice.

Thanks,
Arthi




--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/loadCache-takes-long-time-to-complete-with-million-rows-tp4534p4592.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: loadCache takes long time to complete with million rows

Posted by vkulichenko <va...@gmail.com>.
Hi Arthi,

I see that you're using your implementation of the store, so it's really
hard to say why is it slow. I would recommend to debug the code first and
see what the time is spent on and whether resources are utilized. One of the
first possible optimizations would be to load the data in multithreaded
fashion within CacheStore.loadCache() implementation.

Hi Shaomin,

This depends on number of parameters like the size of value, network, etc.
If you feel that performance in your test could be better, please provide
the code and we will take a look.

-Val



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/loadCache-takes-long-time-to-complete-with-million-rows-tp4534p4567.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

RE: loadCache takes long time to complete with million rows

Posted by Shaomin Zhang <Sh...@tudor.com>.
Hi Arthi

I am loading large (not as large as yours) number of records into cache too, and trying to understand the performance. I use data streamer to load around 3M records into the cache in about 20 minutes including read from database and put into the cache. From the log file, I can see putting objects into cache takes nearly all the time. I am running multiple nodes on more than one hardware boxes, there might be so network cost, but it is roughly takes 0.4ms per put. What are the measures in your case?

Val, what is the typical performance in get/put operations in Ignite?

Thanks

Shaomin

-----Original Message-----
From: arthi [mailto:Arthi.Kasturirangan.ap@nielsen.com]
Sent: 26 April 2016 13:00
To: user@ignite.apache.org
Subject: Re: loadCache takes long time to complete with million rows


hi Val,

There is enough heap available. I initiated the process using 10g and the utilization is below 5g.
profile.PNG
<http://apache-ignite-users.70518.x6.nabble.com/file/n4543/profile.PNG>

Thanks,
Arthi



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/loadCache-takes-long-time-to-complete-with-million-rows-tp4534p4543.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.
_________________________________________________________

This email, its contents, and any attachments transmitted with it are intended only for the addressee(s) and may be confidential and legally privileged. We do not waive any confidentiality by misdelivery. If you have received this email in error, please notify the sender immediately and delete it. You should not copy it, forward it or otherwise use the contents, attachments or information in any way. Any liability for viruses is excluded to the fullest extent permitted by law.

Tudor Capital Europe LLP (TCE) is authorised and regulated by The Financial Conduct Authority (the FCA). TCE is registered as a limited liability partnership in England and Wales No: OC340673 with its registered office at 10 New Burlington Street, London, W1S 3BE, United Kingdom

Re: loadCache takes long time to complete with million rows

Posted by arthi <Ar...@nielsen.com>.
hi Val,

There is enough heap available. I initiated the process using 10g and the
utilization is below 5g.
profile.PNG
<http://apache-ignite-users.70518.x6.nabble.com/file/n4543/profile.PNG>  

Thanks,
Arthi



--
View this message in context: http://apache-ignite-users.70518.x6.nabble.com/loadCache-takes-long-time-to-complete-with-million-rows-tp4534p4543.html
Sent from the Apache Ignite Users mailing list archive at Nabble.com.

Re: loadCache takes long time to complete with million rows

Posted by Vladimir Ozerov <vo...@gridgain.com>.
Hi,

The most obvious reason - insufficient heap. Even though you specified
OFFHEAP_TIERED mode, Ignite still generates some intermediate objects as
well as some long-lived ones. Please monitor your application through a
profiler to confirm that it has enough heap and GC pauses are not too long.

Vladimir.

On Tue, Apr 26, 2016 at 11:25 AM, arthi <Ar...@nielsen.com>
wrote:

> Hi Team,
>
> I am loading a partitioned cache with 30Million rows using loadCache API
> from a persistence store.
> The data gets loaded into the cache, but the process takes a long time to
> complete.
>
> Here is the config -
> <bean class="org.apache.ignite.configuration.CacheConfiguration">
>                                         <property name="name"
> value="SHOP_ITEM_BITMAP_CACHE" />
>                                         <property name="eagerTtl"
> value="false"/>
>                                         <property name="copyOnRead"
> value="false"/>
>                                         <property name="atomicityMode"
> value="ATOMIC"/>
>                     <property name="backups" value="0"/>
>                                         <property name="memoryMode"
> value="OFFHEAP_TIERED"/>
>                                         <property name="offHeapMaxMemory"
> value="0"/>
>                                         <property name="swapEnabled"
> value="false"/>
>                                         <property name="cacheMode"
> value="PARTITIONED" />
>                                         <property name="affinity">
>                                                 <bean
>
> class="org.apache.ignite.cache.affinity.rendezvous.RendezvousAffinityFunction">
>                                                         <property
> name="partitions" value="64"/>
>                                                 </bean>
>                                         </property>
>                                         <property name="cacheStoreFactory">
>                                                 <bean
> class="javax.cache.configuration.FactoryBuilder$SingletonFactory">
>                                                   <constructor-arg>
>                                                         <bean
>
> class="com.nielsen.poc.aggregation.ignite.datagrid.store.ShopItemBitmapStore">
>                                                         </bean>
>                                                   </constructor-arg>
>                                                 </bean>
>                                         </property>
>                                         <property name="readThrough"
> value="true" />
>                                         <property name="queryEntities">
>                                                 <list>
>                                                         <bean
> class="org.apache.ignite.cache.QueryEntity">
>                                                                 <property
> name="keyType" value="org.apache.ignite.cache.AffinityKey"
> />
>                                                                 <property
> name="valueType"
> value="com.nielsen.poc.aggregation.ignite.datagrid.model.ShopItemBitmap" />
>                                                                 <property
> name="fields">
>
> <map>
>
>       <entry key="id" value="java.lang.Integer" />
>
>       <entry key="sid_per_id" value="java.lang.Integer" />
>
>       <entry key="sid_mah_id" value="java.lang.Integer" />
>
>       <entry key="sid_itm_id" value="java.lang.Integer" />
>
>       <entry key="sid_prm_id" value="java.lang.Integer" />
>
>       <entry key="sid_cha_code" value="java.lang.String" />
>
>       <entry key="sid_service" value="java.lang.String" />
>
>       <entry key="sid_itm_dist" value="java.lang.String" />
>
>       <entry key="category" value="java.lang.String" />
>
> </map>
>                                                                 </property>
>                                                                 <property
> name="indexes">
>
> <list>
>
>       <bean class="org.apache.ignite.cache.QueryIndex">
>
>               <constructor-arg index="0">
>
>                       <list>
>
>                               <value>sid_mah_id</value>
>
>                               <value>category</value>
>
>                               <value>sid_per_id</value>
>
>                       </list>
>
>               </constructor-arg>
>
>               <constructor-arg index="1" value="SORTED"/>
>
>       </bean>
>
> </list>
>                                                                 </property>
>                                                         </bean>
>                                                 </list>
>                                         </property>
>                                 </bean>
>
> can you please guide to see what the issue is?
> The same loadCache API can work faster for smaller data sets.
>
> Thanks,
> Arthi
>
>
>
> --
> View this message in context:
> http://apache-ignite-users.70518.x6.nabble.com/loadCache-takes-long-time-to-complete-with-million-rows-tp4534.html
> Sent from the Apache Ignite Users mailing list archive at Nabble.com.
>